go-rag/TODO.md at d8fd067a8c7404b9ccfedcd16beed2b5098f54ba

Snider d8fd067a8c feat: Phase 3 enhancements — sentence splitting, collection helpers, keyword filter, benchmarks

3.1: Sentence-aware chunk splitting at ". ", "? ", "! " boundaries when
paragraphs exceed ChunkConfig.Size. Overlap now aligns to word boundaries
to avoid mid-word splits.

3.2: VectorStore interface gains ListCollections and CollectionInfo methods.
New collections.go with ListCollections, DeleteCollection, CollectionStats
helpers returning backend-agnostic CollectionInfo. Mock updated accordingly.

3.3: KeywordFilter re-ranks QueryResults by boosting scores for keyword
matches (case-insensitive, +10% per keyword). QueryConfig.Keywords flag
enables automatic extraction and filtering.

3.4: Mock-only benchmarks for chunking, query, ingest, formatting, and
keyword filtering.

Co-Authored-By: Virgil <virgil@lethean.io>

7 KiB

Raw Blame History

TODO.md — go-rag Task Queue

Phase 0: Environment Setup

Phase 1: Unit Tests (18.4% -> 38.8% coverage)

Testable Without External Services

Require External Services (use build tag `//go:build rag`)

Phase 2: Test Infrastructure (38.8% -> 69.0% coverage)

Phase 3: Enhancements

3.1 Chunk Boundary Improvements

3.2 Collection Management Helpers

3.3 Keyword Pre-Filter

3.4 Benchmarks

Phase 4: GPU Embeddings

Known Issues

Platform

Workflow

7 KiB Raw Blame History

TODO.md — go-rag Task Queue

Phase 0: Environment Setup

Phase 1: Unit Tests (18.4% -> 38.8% coverage)

Testable Without External Services

Require External Services (use build tag //go:build rag)

Phase 2: Test Infrastructure (38.8% -> 69.0% coverage)

Phase 3: Enhancements

3.1 Chunk Boundary Improvements

3.2 Collection Management Helpers

3.3 Keyword Pre-Filter

3.4 Benchmarks

Phase 4: GPU Embeddings

Known Issues

Platform

Workflow

7 KiB

Raw Blame History

Require External Services (use build tag `//go:build rag`)