Commit graph

4 commits

Author SHA1 Message Date
Snider
ebe9537b6a refactor: apply go fix modernizers for Go 1.26
Automated fixes: interface{} → any, range-over-int, t.Context(),
wg.Go(), strings.SplitSeq, strings.Builder, slices.Contains,
maps helpers, min/max builtins.

Co-Authored-By: Virgil <virgil@lethean.io>
2026-02-22 21:00:16 +00:00
Snider
d8fd067a8c feat: Phase 3 enhancements — sentence splitting, collection helpers, keyword filter, benchmarks
3.1: Sentence-aware chunk splitting at ". ", "? ", "! " boundaries when
paragraphs exceed ChunkConfig.Size. Overlap now aligns to word boundaries
to avoid mid-word splits.

3.2: VectorStore interface gains ListCollections and CollectionInfo methods.
New collections.go with ListCollections, DeleteCollection, CollectionStats
helpers returning backend-agnostic CollectionInfo. Mock updated accordingly.

3.3: KeywordFilter re-ranks QueryResults by boosting scores for keyword
matches (case-insensitive, +10% per keyword). QueryConfig.Keywords flag
enables automatic extraction and filtering.

3.4: Mock-only benchmarks for chunking, query, ingest, formatting, and
keyword filtering.

Co-Authored-By: Virgil <virgil@lethean.io>
2026-02-20 08:02:00 +00:00
Claude
acb987a01d
test: add Phase 1 pure-function unit tests (18.4% -> 38.8% coverage)
Add comprehensive unit tests for all pure functions that require no
external services (Qdrant, Ollama). Coverage of testable functions
now at 100% for FormatResults*, Default*Config, EmbedDimension,
valueToGo, Model, ChunkID, and ChunkMarkdown edge cases.

New test files:
- query_test.go: FormatResultsText, FormatResultsContext, FormatResultsJSON,
  DefaultQueryConfig (18 tests)
- ollama_test.go: DefaultOllamaConfig, EmbedDimension, Model (8 tests)
- qdrant_test.go: DefaultQdrantConfig, pointIDToString, valueToGo,
  Point/SearchResult structs (24 tests)

Extended chunk_test.go with edge cases:
- Empty input, whitespace-only, single newline
- Headers with no body content
- Unicode/emoji text with rune-safe overlap verification
- Very long single paragraph
- Config boundary conditions (zero/negative size, overlap >= size)
- Sequential chunk indexing
- ChunkID rune truncation with multibyte characters
- DefaultChunkConfig, DefaultIngestConfig

Co-Authored-By: Charon <developers@lethean.io>
2026-02-20 00:02:52 +00:00
Snider
bf047e4494 feat: extract go-rag from go-ai as standalone RAG package
Vector search with Qdrant + Ollama embeddings, document chunking.
Zero internal go-ai dependencies. Adds CLAUDE.md/TODO.md/FINDINGS.md.

Co-Authored-By: Virgil <virgil@lethean.io>
2026-02-19 18:29:59 +00:00