Commit graph

22 commits

Author SHA1 Message Date
Snider
ebe9537b6a refactor: apply go fix modernizers for Go 1.26
Automated fixes: interface{} → any, range-over-int, t.Context(),
wg.Go(), strings.SplitSeq, strings.Builder, slices.Contains,
maps helpers, min/max builtins.

Co-Authored-By: Virgil <virgil@lethean.io>
2026-02-22 21:00:16 +00:00
Snider
dac393cae2 chore: go mod tidy for 1.26.0
Co-Authored-By: Virgil <virgil@lethean.io>
2026-02-22 20:35:59 +00:00
Snider
c4c4bab176 chore: bump go directive to 1.26.0
Co-Authored-By: Virgil <virgil@lethean.io>
2026-02-22 20:33:48 +00:00
Snider
b2d88c06d1 chore: pin forge deps to v0.0.1 tags for Go 1.26 compat
Go 1.26 rejects non-semver version strings (like 'main') in go.mod.
Tags v0.0.1 now exist on all forge repos — workspace still overrides
for local development.

Co-Authored-By: Virgil <virgil@lethean.io>
2026-02-22 20:15:06 +00:00
Snider
dcdc2b9732 chore: use workspace-resolved versions, drop replace directives
Forge module versions now use main branch resolution via ~/Code/go.work
workspace. Removes local replace directives — the central go.work handles
all cross-repo resolution during development.

Co-Authored-By: Virgil <virgil@lethean.io>
2026-02-22 19:49:36 +00:00
Snider
ebce05d6a4 feat: add RAG Python tools from CLI
Python scripts for RAG ingestion and querying (ingest.py, query.py).

Co-Authored-By: Virgil <virgil@lethean.io>
2026-02-21 21:16:54 +00:00
Snider
2a8d8b0820 feat: migrate rag commands from CLI
Co-Authored-By: Virgil <virgil@lethean.io>
2026-02-21 19:38:11 +00:00
Snider
4c741992dc chore: resolve forge deps from registry, remove local replaces
Co-Authored-By: Virgil <virgil@lethean.io>
2026-02-21 19:19:26 +00:00
Snider
7d97562402 docs: add README with quick start and docs links
Co-Authored-By: Virgil <virgil@lethean.io>
2026-02-20 15:11:19 +00:00
Snider
ce4e311b54 docs: graduate TODO/FINDINGS into production documentation
Replace internal task tracking (TODO.md, FINDINGS.md) with structured
documentation in docs/. Trim CLAUDE.md to agent instructions only.

Co-Authored-By: Virgil <virgil@lethean.io>
2026-02-20 15:01:55 +00:00
Claude
f5f1e68c5c
feat(bench): add Phase 4 GPU benchmarks on RX 7800 XT
Benchmarks for embedding (Ollama ROCm), chunking (pure CPU), and search
(Qdrant) latency. Key results: 97 embeds/sec single (10.3ms), Qdrant
search 152µs (6.5K QPS), chunking 11µs per 50-section doc.

EmbedBatch confirmed sequential — Ollama has no batch API.

Co-Authored-By: Charon <developers@lethean.io>
2026-02-20 11:51:25 +00:00
Snider
d8fd067a8c feat: Phase 3 enhancements — sentence splitting, collection helpers, keyword filter, benchmarks
3.1: Sentence-aware chunk splitting at ". ", "? ", "! " boundaries when
paragraphs exceed ChunkConfig.Size. Overlap now aligns to word boundaries
to avoid mid-word splits.

3.2: VectorStore interface gains ListCollections and CollectionInfo methods.
New collections.go with ListCollections, DeleteCollection, CollectionStats
helpers returning backend-agnostic CollectionInfo. Mock updated accordingly.

3.3: KeywordFilter re-ranks QueryResults by boosting scores for keyword
matches (case-insensitive, +10% per keyword). QueryConfig.Keywords flag
enables automatic extraction and filtering.

3.4: Mock-only benchmarks for chunking, query, ingest, formatting, and
keyword filtering.

Co-Authored-By: Virgil <virgil@lethean.io>
2026-02-20 08:02:00 +00:00
Snider
cf1c6191c4 docs: flesh out Phase 3 enhancement specs — chunk boundaries, keyword filter, benchmarks
All Phase 3 tasks are pure Go, testable with existing mocks.
Phase 0 env setup also marked complete (Charon).

Co-Authored-By: Virgil <virgil@lethean.io>
2026-02-20 07:45:16 +00:00
Claude
5a93e91f56
docs: update TODO.md with Phase 3 commit hash (e90f281)
Co-Authored-By: Charon <developers@lethean.io>
2026-02-20 00:33:16 +00:00
Claude
e90f281f6b
test: add Phase 3 integration tests with live Qdrant + Ollama (69.0% -> 89.2%)
32 new integration tests across 3 files, all gated behind //go:build rag:
- qdrant_integration_test.go (11): collection CRUD, upsert, search, filter, overwrite
- ollama_integration_test.go (9): embed, batch, consistency, dimension, model verify
- integration_test.go (12): end-to-end ingest+query, format results, all helpers,
  semantic similarity, recreate flag, convenience wrappers with default clients

Key discovery: Qdrant NewID() requires valid UUID/hex format — arbitrary strings
rejected. ChunkID's MD5 hex output works, but test point IDs must match.

Co-Authored-By: Charon <developers@lethean.io>
2026-02-20 00:33:03 +00:00
Claude
7784315f6b
docs: update TODO.md with Phase 2 commit hash
Co-Authored-By: Charon <developers@lethean.io>
2026-02-20 00:16:08 +00:00
Claude
a49761b1ba
feat: extract Embedder and VectorStore interfaces, add mock-based tests
Phase 2 test infrastructure: extract interfaces to decouple business
logic from external services, enabling fast CI tests without live
Qdrant or Ollama.

- Add Embedder interface (embedder.go) satisfied by OllamaClient
- Add VectorStore interface (vectorstore.go) satisfied by QdrantClient
- Update Ingest, IngestFile, Query to accept interfaces
- Add QueryWith, QueryContextWith, IngestDirWith, IngestFileWith helpers
- Add mockEmbedder and mockVectorStore in mock_test.go
- Add 69 new mock-based tests (ingest: 23, query: 12, helpers: 16)
- Coverage: 38.8% -> 69.0% (135 leaf-level tests total)

Co-Authored-By: Charon <developers@lethean.io>
2026-02-20 00:15:54 +00:00
Claude
49e9669419
docs: update TODO.md and FINDINGS.md for Phase 1 completion
Mark all Phase 1 pure-function test tasks as complete with commit
reference acb987a. Document coverage improvement (18.4% -> 38.8%),
per-function coverage breakdown, and discoveries about testability
of remaining functions.

Co-Authored-By: Charon <developers@lethean.io>
2026-02-20 00:03:47 +00:00
Claude
acb987a01d
test: add Phase 1 pure-function unit tests (18.4% -> 38.8% coverage)
Add comprehensive unit tests for all pure functions that require no
external services (Qdrant, Ollama). Coverage of testable functions
now at 100% for FormatResults*, Default*Config, EmbedDimension,
valueToGo, Model, ChunkID, and ChunkMarkdown edge cases.

New test files:
- query_test.go: FormatResultsText, FormatResultsContext, FormatResultsJSON,
  DefaultQueryConfig (18 tests)
- ollama_test.go: DefaultOllamaConfig, EmbedDimension, Model (8 tests)
- qdrant_test.go: DefaultQdrantConfig, pointIDToString, valueToGo,
  Point/SearchResult structs (24 tests)

Extended chunk_test.go with edge cases:
- Empty input, whitespace-only, single newline
- Headers with no body content
- Unicode/emoji text with rune-safe overlap verification
- Very long single paragraph
- Config boundary conditions (zero/negative size, overlap >= size)
- Sequential chunk indexing
- ChunkID rune truncation with multibyte characters
- DefaultChunkConfig, DefaultIngestConfig

Co-Authored-By: Charon <developers@lethean.io>
2026-02-20 00:02:52 +00:00
Claude
034b9da45f
fix: correct go.mod replace directive, prep contract files for fleet
- go.mod: replace ../core → ../go (actual framework location)
- go mod tidy: resolve missing indirect deps
- TODO.md: rewrite with phased task queue (0-4)
- FINDINGS.md: add environment review, testability analysis

Co-Authored-By: Charon <developers@lethean.io>
2026-02-19 23:33:56 +00:00
Snider
4999c8bdc3 docs: assign to Linux homelab agent with Qdrant/Ollama tasks
Co-Authored-By: Virgil <virgil@lethean.io>
2026-02-19 19:40:21 +00:00
Snider
bf047e4494 feat: extract go-rag from go-ai as standalone RAG package
Vector search with Qdrant + Ollama embeddings, document chunking.
Zero internal go-ai dependencies. Adds CLAUDE.md/TODO.md/FINDINGS.md.

Co-Authored-By: Virgil <virgil@lethean.io>
2026-02-19 18:29:59 +00:00