go-rag/TODO.md at f5f1e68c5cf1612bec6a2b8b88997d58fceda4c8

feat(bench): add Phase 4 GPU benchmarks on RX 7800 XT

Benchmarks for embedding (Ollama ROCm), chunking (pure CPU), and search
(Qdrant) latency. Key results: 97 embeds/sec single (10.3ms), Qdrant
search 152µs (6.5K QPS), chunking 11µs per 50-section doc.

EmbedBatch confirmed sequential — Ollama has no batch API.

Co-Authored-By: Charon <developers@lethean.io>

7.3 KiB

Raw Blame History

TODO.md — go-rag Task Queue

Phase 0: Environment Setup

Phase 1: Unit Tests (18.4% -> 38.8% coverage)

Testable Without External Services

Require External Services (use build tag `//go:build rag`)

Phase 2: Test Infrastructure (38.8% -> 69.0% coverage)

Phase 3: Enhancements

3.1 Chunk Boundary Improvements

3.2 Collection Management Helpers

3.3 Keyword Pre-Filter

3.4 Benchmarks

Phase 4: GPU Embeddings — COMPLETE

Known Issues

Platform

Workflow

7.3 KiB Raw Blame History

TODO.md — go-rag Task Queue

Phase 0: Environment Setup

Phase 1: Unit Tests (18.4% -> 38.8% coverage)

Testable Without External Services

Require External Services (use build tag //go:build rag)

Phase 2: Test Infrastructure (38.8% -> 69.0% coverage)

Phase 3: Enhancements

3.1 Chunk Boundary Improvements

3.2 Collection Management Helpers

3.3 Keyword Pre-Filter

3.4 Benchmarks

Phase 4: GPU Embeddings — COMPLETE

Known Issues

Platform

Workflow

7.3 KiB

Raw Blame History

Require External Services (use build tag `//go:build rag`)