description: Retrieval-augmented generation via Qdrant vector search and Ollama embeddings.
---
# RAG Pipeline
go-ai integrates with the RAG (Retrieval-Augmented Generation) pipeline provided by `go-rag`. This surfaces as three MCP tools for vector search and a high-level facade function for programmatic use.
The tool embeds the question via Ollama, searches Qdrant with the specified parameters, and returns formatted context with source references.
### `rag_ingest`
Ingest a file into the vector database. The file is chunked (for Markdown, this respects heading boundaries), each chunk is embedded via Ollama, and the resulting vectors are stored in Qdrant.
| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `path` | `string` | Yes | Path to the file to ingest (relative to workspace root) |
This tool is logged at `Security` level due to its write nature.
### `rag_collections`
List all available collections in the connected Qdrant instance, with point counts and vector dimensions.
## AI Facade: QueryRAGForTask
The `ai` package provides a higher-level wrapper for programmatic RAG queries. It is used by agentic task planners to enrich task context without importing `go-rag` directly.
Both services must be running for RAG tools to function. In CI, tests that touch RAG tools are guarded with `skipIfShort(t)`.
## Embedding Benchmark
The `cmd/embed-bench/` utility compares embedding models for the OpenBrain knowledge store. It tests how well models separate semantically related vs unrelated agent memory pairs.
```bash
go run ./cmd/embed-bench
go run ./cmd/embed-bench -ollama http://localhost:11434
```
The benchmark evaluates:
- **Cluster separation** -- intra-group vs inter-group similarity
- **Query recall accuracy** -- top-1 and top-3 retrieval precision
- **Embedding throughput** -- milliseconds per memory
Models tested: `nomic-embed-text` and `embeddinggemma`.
## Testing
RAG tool tests cover handler validation (empty question/path fields, default behaviour) and graceful degradation when Qdrant or Ollama are unavailable. Full RAG round-trip tests require live services and are skipped in short mode.