- Fix fmt.Sprintf format verb error in ssh.go (remove unused stat command) - Fix errcheck warnings by explicitly ignoring best-effort operations - Fix ineffassign warning in cmd_ansible.go All golangci-lint checks now pass for deploy packages. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
4.8 KiB
RAG Pipeline for Host UK Documentation
Store documentation in a vector database so Claude (and local LLMs) can retrieve relevant context without being reminded every conversation.
The Problem This Solves
"The amount of times I've had to re-tell you how to make a Flux button is crazy"
Instead of wasting context window on "remember, Flux buttons work like this...", the RAG system:
- Stores all documentation in Qdrant
- Claude queries before answering
- Relevant docs injected automatically
- No more re-teaching
Prerequisites
Already running on your lab:
- Qdrant:
linux.snider.dev:6333 - Ollama:
linux.snider.dev:11434(or local)
Install Python deps:
pip install -r requirements.txt
Ensure embedding model is available:
ollama pull nomic-embed-text
Quick Start
1. Ingest Documentation
# Ingest recovered Host UK docs
python ingest.py /Users/snider/Code/host-uk/core/tasks/recovered-hostuk \
--collection hostuk-docs \
--recreate
# Ingest Flux UI docs separately (higher priority)
python ingest.py /path/to/flux-ui-docs \
--collection flux-ui-docs \
--recreate
2. Query the Database
# Search for Flux button docs
python query.py "how to create a Flux button component"
# Filter by category
python query.py "path sandboxing" --category architecture
# Get more results
python query.py "Vi personality" --top 10
# Output as JSON
python query.py "brand voice" --format json
# Output for LLM context injection
python query.py "Flux modal component" --format context
3. List Collections
python query.py --list-collections
python query.py --stats --collection flux-ui-docs
Collections Strategy
| Collection | Content | Priority |
|---|---|---|
flux-ui-docs |
Flux Pro component docs | High (UI questions) |
hostuk-docs |
Recovered implementation docs | Medium |
brand-docs |
Vi, brand voice, visual identity | For content generation |
lethean-docs |
SASE/dVPN technical docs | Product-specific |
Integration with Claude Code
Option 1: MCP Server (Best)
Create an MCP server that Claude can query:
// In core CLI
func (s *RagServer) Query(query string) ([]Document, error) {
// Query Qdrant
// Return relevant docs
}
Then Claude can call rag.query("Flux button") and get docs automatically.
Option 2: CLAUDE.md Instruction
Add to project CLAUDE.md:
## Before Answering UI Questions
When asked about Flux UI components, query the RAG database first:
```bash
python /path/to/query.py "your question" --collection flux-ui-docs --format context
Include the retrieved context in your response.
### Option 3: Claude Code Hook
Create a hook that auto-injects context for certain queries.
## Category Taxonomy
The ingestion automatically categorizes files:
| Category | Matches |
|----------|---------|
| `ui-component` | flux, ui/component |
| `brand` | brand, mascot |
| `product-brief` | brief |
| `help-doc` | help, draft |
| `task` | task, plan |
| `architecture` | architecture, migration |
| `documentation` | default |
## Environment Variables
| Variable | Default | Description |
|----------|---------|-------------|
| `QDRANT_HOST` | linux.snider.dev | Qdrant server |
| `QDRANT_PORT` | 6333 | Qdrant port |
| `EMBEDDING_MODEL` | nomic-embed-text | Ollama model |
| `CHUNK_SIZE` | 500 | Characters per chunk |
| `CHUNK_OVERLAP` | 50 | Overlap between chunks |
## Training a Model vs RAG
**RAG** (what this does):
- Model weights unchanged
- Documents retrieved at query time
- Knowledge updates instantly (re-ingest)
- Good for: facts, API docs, current information
**Fine-tuning** (separate process):
- Model weights updated
- Knowledge baked into model
- Requires retraining to update
- Good for: style, patterns, conventions
**For Flux UI**: RAG is perfect. The docs change, API changes, you want current info.
**For Vi's voice**: Fine-tuning is better. Style doesn't change often, should be "baked in".
## Vector Math (For Understanding)
"How do I make a Flux button?" ↓ Embedding [0.12, -0.45, 0.78, ...768 floats...] ↓ Cosine similarity search Find chunks with similar vectors ↓ Results
- doc/ui/flux/components/button.md (score: 0.89)
- doc/ui/flux/forms.md (score: 0.76)
- doc/ui/flux/components/input.md (score: 0.71)
The embedding model converts text to "meaning vectors". Similar meanings = similar vectors = found by search.
## Troubleshooting
**"No results found"**
- Lower threshold: `--threshold 0.3`
- Check collection has data: `--stats`
- Verify Ollama is running: `ollama list`
**"Connection refused"**
- Check Qdrant is running: `curl http://linux.snider.dev:6333/collections`
- Check firewall/network
**"Embedding model not available"**
```bash
ollama pull nomic-embed-text
Part of the Host UK Core CLI tooling