cli/tools/rag/README.md

194 lines
4.8 KiB
Markdown
Raw Permalink Normal View History

feat: wire release command, add tar.xz support, unified installers (#277) * feat(cli): wire release command and add installer scripts - Wire up `core build release` subcommand (was orphaned) - Wire up `core monitor` command (missing import in full variant) - Add installer scripts for Unix (.sh) and Windows (.bat) - setup: Interactive with variant selection - ci: Minimal for CI/CD environments - dev: Full development variant - go/php/agent: Targeted development variants - All scripts include security hardening: - Secure temp directories (mktemp -d) - Architecture validation - Version validation after GitHub API call - Proper cleanup on exit - PowerShell PATH updates on Windows (avoids setx truncation) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * feat(build): add tar.xz support and unified installer scripts - Add tar.xz archive support using Borg's compress package - ArchiveXZ() and ArchiveWithFormat() for configurable compression - Better compression ratio than gzip for release artifacts - Consolidate 12 installer scripts into 2 unified scripts - install.sh and install.bat with BunnyCDN edge variable support - Subdomains: setup.core.help, ci.core.help, dev.core.help, etc. - MODE and VARIANT transformed at edge based on subdomain - Installers prefer tar.xz with automatic fallback to tar.gz - Fixed CodeRabbit issues: HTTP status patterns, tar error handling, verify_install params, VARIANT validation, CI PATH persistence Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * chore: add build and release config files - .core/build.yaml - cross-platform build configuration - .core/release.yaml - release workflow configuration Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * chore: move plans from docs/ to tasks/ Consolidate planning documents in tasks/plans/ directory. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * fix(install): address CodeRabbit review feedback - Add curl timeout (--max-time) to prevent hanging on slow networks - Rename TMPDIR to WORK_DIR to avoid clobbering system env var - Add chmod +x to ensure binary has execute permissions - Add error propagation after subroutine calls in batch file - Remove System32 install attempt in CI mode (use consistent INSTALL_DIR) - Fix HTTP status regex for HTTP/2 compatibility Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * feat(rag): add Go RAG implementation with Qdrant + Ollama Add RAG (Retrieval Augmented Generation) tools for storing documentation in Qdrant vector database and querying with semantic search. This replaces the Python tools/rag implementation with a native Go solution. New commands: - core rag ingest [directory] - Ingest markdown files into Qdrant - core rag query [question] - Query vector database with semantic search - core rag collections - List and manage Qdrant collections Features: - Markdown chunking by sections and paragraphs with overlap - UTF-8 safe text handling for international content - Automatic category detection from file paths - Multiple output formats: text, JSON, LLM context injection - Environment variable support for host configuration Dependencies: - github.com/qdrant/go-client (gRPC client) - github.com/ollama/ollama/api (embeddings API) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * feat(deploy): add pure-Go Ansible executor and Coolify API integration Implement infrastructure deployment system with: - pkg/ansible: Pure Go Ansible executor - Playbook/inventory parsing (types.go, parser.go) - Full execution engine with variable templating, loops, blocks, conditionals, handlers, and fact gathering (executor.go) - SSH client with key/password auth and privilege escalation (ssh.go) - 35+ module implementations: shell, command, copy, template, file, apt, service, systemd, user, group, git, docker_compose, etc. (modules.go) - pkg/deploy/coolify: Coolify API client wrapping Python swagger client - List/get servers, projects, applications, databases, services - Generic Call() for any OpenAPI operation - pkg/deploy/python: Embedded Python runtime for swagger client integration - internal/cmd/deploy: CLI commands - core deploy servers/projects/apps/databases/services/team - core deploy call <operation> [params-json] This enables Docker-free infrastructure deployment with Ansible-compatible playbooks executed natively in Go. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * fix(deploy): address linter warnings and build errors - Fix fmt.Sprintf format verb error in ssh.go (remove unused stat command) - Fix errcheck warnings by explicitly ignoring best-effort operations - Fix ineffassign warning in cmd_ansible.go All golangci-lint checks now pass for deploy packages. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * style(deploy): fix gofmt formatting Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * fix(deploy): use known_hosts for SSH host key verification Address CodeQL security alert by using the user's known_hosts file for SSH host key verification when available. Falls back to accepting any key only when known_hosts doesn't exist (common in containerized or ephemeral environments). Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * feat(ai,security,ide): add agentic MVP, security jobs, and Core IDE desktop app Wire up AI infrastructure with unified pkg/ai package (metrics JSONL, RAG integration), move RAG under `core ai rag`, add `core ai metrics` command, and enrich task context with Qdrant documentation. Add `--target` flag to all security commands for external repo scanning, `core security jobs` for distributing findings as GitHub Issues, and consistent error logging across scan/deps/alerts/secrets commands. Add Core IDE Wails v3 desktop app with Angular 20 frontend, MCP bridge (loopback-only HTTP server), WebSocket hub, and Claude Code bridge. Production-ready with Lethean CIC branding, macOS code signing support, and security hardening (origin validation, body size limits, URL scheme checks, memory leak prevention, XSS mitigation). Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * fix: address PR review comments from CodeRabbit, Copilot, and Gemini Fixes across 25 files addressing 46+ review comments: - pkg/ai/metrics.go: handle error from Close() on writable file handle - pkg/ansible: restore loop vars after loop, restore become settings, fix Upload with become=true and no password (use sudo -n), honour SSH timeout config, use E() helper for contextual errors, quote git refs in checkout commands - pkg/rag: validate chunk config, guard negative-to-uint64 conversion, use E() helper for errors, add context timeout to Ollama HTTP calls - pkg/deploy/python: fix exec.ExitError type assertion (was os.PathError), handle os.UserHomeDir() error - pkg/build/buildcmd: use cmd.Context() instead of context.Background() for proper Ctrl+C cancellation - install.bat: add curl timeouts, CRLF line endings, use --connect-timeout for archive downloads - install.sh: use absolute path for version check in CI mode - tools/rag: fix broken ingest.py function def, escape HTML in query.py, pin qdrant-client version, add markdown code block languages - internal/cmd/rag: add chunk size validation, env override handling Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * fix(build): make release dry-run by default and remove darwin/amd64 target Replace --dry-run (default false) with --we-are-go-for-launch (default false) so `core build release` is safe by default. Remove darwin/amd64 from default build targets (arm64 only for macOS). Fix cmd_project.go to use command context instead of context.Background(). Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-04 00:49:57 +00:00
# RAG Pipeline for Host UK Documentation
Store documentation in a vector database so Claude (and local LLMs) can retrieve relevant context without being reminded every conversation.
## The Problem This Solves
> "The amount of times I've had to re-tell you how to make a Flux button is crazy"
Instead of wasting context window on "remember, Flux buttons work like this...", the RAG system:
1. Stores all documentation in Qdrant
2. Claude queries before answering
3. Relevant docs injected automatically
4. No more re-teaching
## Prerequisites
**Already running on your lab:**
- Qdrant: `linux.snider.dev:6333`
- Ollama: `linux.snider.dev:11434` (or local)
**Install Python deps:**
```bash
pip install -r requirements.txt
```
**Ensure embedding model is available:**
```bash
ollama pull nomic-embed-text
```
## Quick Start
### 1. Ingest Documentation
```bash
# Ingest recovered Host UK docs
python ingest.py /Users/snider/Code/host-uk/core/tasks/recovered-hostuk \
--collection hostuk-docs \
--recreate
# Ingest Flux UI docs separately (higher priority)
python ingest.py /path/to/flux-ui-docs \
--collection flux-ui-docs \
--recreate
```
### 2. Query the Database
```bash
# Search for Flux button docs
python query.py "how to create a Flux button component"
# Filter by category
python query.py "path sandboxing" --category architecture
# Get more results
python query.py "Vi personality" --top 10
# Output as JSON
python query.py "brand voice" --format json
# Output for LLM context injection
python query.py "Flux modal component" --format context
```
### 3. List Collections
```bash
python query.py --list-collections
python query.py --stats --collection flux-ui-docs
```
## Collections Strategy
| Collection | Content | Priority |
|------------|---------|----------|
| `flux-ui-docs` | Flux Pro component docs | High (UI questions) |
| `hostuk-docs` | Recovered implementation docs | Medium |
| `brand-docs` | Vi, brand voice, visual identity | For content generation |
| `lethean-docs` | SASE/dVPN technical docs | Product-specific |
## Integration with Claude Code
### Option 1: MCP Server (Best)
Create an MCP server that Claude can query:
```go
// In core CLI
func (s *RagServer) Query(query string) ([]Document, error) {
// Query Qdrant
// Return relevant docs
}
```
Then Claude can call `rag.query("Flux button")` and get docs automatically.
### Option 2: CLAUDE.md Instruction
Add to project CLAUDE.md:
```markdown
## Before Answering UI Questions
When asked about Flux UI components, query the RAG database first:
```bash
python /path/to/query.py "your question" --collection flux-ui-docs --format context
```
Include the retrieved context in your response.
```
### Option 3: Claude Code Hook
Create a hook that auto-injects context for certain queries.
## Category Taxonomy
The ingestion automatically categorizes files:
| Category | Matches |
|----------|---------|
| `ui-component` | flux, ui/component |
| `brand` | brand, mascot |
| `product-brief` | brief |
| `help-doc` | help, draft |
| `task` | task, plan |
| `architecture` | architecture, migration |
| `documentation` | default |
## Environment Variables
| Variable | Default | Description |
|----------|---------|-------------|
| `QDRANT_HOST` | linux.snider.dev | Qdrant server |
| `QDRANT_PORT` | 6333 | Qdrant port |
| `EMBEDDING_MODEL` | nomic-embed-text | Ollama model |
| `CHUNK_SIZE` | 500 | Characters per chunk |
| `CHUNK_OVERLAP` | 50 | Overlap between chunks |
## Training a Model vs RAG
**RAG** (what this does):
- Model weights unchanged
- Documents retrieved at query time
- Knowledge updates instantly (re-ingest)
- Good for: facts, API docs, current information
**Fine-tuning** (separate process):
- Model weights updated
- Knowledge baked into model
- Requires retraining to update
- Good for: style, patterns, conventions
**For Flux UI**: RAG is perfect. The docs change, API changes, you want current info.
**For Vi's voice**: Fine-tuning is better. Style doesn't change often, should be "baked in".
## Vector Math (For Understanding)
```text
"How do I make a Flux button?"
↓ Embedding
[0.12, -0.45, 0.78, ...768 floats...]
↓ Cosine similarity search
Find chunks with similar vectors
↓ Results
1. doc/ui/flux/components/button.md (score: 0.89)
2. doc/ui/flux/forms.md (score: 0.76)
3. doc/ui/flux/components/input.md (score: 0.71)
```
The embedding model converts text to "meaning vectors". Similar meanings = similar vectors = found by search.
## Troubleshooting
**"No results found"**
- Lower threshold: `--threshold 0.3`
- Check collection has data: `--stats`
- Verify Ollama is running: `ollama list`
**"Connection refused"**
- Check Qdrant is running: `curl http://linux.snider.dev:6333/collections`
- Check firewall/network
**"Embedding model not available"**
```bash
ollama pull nomic-embed-text
```
---
*Part of the Host UK Core CLI tooling*