diff --git a/docs/agentic.md b/docs/agentic.md new file mode 100644 index 0000000..635c56e --- /dev/null +++ b/docs/agentic.md @@ -0,0 +1,214 @@ +--- +title: Agentic Client +description: Security scanning, metrics recording, and CLI command integrations. +--- + +# Agentic Client + +go-ai provides several CLI command packages that integrate AI-adjacent functionality into the `core` binary. These cover security scanning across multi-repo environments, AI metrics viewing, and homelab monitoring. + +## Security Scanning + +The `cmd/security/` package provides a comprehensive security scanning suite that queries GitHub's security APIs across repositories defined in a `repos.yaml` registry. + +### Command Structure + +``` +core security + +-- alerts # Unified view of all security alert types + +-- deps # Dependabot dependency vulnerability alerts + +-- scan # Code scanning alerts (CodeQL, etc.) + +-- secrets # Secret scanning alerts + +-- jobs # Create GitHub issues from scan findings +``` + +### Common Flags + +| Flag | Description | +|------|-------------| +| `--registry` | Path to `repos.yaml` (auto-detected if omitted) | +| `--repo` | Filter to a single repository | +| `--severity` | Filter by severity: `critical`, `high`, `medium`, `low` (comma-separated) | +| `--json` | Output as JSON instead of formatted table | +| `--target` | External repo target (e.g. `wailsapp/wails`) -- bypasses registry | + +### Alerts + +The `alerts` command provides a unified view combining Dependabot, code scanning, and secret scanning alerts: + +```bash +core security alerts # All repos in registry +core security alerts --repo core-php # Single repo +core security alerts --severity critical,high # Filter by severity +core security alerts --target wailsapp/wails # External repo +core security alerts --json # JSON output +``` + +### Dependencies + +Query Dependabot vulnerability alerts with upgrade suggestions: + +```bash +core security deps +core security deps --severity high +``` + +Output includes the vulnerable version range and the first patched version when available. + +### Code Scanning + +Query code scanning alerts from tools like CodeQL: + +```bash +core security scan +core security scan --tool codeql # Filter by scanning tool +``` + +Each scan execution records a metrics event via `ai.Record()` for tracking scan activity over time. + +### Secret Scanning + +Check for exposed secrets across repositories: + +```bash +core security secrets +core security secrets --json +``` + +Secrets are always treated as high severity. The output flags whether push protection was bypassed. + +### Jobs + +Create GitHub issues from security findings for agent-driven remediation: + +```bash +core security jobs --targets wailsapp/wails +core security jobs --targets wailsapp/wails --issue-repo host-uk/core +core security jobs --targets wailsapp/wails --dry-run +core security jobs --targets a/b,c/d --copies 3 +``` + +| Flag | Default | Description | +|------|---------|-------------| +| `--targets` | (required) | Comma-separated `owner/repo` targets | +| `--issue-repo` | `host-uk/core` | Repository where issues are created | +| `--dry-run` | `false` | Preview without creating issues | +| `--copies` | `1` | Number of issue copies per target | + +Each created issue includes a findings summary, checklist, and instructions. A metrics event is recorded for each issue created. + +## Metrics + +The `ai/` package provides a JSONL-based metrics recording system. Events are stored at: + +``` +~/.core/ai/metrics/YYYY-MM-DD.jsonl +``` + +### Event Structure + +```go +type Event struct { + Type string `json:"type"` + Timestamp time.Time `json:"timestamp"` + AgentID string `json:"agent_id,omitempty"` + Repo string `json:"repo,omitempty"` + Duration time.Duration `json:"duration,omitempty"` + Data map[string]any `json:"data,omitempty"` +} +``` + +### Recording Events + +```go +ai.Record(ai.Event{ + Type: "security.scan", + Timestamp: time.Now(), + Repo: "wailsapp/wails", + Data: map[string]any{ + "total": summary.Total, + "critical": summary.Critical, + }, +}) +``` + +Writing uses `O_APPEND` with a mutex for concurrent safety. Missing directories are created automatically. + +### Reading and Querying + +```go +events, err := ai.ReadEvents(since) // Read events from a time range +summary := ai.Summary(events) // Aggregate by type, repo, agent +``` + +`ReadEvents` iterates calendar days from `since` to today, opening each daily file. Missing files are silently skipped. Malformed JSONL lines are skipped without error. + +`Summary` returns a `map[string]any` with: +- `total` -- total event count +- `by_type` -- sorted slice of `{key, count}` maps +- `by_repo` -- sorted slice of `{key, count}` maps +- `by_agent` -- sorted slice of `{key, count}` maps + +### CLI Command + +```bash +core ai metrics # Last 7 days (default) +core ai metrics --since 30d # Last 30 days +core ai metrics --since 24h # Last 24 hours +core ai metrics --json # JSON output +``` + +Duration format: `Nd` (days), `Nh` (hours), `Nm` (minutes). + +### MCP Tools + +The metrics system is also exposed via two MCP tools: + +| Tool | Description | +|------|-------------| +| `metrics_record` | Record an event to the JSONL store | +| `metrics_query` | Query and summarise events for a time period | + +## Lab Dashboard + +The `cmd/lab/` package provides a homelab monitoring dashboard with real-time data collection: + +```bash +core lab serve # Start on :8080 +core lab serve --bind :9090 # Custom port +``` + +### Collectors + +The dashboard aggregates data from multiple sources: + +| Collector | Interval | Source | +|-----------|----------|--------| +| System | 60s | Local machine stats | +| Prometheus | Configurable | Prometheus endpoint | +| HuggingFace | Configurable | HF model metadata | +| Docker | Configurable | Docker container status | +| Forgejo | Configurable | Forge CI/CD status | +| Training | Configurable | ML training run status | +| Services | 60s | Service health checks | +| InfluxDB | Configurable | Time-series metrics | + +### Routes + +**Web pages:** `/`, `/models`, `/training`, `/dataset`, `/agents`, `/services` + +**JSON API:** `/api/status`, `/api/models`, `/api/training`, `/api/dataset`, `/api/runs`, `/api/agents`, `/api/services` + +**Live updates:** `/events` (Server-Sent Events) + +**Health:** `/health` + +## RAG CLI + +The `cmd/rag/` package re-exports `go-rag`'s CLI commands for use within the `core` binary: + +```go +var AddRAGSubcommands = ragcmd.AddRAGSubcommands +``` + +This makes RAG operations (ingest, query, collection management) available as `core rag` subcommands without duplicating the implementation. diff --git a/docs/ide-bridge.md b/docs/ide-bridge.md new file mode 100644 index 0000000..b3941d7 --- /dev/null +++ b/docs/ide-bridge.md @@ -0,0 +1,223 @@ +--- +title: IDE Bridge +description: IDE integration bridge connecting the MCP server to Laravel core-agentic via WebSocket. +--- + +# IDE Bridge + +The `mcp/ide` package implements the IDE subsystem. It bridges the desktop MCP server to the Laravel `core-agentic` backend via a persistent WebSocket connection, enabling AI agents to interact with agent sessions, builds, and the platform dashboard. + +## Architecture + +``` +MCP Client (Claude, Cursor, etc.) + | + v MCP JSON-RPC +go-ai MCP Server + | + v ide_* tool calls +IDE Subsystem (mcp/ide/) + | + +-- Bridge -----------> Laravel core-agentic + | (WebSocket) ws://localhost:9876/ws + | + +-- ws.Hub <------------ Bridge dispatch + | + v real-time updates + Local WebSocket subscribers (browser UIs) +``` + +## Subsystem Structure + +```go +type Subsystem struct { + cfg Config + bridge *Bridge // nil in headless mode + hub *ws.Hub // local WebSocket hub for real-time forwarding +} +``` + +When a `ws.Hub` is provided, the subsystem creates a `Bridge` that actively connects to Laravel. Without a hub (`hub == nil`), the subsystem runs in **headless mode**: tools are still registered and return stub responses, but no real-time forwarding occurs. + +## Configuration + +```go +type Config struct { + LaravelWSURL string // WebSocket endpoint (default: ws://localhost:9876/ws) + WorkspaceRoot string // local path for workspace context + Token string // Bearer token for Authorization header + ReconnectInterval time.Duration // base backoff (default: 2s) + MaxReconnectInterval time.Duration // cap for exponential backoff (default: 30s) +} +``` + +All fields are overridable via functional options: + +```go +sub := ide.New(hub, + ide.WithLaravelURL("ws://custom:9876/ws"), + ide.WithToken("my-bearer-token"), + ide.WithWorkspaceRoot("/path/to/project"), + ide.WithReconnectInterval(5 * time.Second), +) +``` + +## WebSocket Bridge + +The `Bridge` maintains a persistent WebSocket connection to Laravel and forwards inbound messages to the local `ws.Hub`. + +### Connection Lifecycle + +``` +StartBridge(ctx) + +-- go connectLoop(ctx) + +-- dial(ctx) <-- WebSocket upgrade with Bearer token + | sets b.connected = true + +-- readLoop(ctx) <-- blocks reading frames + +-- dispatch(msg) <-- routes to ws.Hub channel + [on read error] + sets b.connected = false, returns to connectLoop +``` + +### Exponential Backoff + +When the connection drops or fails to establish, the bridge uses exponential backoff: + +```go +delay := b.cfg.ReconnectInterval // starts at 2s +for { + if err := b.dial(ctx); err != nil { + // wait delay, then double it up to MaxReconnectInterval + delay = min(delay*2, b.cfg.MaxReconnectInterval) + continue + } + delay = b.cfg.ReconnectInterval // reset on successful connection + b.readLoop(ctx) +} +``` + +Backoff sequence with defaults: 2s, 4s, 8s, 16s, 30s, 30s, ... The delay resets to 2s on every successful connection. + +### Authentication + +```go +var header http.Header +if b.cfg.Token != "" { + header = http.Header{} + header.Set("Authorization", "Bearer "+b.cfg.Token) +} +conn, _, err := dialer.DialContext(ctx, b.cfg.LaravelWSURL, header) +``` + +When `Token` is empty, no `Authorization` header is sent. This is appropriate for development environments running without authentication. + +### Message Dispatch + +Inbound frames from Laravel are deserialised into `BridgeMessage`: + +```go +type BridgeMessage struct { + Type string `json:"type"` + Channel string `json:"channel,omitempty"` + SessionID string `json:"sessionId,omitempty"` + Data any `json:"data,omitempty"` + Timestamp time.Time `json:"timestamp"` +} +``` + +The `dispatch` method routes messages to the local `ws.Hub`: + +```go +func (b *Bridge) dispatch(msg BridgeMessage) { + channel := msg.Channel + if channel == "" { + channel = "ide:" + msg.Type // synthetic channel name + } + b.hub.SendToChannel(channel, ws.Message{Type: ws.TypeEvent, Data: msg.Data}) +} +``` + +This allows browser-based UIs connected to the local WebSocket hub to receive real-time updates from Laravel without polling. + +### Outbound Messages + +MCP tool handlers call `bridge.Send()` to push requests to Laravel: + +```go +func (b *Bridge) Send(msg BridgeMessage) error { + b.mu.Lock() + defer b.mu.Unlock() + if b.conn == nil { + return fmt.Errorf("bridge: not connected") + } + msg.Timestamp = time.Now() + data, _ := json.Marshal(msg) + return b.conn.WriteMessage(websocket.TextMessage, data) +} +``` + +A mutex ensures `Send` and the `readLoop` do not race on `b.conn`. If the bridge is disconnected, `Send` returns an error which propagates to the MCP client as a JSON-RPC error. + +## IDE Tool Groups + +The subsystem registers 11 tools across three groups. + +### Chat Tools (`tools_chat.go`) + +| Tool | Description | +|------|-------------| +| `ide_chat_send` | Send a message to an agent chat session | +| `ide_chat_history` | Retrieve message history for a session | +| `ide_session_list` | List active agent sessions | +| `ide_session_create` | Create a new agent session | +| `ide_plan_status` | Get current plan status for a session | + +### Build Tools (`tools_build.go`) + +| Tool | Description | +|------|-------------| +| `ide_build_status` | Get the status of a specific build | +| `ide_build_list` | List recent builds, optionally filtered by repository | +| `ide_build_logs` | Retrieve log output for a build | + +### Dashboard Tools (`tools_dashboard.go`) + +| Tool | Description | +|------|-------------| +| `ide_dashboard_overview` | High-level platform overview (repos, services, sessions, builds, bridge status) | +| `ide_dashboard_activity` | Recent activity feed | +| `ide_dashboard_metrics` | Aggregate build and agent metrics for a time period | + +### Tool Behaviour + +All IDE tools follow a **fire-and-forward** pattern: the tool sends a `BridgeMessage` to Laravel and returns an immediate acknowledgement or stub response. Real data arrives asynchronously via the WebSocket read loop and is forwarded to `ws.Hub` subscribers. + +The `ide_dashboard_overview` tool is the one exception -- it reads `bridge.Connected()` synchronously to populate the `BridgeOnline` field. + +## Registration + +The IDE subsystem is registered during MCP server construction: + +```go +svc, err := mcp.New( + mcp.WithWSHub(hub), + mcp.WithSubsystem(ide.New(hub, ide.WithToken(token))), +) +``` + +The subsystem implements `SubsystemWithShutdown`, closing the bridge connection gracefully when the MCP server shuts down. + +## Testing + +Bridge tests use `net/http/httptest` to stand up a real WebSocket server in-process. This keeps tests hermetic while exercising: + +- Reconnection logic and exponential backoff +- Authentication header injection +- Message dispatch routing +- Server shutdown detection + +The `waitConnected` helper polls `bridge.Connected()` with a deadline rather than using fixed sleeps. + +All 11 tool handlers are tested under two conditions: +- **nil bridge** -- verifies the error path +- **connected mock bridge** -- verifies the success path with JSON round-trip validation diff --git a/docs/index.md b/docs/index.md new file mode 100644 index 0000000..caa7b26 --- /dev/null +++ b/docs/index.md @@ -0,0 +1,130 @@ +--- +title: go-ai Overview +description: The AI integration hub for the Lethean Go ecosystem — MCP server, metrics, and facade. +--- + +# go-ai + +**Module**: `forge.lthn.ai/core/go-ai` +**Language**: Go 1.26 +**Licence**: EUPL-1.2 + +go-ai is the **integration hub** for the Lethean AI stack. It imports specialised modules and exposes them as a unified MCP server with IDE bridge support, metrics recording, and a thin AI facade. + +## Architecture + +``` +AI Clients (Claude, Cursor, any MCP-capable IDE) + | MCP JSON-RPC (stdio / TCP / Unix) + v + [ go-ai MCP Server ] <-- this module + | | | + | | +-- ide/ subsystem --> Laravel core-agentic (WebSocket) + | +-- go-rag -----------------> Qdrant + Ollama + +-- go-ml ---------------------------> inference backends (go-mlx, go-rocm, ...) + + Core CLI (forge.lthn.ai/core/cli) bootstraps and wires everything +``` + +go-ai is a pure library module. It contains no `main` package. The Core CLI (`core mcp serve`) imports `forge.lthn.ai/core/go-ai/mcp`, constructs a `mcp.Service`, and calls `Run()`. + +## Package Layout + +``` +go-ai/ ++-- ai/ # AI facade: RAG queries and JSONL metrics +| +-- ai.go # Package documentation and composition overview +| +-- rag.go # QueryRAGForTask() with graceful degradation +| +-- metrics.go # Event, Record(), ReadEvents(), Summary() +| ++-- cmd/ # CLI command registrations +| +-- daemon/ # core daemon (MCP server lifecycle) +| +-- metrics/ # core ai metrics viewer +| +-- rag/ # re-exports go-rag CLI commands +| +-- security/ # security scanning tools (deps, alerts, secrets, scan, jobs) +| +-- lab/ # homelab monitoring dashboard +| +-- embed-bench/ # embedding model benchmark utility +| ++-- docs/ # This documentation +``` + +The MCP server and all its tool subsystems are provided by the separate `forge.lthn.ai/core/mcp` module. go-ai wires that server together with the `ai/` facade and the CLI command registrations. + +## Imported Modules + +| Module | Purpose | +|--------|---------| +| `forge.lthn.ai/core/go-ml` | Inference backends, scoring engine | +| `forge.lthn.ai/core/go-rag` | Vector search, embeddings | +| `forge.lthn.ai/core/go-inference` | Shared TextModel/Backend interfaces | +| `forge.lthn.ai/core/go-process` | Process lifecycle management | +| `forge.lthn.ai/core/go-log` | Structured logging with security levels | +| `forge.lthn.ai/core/go-io` | Sandboxed filesystem abstraction | +| `forge.lthn.ai/core/go-i18n` | Internationalisation | + +## Quick Start + +go-ai is not run directly. It is consumed by the Core CLI: + +```bash +# Start the MCP server on stdio (default) +core mcp serve + +# Start on TCP +core mcp serve --mcp-transport tcp --mcp-addr 127.0.0.1:9100 + +# Run as a background daemon +core daemon start + +# View AI metrics +core ai metrics --since 7d +``` + +## Documentation + +| Page | Description | +|------|-------------| +| [MCP Server](mcp-server.md) | Protocol implementation, transports, tool registration | +| [ML Pipeline](ml-pipeline.md) | ML scoring, model management, inference backends | +| [RAG Pipeline](rag.md) | Retrieval-augmented generation, vector search | +| [Agentic Client](agentic.md) | Security scanning, metrics, CLI commands | +| [IDE Bridge](ide-bridge.md) | IDE integration, WebSocket bridge to Laravel | + +## Build and Test + +```bash +go test ./... # Run all tests +go test -run TestName ./... # Run a single test +go test -v -race ./... # Verbose with race detector +go build ./... # Verify compilation (library -- no binary) +go vet ./... # Vet +``` + +Tests follow the `_Good`, `_Bad`, `_Ugly` suffix convention: + +- `_Good` -- Happy path, valid input +- `_Bad` -- Expected error conditions +- `_Ugly` -- Panics and edge cases + +## Dependencies + +### Direct + +| Module | Role | +|--------|------| +| `forge.lthn.ai/core/cli` | CLI framework (cobra-based command registration) | +| `forge.lthn.ai/core/go-api` | API server framework | +| `forge.lthn.ai/core/go-i18n` | Internationalisation strings | +| `forge.lthn.ai/core/go-inference` | Shared inference interfaces | +| `forge.lthn.ai/core/go-io` | Filesystem abstraction | +| `forge.lthn.ai/core/go-log` | Structured logging | +| `forge.lthn.ai/core/go-ml` | ML scoring and inference | +| `forge.lthn.ai/core/go-process` | Process lifecycle | +| `forge.lthn.ai/core/go-rag` | RAG pipeline | +| `github.com/modelcontextprotocol/go-sdk` | MCP Go SDK | +| `github.com/gorilla/websocket` | WebSocket client (IDE bridge) | +| `github.com/gin-gonic/gin` | HTTP router | + +### Indirect (via go-ml and go-rag) + +`go-mlx`, `go-rocm`, `go-duckdb`, `parquet-go`, `ollama`, `qdrant/go-client`, and the Arrow ecosystem are transitive dependencies not imported directly by go-ai. diff --git a/docs/mcp-server.md b/docs/mcp-server.md new file mode 100644 index 0000000..55c0b4c --- /dev/null +++ b/docs/mcp-server.md @@ -0,0 +1,228 @@ +--- +title: MCP Server +description: Model Context Protocol server implementation, transports, and tool registration. +--- + +# MCP Server + +The MCP server is the core of go-ai. It exposes 49 tools across file operations, RAG vector search, ML inference, process management, WebSocket streaming, browser automation, metrics, and IDE integration via the [Model Context Protocol](https://modelcontextprotocol.io/). + +## The Service Struct + +`mcp.Service` is the central container. It wraps the upstream MCP Go SDK server and owns all optional services: + +```go +type Service struct { + server *mcp.Server // upstream go-sdk server instance + workspaceRoot string // sandboxed root for file operations + medium io.Medium // filesystem abstraction (sandboxed or global) + subsystems []Subsystem // plugin subsystems registered via WithSubsystem + logger *log.Logger // audit logger for tool execution + processService *process.Service // optional: process lifecycle management + wsHub *ws.Hub // optional: WebSocket hub for streaming +} +``` + +## Construction + +`New()` uses functional options. All options are applied before tools are registered: + +```go +svc, err := mcp.New( + mcp.WithWorkspaceRoot("/path/to/project"), + mcp.WithProcessService(ps), + mcp.WithWSHub(hub), + mcp.WithSubsystem(ide.New(hub, ide.WithToken(token))), + mcp.WithSubsystem(mcp.NewMLSubsystem(mlSvc)), +) +``` + +**Construction sequence:** + +1. Allocate `Service` with an empty `mcp.Server` (name `core-cli`, version `0.1.0`). +2. Default workspace root to `os.Getwd()` and create a sandboxed medium. +3. Apply each `Option` in order. +4. Register built-in file, directory, and language tools (10 tools). +5. Register RAG, metrics, and conditionally WebSocket and process tools. +6. Iterate subsystems and call `sub.RegisterTools(s.server)` for each plugin. + +### Available Options + +| Option | Effect | +|--------|--------| +| `WithWorkspaceRoot(root)` | Restrict file operations to `root`; empty string removes restriction | +| `WithProcessService(ps)` | Enable process management tools | +| `WithWSHub(hub)` | Enable WebSocket streaming tools | +| `WithSubsystem(sub)` | Append a Subsystem plugin | + +## Workspace Sandboxing + +The `io.Medium` abstraction (from `forge.lthn.ai/core/go-io`) isolates file access. When a workspace root is configured, every read, write, list, and stat call is validated against that root. Paths that escape the sandbox are rejected before reaching the operating system. + +```go +func WithWorkspaceRoot(root string) Option { + return func(s *Service) error { + if root == "" { + s.medium = io.Local // unrestricted global filesystem + return nil + } + abs, _ := filepath.Abs(root) + m, err := io.NewSandboxed(abs) + s.medium = m + return nil + } +} +``` + +An empty root switches the medium to `io.Local` with no path restrictions. Production deployments should always provide an explicit root. + +## Transports + +The server supports three transports. `Run()` auto-selects between stdio and TCP based on the `MCP_ADDR` environment variable. + +### Stdio (default) + +Standard integration mode for AI clients (Claude, Cursor) that spawn the server as a subprocess: + +```go +func (s *Service) ServeStdio(ctx context.Context) error { + return s.server.Run(ctx, &mcp.StdioTransport{}) +} +``` + +`Run()` delegates to `ServeStdio` when `MCP_ADDR` is unset. + +### TCP + +```go +const DefaultTCPAddr = "127.0.0.1:9100" +``` + +Each accepted TCP connection receives its own fresh `mcp.Server` instance to prevent per-session state from leaking between clients. Messages are framed as newline-delimited JSON-RPC with a 10 MB maximum message size. + +```bash +# Start in TCP mode +MCP_ADDR=127.0.0.1:9100 core mcp serve +``` + +A warning is emitted when binding to `0.0.0.0`; local-only access is strongly preferred. + +### Unix Domain Socket + +```go +func (s *Service) ServeUnix(ctx context.Context, socketPath string) error +``` + +The socket file is removed before binding (to recover from unclean shutdowns) and again on shutdown. Like TCP, each connection spawns an independent server instance. Logging uses the `Security` level because socket access implies filesystem-based access control. + +### Transport Comparison + +| Transport | Activation | Use Case | +|-----------|-----------|----------| +| Stdio | No `MCP_ADDR` set | AI client subprocess integration | +| TCP | `MCP_ADDR=host:port` | Remote clients, multi-client daemons | +| Unix | Explicit `ServeUnix()` call | Local IPC with OS-level access control | + +## Subsystem Plugin Model + +### Interfaces + +```go +// Subsystem registers additional MCP tools at startup. +type Subsystem interface { + Name() string + RegisterTools(server *mcp.Server) +} + +// SubsystemWithShutdown extends Subsystem with graceful cleanup. +type SubsystemWithShutdown interface { + Subsystem + Shutdown(ctx context.Context) error +} +``` + +`RegisterTools` is called once during `New()`, after built-in tools are registered. `Shutdown` is optional -- the `Service.Shutdown(ctx)` method type-asserts each subsystem and calls `Shutdown` if implemented. + +### Built-in and Plugin Subsystems + +| Subsystem | Type | Source | +|-----------|------|--------| +| File, directory, language tools | Built-in | `mcp/mcp.go` | +| RAG tools | Built-in | `mcp/tools_rag.go` | +| Metrics tools | Built-in | `mcp/tools_metrics.go` | +| Process tools | Built-in (conditional) | `mcp/tools_process.go` | +| WebSocket tools | Built-in (conditional) | `mcp/tools_ws.go` | +| Webview tools | Built-in | `mcp/tools_webview.go` | +| ML subsystem | Plugin (`MLSubsystem`) | `mcp/tools_ml.go` | +| IDE subsystem | Plugin (`ide.Subsystem`) | `mcp/ide/` | + +## Tool Registration Pattern + +Every tool follows an identical pattern: a descriptor with name and description, and a typed handler: + +```go +mcp.AddTool(server, &mcp.Tool{ + Name: "file_read", + Description: "Read the contents of a file", +}, s.readFile) +``` + +The handler signature is: + +```go +func(ctx context.Context, req *mcp.CallToolRequest, input InputStruct) (*mcp.CallToolResult, OutputStruct, error) +``` + +The MCP Go SDK deserialises JSON-RPC `params` into `InputStruct` and serialises `OutputStruct` into the response. Returning a non-nil error produces a JSON-RPC error response. + +### Audit Logging + +Mutating operations (`file_write`, `file_delete`, `rag_ingest`, `ws_start`) are logged at `Security` level. Read-only operations use `Info`. The current OS username is captured via `log.Username()` and attached to every log entry. + +## Full Tool Inventory + +49 tools across 12 groups: + +| Group | Tools | Source | +|-------|-------|--------| +| File operations | `file_read`, `file_write`, `file_delete`, `file_rename`, `file_exists`, `file_edit` | `mcp/mcp.go` | +| Directory operations | `dir_list`, `dir_create` | `mcp/mcp.go` | +| Language detection | `lang_detect`, `lang_list` | `mcp/mcp.go` | +| RAG | `rag_query`, `rag_ingest`, `rag_collections` | `mcp/tools_rag.go` | +| ML inference | `ml_generate`, `ml_score`, `ml_probe`, `ml_status`, `ml_backends` | `mcp/tools_ml.go` | +| Metrics | `metrics_record`, `metrics_query` | `mcp/tools_metrics.go` | +| Process management | `process_start`, `process_stop`, `process_kill`, `process_list`, `process_output`, `process_input` | `mcp/tools_process.go` | +| WebSocket | `ws_start`, `ws_info` | `mcp/tools_ws.go` | +| Browser automation | `webview_connect`, `webview_disconnect`, `webview_navigate`, `webview_click`, `webview_type`, `webview_query`, `webview_console`, `webview_eval`, `webview_screenshot`, `webview_wait` | `mcp/tools_webview.go` | +| IDE chat | `ide_chat_send`, `ide_chat_history`, `ide_session_list`, `ide_session_create`, `ide_plan_status` | `mcp/ide/tools_chat.go` | +| IDE build | `ide_build_status`, `ide_build_list`, `ide_build_logs` | `mcp/ide/tools_build.go` | +| IDE dashboard | `ide_dashboard_overview`, `ide_dashboard_activity`, `ide_dashboard_metrics` | `mcp/ide/tools_dashboard.go` | + +## Daemon Mode + +The `cmd/daemon` package provides background service management: + +```go +type Config struct { + MCPTransport string // stdio, tcp, socket + MCPAddr string // address/path for tcp or socket + HealthAddr string // health check endpoint (default: 127.0.0.1:9101) + PIDFile string // PID file path +} +``` + +Configuration can be set via environment variables: + +| Variable | Default | Description | +|----------|---------|-------------| +| `CORE_MCP_TRANSPORT` | `tcp` | Transport type | +| `CORE_MCP_ADDR` | `127.0.0.1:9100` | Listen address | +| `CORE_HEALTH_ADDR` | `127.0.0.1:9101` | Health endpoint | +| `CORE_PID_FILE` | `~/.core/daemon.pid` | PID file | + +```bash +core daemon start # Start in background +core daemon start --mcp-transport socket # Unix socket mode +core daemon stop # Graceful shutdown +core daemon status # Check if running +``` diff --git a/docs/ml-pipeline.md b/docs/ml-pipeline.md new file mode 100644 index 0000000..f8109ff --- /dev/null +++ b/docs/ml-pipeline.md @@ -0,0 +1,144 @@ +--- +title: ML Pipeline +description: ML scoring, model management, and inference backend integration. +--- + +# ML Pipeline + +The ML pipeline in go-ai connects the MCP server to the scoring and inference capabilities provided by `go-ml` and `go-inference`. Five MCP tools expose generation, scoring, capability probes, and backend discovery. + +## Architecture + +``` +MCP Client + | tools/call ml_generate + v +MLSubsystem (go-ai/mcp/tools_ml.go) + | + +-- ml.Service (go-ml) + | +-- InferenceAdapter --> inference.TextModel (go-inference) + | +-- ScoringEngine (heuristic scores) + | +-- JudgeBackend (LLM-as-judge) + | + +-- inference.List() / inference.Get() / inference.Default() + +-- go-mlx (Metal GPU, macOS) + +-- go-rocm (AMD ROCm, Linux) + +-- Ollama (HTTP subprocess) +``` + +## ML Tools + +### `ml_generate` + +Generate text using the active inference backend. + +| Parameter | Type | Required | Description | +|-----------|------|----------|-------------| +| `prompt` | `string` | Yes | The text prompt | +| `model` | `string` | No | Model name (uses default if omitted) | +| `max_tokens` | `int` | No | Maximum tokens to generate | +| `temperature` | `float64` | No | Sampling temperature | + +Returns the generated text and metadata about which backend and model were used. + +### `ml_score` + +Score content using the heuristic scoring engine. Supports three scoring modes: + +- **Heuristic** -- Pattern-based scoring across multiple dimensions (emotional register, sycophancy detection, vocabulary diversity, etc.) +- **Semantic** -- LLM-as-judge evaluation using a secondary model +- **Content** -- Combined scoring pipeline + +| Parameter | Type | Required | Description | +|-----------|------|----------|-------------| +| `content` | `string` | Yes | Text to score | +| `mode` | `string` | No | `heuristic`, `semantic`, or `content` | + +Returns dimension-level scores and an aggregate verdict. + +### `ml_probe` + +Run capability probes against the active model. Probes test specific model capabilities (instruction following, reasoning, factual recall, etc.). There are 23 built-in probes. + +| Parameter | Type | Required | Description | +|-----------|------|----------|-------------| +| `probe` | `string` | No | Specific probe name (runs all if omitted) | + +### `ml_status` + +Query the ML pipeline status, including active backends, loaded models, and InfluxDB pipeline health. + +### `ml_backends` + +List all registered inference backends and their availability status. Calls `inference.List()`, `inference.Get()`, and `inference.Default()` from the `go-inference` registry. + +Returns an array of backends with their names, availability flags, and which is currently the default. + +## Inference Backend Registry + +The `go-inference` module provides a global registry for inference backends. Backends are registered at startup and can be queried at runtime: + +```go +// Querying available backends (used by ml_backends tool) +backends := inference.List() // All registered backends +backend := inference.Get("mlx") // Specific backend by name +def := inference.Default() // Currently active default +``` + +### Supported Backends + +| Backend | Module | Platform | Description | +|---------|--------|----------|-------------| +| MLX | `go-mlx` | macOS (Apple Silicon) | Native Metal GPU inference | +| ROCm | `go-rocm` | Linux (AMD GPU) | AMD ROCm GPU inference via llama-server | +| Ollama | `go-ml` | Any | HTTP-based inference via Ollama subprocess | + +## Scoring Engine + +The scoring engine in `go-ml` provides heuristic analysis across multiple dimensions. Each dimension produces a normalised score (0.0 to 1.0) and a qualitative verdict. + +Scoring dimensions include: +- Emotional register (positive and negative pattern detection) +- Sycophancy detection +- Vocabulary diversity +- Sentence complexity +- Repetition analysis +- Format adherence + +The `ml_score` tool delegates directly to `go-ml`'s `ml.Service` rather than routing through `go-inference`, since the scoring engine is specific to go-ml and not an abstract backend capability. + +## Integration with the MCP Server + +The ML subsystem is registered as a plugin during MCP server construction: + +```go +svc, err := mcp.New( + mcp.WithSubsystem(mcp.NewMLSubsystem(mlSvc)), +) +``` + +`MLSubsystem` implements the `Subsystem` interface and registers all five ML tools when `RegisterTools` is called. + +## Testing + +ML tools can be tested with mock backends that satisfy the `ml.Backend` and `inference.Backend` interfaces: + +```go +type mockMLBackend struct { + name string + available bool + generateResp string + generateErr error +} + +func (m *mockMLBackend) Name() string { return m.name } +func (m *mockMLBackend) Available() bool { return m.available } +``` + +Register lightweight mocks for CI environments where GPU backends and model weights are not available: + +```go +inference.Register(&mockInferenceBackend{name: "test-ci-mock", available: true}) +``` + +Note that `inference.Register` is global state -- use unique names to avoid conflicts between parallel test runs. diff --git a/docs/rag.md b/docs/rag.md new file mode 100644 index 0000000..5b8749b --- /dev/null +++ b/docs/rag.md @@ -0,0 +1,144 @@ +--- +title: RAG Pipeline +description: Retrieval-augmented generation via Qdrant vector search and Ollama embeddings. +--- + +# RAG Pipeline + +go-ai integrates with the RAG (Retrieval-Augmented Generation) pipeline provided by `go-rag`. This surfaces as three MCP tools for vector search and a high-level facade function for programmatic use. + +## Architecture + +``` +MCP Client Programmatic callers + | | + v v +rag_query / rag_ingest / rag_collections ai.QueryRAGForTask() + | | + +----------- go-rag --------------------+ + | | + v v + Qdrant Ollama + (vectors) (embeddings) +``` + +## MCP Tools + +### `rag_query` + +Query the vector database for documents relevant to a natural-language question. + +| Parameter | Type | Required | Description | +|-----------|------|----------|-------------| +| `question` | `string` | Yes | Natural-language query | +| `collection` | `string` | No | Qdrant collection name (default: `hostuk-docs`) | +| `limit` | `int` | No | Maximum results to return (default: 3) | +| `threshold` | `float64` | No | Minimum similarity score (default: 0.5) | + +The tool embeds the question via Ollama, searches Qdrant with the specified parameters, and returns formatted context with source references. + +### `rag_ingest` + +Ingest a file into the vector database. The file is chunked (for Markdown, this respects heading boundaries), each chunk is embedded via Ollama, and the resulting vectors are stored in Qdrant. + +| Parameter | Type | Required | Description | +|-----------|------|----------|-------------| +| `path` | `string` | Yes | Path to the file to ingest (relative to workspace root) | +| `collection` | `string` | No | Target Qdrant collection | + +This tool is logged at `Security` level due to its write nature. + +### `rag_collections` + +List all available collections in the connected Qdrant instance, with point counts and vector dimensions. + +## AI Facade: QueryRAGForTask + +The `ai` package provides a higher-level wrapper for programmatic RAG queries. It is used by agentic task planners to enrich task context without importing `go-rag` directly. + +```go +type TaskInfo struct { + Title string + Description string +} + +func QueryRAGForTask(task TaskInfo) (string, error) { + query := task.Title + " " + task.Description + + // Truncate to 500 runes to keep the embedding focused + runes := []rune(query) + if len(runes) > 500 { + query = string(runes[:500]) + } + + qdrantCfg := rag.DefaultQdrantConfig() + qdrantClient, err := rag.NewQdrantClient(qdrantCfg) + if err != nil { + return "", fmt.Errorf("rag qdrant client: %w", err) + } + defer qdrantClient.Close() + + ollamaCfg := rag.DefaultOllamaConfig() + ollamaClient, err := rag.NewOllamaClient(ollamaCfg) + if err != nil { + return "", fmt.Errorf("rag ollama client: %w", err) + } + + ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second) + defer cancel() + + results, err := rag.Query(ctx, qdrantClient, ollamaClient, query, rag.QueryConfig{ + Collection: "hostuk-docs", + Limit: 3, + Threshold: 0.5, + }) + if err != nil { + return "", fmt.Errorf("rag query: %w", err) + } + return rag.FormatResultsContext(results), nil +} +``` + +Key design decisions: +- The query is capped at **500 runes** to keep the embedding vector focused on the task's core intent +- A **10-second timeout** prevents hanging when services are slow +- The function returns an error rather than silently degrading, giving callers the choice of how to handle failures + +## External Service Dependencies + +### Qdrant + +Vector database storing embedded document chunks. + +- Default address: `localhost:6334` (gRPC) +- Configuration: `rag.DefaultQdrantConfig()` + +### Ollama + +Local LLM server providing embedding generation. + +- Default address: `localhost:11434` (HTTP) +- Configuration: `rag.DefaultOllamaConfig()` +- Default embedding model: `nomic-embed-text` + +Both services must be running for RAG tools to function. In CI, tests that touch RAG tools are guarded with `skipIfShort(t)`. + +## Embedding Benchmark + +The `cmd/embed-bench/` utility compares embedding models for the OpenBrain knowledge store. It tests how well models separate semantically related vs unrelated agent memory pairs. + +```bash +go run ./cmd/embed-bench +go run ./cmd/embed-bench -ollama http://localhost:11434 +``` + +The benchmark evaluates: +- **Cluster separation** -- intra-group vs inter-group similarity +- **Query recall accuracy** -- top-1 and top-3 retrieval precision +- **Embedding throughput** -- milliseconds per memory + +Models tested: `nomic-embed-text` and `embeddinggemma`. + +## Testing + +RAG tool tests cover handler validation (empty question/path fields, default behaviour) and graceful degradation when Qdrant or Ollama are unavailable. Full RAG round-trip tests require live services and are skipped in short mode.