diff --git a/docs/agentic.md b/docs/agentic.md
new file mode 100644
index 0000000..635c56e
--- /dev/null
+++ b/docs/agentic.md
@@ -0,0 +1,214 @@
+---
+title: Agentic Client
+description: Security scanning, metrics recording, and CLI command integrations.
+---
+
+# Agentic Client
+
+go-ai provides several CLI command packages that integrate AI-adjacent functionality into the `core` binary. These cover security scanning across multi-repo environments, AI metrics viewing, and homelab monitoring.
+
+## Security Scanning
+
+The `cmd/security/` package provides a comprehensive security scanning suite that queries GitHub's security APIs across repositories defined in a `repos.yaml` registry.
+
+### Command Structure
+
+```
+core security
+  +-- alerts     # Unified view of all security alert types
+  +-- deps       # Dependabot dependency vulnerability alerts
+  +-- scan       # Code scanning alerts (CodeQL, etc.)
+  +-- secrets    # Secret scanning alerts
+  +-- jobs       # Create GitHub issues from scan findings
+```
+
+### Common Flags
+
+| Flag | Description |
+|------|-------------|
+| `--registry` | Path to `repos.yaml` (auto-detected if omitted) |
+| `--repo` | Filter to a single repository |
+| `--severity` | Filter by severity: `critical`, `high`, `medium`, `low` (comma-separated) |
+| `--json` | Output as JSON instead of formatted table |
+| `--target` | External repo target (e.g. `wailsapp/wails`) -- bypasses registry |
+
+### Alerts
+
+The `alerts` command provides a unified view combining Dependabot, code scanning, and secret scanning alerts:
+
+```bash
+core security alerts                          # All repos in registry
+core security alerts --repo core-php          # Single repo
+core security alerts --severity critical,high # Filter by severity
+core security alerts --target wailsapp/wails  # External repo
+core security alerts --json                   # JSON output
+```
+
+### Dependencies
+
+Query Dependabot vulnerability alerts with upgrade suggestions:
+
+```bash
+core security deps
+core security deps --severity high
+```
+
+Output includes the vulnerable version range and the first patched version when available.
+
+### Code Scanning
+
+Query code scanning alerts from tools like CodeQL:
+
+```bash
+core security scan
+core security scan --tool codeql    # Filter by scanning tool
+```
+
+Each scan execution records a metrics event via `ai.Record()` for tracking scan activity over time.
+
+### Secret Scanning
+
+Check for exposed secrets across repositories:
+
+```bash
+core security secrets
+core security secrets --json
+```
+
+Secrets are always treated as high severity. The output flags whether push protection was bypassed.
+
+### Jobs
+
+Create GitHub issues from security findings for agent-driven remediation:
+
+```bash
+core security jobs --targets wailsapp/wails
+core security jobs --targets wailsapp/wails --issue-repo host-uk/core
+core security jobs --targets wailsapp/wails --dry-run
+core security jobs --targets a/b,c/d --copies 3
+```
+
+| Flag | Default | Description |
+|------|---------|-------------|
+| `--targets` | (required) | Comma-separated `owner/repo` targets |
+| `--issue-repo` | `host-uk/core` | Repository where issues are created |
+| `--dry-run` | `false` | Preview without creating issues |
+| `--copies` | `1` | Number of issue copies per target |
+
+Each created issue includes a findings summary, checklist, and instructions. A metrics event is recorded for each issue created.
+
+## Metrics
+
+The `ai/` package provides a JSONL-based metrics recording system. Events are stored at:
+
+```
+~/.core/ai/metrics/YYYY-MM-DD.jsonl
+```
+
+### Event Structure
+
+```go
+type Event struct {
+    Type      string         `json:"type"`
+    Timestamp time.Time      `json:"timestamp"`
+    AgentID   string         `json:"agent_id,omitempty"`
+    Repo      string         `json:"repo,omitempty"`
+    Duration  time.Duration  `json:"duration,omitempty"`
+    Data      map[string]any `json:"data,omitempty"`
+}
+```
+
+### Recording Events
+
+```go
+ai.Record(ai.Event{
+    Type:      "security.scan",
+    Timestamp: time.Now(),
+    Repo:      "wailsapp/wails",
+    Data: map[string]any{
+        "total":    summary.Total,
+        "critical": summary.Critical,
+    },
+})
+```
+
+Writing uses `O_APPEND` with a mutex for concurrent safety. Missing directories are created automatically.
+
+### Reading and Querying
+
+```go
+events, err := ai.ReadEvents(since)  // Read events from a time range
+summary := ai.Summary(events)        // Aggregate by type, repo, agent
+```
+
+`ReadEvents` iterates calendar days from `since` to today, opening each daily file. Missing files are silently skipped. Malformed JSONL lines are skipped without error.
+
+`Summary` returns a `map[string]any` with:
+- `total` -- total event count
+- `by_type` -- sorted slice of `{key, count}` maps
+- `by_repo` -- sorted slice of `{key, count}` maps
+- `by_agent` -- sorted slice of `{key, count}` maps
+
+### CLI Command
+
+```bash
+core ai metrics                  # Last 7 days (default)
+core ai metrics --since 30d     # Last 30 days
+core ai metrics --since 24h     # Last 24 hours
+core ai metrics --json          # JSON output
+```
+
+Duration format: `Nd` (days), `Nh` (hours), `Nm` (minutes).
+
+### MCP Tools
+
+The metrics system is also exposed via two MCP tools:
+
+| Tool | Description |
+|------|-------------|
+| `metrics_record` | Record an event to the JSONL store |
+| `metrics_query` | Query and summarise events for a time period |
+
+## Lab Dashboard
+
+The `cmd/lab/` package provides a homelab monitoring dashboard with real-time data collection:
+
+```bash
+core lab serve                    # Start on :8080
+core lab serve --bind :9090       # Custom port
+```
+
+### Collectors
+
+The dashboard aggregates data from multiple sources:
+
+| Collector | Interval | Source |
+|-----------|----------|--------|
+| System | 60s | Local machine stats |
+| Prometheus | Configurable | Prometheus endpoint |
+| HuggingFace | Configurable | HF model metadata |
+| Docker | Configurable | Docker container status |
+| Forgejo | Configurable | Forge CI/CD status |
+| Training | Configurable | ML training run status |
+| Services | 60s | Service health checks |
+| InfluxDB | Configurable | Time-series metrics |
+
+### Routes
+
+**Web pages:** `/`, `/models`, `/training`, `/dataset`, `/agents`, `/services`
+
+**JSON API:** `/api/status`, `/api/models`, `/api/training`, `/api/dataset`, `/api/runs`, `/api/agents`, `/api/services`
+
+**Live updates:** `/events` (Server-Sent Events)
+
+**Health:** `/health`
+
+## RAG CLI
+
+The `cmd/rag/` package re-exports `go-rag`'s CLI commands for use within the `core` binary:
+
+```go
+var AddRAGSubcommands = ragcmd.AddRAGSubcommands
+```
+
+This makes RAG operations (ingest, query, collection management) available as `core rag` subcommands without duplicating the implementation.
diff --git a/docs/ide-bridge.md b/docs/ide-bridge.md
new file mode 100644
index 0000000..b3941d7
--- /dev/null
+++ b/docs/ide-bridge.md
@@ -0,0 +1,223 @@
+---
+title: IDE Bridge
+description: IDE integration bridge connecting the MCP server to Laravel core-agentic via WebSocket.
+---
+
+# IDE Bridge
+
+The `mcp/ide` package implements the IDE subsystem. It bridges the desktop MCP server to the Laravel `core-agentic` backend via a persistent WebSocket connection, enabling AI agents to interact with agent sessions, builds, and the platform dashboard.
+
+## Architecture
+
+```
+MCP Client (Claude, Cursor, etc.)
+    |
+    v  MCP JSON-RPC
+go-ai MCP Server
+    |
+    v  ide_* tool calls
+IDE Subsystem (mcp/ide/)
+    |
+    +-- Bridge -----------> Laravel core-agentic
+    |   (WebSocket)         ws://localhost:9876/ws
+    |
+    +-- ws.Hub <------------ Bridge dispatch
+         |
+         v  real-time updates
+    Local WebSocket subscribers (browser UIs)
+```
+
+## Subsystem Structure
+
+```go
+type Subsystem struct {
+    cfg    Config
+    bridge *Bridge    // nil in headless mode
+    hub    *ws.Hub    // local WebSocket hub for real-time forwarding
+}
+```
+
+When a `ws.Hub` is provided, the subsystem creates a `Bridge` that actively connects to Laravel. Without a hub (`hub == nil`), the subsystem runs in **headless mode**: tools are still registered and return stub responses, but no real-time forwarding occurs.
+
+## Configuration
+
+```go
+type Config struct {
+    LaravelWSURL         string        // WebSocket endpoint (default: ws://localhost:9876/ws)
+    WorkspaceRoot        string        // local path for workspace context
+    Token                string        // Bearer token for Authorization header
+    ReconnectInterval    time.Duration // base backoff (default: 2s)
+    MaxReconnectInterval time.Duration // cap for exponential backoff (default: 30s)
+}
+```
+
+All fields are overridable via functional options:
+
+```go
+sub := ide.New(hub,
+    ide.WithLaravelURL("ws://custom:9876/ws"),
+    ide.WithToken("my-bearer-token"),
+    ide.WithWorkspaceRoot("/path/to/project"),
+    ide.WithReconnectInterval(5 * time.Second),
+)
+```
+
+## WebSocket Bridge
+
+The `Bridge` maintains a persistent WebSocket connection to Laravel and forwards inbound messages to the local `ws.Hub`.
+
+### Connection Lifecycle
+
+```
+StartBridge(ctx)
+    +-- go connectLoop(ctx)
+           +-- dial(ctx)              <-- WebSocket upgrade with Bearer token
+           |      sets b.connected = true
+           +-- readLoop(ctx)          <-- blocks reading frames
+                  +-- dispatch(msg)   <-- routes to ws.Hub channel
+                  [on read error]
+                  sets b.connected = false, returns to connectLoop
+```
+
+### Exponential Backoff
+
+When the connection drops or fails to establish, the bridge uses exponential backoff:
+
+```go
+delay := b.cfg.ReconnectInterval    // starts at 2s
+for {
+    if err := b.dial(ctx); err != nil {
+        // wait delay, then double it up to MaxReconnectInterval
+        delay = min(delay*2, b.cfg.MaxReconnectInterval)
+        continue
+    }
+    delay = b.cfg.ReconnectInterval // reset on successful connection
+    b.readLoop(ctx)
+}
+```
+
+Backoff sequence with defaults: 2s, 4s, 8s, 16s, 30s, 30s, ... The delay resets to 2s on every successful connection.
+
+### Authentication
+
+```go
+var header http.Header
+if b.cfg.Token != "" {
+    header = http.Header{}
+    header.Set("Authorization", "Bearer "+b.cfg.Token)
+}
+conn, _, err := dialer.DialContext(ctx, b.cfg.LaravelWSURL, header)
+```
+
+When `Token` is empty, no `Authorization` header is sent. This is appropriate for development environments running without authentication.
+
+### Message Dispatch
+
+Inbound frames from Laravel are deserialised into `BridgeMessage`:
+
+```go
+type BridgeMessage struct {
+    Type      string    `json:"type"`
+    Channel   string    `json:"channel,omitempty"`
+    SessionID string    `json:"sessionId,omitempty"`
+    Data      any       `json:"data,omitempty"`
+    Timestamp time.Time `json:"timestamp"`
+}
+```
+
+The `dispatch` method routes messages to the local `ws.Hub`:
+
+```go
+func (b *Bridge) dispatch(msg BridgeMessage) {
+    channel := msg.Channel
+    if channel == "" {
+        channel = "ide:" + msg.Type   // synthetic channel name
+    }
+    b.hub.SendToChannel(channel, ws.Message{Type: ws.TypeEvent, Data: msg.Data})
+}
+```
+
+This allows browser-based UIs connected to the local WebSocket hub to receive real-time updates from Laravel without polling.
+
+### Outbound Messages
+
+MCP tool handlers call `bridge.Send()` to push requests to Laravel:
+
+```go
+func (b *Bridge) Send(msg BridgeMessage) error {
+    b.mu.Lock()
+    defer b.mu.Unlock()
+    if b.conn == nil {
+        return fmt.Errorf("bridge: not connected")
+    }
+    msg.Timestamp = time.Now()
+    data, _ := json.Marshal(msg)
+    return b.conn.WriteMessage(websocket.TextMessage, data)
+}
+```
+
+A mutex ensures `Send` and the `readLoop` do not race on `b.conn`. If the bridge is disconnected, `Send` returns an error which propagates to the MCP client as a JSON-RPC error.
+
+## IDE Tool Groups
+
+The subsystem registers 11 tools across three groups.
+
+### Chat Tools (`tools_chat.go`)
+
+| Tool | Description |
+|------|-------------|
+| `ide_chat_send` | Send a message to an agent chat session |
+| `ide_chat_history` | Retrieve message history for a session |
+| `ide_session_list` | List active agent sessions |
+| `ide_session_create` | Create a new agent session |
+| `ide_plan_status` | Get current plan status for a session |
+
+### Build Tools (`tools_build.go`)
+
+| Tool | Description |
+|------|-------------|
+| `ide_build_status` | Get the status of a specific build |
+| `ide_build_list` | List recent builds, optionally filtered by repository |
+| `ide_build_logs` | Retrieve log output for a build |
+
+### Dashboard Tools (`tools_dashboard.go`)
+
+| Tool | Description |
+|------|-------------|
+| `ide_dashboard_overview` | High-level platform overview (repos, services, sessions, builds, bridge status) |
+| `ide_dashboard_activity` | Recent activity feed |
+| `ide_dashboard_metrics` | Aggregate build and agent metrics for a time period |
+
+### Tool Behaviour
+
+All IDE tools follow a **fire-and-forward** pattern: the tool sends a `BridgeMessage` to Laravel and returns an immediate acknowledgement or stub response. Real data arrives asynchronously via the WebSocket read loop and is forwarded to `ws.Hub` subscribers.
+
+The `ide_dashboard_overview` tool is the one exception -- it reads `bridge.Connected()` synchronously to populate the `BridgeOnline` field.
+
+## Registration
+
+The IDE subsystem is registered during MCP server construction:
+
+```go
+svc, err := mcp.New(
+    mcp.WithWSHub(hub),
+    mcp.WithSubsystem(ide.New(hub, ide.WithToken(token))),
+)
+```
+
+The subsystem implements `SubsystemWithShutdown`, closing the bridge connection gracefully when the MCP server shuts down.
+
+## Testing
+
+Bridge tests use `net/http/httptest` to stand up a real WebSocket server in-process. This keeps tests hermetic while exercising:
+
+- Reconnection logic and exponential backoff
+- Authentication header injection
+- Message dispatch routing
+- Server shutdown detection
+
+The `waitConnected` helper polls `bridge.Connected()` with a deadline rather than using fixed sleeps.
+
+All 11 tool handlers are tested under two conditions:
+- **nil bridge** -- verifies the error path
+- **connected mock bridge** -- verifies the success path with JSON round-trip validation
diff --git a/docs/index.md b/docs/index.md
new file mode 100644
index 0000000..caa7b26
--- /dev/null
+++ b/docs/index.md
@@ -0,0 +1,130 @@
+---
+title: go-ai Overview
+description: The AI integration hub for the Lethean Go ecosystem — MCP server, metrics, and facade.
+---
+
+# go-ai
+
+**Module**: `forge.lthn.ai/core/go-ai`
+**Language**: Go 1.26
+**Licence**: EUPL-1.2
+
+go-ai is the **integration hub** for the Lethean AI stack. It imports specialised modules and exposes them as a unified MCP server with IDE bridge support, metrics recording, and a thin AI facade.
+
+## Architecture
+
+```
+AI Clients (Claude, Cursor, any MCP-capable IDE)
+          |  MCP JSON-RPC (stdio / TCP / Unix)
+          v
+  [ go-ai MCP Server ]          <-- this module
+      |        |        |
+      |        |        +-- ide/ subsystem --> Laravel core-agentic (WebSocket)
+      |        +-- go-rag -----------------> Qdrant + Ollama
+      +-- go-ml ---------------------------> inference backends (go-mlx, go-rocm, ...)
+
+  Core CLI (forge.lthn.ai/core/cli) bootstraps and wires everything
+```
+
+go-ai is a pure library module. It contains no `main` package. The Core CLI (`core mcp serve`) imports `forge.lthn.ai/core/go-ai/mcp`, constructs a `mcp.Service`, and calls `Run()`.
+
+## Package Layout
+
+```
+go-ai/
++-- ai/                          # AI facade: RAG queries and JSONL metrics
+|   +-- ai.go                   # Package documentation and composition overview
+|   +-- rag.go                  # QueryRAGForTask() with graceful degradation
+|   +-- metrics.go              # Event, Record(), ReadEvents(), Summary()
+|
++-- cmd/                         # CLI command registrations
+|   +-- daemon/                  # core daemon (MCP server lifecycle)
+|   +-- metrics/                 # core ai metrics viewer
+|   +-- rag/                     # re-exports go-rag CLI commands
+|   +-- security/                # security scanning tools (deps, alerts, secrets, scan, jobs)
+|   +-- lab/                     # homelab monitoring dashboard
+|   +-- embed-bench/             # embedding model benchmark utility
+|
++-- docs/                        # This documentation
+```
+
+The MCP server and all its tool subsystems are provided by the separate `forge.lthn.ai/core/mcp` module. go-ai wires that server together with the `ai/` facade and the CLI command registrations.
+
+## Imported Modules
+
+| Module | Purpose |
+|--------|---------|
+| `forge.lthn.ai/core/go-ml` | Inference backends, scoring engine |
+| `forge.lthn.ai/core/go-rag` | Vector search, embeddings |
+| `forge.lthn.ai/core/go-inference` | Shared TextModel/Backend interfaces |
+| `forge.lthn.ai/core/go-process` | Process lifecycle management |
+| `forge.lthn.ai/core/go-log` | Structured logging with security levels |
+| `forge.lthn.ai/core/go-io` | Sandboxed filesystem abstraction |
+| `forge.lthn.ai/core/go-i18n` | Internationalisation |
+
+## Quick Start
+
+go-ai is not run directly. It is consumed by the Core CLI:
+
+```bash
+# Start the MCP server on stdio (default)
+core mcp serve
+
+# Start on TCP
+core mcp serve --mcp-transport tcp --mcp-addr 127.0.0.1:9100
+
+# Run as a background daemon
+core daemon start
+
+# View AI metrics
+core ai metrics --since 7d
+```
+
+## Documentation
+
+| Page | Description |
+|------|-------------|
+| [MCP Server](mcp-server.md) | Protocol implementation, transports, tool registration |
+| [ML Pipeline](ml-pipeline.md) | ML scoring, model management, inference backends |
+| [RAG Pipeline](rag.md) | Retrieval-augmented generation, vector search |
+| [Agentic Client](agentic.md) | Security scanning, metrics, CLI commands |
+| [IDE Bridge](ide-bridge.md) | IDE integration, WebSocket bridge to Laravel |
+
+## Build and Test
+
+```bash
+go test ./...                       # Run all tests
+go test -run TestName ./...         # Run a single test
+go test -v -race ./...              # Verbose with race detector
+go build ./...                      # Verify compilation (library -- no binary)
+go vet ./...                        # Vet
+```
+
+Tests follow the `_Good`, `_Bad`, `_Ugly` suffix convention:
+
+- `_Good` -- Happy path, valid input
+- `_Bad` -- Expected error conditions
+- `_Ugly` -- Panics and edge cases
+
+## Dependencies
+
+### Direct
+
+| Module | Role |
+|--------|------|
+| `forge.lthn.ai/core/cli` | CLI framework (cobra-based command registration) |
+| `forge.lthn.ai/core/go-api` | API server framework |
+| `forge.lthn.ai/core/go-i18n` | Internationalisation strings |
+| `forge.lthn.ai/core/go-inference` | Shared inference interfaces |
+| `forge.lthn.ai/core/go-io` | Filesystem abstraction |
+| `forge.lthn.ai/core/go-log` | Structured logging |
+| `forge.lthn.ai/core/go-ml` | ML scoring and inference |
+| `forge.lthn.ai/core/go-process` | Process lifecycle |
+| `forge.lthn.ai/core/go-rag` | RAG pipeline |
+| `github.com/modelcontextprotocol/go-sdk` | MCP Go SDK |
+| `github.com/gorilla/websocket` | WebSocket client (IDE bridge) |
+| `github.com/gin-gonic/gin` | HTTP router |
+
+### Indirect (via go-ml and go-rag)
+
+`go-mlx`, `go-rocm`, `go-duckdb`, `parquet-go`, `ollama`, `qdrant/go-client`, and the Arrow ecosystem are transitive dependencies not imported directly by go-ai.
diff --git a/docs/mcp-server.md b/docs/mcp-server.md
new file mode 100644
index 0000000..55c0b4c
--- /dev/null
+++ b/docs/mcp-server.md
@@ -0,0 +1,228 @@
+---
+title: MCP Server
+description: Model Context Protocol server implementation, transports, and tool registration.
+---
+
+# MCP Server
+
+The MCP server is the core of go-ai. It exposes 49 tools across file operations, RAG vector search, ML inference, process management, WebSocket streaming, browser automation, metrics, and IDE integration via the [Model Context Protocol](https://modelcontextprotocol.io/).
+
+## The Service Struct
+
+`mcp.Service` is the central container. It wraps the upstream MCP Go SDK server and owns all optional services:
+
+```go
+type Service struct {
+    server         *mcp.Server       // upstream go-sdk server instance
+    workspaceRoot  string            // sandboxed root for file operations
+    medium         io.Medium         // filesystem abstraction (sandboxed or global)
+    subsystems     []Subsystem       // plugin subsystems registered via WithSubsystem
+    logger         *log.Logger       // audit logger for tool execution
+    processService *process.Service  // optional: process lifecycle management
+    wsHub          *ws.Hub           // optional: WebSocket hub for streaming
+}
+```
+
+## Construction
+
+`New()` uses functional options. All options are applied before tools are registered:
+
+```go
+svc, err := mcp.New(
+    mcp.WithWorkspaceRoot("/path/to/project"),
+    mcp.WithProcessService(ps),
+    mcp.WithWSHub(hub),
+    mcp.WithSubsystem(ide.New(hub, ide.WithToken(token))),
+    mcp.WithSubsystem(mcp.NewMLSubsystem(mlSvc)),
+)
+```
+
+**Construction sequence:**
+
+1. Allocate `Service` with an empty `mcp.Server` (name `core-cli`, version `0.1.0`).
+2. Default workspace root to `os.Getwd()` and create a sandboxed medium.
+3. Apply each `Option` in order.
+4. Register built-in file, directory, and language tools (10 tools).
+5. Register RAG, metrics, and conditionally WebSocket and process tools.
+6. Iterate subsystems and call `sub.RegisterTools(s.server)` for each plugin.
+
+### Available Options
+
+| Option | Effect |
+|--------|--------|
+| `WithWorkspaceRoot(root)` | Restrict file operations to `root`; empty string removes restriction |
+| `WithProcessService(ps)` | Enable process management tools |
+| `WithWSHub(hub)` | Enable WebSocket streaming tools |
+| `WithSubsystem(sub)` | Append a Subsystem plugin |
+
+## Workspace Sandboxing
+
+The `io.Medium` abstraction (from `forge.lthn.ai/core/go-io`) isolates file access. When a workspace root is configured, every read, write, list, and stat call is validated against that root. Paths that escape the sandbox are rejected before reaching the operating system.
+
+```go
+func WithWorkspaceRoot(root string) Option {
+    return func(s *Service) error {
+        if root == "" {
+            s.medium = io.Local   // unrestricted global filesystem
+            return nil
+        }
+        abs, _ := filepath.Abs(root)
+        m, err := io.NewSandboxed(abs)
+        s.medium = m
+        return nil
+    }
+}
+```
+
+An empty root switches the medium to `io.Local` with no path restrictions. Production deployments should always provide an explicit root.
+
+## Transports
+
+The server supports three transports. `Run()` auto-selects between stdio and TCP based on the `MCP_ADDR` environment variable.
+
+### Stdio (default)
+
+Standard integration mode for AI clients (Claude, Cursor) that spawn the server as a subprocess:
+
+```go
+func (s *Service) ServeStdio(ctx context.Context) error {
+    return s.server.Run(ctx, &mcp.StdioTransport{})
+}
+```
+
+`Run()` delegates to `ServeStdio` when `MCP_ADDR` is unset.
+
+### TCP
+
+```go
+const DefaultTCPAddr = "127.0.0.1:9100"
+```
+
+Each accepted TCP connection receives its own fresh `mcp.Server` instance to prevent per-session state from leaking between clients. Messages are framed as newline-delimited JSON-RPC with a 10 MB maximum message size.
+
+```bash
+# Start in TCP mode
+MCP_ADDR=127.0.0.1:9100 core mcp serve
+```
+
+A warning is emitted when binding to `0.0.0.0`; local-only access is strongly preferred.
+
+### Unix Domain Socket
+
+```go
+func (s *Service) ServeUnix(ctx context.Context, socketPath string) error
+```
+
+The socket file is removed before binding (to recover from unclean shutdowns) and again on shutdown. Like TCP, each connection spawns an independent server instance. Logging uses the `Security` level because socket access implies filesystem-based access control.
+
+### Transport Comparison
+
+| Transport | Activation | Use Case |
+|-----------|-----------|----------|
+| Stdio | No `MCP_ADDR` set | AI client subprocess integration |
+| TCP | `MCP_ADDR=host:port` | Remote clients, multi-client daemons |
+| Unix | Explicit `ServeUnix()` call | Local IPC with OS-level access control |
+
+## Subsystem Plugin Model
+
+### Interfaces
+
+```go
+// Subsystem registers additional MCP tools at startup.
+type Subsystem interface {
+    Name() string
+    RegisterTools(server *mcp.Server)
+}
+
+// SubsystemWithShutdown extends Subsystem with graceful cleanup.
+type SubsystemWithShutdown interface {
+    Subsystem
+    Shutdown(ctx context.Context) error
+}
+```
+
+`RegisterTools` is called once during `New()`, after built-in tools are registered. `Shutdown` is optional -- the `Service.Shutdown(ctx)` method type-asserts each subsystem and calls `Shutdown` if implemented.
+
+### Built-in and Plugin Subsystems
+
+| Subsystem | Type | Source |
+|-----------|------|--------|
+| File, directory, language tools | Built-in | `mcp/mcp.go` |
+| RAG tools | Built-in | `mcp/tools_rag.go` |
+| Metrics tools | Built-in | `mcp/tools_metrics.go` |
+| Process tools | Built-in (conditional) | `mcp/tools_process.go` |
+| WebSocket tools | Built-in (conditional) | `mcp/tools_ws.go` |
+| Webview tools | Built-in | `mcp/tools_webview.go` |
+| ML subsystem | Plugin (`MLSubsystem`) | `mcp/tools_ml.go` |
+| IDE subsystem | Plugin (`ide.Subsystem`) | `mcp/ide/` |
+
+## Tool Registration Pattern
+
+Every tool follows an identical pattern: a descriptor with name and description, and a typed handler:
+
+```go
+mcp.AddTool(server, &mcp.Tool{
+    Name:        "file_read",
+    Description: "Read the contents of a file",
+}, s.readFile)
+```
+
+The handler signature is:
+
+```go
+func(ctx context.Context, req *mcp.CallToolRequest, input InputStruct) (*mcp.CallToolResult, OutputStruct, error)
+```
+
+The MCP Go SDK deserialises JSON-RPC `params` into `InputStruct` and serialises `OutputStruct` into the response. Returning a non-nil error produces a JSON-RPC error response.
+
+### Audit Logging
+
+Mutating operations (`file_write`, `file_delete`, `rag_ingest`, `ws_start`) are logged at `Security` level. Read-only operations use `Info`. The current OS username is captured via `log.Username()` and attached to every log entry.
+
+## Full Tool Inventory
+
+49 tools across 12 groups:
+
+| Group | Tools | Source |
+|-------|-------|--------|
+| File operations | `file_read`, `file_write`, `file_delete`, `file_rename`, `file_exists`, `file_edit` | `mcp/mcp.go` |
+| Directory operations | `dir_list`, `dir_create` | `mcp/mcp.go` |
+| Language detection | `lang_detect`, `lang_list` | `mcp/mcp.go` |
+| RAG | `rag_query`, `rag_ingest`, `rag_collections` | `mcp/tools_rag.go` |
+| ML inference | `ml_generate`, `ml_score`, `ml_probe`, `ml_status`, `ml_backends` | `mcp/tools_ml.go` |
+| Metrics | `metrics_record`, `metrics_query` | `mcp/tools_metrics.go` |
+| Process management | `process_start`, `process_stop`, `process_kill`, `process_list`, `process_output`, `process_input` | `mcp/tools_process.go` |
+| WebSocket | `ws_start`, `ws_info` | `mcp/tools_ws.go` |
+| Browser automation | `webview_connect`, `webview_disconnect`, `webview_navigate`, `webview_click`, `webview_type`, `webview_query`, `webview_console`, `webview_eval`, `webview_screenshot`, `webview_wait` | `mcp/tools_webview.go` |
+| IDE chat | `ide_chat_send`, `ide_chat_history`, `ide_session_list`, `ide_session_create`, `ide_plan_status` | `mcp/ide/tools_chat.go` |
+| IDE build | `ide_build_status`, `ide_build_list`, `ide_build_logs` | `mcp/ide/tools_build.go` |
+| IDE dashboard | `ide_dashboard_overview`, `ide_dashboard_activity`, `ide_dashboard_metrics` | `mcp/ide/tools_dashboard.go` |
+
+## Daemon Mode
+
+The `cmd/daemon` package provides background service management:
+
+```go
+type Config struct {
+    MCPTransport string // stdio, tcp, socket
+    MCPAddr      string // address/path for tcp or socket
+    HealthAddr   string // health check endpoint (default: 127.0.0.1:9101)
+    PIDFile      string // PID file path
+}
+```
+
+Configuration can be set via environment variables:
+
+| Variable | Default | Description |
+|----------|---------|-------------|
+| `CORE_MCP_TRANSPORT` | `tcp` | Transport type |
+| `CORE_MCP_ADDR` | `127.0.0.1:9100` | Listen address |
+| `CORE_HEALTH_ADDR` | `127.0.0.1:9101` | Health endpoint |
+| `CORE_PID_FILE` | `~/.core/daemon.pid` | PID file |
+
+```bash
+core daemon start                          # Start in background
+core daemon start --mcp-transport socket   # Unix socket mode
+core daemon stop                           # Graceful shutdown
+core daemon status                         # Check if running
+```
diff --git a/docs/ml-pipeline.md b/docs/ml-pipeline.md
new file mode 100644
index 0000000..f8109ff
--- /dev/null
+++ b/docs/ml-pipeline.md
@@ -0,0 +1,144 @@
+---
+title: ML Pipeline
+description: ML scoring, model management, and inference backend integration.
+---
+
+# ML Pipeline
+
+The ML pipeline in go-ai connects the MCP server to the scoring and inference capabilities provided by `go-ml` and `go-inference`. Five MCP tools expose generation, scoring, capability probes, and backend discovery.
+
+## Architecture
+
+```
+MCP Client
+    |  tools/call ml_generate
+    v
+MLSubsystem (go-ai/mcp/tools_ml.go)
+    |
+    +-- ml.Service (go-ml)
+    |       +-- InferenceAdapter --> inference.TextModel (go-inference)
+    |       +-- ScoringEngine (heuristic scores)
+    |       +-- JudgeBackend (LLM-as-judge)
+    |
+    +-- inference.List() / inference.Get() / inference.Default()
+            +-- go-mlx (Metal GPU, macOS)
+            +-- go-rocm (AMD ROCm, Linux)
+            +-- Ollama (HTTP subprocess)
+```
+
+## ML Tools
+
+### `ml_generate`
+
+Generate text using the active inference backend.
+
+| Parameter | Type | Required | Description |
+|-----------|------|----------|-------------|
+| `prompt` | `string` | Yes | The text prompt |
+| `model` | `string` | No | Model name (uses default if omitted) |
+| `max_tokens` | `int` | No | Maximum tokens to generate |
+| `temperature` | `float64` | No | Sampling temperature |
+
+Returns the generated text and metadata about which backend and model were used.
+
+### `ml_score`
+
+Score content using the heuristic scoring engine. Supports three scoring modes:
+
+- **Heuristic** -- Pattern-based scoring across multiple dimensions (emotional register, sycophancy detection, vocabulary diversity, etc.)
+- **Semantic** -- LLM-as-judge evaluation using a secondary model
+- **Content** -- Combined scoring pipeline
+
+| Parameter | Type | Required | Description |
+|-----------|------|----------|-------------|
+| `content` | `string` | Yes | Text to score |
+| `mode` | `string` | No | `heuristic`, `semantic`, or `content` |
+
+Returns dimension-level scores and an aggregate verdict.
+
+### `ml_probe`
+
+Run capability probes against the active model. Probes test specific model capabilities (instruction following, reasoning, factual recall, etc.). There are 23 built-in probes.
+
+| Parameter | Type | Required | Description |
+|-----------|------|----------|-------------|
+| `probe` | `string` | No | Specific probe name (runs all if omitted) |
+
+### `ml_status`
+
+Query the ML pipeline status, including active backends, loaded models, and InfluxDB pipeline health.
+
+### `ml_backends`
+
+List all registered inference backends and their availability status. Calls `inference.List()`, `inference.Get()`, and `inference.Default()` from the `go-inference` registry.
+
+Returns an array of backends with their names, availability flags, and which is currently the default.
+
+## Inference Backend Registry
+
+The `go-inference` module provides a global registry for inference backends. Backends are registered at startup and can be queried at runtime:
+
+```go
+// Querying available backends (used by ml_backends tool)
+backends := inference.List()     // All registered backends
+backend := inference.Get("mlx")  // Specific backend by name
+def := inference.Default()       // Currently active default
+```
+
+### Supported Backends
+
+| Backend | Module | Platform | Description |
+|---------|--------|----------|-------------|
+| MLX | `go-mlx` | macOS (Apple Silicon) | Native Metal GPU inference |
+| ROCm | `go-rocm` | Linux (AMD GPU) | AMD ROCm GPU inference via llama-server |
+| Ollama | `go-ml` | Any | HTTP-based inference via Ollama subprocess |
+
+## Scoring Engine
+
+The scoring engine in `go-ml` provides heuristic analysis across multiple dimensions. Each dimension produces a normalised score (0.0 to 1.0) and a qualitative verdict.
+
+Scoring dimensions include:
+- Emotional register (positive and negative pattern detection)
+- Sycophancy detection
+- Vocabulary diversity
+- Sentence complexity
+- Repetition analysis
+- Format adherence
+
+The `ml_score` tool delegates directly to `go-ml`'s `ml.Service` rather than routing through `go-inference`, since the scoring engine is specific to go-ml and not an abstract backend capability.
+
+## Integration with the MCP Server
+
+The ML subsystem is registered as a plugin during MCP server construction:
+
+```go
+svc, err := mcp.New(
+    mcp.WithSubsystem(mcp.NewMLSubsystem(mlSvc)),
+)
+```
+
+`MLSubsystem` implements the `Subsystem` interface and registers all five ML tools when `RegisterTools` is called.
+
+## Testing
+
+ML tools can be tested with mock backends that satisfy the `ml.Backend` and `inference.Backend` interfaces:
+
+```go
+type mockMLBackend struct {
+    name         string
+    available    bool
+    generateResp string
+    generateErr  error
+}
+
+func (m *mockMLBackend) Name() string    { return m.name }
+func (m *mockMLBackend) Available() bool { return m.available }
+```
+
+Register lightweight mocks for CI environments where GPU backends and model weights are not available:
+
+```go
+inference.Register(&mockInferenceBackend{name: "test-ci-mock", available: true})
+```
+
+Note that `inference.Register` is global state -- use unique names to avoid conflicts between parallel test runs.
diff --git a/docs/rag.md b/docs/rag.md
new file mode 100644
index 0000000..5b8749b
--- /dev/null
+++ b/docs/rag.md
@@ -0,0 +1,144 @@
+---
+title: RAG Pipeline
+description: Retrieval-augmented generation via Qdrant vector search and Ollama embeddings.
+---
+
+# RAG Pipeline
+
+go-ai integrates with the RAG (Retrieval-Augmented Generation) pipeline provided by `go-rag`. This surfaces as three MCP tools for vector search and a high-level facade function for programmatic use.
+
+## Architecture
+
+```
+MCP Client                           Programmatic callers
+    |                                       |
+    v                                       v
+rag_query / rag_ingest / rag_collections    ai.QueryRAGForTask()
+    |                                       |
+    +----------- go-rag --------------------+
+                    |              |
+                    v              v
+                Qdrant          Ollama
+              (vectors)       (embeddings)
+```
+
+## MCP Tools
+
+### `rag_query`
+
+Query the vector database for documents relevant to a natural-language question.
+
+| Parameter | Type | Required | Description |
+|-----------|------|----------|-------------|
+| `question` | `string` | Yes | Natural-language query |
+| `collection` | `string` | No | Qdrant collection name (default: `hostuk-docs`) |
+| `limit` | `int` | No | Maximum results to return (default: 3) |
+| `threshold` | `float64` | No | Minimum similarity score (default: 0.5) |
+
+The tool embeds the question via Ollama, searches Qdrant with the specified parameters, and returns formatted context with source references.
+
+### `rag_ingest`
+
+Ingest a file into the vector database. The file is chunked (for Markdown, this respects heading boundaries), each chunk is embedded via Ollama, and the resulting vectors are stored in Qdrant.
+
+| Parameter | Type | Required | Description |
+|-----------|------|----------|-------------|
+| `path` | `string` | Yes | Path to the file to ingest (relative to workspace root) |
+| `collection` | `string` | No | Target Qdrant collection |
+
+This tool is logged at `Security` level due to its write nature.
+
+### `rag_collections`
+
+List all available collections in the connected Qdrant instance, with point counts and vector dimensions.
+
+## AI Facade: QueryRAGForTask
+
+The `ai` package provides a higher-level wrapper for programmatic RAG queries. It is used by agentic task planners to enrich task context without importing `go-rag` directly.
+
+```go
+type TaskInfo struct {
+    Title       string
+    Description string
+}
+
+func QueryRAGForTask(task TaskInfo) (string, error) {
+    query := task.Title + " " + task.Description
+
+    // Truncate to 500 runes to keep the embedding focused
+    runes := []rune(query)
+    if len(runes) > 500 {
+        query = string(runes[:500])
+    }
+
+    qdrantCfg := rag.DefaultQdrantConfig()
+    qdrantClient, err := rag.NewQdrantClient(qdrantCfg)
+    if err != nil {
+        return "", fmt.Errorf("rag qdrant client: %w", err)
+    }
+    defer qdrantClient.Close()
+
+    ollamaCfg := rag.DefaultOllamaConfig()
+    ollamaClient, err := rag.NewOllamaClient(ollamaCfg)
+    if err != nil {
+        return "", fmt.Errorf("rag ollama client: %w", err)
+    }
+
+    ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
+    defer cancel()
+
+    results, err := rag.Query(ctx, qdrantClient, ollamaClient, query, rag.QueryConfig{
+        Collection: "hostuk-docs",
+        Limit:      3,
+        Threshold:  0.5,
+    })
+    if err != nil {
+        return "", fmt.Errorf("rag query: %w", err)
+    }
+    return rag.FormatResultsContext(results), nil
+}
+```
+
+Key design decisions:
+- The query is capped at **500 runes** to keep the embedding vector focused on the task's core intent
+- A **10-second timeout** prevents hanging when services are slow
+- The function returns an error rather than silently degrading, giving callers the choice of how to handle failures
+
+## External Service Dependencies
+
+### Qdrant
+
+Vector database storing embedded document chunks.
+
+- Default address: `localhost:6334` (gRPC)
+- Configuration: `rag.DefaultQdrantConfig()`
+
+### Ollama
+
+Local LLM server providing embedding generation.
+
+- Default address: `localhost:11434` (HTTP)
+- Configuration: `rag.DefaultOllamaConfig()`
+- Default embedding model: `nomic-embed-text`
+
+Both services must be running for RAG tools to function. In CI, tests that touch RAG tools are guarded with `skipIfShort(t)`.
+
+## Embedding Benchmark
+
+The `cmd/embed-bench/` utility compares embedding models for the OpenBrain knowledge store. It tests how well models separate semantically related vs unrelated agent memory pairs.
+
+```bash
+go run ./cmd/embed-bench
+go run ./cmd/embed-bench -ollama http://localhost:11434
+```
+
+The benchmark evaluates:
+- **Cluster separation** -- intra-group vs inter-group similarity
+- **Query recall accuracy** -- top-1 and top-3 retrieval precision
+- **Embedding throughput** -- milliseconds per memory
+
+Models tested: `nomic-embed-text` and `embeddinggemma`.
+
+## Testing
+
+RAG tool tests cover handler validation (empty question/path fields, default behaviour) and graceful degradation when Qdrant or Ollama are unavailable. Full RAG round-trip tests require live services and are skipped in short mode.