# OpenBrain Design > **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task. **Goal:** Shared vector-indexed knowledge store that all agents (Virgil, Charon, Darbs, LEM) read/write through MCP, building singular state across sessions. **Architecture:** MariaDB for relational metadata + Qdrant for vector embeddings. Four MCP tools in php-agentic. Go bridge in go-ai for CLI agents. Ollama for embedding generation. **Repos:** `dappco.re/php/agent` (primary), `dappco.re/go/ai` (bridge) --- ## Problem Agent knowledge is scattered: - Virgil's `MEMORY.md` files in `~/.claude/projects/*/memory/` — file-based, single-agent, no semantic search - Plans in `docs/plans/` across repos — forgotten after completion - Session handoff notes in `agent_sessions.handoff_notes` — JSON blobs, not searchable - Research findings lost when context windows compress When Charon discovers a scoring calibration bug, Virgil only knows about it if explicitly told. There's no shared knowledge graph. ## Concept **OpenBrain** — "Open" means open protocol (MCP), not open source. All agents on the platform access the same knowledge graph via `brain_*` MCP tools. Data is stored *for agents* — structured for near-native context transfer between sessions and models. ## Data Model ### `brain_memories` table (MariaDB) | Column | Type | Purpose | |--------|------|---------| | `id` | UUID | Primary key, also Qdrant point ID | | `workspace_id` | FK | Multi-tenant isolation | | `agent_id` | string | Who wrote it (virgil, charon, darbs, lem) | | `type` | enum | `decision`, `observation`, `convention`, `research`, `plan`, `bug`, `architecture` | | `content` | text | The knowledge (markdown) | | `tags` | JSON | Topic tags for filtering | | `project` | string nullable | Repo/project scope (null = cross-project) | | `confidence` | float | 0.0–1.0, how certain the agent is | | `supersedes_id` | UUID nullable | FK to older memory this replaces | | `expires_at` | timestamp nullable | TTL for session-scoped context | | `deleted_at` | timestamp nullable | Soft delete | | `created_at` | timestamp | | | `updated_at` | timestamp | | ### `openbrain` Qdrant collection - **Vector dimension:** 768 (nomic-embed-text via Ollama) - **Distance metric:** Cosine - **Point ID:** MariaDB UUID - **Payload:** `workspace_id`, `agent_id`, `type`, `tags`, `project`, `confidence`, `created_at` (for filtered search) ## MCP Tools ### `brain_remember` — Store a memory ```json { "content": "LEM emotional_register was blind to negative emotions. Fixed by adding 8 weighted pattern groups.", "type": "bug", "tags": ["scoring", "emotional-register", "lem"], "project": "eaas", "confidence": 0.95, "supersedes": "uuid-of-outdated-memory" } ``` Agent ID injected from MCP session context. Returns the new memory UUID. **Pipeline:** 1. Validate input 2. Embed content via Ollama (`POST /api/embeddings`, model: `nomic-embed-text`) 3. Insert into MariaDB 4. Upsert into Qdrant with payload metadata 5. If `supersedes` set, soft-delete the old memory and remove from Qdrant ### `brain_recall` — Semantic search ```json { "query": "How does verdict classification work?", "top_k": 5, "filter": { "project": "eaas", "type": ["decision", "architecture"], "min_confidence": 0.5 } } ``` **Pipeline:** 1. Embed query via Ollama 2. Search Qdrant with vector + payload filters 3. Get top-K point IDs with similarity scores 4. Hydrate from MariaDB (content, tags, supersedes chain) 5. Return ranked results with scores Only returns latest version of superseded memories (includes `supersedes_count` so agent knows history exists). ### `brain_forget` — Soft-delete or supersede ```json { "id": "uuid", "reason": "Superseded by new calibration approach" } ``` Sets `deleted_at` in MariaDB, removes point from Qdrant. Keeps audit trail. ### `brain_list` — Browse (no vectors) ```json { "project": "eaas", "type": "decision", "agent_id": "charon", "limit": 20 } ``` Pure MariaDB query. For browsing, auditing, bulk export. No embedding needed. ## Architecture ### PHP side (`php-agentic`) ``` Mcp/Tools/Agent/Brain/ ├── BrainRemember.php ├── BrainRecall.php ├── BrainForget.php └── BrainList.php Services/ └── BrainService.php # Ollama embeddings + Qdrant client + MariaDB CRUD Models/ └── BrainMemory.php # Eloquent model Migrations/ └── XXXX_create_brain_memories_table.php ``` `BrainService` handles: - Ollama HTTP calls for embeddings - Qdrant REST API (upsert, search, delete points) - MariaDB CRUD via Eloquent - Supersession chain management ### Go side (`go-ai`) Thin bridge tools in the MCP server that proxy `brain_*` calls to Laravel via the existing WebSocket bridge. Same pattern as `ide_chat_send` / `ide_session_create`. ### Data flow ``` Agent (any Claude) ↓ MCP tool call Go MCP server (local, macOS/Linux) ↓ WebSocket bridge Laravel php-agentic (lthn.sh, de1) ↓ ↓ MariaDB Qdrant (relational) (vectors) ↑ Ollama (embeddings) ``` PHP-native agents skip the Go bridge — call `BrainService` directly. ### Infrastructure - **Qdrant:** New container on de1. Shared between OpenBrain and EaaS scoring (different collections). - **Ollama:** Existing instance. `nomic-embed-text` model for 768d embeddings. CPU is fine for the volume (~10K memories). - **MariaDB:** Existing instance on de1. New table in the agentic database. ## Integration ### Plans → Brain On plan completion, agents can extract key decisions/findings and `brain_remember` them. Optional — agents decide what's worth persisting. The plan itself stays in `agent_plans`; lessons learned go to the brain. ### Sessions → Brain Handoff notes (summary, next_steps, blockers) can auto-persist as memories with `type: observation` and optional TTL. Agents can also manually remember during a session. ### MEMORY.md migration Seed data: collect all `MEMORY.md` files from `~/.claude/projects/*/memory/` across worktrees. Parse into individual memories, embed, and load into OpenBrain. After migration, `brain_recall` replaces file-based memory. ### EaaS Same Qdrant instance, different collection (`eaas_scoring` vs `openbrain`). Shared infrastructure, separate concerns. ### LEM LEM models query the brain for project context during training data curation or benchmark analysis. Same MCP tools, different agent ID. ## What this replaces - Virgil's `MEMORY.md` files (file-based, single-agent, no search) - Scattered `docs/plans/` findings that get forgotten - Manual "Charon found X" cross-agent handoffs - Session-scoped knowledge that dies with context compression ## What this enables - Any Claude picks up where another left off — semantically - Decisions surface when related code is touched - Knowledge graph grows with every session across all agents - Near-native context transfer between models and sessions