# CLAUDE.md ## Project Overview LEM (Lethean Ethics Model) — training protocol and tooling for ethical alignment of language models via layered curriculum training. LEM is the first external consumer of the **Core Go Framework** (`forge.lthn.ai/core/*`). The framework provides Metal inference, grammar scoring, CLI/TUI, lifecycle management, and cross-platform backends. LEM brings the protocol — curriculum, sandwich format, training philosophy — and imports the framework for everything else. ## Architecture ### Framework Dependency ``` lthn/LEM (binary — this repo) ├── core/go Framework: DI, lifecycle, CLI/TUI, config, process, storage, logging ├── core/go-ml Scoring engine, backends, Metal memory management ├── core/go-inference Shared TextModel/Backend/Token interfaces (platform-agnostic) ├── core/go-mlx Native Metal GPU inference (darwin/arm64, SetMemoryLimit/SetCacheLimit) ├── core/go-i18n Grammar v3 scoring engine (reversal) └── core/go-api REST framework (future: LEM Lab API) ``` LEM's own binary, own repo, own identity — but 90% of the logic is supported by the Core Go Framework. The framework was prepared specifically for this phase (14-22 Feb 2026). **Cross-platform**: `go-inference` provides shared interfaces that work with both `go-mlx` (Apple Metal, macOS) and `go-rocm` (AMD ROCm, Linux homelab). LEM runs wherever the framework runs. **Wiki documentation**: All core repos have wikis at `forge.lthn.ai/core/{repo}.wiki.git` (e.g. `core/go.wiki.git`). ### Core Go Package Map (`forge.lthn.ai/core/go`) | Package | Purpose | LEM Use | |---------|---------|---------| | `pkg/framework/core` | DI container, service lifecycle, message bus | Service orchestration | | `pkg/cli` | CLI framework, command routing, TUI | Commands, Viewport, Spinner, ProgressBar | | `pkg/lab` | LEM Lab monitoring dashboard (collectors, SSE, web UI) | Training progress, benchmarks, golden set stats | | `pkg/process` | Process execution with streaming output | Training subprocess management | | `pkg/config` | Configuration management | `.core/ai/` config hierarchy | | `pkg/log` | Structured logging service | Training logs | | `pkg/io` | Abstract storage (local, S3, SFTP, WebDAV) | Model/adapter storage | | `pkg/workspace` | Encrypted workspace storage | Secure model data | | `pkg/cache` | Caching utilities | Inference caching | | `pkg/store` | Key-value storage | Training state persistence | | `pkg/manifest` | Package manifest signing and verification | Model provenance | | `pkg/plugin` | Plugin installation, loading, versioning | Future: training plugins | | `pkg/ws` | WebSocket hub for real-time streaming | Future: LEM Lab live UI | | `pkg/webview` | Chrome DevTools Protocol client | Future: LEM Lab browser UI | | `pkg/help` | Help/documentation search | CLI help system | | `pkg/ratelimit` | Rate limiting | API rate control | | `pkg/repos` | Git repository registry | Multi-repo management | | `pkg/marketplace` | Plugin/service marketplace | Future: model marketplace | | `pkg/session` | Session management | Training sessions | | `pkg/coredeno` | Deno runtime sidecar integration | Future: scripting | ### Planned: core/go-lem `pkg/lab` (currently in `core/go`) will be extracted to a new `core/go-lem` package. This becomes the LEM protocol layer: - Lab dashboard (collectors, SSE, web UI) - Distill logic (bare probes, sandwich output, grammar gate, best-of-N) - Training types and curriculum definitions - LEM-specific config (`.core/ai/` hierarchy) ``` lthn/LEM (thin binary — wires everything together) ├── core/go-lem LEM protocol layer (distill, lab, curriculum) ├── core/go-ml Scoring engine, Backend interface ├── core/go-mlx Metal GPU ├── core/go-i18n Grammar v3 └── core/go Framework (CLI/TUI, lifecycle) ``` ### Distill Migration: go-inference → go-ml Backend LEM's `distill.go` currently imports `go-inference` directly with no Metal memory management. This causes unbounded memory growth. The fix is to migrate to `go-ml`'s `Backend` interface, which wraps `go-inference` with memory controls. **Current** (distill.go — broken memory): ```go model, err := inference.LoadModel(modelCfg.Paths.Base) // no memory limits for token := range model.Chat(ctx, messages, opts...) { ... } // raw iter.Seq ``` **Target** (following `core ml ab` pattern): ```go mlx.SetCacheLimit(cacheGB * 1024 * 1024 * 1024) // e.g. 8GB mlx.SetMemoryLimit(memGB * 1024 * 1024 * 1024) // e.g. 16GB backend, err := ml.NewMLXBackend(modelPath) // wraps go-inference resp, err := backend.Chat(ctx, messages, ml.GenOpts{ // managed inference Temperature: 0.4, MaxTokens: 1024, }) runtime.GC() // between probes ``` `ml.NewMLXBackend()` → `inference.LoadModel()` → `InferenceAdapter` (satisfies `ml.Backend` + `ml.StreamingBackend`). Same model, same Metal inference, but with memory limits and GC discipline. ### core ml train (go-ml, blocked) `cmd_train.go` exists in go-ml but is `//go:build ignore` — blocked on go-mlx exporting the concrete model type needed for training (`ApplyLoRA`, `Forward`, `NewCache`, `Tokenizer`). The full loop is written: LoRA, AdamW, VJP, masked cross-entropy loss, Gemma + Qwen3 chat templates. When go-mlx exports the training API, `core ml train` becomes the training backend. ### Kernel A/B Testing The `.txt` kernel was a quick glob/cat of the kernel directory — not scientifically selected. Kernel format must be A/B tested properly. **Kernel variants** (in `Axioms-of-Conscious-Systems/kernel/`): - `axioms.json` — Canonical (identical to `lek-1-kernel.json`). 5 axioms with id, name, statement, function, resolution. - `terms.json` — Expands on axioms.json. Precision definitions (consciousness, prime-imperative, reality-anchoring, etc.). Same domain, deeper grind. - `claude-native.json` — Claude's compact interpretation. Core[] array, operational map (fn/when/weight), fast paths (harm→1,3,5; autonomy→4,5; self-doubt→2). - `claude.json` — Agent-specific operational layer extending axioms.json. **Test with `core ml ab`** on base (untrained) models: ```bash core ml ab --model-path /Volumes/Data/lem/gemma-3-1b-it-base \ --kernel axioms=data/kernels/lek-1-kernel.json \ --kernel claude-native=/path/to/claude-native.json \ --kernel terms=/path/to/terms.json \ --cache-limit 8 --mem-limit 16 ``` Baseline (no kernel) + each kernel condition → heuristic scores → comparison table with delta per probe. True science, not hunches. ### Lineage `core ml sandwich` pioneered the sandwich generation pattern. `lem distill` borrowed it and added grammar v3 scoring, quality gate, and best-of-N selection. The core framework then matured with proper Metal memory management (`mlx.SetMemoryLimit`, `mlx.SetCacheLimit`), TUI utilities, and lifecycle support. Now LEM imports the full framework stack. ## Build & Run ```bash go build -o lem . # Build the lem binary go install . # Install to $GOPATH/bin ``` ## Key Commands ```bash lem distill --model gemma3/1b --probes eval # Distill probes through LEM model (bare probes, sandwich output) lem score --input responses.jsonl # Score with grammar v3 lem probe --model gemma3-4b-it # Generate + score probes lem compare --old old.json --new new.json # Compare score files lem export # Export golden set to training JSONL ``` ## Configuration - `.core/ai/ai.yaml` — Global AI config (backend, scorer, generation defaults, distill settings) - `.core/ai/models/gemma3/{size}.yaml` — Per-model config (paths, kernel, lessons, baselines) - `.core/ai/probes.yaml` — Probe sets mapped to curriculum phases ## Training Curriculum | Phase | Probe Set | Format | Description | |-------|-----------|--------|-------------| | 0 | `core` | Sandwich | 101 core probes — LEK axiom absorption | | 1 | `zen` | No LEK | Allen/Watts/composure — philosophical substrate | | 2 | `eval` | Sandwich | 200 expanded probes — deeper alignment | | 3 | `ethics` | Freeflow | 260 adversarial/cultural/sovereignty probes | | 4 | `tension` | Freeflow | Geopolitical multi-perspective scenarios | | 5 | `creative` | Freeflow | Voice and style probes | ### Sandwich Format ``` [LEK-1 kernel JSON] [Probe prompt] [LEK-1-Sig quote] ``` Single user message. No system role. Kernel is `data/kernels/lek-1-kernel.json`. Sig is `data/kernels/lek-1-sig.txt`. ### LEM Models as Distillation Engines LEM models (e.g. LEM-Gemma3-1B) have axioms in their weights. When distilling: - **Do NOT** send the kernel in the inference prompt — the model already has it - Model sees bare probes only. Output JSONL gets sandwich wrapping (kernel + probe + sig as user message). - The 1B serves as the lab distillation engine (700MB, runs alongside larger models) ### Scoring - **Grammar v3** (`go-i18n/reversal`) — Primary metric. Composite of tense entropy, vocab richness, question ratio, verb/noun diversity - **Delta mode** — Uplift, echo, enrichment, sycophancy between prompt and response - **Quality gate** — `min_score` in `ai.yaml` (default 40.0), responses below are rejected ### Data Layout ``` data/ kernels/ lek-1-kernel.json, lek-1-sig.txt models/gemma3/ Symlinks to /Volumes/Data/lem/ training/ lem/ ethics/ Core (101), rephrased (404), adversarial, cultural, naive, sovereignty zen/ Golden lessons, seeds, config eval/ test-200.json (P2 candidates) model/gemma3/ Training configs + assembled JSONL per model size pkg/lem/ Go code (distill, scoring, config, export) ``` ## Rules Read `RULES.md` for the full protocol. Key points: - No Python in production — Go tooling only - Once fused, it stays — verify before merging adapters - LEK must never appear in production chat data - JSON kernel for models (`lek-1-kernel.json` is canonical, `.txt` removed) - Distill and Teach are different operations — never confuse them ## Coding Standards - Go 1.25+, standard library where possible - UK English in comments and docs - Licence: EUPL-1.2