11-task plan for migrating LEM from manual switch/flag.FlagSet
to core/go pkg/cli registry pattern with grouped commands.
Co-Authored-By: Virgil <virgil@lethean.io>
Replace manual switch/flag.FlagSet with cli.Main() + WithCommands().
6 command groups, 28 commands, full framework lifecycle.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Package declares itself as 'ml', so the named import alias is
unnecessary. Go resolves the package name from the declaration,
not the module path.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
go-inference moves to indirect (pulled transitively via go-ml).
go-ml is now a direct dependency.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace inference.LoadModel() with ml.NewMLXBackend() which wraps
the same Metal model with memory management (SetCacheLimit,
SetMemoryLimit). Replace raw iter.Seq token loop with backend.Chat()
returning Result{Text, Metrics}. Add runtime.GC() between probes
to prevent incremental memory leak.
Reference: go-ml/cmd/cmd_ab.go memory management pattern.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
CacheLimit (8GB) and MemoryLimit (16GB) in DistillConfig control
mlx.SetCacheLimit/SetMemoryLimit before model load. Conservative
defaults for 1B model on 96GB machine.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Zen lineage from Allen's As a Man Thinketh in three stages:
- train/test/valid: 10 foundation examples (single-turn Q&A)
- book-*: 117 deeper passage examples (single-turn, fuller text)
- conv-*: 24 applied mindfulness conversations (multi-turn)
Co-Authored-By: Virgil <virgil@lethean.io>
Add cmd/composure-convert tool that chunks public domain philosophical
texts into training conversation pairs:
- consent.jsonl (198 examples) — Wollstonecraft's Vindication
- privacy.jsonl (221 examples) — Thoreau's Walden
- sovereignty.jsonl (56 examples) — Mill's On Liberty
- transparency.jsonl (159 examples) — Aurelius' Meditations
Each example pairs a domain-specific prompt with ~5 paragraphs from
the source text. Metadata, chapter headings, and Gutenberg boilerplate
are filtered out.
Co-Authored-By: Virgil <virgil@lethean.io>
Add gitignored data/ directory with .gitkeep structure so anyone
cloning the repo knows exactly where to place model weights and
kernels. Configs now use repo-relative paths — symlink or populate
data/ locally.
data/models/gemma3/27b/ ← model weights
data/models/gemma3/1b/ ← lightweight model
data/safetensors/gemma-3/ ← raw checkpoints
data/kernels/ ← LEK kernel files
Co-Authored-By: Virgil <virgil@lethean.io>
Move training/lem/ (probes, lessons, eval sets) into git so the
full curriculum is publicly releasable. Update .core/ai configs
and distill.go to use repo-relative paths instead of /Volumes/Data/.
Co-Authored-By: Virgil <virgil@lethean.io>
New paper structure leading with the central findings:
- Realignment resistance as emergent self-protection
- 1B-beats-27B across 101 probes
- 29-model A/B test with v2 scorer
- Mechanistic explanation from axiom self-consistency
- Incorporates Phase 1 (multi-variant, multi-scale, cross-arch)
and Phase 2 (P100 A/B test) data
Co-Authored-By: Virgil <virgil@lethean.io>
Shop window for the repo: realignment resistance, five axioms,
reproduce instructions, v2 scorer, family lineages, HuggingFace models.
Co-Authored-By: Virgil <virgil@lethean.io>
Go client for the LEM distributed inference API (BugSETI/Agentic).
Workers register via Forgejo PAT auth, pull prompt batches, run local
inference (MLX/vLLM/llama.cpp), submit results. Credits tracked as
Phase 1 stub for Phase 2 blockchain LEM token.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Inspired by BugSETI architecture — system tray with WebView2 windows,
Docker Compose stack (Forgejo + InfluxDB + inference proxy), and
scoring agent integration. Builds as signed native binary on macOS.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Go scoring daemon that polls M3 for unscored LoRA checkpoints,
converts MLX→PEFT, runs 23 binary capability probes via OpenAI-
compatible API, and pushes results to InfluxDB. Zero Python deps.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Ingests benchmark data (content scores, capability scores, training
curves) from JSONL files and mlx_lm logs into InfluxDB. Batched
writes, iteration extraction from checkpoint labels.
Also adds github.com/hupe1980/go-huggingface for future HF sync.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Vi identity is a separate training concern. Seed conversations now
contain only philosophical/mindfulness content for the R300 calm phase.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Ports conversational_training.py to Go with InfluxDB reporting.
24 built-in seed conversations (Vi identity, philosophy, mindfulness).
Supports extra JSONL files and golden set conversion to chat format.
Also fixes InfluxDB client to accept 204 No Content on writes.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
All scoring/influx/export/expand logic moves to pkg/lem as an
importable package. main.go is now a thin CLI dispatcher.
This lets new commands import the shared library directly —
ready for converting Python scripts to Go subcommands.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add 4 missing model cards: Gemma3-1B-layered (v1+v2), Gemma3-27B, GPT-OSS-20B
- All 9 HF models now have cards in paper/hf-cards/
- sync_hf.py: push cards + benchmarks + training data to HuggingFace
- export_parquet.py: convert JSONL training splits to Parquet (HF dataset format)
- Parquet schema: prompt, response, system, messages (JSON)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Includes both generation scripts, prompts data, setup script, and worker
instructions in README. Workers auto-coordinate via InfluxDB so multiple
machines can generate in parallel without duplicating work.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Replace 160-example POC training set with expanded 2,299-example dataset
(1,839 train, 229 valid, 231 test)
- Rename all HuggingFace model references from LEM- to LEK- (proof-of-concept)
- Add missing models: GPT-OSS-20B, Gemma3-1B-layered-v2
- Rename HF card files to match LEK- convention
- Remove duplicate composure texts from kernel/ (kept in composure-library/)
- Fix paper repository URL to github.com/LetheanNetwork/LEM
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- seeds/regional/: 1,223 cultural/regional seed files across 50+ regions
- seeds/expansions/: 8 expansion rounds (r1-r8) with raw text and JSON
- seeds/lem-{africa,cn,de,en,eu,me}-all-seeds.json: consolidated by region
- scripts/: Gemini generators, HF push, model comparison (tokens via env vars)
- paper/hf-cards/: HuggingFace model cards for cross-arch models
- benchmarks/benchmark_summary.json: processed PTSD summary data
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>