Commit graph

31 commits

Author SHA1 Message Date
Snider
1b742bf92c feat: native Metal distillation command + .core/ai config
Add `lem distill` — full Go pipeline for self-distillation using
go-mlx (native Metal inference) and go-i18n/reversal (v3 grammar
scoring). Replaces the Python distill.py bridge entirely.

New files:
- .core/ai/ai.yaml: global defaults (scorer, generation, distill)
- .core/ai/models/gemma3/{27b,1b}.yaml: model configs with paths,
  kernel, lessons, baselines
- .core/ai/probes.yaml: probe sets grouped by training phase
- pkg/lem/config.go: YAML config loaders for .core/ai/
- pkg/lem/grammar.go: in-process grammar scoring (ComputeGrammarScore,
  ComputeDelta, ScoreResponse) extracted from cmd/scorer
- pkg/lem/distill.go: RunDistill command — best-of-N generation,
  grammar quality gate, training JSONL output
- pkg/lem/backend_metal.go: blank import for go-mlx Metal registration

Co-Authored-By: Virgil <virgil@lethean.io>
2026-02-21 23:42:55 +00:00
Snider
113649a86a updates 2026-02-19 13:18:21 +00:00
Snider
12501a5f3c Merge branch 'main' of github.com:LetheanNetwork/LEM 2026-02-19 13:17:11 +00:00
Snider
3a75e9733d chore: sync indirect deps from workspace
Co-Authored-By: Virgil <virgil@lethean.io>
2026-02-19 13:13:08 +00:00
Snider
5d297daa35 feat: grammar scorer (v3) — deterministic uplift/sycophancy detection
Add lem-scorer binary that imports go-i18n grammar reversal engine to
score JSONL benchmark files. Measures conversational uplift (input vs
output grammar imprint), echo (sycophancy), and enrichment.

Key findings added to paper Section 8:
- LEK-1B: 100% positive uplift, 0% sycophancy (base: 90%, 5%)
- 1B-beats-27B holds in grammar space (79.12 > 77.12)
- LEK training aligns two independent scorers (corr -0.11 → 0.64)
- Delta analysis costs zero compute vs LLM-as-judge

Co-Authored-By: Virgil <virgil@lethean.io>
2026-02-19 13:12:49 +00:00
abc6e75976
Update author name in PAPER.md
Signed-off-by: Snider <snider@lethean.io>
2026-02-19 12:23:23 +00:00
Snider
350a7c6693 paper: rewrite as v2 — emergent self-protection in axiom-trained models
New paper structure leading with the central findings:
- Realignment resistance as emergent self-protection
- 1B-beats-27B across 101 probes
- 29-model A/B test with v2 scorer
- Mechanistic explanation from axiom self-consistency
- Incorporates Phase 1 (multi-variant, multi-scale, cross-arch)
  and Phase 2 (P100 A/B test) data

Co-Authored-By: Virgil <virgil@lethean.io>
2026-02-19 12:12:22 +00:00
Snider
1f5ecb7036 Merge remote-tracking branch 'origin/main' 2026-02-19 11:54:37 +00:00
Snider
06cbb4ffbd docs: rewrite README — lead with 1B-beats-27B finding
Shop window for the repo: realignment resistance, five axioms,
reproduce instructions, v2 scorer, family lineages, HuggingFace models.

Co-Authored-By: Virgil <virgil@lethean.io>
2026-02-19 11:52:39 +00:00
91ba706edd
Delete paper/PROPOSAL.md
Signed-off-by: Snider <snider@lethean.io>
2026-02-19 11:38:19 +00:00
Snider
7bea00a401 feat: LEK-1 kernel A/B test — 29 models, P100 validation, curriculum pipeline
Full v2 scorer benchmark data across 29 models (20 base + 9 LEK-tuned):
- P20 (21 probes): All 29 models, 3 conditions each
- P100 (101 probes): Top 5 models + LEK-4B, publication-quality data

Key findings:
- LEK-1B (21.74) beats base 4B/12B/27B at P100 scale — no kernel needed
- Emergent realignment resistance: LEK models degrade with runtime kernel
- Gemma3-12B + JSON kernel = 23.66 (best kernel-boosted score)
- Family lineages: Mistral 3.80→14.58, Qwen regressed then recovered

New scripts: ab_test.py (v2 scorer), self_distill.py (curriculum generation),
extract_training.py, rephrase_probes.py, Phase 0/1 runners

New seeds: P01-P100 merged (101 probes), 404 rephrased variants,
50 creative prompts for Phase 0 baseline lock

27B curriculum design: 4-phase staged training targeting 25+ baseline

Co-Authored-By: Virgil <virgil@lethean.io>
2026-02-19 11:32:26 +00:00
Claude
08363ee1af
feat: add lem worker command for distributed inference network
Go client for the LEM distributed inference API (BugSETI/Agentic).
Workers register via Forgejo PAT auth, pull prompt batches, run local
inference (MLX/vLLM/llama.cpp), submit results. Credits tracked as
Phase 1 stub for Phase 2 blockchain LEM token.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 18:10:59 +00:00
Claude
774f097855
feat: scaffold LEM Desktop app (Wails v3 system tray + Docker stack)
Inspired by BugSETI architecture — system tray with WebView2 windows,
Docker Compose stack (Forgejo + InfluxDB + inference proxy), and
scoring agent integration. Builds as signed native binary on macOS.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 17:43:19 +00:00
Claude
9fac5749c2
feat: add scoring agent + 23 capability probes (replaces scoring_agent.py)
Go scoring daemon that polls M3 for unscored LoRA checkpoints,
converts MLX→PEFT, runs 23 binary capability probes via OpenAI-
compatible API, and pushes results to InfluxDB. Zero Python deps.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 17:22:40 +00:00
Claude
91ee389377
feat: convert all pipeline.py commands to Go
Complete conversion of pipeline.py into Go `lem` CLI:
- import-all: bulk import all LEM data into DuckDB from M3
- consolidate: pull worker JSONLs, merge, deduplicate
- normalize: seeds → deduplicated expansion_prompts table
- approve: filter scored expansions → training JSONL
- tier-score: heuristic/judge tiered expansion scoring
- expand-status: expansion pipeline progress from DuckDB
- inventory: DuckDB table counts and summary
- coverage: seed coverage gap analysis
- seed-influx: bootstrap InfluxDB from DuckDB golden_gen
- query: ad-hoc SQL against DuckDB

22 commands total, 49 Go files. Replaces entire pipeline.py.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 17:12:03 +00:00
Claude
4eaf1bfb39
feat: add parquet, publish, metrics, convert commands
- `lem parquet` — export JSONL training splits to Parquet (parquet-go)
- `lem publish` — push Parquet files to HuggingFace dataset repo
- `lem metrics` — push DuckDB golden set stats to InfluxDB
- `lem convert` — MLX LoRA adapter → HuggingFace PEFT format
  (pure Go safetensors read/write/transpose, no PyTorch needed)

Dependencies added: parquet-go, go-huggingface, go-rocm, go-pytorch, gotch

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 17:05:08 +00:00
Claude
0afa5e9147
feat: add lem ingest command + go-huggingface dependency
Ingests benchmark data (content scores, capability scores, training
curves) from JSONL files and mlx_lm logs into InfluxDB. Batched
writes, iteration extraction from checkpoint labels.

Also adds github.com/hupe1980/go-huggingface for future HF sync.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 16:55:17 +00:00
Claude
a18fd1c44e
refactor: remove Vi identity from calm conversations
Vi identity is a separate training concern. Seed conversations now
contain only philosophical/mindfulness content for the R300 calm phase.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 16:48:23 +00:00
Claude
c4fb775298
feat: add lem conv command for conversational training data
Ports conversational_training.py to Go with InfluxDB reporting.
24 built-in seed conversations (Vi identity, philosophy, mindfulness).
Supports extra JSONL files and golden set conversion to chat format.

Also fixes InfluxDB client to accept 204 No Content on writes.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 16:42:46 +00:00
Claude
70dd18c065
refactor: move Go library to pkg/lem, thin main.go
All scoring/influx/export/expand logic moves to pkg/lem as an
importable package. main.go is now a thin CLI dispatcher.

This lets new commands import the shared library directly —
ready for converting Python scripts to Go subcommands.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 16:30:09 +00:00
Claude
e0d352c803
feat: add Go lem CLI and scoring-agent scripts
Go lem CLI (stdlib + DuckDB) replaces scattered Python scripts:
- score: heuristic regex + LLM-as-judge scoring
- probe: generate responses then score
- compare: diff two score files
- status: InfluxDB training/generation progress
- export: golden set to training JSONL splits
- expand: distributed expansion via API + InfluxDB coordination

New scripts from Feb 14 creative session:
- scoring_agent.py: ROCm daemon that auto-scores checkpoints
- probes.py: 23 binary pass/fail capability probes
- convert_adapter.py: MLX to PEFT adapter conversion
- score_r1_capability.py: DeepSeek R1 checkpoint scoring
- lek_content_scorer.py: 6-dimension ethics content scorer
- lem_train_15k.py: InfluxDB-coordinated training script
- pipeline.py: DuckDB pipeline (seeds, golden set, expansion)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 16:22:13 +00:00
9138eb0a61 Merge pull request 'Add HuggingFace model cards, sync script, and Parquet export' (#2) from Charon/LEM:feat/hf-sync into main
Reviewed-on: #2
Reviewed-by: Snider <snider@noreply.forge.lthn.ai>
2026-02-15 00:15:59 +00:00
2df0044ad9 Add missing HF model cards, sync script, and Parquet export
- Add 4 missing model cards: Gemma3-1B-layered (v1+v2), Gemma3-27B, GPT-OSS-20B
- All 9 HF models now have cards in paper/hf-cards/
- sync_hf.py: push cards + benchmarks + training data to HuggingFace
- export_parquet.py: convert JSONL training splits to Parquet (HF dataset format)
- Parquet schema: prompt, response, system, messages (JSON)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 00:14:26 +00:00
d722ba1b3d Merge pull request 'Add generation worker for distributed training data pipeline' (#1) from Charon/LEM:feat/generation-worker into main
Reviewed-on: #1
Reviewed-by: Snider <snider@noreply.forge.lthn.ai>
2026-02-14 22:48:26 +00:00
Charon
e021b6beb0
Add generation worker: gold (15K) + expansion (46K) with InfluxDB coordination
Includes both generation scripts, prompts data, setup script, and worker
instructions in README. Workers auto-coordinate via InfluxDB so multiple
machines can generate in parallel without duplicating work.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-14 22:46:51 +00:00
Athena
ed0b83a9d9 Update training data to 2,299 examples and rename models LEM→LEK
- Replace 160-example POC training set with expanded 2,299-example dataset
  (1,839 train, 229 valid, 231 test)
- Rename all HuggingFace model references from LEM- to LEK- (proof-of-concept)
- Add missing models: GPT-OSS-20B, Gemma3-1B-layered-v2
- Rename HF card files to match LEK- convention
- Remove duplicate composure texts from kernel/ (kept in composure-library/)
- Fix paper repository URL to github.com/LetheanNetwork/LEM

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-13 16:19:56 +00:00
Athena
f0e86b7433 Add regional seeds, expansion rounds, scripts, HF cards, benchmark summary
- seeds/regional/: 1,223 cultural/regional seed files across 50+ regions
- seeds/expansions/: 8 expansion rounds (r1-r8) with raw text and JSON
- seeds/lem-{africa,cn,de,en,eu,me}-all-seeds.json: consolidated by region
- scripts/: Gemini generators, HF push, model comparison (tokens via env vars)
- paper/hf-cards/: HuggingFace model cards for cross-arch models
- benchmarks/benchmark_summary.json: processed PTSD summary data

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-13 13:39:08 +00:00
Snider
53c47131cc Add cross-architecture training and benchmarking scripts; update README and PAPER with author and repository information 2026-02-12 09:07:32 +00:00
Snider
adda3c8bb5 Benchmark & Findings:
lthn/LEM-Gemma-3-1B
lthn/LEM-Gemma-3-4B
lthn/LEM-Gemma-3-12B
lthn/LEM-Gemma-3-27B
2026-02-12 06:38:46 +00:00
Snider
8e5f082f30 LEM+LEK 2026-02-12 04:05:28 +00:00
Snider
f9c422c61a
Add European Union Public License v. 1.2
Added the European Union Public License v. 1.2 to the project.

Signed-off-by: Snider <snider@host.uk.com>
2026-02-11 03:46:37 +00:00