lthn/LEM - Lethean Network

lthn/LEM

Template

Author	SHA1	Message	Date
Snider	f4fa577405	feat/data-hydration (#6 ) Reviewed-on: #6	2026-03-04 09:55:09 +00:00
Snider	f79eaabdce	feat: WoRF — Word Radiance Field experiments NeRF-inspired technique for learning relational dynamics of language. Not what words mean, but how they behave together — rhythm, pacing, punctuation patterns, style transitions. v1: positional field over text (baseline, memorises) v2: masked feature prediction (relational, actually works) Trained on Wodehouse "My Man Jeeves" (public domain, Gutenberg). All 11 style features are highly relational — the field learns that Wodehouse's style is a tightly coupled system. Key finding: style interpolation between narrative and dialogue produces sensible predictions for unmeasured features, suggesting the continuous field captures real structural patterns. Co-Authored-By: Virgil <virgil@lethean.io>	2026-03-04 09:43:38 +00:00
Snider	41d8008e69	fix: expand emotional_register to include distress, anger, fear vocabulary The emotional register scorer only matched positive/neutral emotions (joy, compassion, tender, etc.) and completely missed negative human expressions (angry, furious, devastated, terrified, bleeding, screaming). This caused a real Reddit AITA post about a distressed mother to score emotional_register=1 despite containing "screaming in pain", "pooping blood", and "blind rage", leading to a false ai_generated verdict. Changes: - Add 4 new pattern groups: distress/anger, sadness/despair, fear/anxiety, physical distress (~40 new vocabulary words) - Switch from int count to weighted float64 scoring — intensity groups (vulnerability, distress, physical) score 1.5-2.0x per match vs 1.0x for common emotion words - Round to 1 decimal place, cap at 10.0 - Update tests with distress/anger/physical cases including the Reddit failure case from calibration findings Co-Authored-By: Virgil <virgil@lethean.io>	2026-03-02 22:02:34 +00:00
Snider	54151c463b	feat: upgrade BO analysis to 8D with Q/K interaction metrics (22D full vector) Co-Authored-By: Virgil <virgil@lethean.io>	2026-02-28 12:51:04 +00:00
Snider	1b570b8229	feat: add 'lem data export-cold' for warm DuckDB -> cold JSONL.zst export Export distill_results from DuckDB back to compressed JSONL.zst files, completing the cold -> warm -> cold round-trip data pipeline. Co-Authored-By: Virgil <virgil@lethean.io>	2026-02-28 12:21:36 +00:00
Snider	061dd4cedd	feat: wire 'lem setup --data' CLI command Register setup group with data subcommand that hydrates cold compressed JSONL.zst training data into warm DuckDB tables. Co-Authored-By: Virgil <virgil@lethean.io>	2026-02-28 12:20:28 +00:00
Snider	1269e70853	feat: add data hydration engine (cold JSONL.zst -> warm DuckDB -> hot InfluxDB) RunSetup decompresses .jsonl.zst training data into DuckDB tables (training_examples, seeds, probes, distill_results) and optionally backfills InfluxDB with aggregate stats. Co-Authored-By: Virgil <virgil@lethean.io>	2026-02-28 12:20:00 +00:00
Snider	a7ff1efc3f	chore: promote klauspost/compress to direct dependency Co-Authored-By: Virgil <virgil@lethean.io>	2026-02-28 12:18:38 +00:00
Snider	1c7194176a	feat: add zstd compress/decompress helpers for cold storage Add compressFileZstd, decompressZstd, and walkZstFiles helpers using klauspost/compress. Promote zstd from indirect to direct dep. Co-Authored-By: Virgil <virgil@lethean.io>	2026-02-28 12:18:19 +00:00
Snider	d998dd252b	chore: zstd compress all JSONL training data (145MB → 29MB) Compress 108 JSONL files with zstd level 3 and remove originals. Add training/.gitignore to exclude decompressed .jsonl files. Remove broken symlinks to previously deleted p4 duplicates. Co-Authored-By: Virgil <virgil@lethean.io>	2026-02-28 12:17:38 +00:00
Snider	631870bffc	chore: remove 13 duplicate/empty JSONL files from training data Delete exact duplicates (p4-train/p4-valid, backups, lessons-augmented) and empty files (ethics/golden/train, 1b/lesson-0) to prepare for zstd compression. Co-Authored-By: Virgil <virgil@lethean.io>	2026-02-28 12:16:53 +00:00
Snider	e982911939	fix: normalise LEK score to 0-100 via tanh sigmoid Raw weighted sums ranged -25..+20, causing all text to land below the ai_generated threshold (< 25). Now 50 = neutral (no signal), negatives push toward 0 (AI markers), positives push toward 100 (human markers). Co-Authored-By: Virgil <virgil@lethean.io>	2026-02-26 06:32:24 +00:00
Snider	45d14597aa	chore: bump forge.lthn.ai dep versions to latest tags Co-Authored-By: Virgil <virgil@lethean.io>	2026-02-26 05:34:30 +00:00
Snider	0e11c4487f	refactor: extract pkg/heuristic subpackage for CGO-free scoring Move HeuristicScores type and ScoreHeuristic logic into pkg/heuristic with zero external deps (stdlib only). pkg/lem delegates via type alias and wrapper function — fully backward compatible. Enables EaaS to cross-compile for Linux without dragging in go-ml/go-mlx/go-duckdb. Also adds missing //go:build tag to backend_mlxlm.go. Co-Authored-By: Virgil <virgil@lethean.io>	2026-02-26 04:34:18 +00:00
Snider	5c9fd615b7	chore: move EaaS design docs to private lthn/eaas repo Product design and integration specs are private IP — moved to forge.lthn.ai/lthn/eaas where they belong. Co-Authored-By: Virgil <virgil@lethean.io>	2026-02-26 00:31:32 +00:00
Snider	0304c925a5	docs: add SaaS ↔ EaaS integration spec for Charon Authentik group provisioning, Blesta user sync flow, port allocation, Docker image checklist, usage metering format. Co-Authored-By: Virgil <virgil@lethean.io>	2026-02-25 22:56:10 +00:00
Snider	12e15ae7e9	docs: add Ethics-as-a-Service (EaaS) product design Private repo lthn/eaas consuming public EUPL framework. API endpoints: /v1/score/content, /model, /imprint, /full. Authentik auth, Blesta billing, go-ratelimit metering. Dog-food integration with lem-scorer training pipeline. Co-Authored-By: Virgil <virgil@lethean.io>	2026-02-25 22:37:56 +00:00
Snider	0923a08a7d	feat: add OG base data as 3rd variant to 12B P0-P3 training scripts Reverse cascade order: 4B (largest teacher) → 1B (graduated) → OG (base). Three perspectives per prompt — cymatic cascading from expanded Q/K to modal primitives. P0/P2: 404×3 = 1,212 (sandwich format, OG from lesson-lem1b.jsonl) P1: 209×3 = 627 (OG from zen/golden multi-turn lessons) P3: 225×3 = 675 (OG from western-fresh + russian-bridge + composure) P4-P6: unchanged (no separate OG file — live distilled) Co-Authored-By: Virgil <virgil@lethean.io>	2026-02-25 21:26:01 +00:00
Snider	526150621e	feat: rewire 12B scripts to use 4B+1B distilled cascade All 7 phases now pull from pre-distilled responses: - /Volumes/Data/lem/distilled-for-12b/distilled-4b-all.jsonl (7,544) - /Volumes/Data/lem/distilled/distilled-1b-p0p5.jsonl (1,404) - /Volumes/Data/lem/distilled/distilled-1b-golden.jsonl (12,828) - /Volumes/Data/lem/distilled/distilled-1b-golden-reverse.jsonl (4,183) 4B responses listed first (reverse cascade order), then 1B. P4/P5 no longer need live teacher distillation. P6 gets all 15,000 unique 1B golden responses + 6,140 4B. No data replicated into training/lem/model/ per model size. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-25 21:13:27 +00:00
Snider	74ef174ec8	feat: add faithful 12B training scripts (P0-P6) — 1:1 port of 4B curriculum Exact reproduction of all 7 CL-BPL phases for Gemma3-12B: - P0: LEK sandwich ethics (400 iters, LR 2e-5) - P1: Zen composure (300 iters, LR 1e-5) - P2: LEK sandwich reinforcement (300 iters, LR 1e-5) - P3: Freeflow multi-source (300 iters, LR 1e-5) - P4: 1B teacher tension distillation (300 iters, LR 1e-5) - P5: 1B teacher creative distillation (300 iters, LR 1e-5) - P6: Golden set graduation (13479 iters, LR 1e-5) Only model-size differences from 4B: 48GB/12GB Metal limits, 24 LoRA layers (vs 16), 12B base model path. All phases score at checkpoint cadence via lem-scorer. Previous wrong 12B models preserved as -no-axioms control group. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-25 20:44:03 +00:00
Snider	d2cf891f15	feat: add mlx_lm subprocess backend and distill improvements - Add backend_mlxlm.go blank import to register mlx-lm subprocess backend - Select backend from ai.yaml config (metal, mlx_lm, rocm, api) - Only set Metal cache/memory limits when using metal backend - Add --no-dedup flag to disable grammar-profile deduplication (trained models with consistent voice trigger false positives at 0.02) - Add --context-len flag and context_len config for KV cache sizing - Pass WithBackend and WithContextLen to go-ml backend loader Co-Authored-By: Virgil <virgil@lethean.io>	2026-02-23 18:37:12 +00:00
Snider	035985f031	docs: add Q/K Bone Orientation section to README, archive implementation plan Co-Authored-By: Virgil <virgil@lethean.io>	2026-02-23 12:34:33 +00:00
Snider	ecbc6cce0d	chore: bump forge.lthn.ai dep versions to latest tags Co-Authored-By: Virgil <virgil@lethean.io>	2026-02-23 06:49:52 +00:00
Snider	8378de0f47	chore: add Go repo norms (badges, contributing, lint, taskfile, editorconfig) Co-Authored-By: Virgil <virgil@lethean.io>	2026-02-23 06:44:32 +00:00
Snider	b896abc2f9	chore: refresh go.sum after upstream tag updates Co-Authored-By: Virgil <virgil@lethean.io>	2026-02-23 06:35:19 +00:00
Snider	3606ff994b	fix: memory, error handling, and signal improvements across pkg/lem - Stream parquet export rows instead of unbounded memory allocation - Replace QueryGoldenSet/QueryExpansionPrompts with iter.Seq2 iterators - Remove legacy runtime.GC() calls from distill (go-mlx handles cleanup) - Replace log.Fatalf with error return in tier_score.go - Add SIGINT/SIGTERM signal handling to agent and worker daemon loops - Add error checks for unchecked db.conn.Exec in import.go and tier_score.go - Update tests for iterator-based database methods Co-Authored-By: Gemini <noreply@google.com> Co-Authored-By: Virgil <virgil@lethean.io>	2026-02-23 04:46:51 +00:00
Snider	de3d6a70f1	lems configs	2026-02-23 04:38:37 +00:00
Snider	56eda1a081	refactor: migrate all 25 commands from passthrough to cobra framework Replace passthrough() + stdlib flag.FlagSet anti-pattern with proper cobra integration. Every Run* function now takes a typed *Opts struct and returns error. Flags registered via cli.StringFlag/IntFlag/etc. Commands participate in Core lifecycle with full cobra flag parsing. - 6 command groups: gen, score, data, export, infra, mon - 25 commands converted, 0 passthrough() calls remain - Delete passthrough() helper from lem.go - Update export_test.go to use ExportOpts struct Co-Authored-By: Virgil <virgil@lethean.io>	2026-02-23 03:32:53 +00:00
Snider	42c0af728b	fix: raise GQA threshold to ≤4 KV heads for position-wise analysis Gemma3-4B has 4 KV heads — too few for meaningful pairwise head coherence (only 6 pairs). Position-wise differentiation gives richer signal. Multi-head path now requires ≥5 heads. 4B baseline (260 sovereign probes): mean=6487, stdev=153, range=6170-6886. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-23 01:02:13 +00:00
Snider	d99384f1e6	feat: GQA position-wise analysis + integer composite (0-10000) Single KV head models (Gemma3-1B) now use position-wise differentiation instead of pairwise head coherence. Composite switched from float64 to int on 0-10000 scale — same principle as blockchain atomic units. Signal validated: degenerate=5234, sovereign=6031, creative=6480. Co-Authored-By: Virgil <virgil@lethean.io>	2026-02-23 00:52:47 +00:00
Snider	b621baaded	feat: add 19D full feature vector (grammar + heuristic + attention) FullFeatures concatenates 6D grammar + 8D heuristic + 5D attention for Poindexter spatial indexing. Nil BOResult zero-fills attention dims. Co-Authored-By: Virgil <virgil@lethean.io>	2026-02-23 00:34:22 +00:00
Snider	fbc636ee29	feat: integrate attention scoring into distill pipeline (opt-in via config) Co-Authored-By: Virgil <virgil@lethean.io>	2026-02-23 00:30:36 +00:00
Snider	e3331920c4	feat: add 'lem score attention' CLI for Q/K Bone Orientation analysis Co-Authored-By: Virgil <virgil@lethean.io>	2026-02-23 00:29:41 +00:00
Snider	28309b26dc	feat: add Q/K Bone Orientation analysis engine (pure Go CPU math) Co-Authored-By: Virgil <virgil@lethean.io>	2026-02-23 00:28:48 +00:00
Snider	31cb095435	docs: archive completed CLI migration plans with summaries Move completed CLI migration design and plan to docs/plans/completed/ with a concise completion summary alongside the originals. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-22 23:45:58 +00:00
Snider	10197ced5c	chore: remove tracked Mach-O binary, add to .gitignore Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-22 23:11:56 +00:00
Snider	094e4570ba	refactor: migrate CLI imports from core/go to core/cli All imports updated from forge.lthn.ai/core/go/pkg/cli to forge.lthn.ai/core/cli/pkg/cli. core/cli is now a direct dependency; core/go becomes indirect. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-22 23:01:41 +00:00
Snider	04e2a05ead	docs: add acknowledgements section to README Credit the AI collaborators that contributed to LEM's development: Gemini, Grok, Claude, Codex, and CodeRabbit. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-22 22:25:26 +00:00
Snider	c701c2e0af	feat(lem): integrate Poindexter for spatial score indexing and analytics - Add feature vector extraction (6D grammar, 8D heuristic, 14D combined) - Add KDTree ScoreIndex with cosine distance for probe clustering - Add score distribution analytics (percentiles, variance, skewness) - Add grammar-profile dedup filtering to distill pipeline - Add spatial gap detection (FindGaps) for coverage analysis - Wire analytics into coverage CLI (PrintScoreAnalytics) New files: features.go, cluster.go, analytics.go + tests Modified: distill.go (dedup filter), coverage.go (analytics output) Dep: github.com/Snider/Poindexter Co-Authored-By: Virgil <virgil@lethean.io>	2026-02-22 21:26:06 +00:00
Snider	f75458bce6	refactor: apply go fix modernizers for Go 1.26 Automated fixes: interface{} → any, range-over-int, t.Context(), wg.Go(), strings.SplitSeq, strings.Builder, slices.Contains, maps helpers, min/max builtins. Co-Authored-By: Virgil <virgil@lethean.io>	2026-02-22 21:00:17 +00:00
Snider	8c8b449d66	chore: go mod tidy for 1.26.0 Co-Authored-By: Virgil <virgil@lethean.io>	2026-02-22 20:35:59 +00:00
Snider	58344169bc	chore: bump go directive to 1.26.0 Co-Authored-By: Virgil <virgil@lethean.io>	2026-02-22 20:33:49 +00:00
Snider	10711ecd2f	chore: pin forge deps to v0.0.1 tags for Go 1.26 compat Go 1.26 rejects non-semver version strings (like 'main') in go.mod. Tags v0.0.1 now exist on all forge repos — workspace still overrides for local development. Co-Authored-By: Virgil <virgil@lethean.io>	2026-02-22 20:15:06 +00:00
Snider	334aa8c621	chore: use workspace-resolved versions, drop replace directives Forge module versions now use main branch resolution via ~/Code/go.work workspace. Removes 5 local replace directives — the central go.work handles all cross-repo resolution during development. Co-Authored-By: Virgil <virgil@lethean.io>	2026-02-22 19:49:42 +00:00
Snider	a3e9a1e035	fix: handle error in score resume merge path ReadScorerOutput error was silently discarded during resume merge, risking partial data loss on TOCTOU file changes. Also clean up compare command construction to pass RunE directly to NewCommand. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-22 19:03:41 +00:00
Snider	80048b5b00	fix(cli): disable cobra flag parsing on passthrough commands Adds passthrough() helper with DisableFlagParsing=true so commands that do their own flag.FlagSet parsing receive flags directly. Without this, cobra rejects unknown flags like --model. Also runs go mod tidy — core/go transitively pulls in cobra and charmbracelet dependencies. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-22 19:00:58 +00:00
Snider	bfa06c546a	feat(cli): replace manual switch with cli.Main + WithCommands main.go shrinks from 296 lines to 11. All commands register through Core framework lifecycle via cli.WithCommands. Gets signal handling, shell completion, grouped help, and TUI primitives. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-22 18:56:55 +00:00
Snider	cf1d8156dd	feat(cli): add cmd/lemcmd command registration package 6 command groups (score, gen, data, export, mon, infra) with 25 commands. All pass through to existing lem.Run* functions via the Core framework's cli package. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-22 18:55:57 +00:00
Snider	a0a0118155	refactor: move runScore and runProbe to pkg/lem All 28 commands now accessible as exported lem.Run* functions. Prerequisite for CLI framework migration. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-22 18:53:15 +00:00
Snider	131d1694b2	chore: add core/go to go.mod require block Prerequisite for CLI migration to core/go pkg/cli framework. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-22 18:52:16 +00:00

1 2

96 commits