lthn/LEM - Lethean Network

lthn/LEM

Template

Author	SHA1	Message	Date
Snider	c701c2e0af	feat(lem): integrate Poindexter for spatial score indexing and analytics - Add feature vector extraction (6D grammar, 8D heuristic, 14D combined) - Add KDTree ScoreIndex with cosine distance for probe clustering - Add score distribution analytics (percentiles, variance, skewness) - Add grammar-profile dedup filtering to distill pipeline - Add spatial gap detection (FindGaps) for coverage analysis - Wire analytics into coverage CLI (PrintScoreAnalytics) New files: features.go, cluster.go, analytics.go + tests Modified: distill.go (dedup filter), coverage.go (analytics output) Dep: github.com/Snider/Poindexter Co-Authored-By: Virgil <virgil@lethean.io>	2026-02-22 21:26:06 +00:00
Snider	f75458bce6	refactor: apply go fix modernizers for Go 1.26 Automated fixes: interface{} → any, range-over-int, t.Context(), wg.Go(), strings.SplitSeq, strings.Builder, slices.Contains, maps helpers, min/max builtins. Co-Authored-By: Virgil <virgil@lethean.io>	2026-02-22 21:00:17 +00:00
Snider	a3e9a1e035	fix: handle error in score resume merge path ReadScorerOutput error was silently discarded during resume merge, risking partial data loss on TOCTOU file changes. Also clean up compare command construction to pass RunE directly to NewCommand. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-22 19:03:41 +00:00
Snider	a0a0118155	refactor: move runScore and runProbe to pkg/lem All 28 commands now accessible as exported lem.Run* functions. Prerequisite for CLI framework migration. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-22 18:53:15 +00:00
Snider	8532077e46	style: remove redundant named import for go-ml Package declares itself as 'ml', so the named import alias is unnecessary. Go resolves the package name from the declaration, not the module path. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-22 18:08:01 +00:00
Snider	55519b24aa	feat(distill): migrate from go-inference to go-ml Backend Replace inference.LoadModel() with ml.NewMLXBackend() which wraps the same Metal model with memory management (SetCacheLimit, SetMemoryLimit). Replace raw iter.Seq token loop with backend.Chat() returning Result{Text, Metrics}. Add runtime.GC() between probes to prevent incremental memory leak. Reference: go-ml/cmd/cmd_ab.go memory management pattern. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-22 18:02:16 +00:00
Snider	8408cc0bab	feat(distill): add --cache-limit and --mem-limit flags Override ai.yaml memory config per-run. Values in GB. Not yet wired to model loading. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-22 18:00:04 +00:00
Snider	b9da23a0be	feat(distill): add Metal memory limit config fields CacheLimit (8GB) and MemoryLimit (16GB) in DistillConfig control mlx.SetCacheLimit/SetMemoryLimit before model load. Conservative defaults for 1B model on 96GB machine. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-22 17:59:11 +00:00
Snider	d233e76648	feat: add training data to repo + make paths repo-relative Move training/lem/ (probes, lessons, eval sets) into git so the full curriculum is publicly releasable. Update .core/ai configs and distill.go to use repo-relative paths instead of /Volumes/Data/. Co-Authored-By: Virgil <virgil@lethean.io>	2026-02-21 23:49:12 +00:00
Snider	1b742bf92c	feat: native Metal distillation command + .core/ai config Add `lem distill` — full Go pipeline for self-distillation using go-mlx (native Metal inference) and go-i18n/reversal (v3 grammar scoring). Replaces the Python distill.py bridge entirely. New files: - .core/ai/ai.yaml: global defaults (scorer, generation, distill) - .core/ai/models/gemma3/{27b,1b}.yaml: model configs with paths, kernel, lessons, baselines - .core/ai/probes.yaml: probe sets grouped by training phase - pkg/lem/config.go: YAML config loaders for .core/ai/ - pkg/lem/grammar.go: in-process grammar scoring (ComputeGrammarScore, ComputeDelta, ScoreResponse) extracted from cmd/scorer - pkg/lem/distill.go: RunDistill command — best-of-N generation, grammar quality gate, training JSONL output - pkg/lem/backend_metal.go: blank import for go-mlx Metal registration Co-Authored-By: Virgil <virgil@lethean.io>	2026-02-21 23:42:55 +00:00
Claude	08363ee1af	feat: add `lem worker` command for distributed inference network Go client for the LEM distributed inference API (BugSETI/Agentic). Workers register via Forgejo PAT auth, pull prompt batches, run local inference (MLX/vLLM/llama.cpp), submit results. Credits tracked as Phase 1 stub for Phase 2 blockchain LEM token. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-15 18:10:59 +00:00
Claude	774f097855	feat: scaffold LEM Desktop app (Wails v3 system tray + Docker stack) Inspired by BugSETI architecture — system tray with WebView2 windows, Docker Compose stack (Forgejo + InfluxDB + inference proxy), and scoring agent integration. Builds as signed native binary on macOS. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-15 17:43:19 +00:00
Claude	9fac5749c2	feat: add scoring agent + 23 capability probes (replaces scoring_agent.py) Go scoring daemon that polls M3 for unscored LoRA checkpoints, converts MLX→PEFT, runs 23 binary capability probes via OpenAI- compatible API, and pushes results to InfluxDB. Zero Python deps. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-15 17:22:40 +00:00
Claude	91ee389377	feat: convert all pipeline.py commands to Go Complete conversion of pipeline.py into Go `lem` CLI: - import-all: bulk import all LEM data into DuckDB from M3 - consolidate: pull worker JSONLs, merge, deduplicate - normalize: seeds → deduplicated expansion_prompts table - approve: filter scored expansions → training JSONL - tier-score: heuristic/judge tiered expansion scoring - expand-status: expansion pipeline progress from DuckDB - inventory: DuckDB table counts and summary - coverage: seed coverage gap analysis - seed-influx: bootstrap InfluxDB from DuckDB golden_gen - query: ad-hoc SQL against DuckDB 22 commands total, 49 Go files. Replaces entire pipeline.py. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-15 17:12:03 +00:00
Claude	4eaf1bfb39	feat: add parquet, publish, metrics, convert commands - `lem parquet` — export JSONL training splits to Parquet (parquet-go) - `lem publish` — push Parquet files to HuggingFace dataset repo - `lem metrics` — push DuckDB golden set stats to InfluxDB - `lem convert` — MLX LoRA adapter → HuggingFace PEFT format (pure Go safetensors read/write/transpose, no PyTorch needed) Dependencies added: parquet-go, go-huggingface, go-rocm, go-pytorch, gotch Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-15 17:05:08 +00:00
Claude	0afa5e9147	feat: add `lem ingest` command + go-huggingface dependency Ingests benchmark data (content scores, capability scores, training curves) from JSONL files and mlx_lm logs into InfluxDB. Batched writes, iteration extraction from checkpoint labels. Also adds github.com/hupe1980/go-huggingface for future HF sync. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-15 16:55:17 +00:00
Claude	a18fd1c44e	refactor: remove Vi identity from calm conversations Vi identity is a separate training concern. Seed conversations now contain only philosophical/mindfulness content for the R300 calm phase. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-15 16:48:23 +00:00
Claude	c4fb775298	feat: add `lem conv` command for conversational training data Ports conversational_training.py to Go with InfluxDB reporting. 24 built-in seed conversations (Vi identity, philosophy, mindfulness). Supports extra JSONL files and golden set conversion to chat format. Also fixes InfluxDB client to accept 204 No Content on writes. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-15 16:42:46 +00:00
Claude	70dd18c065	refactor: move Go library to pkg/lem, thin main.go All scoring/influx/export/expand logic moves to pkg/lem as an importable package. main.go is now a thin CLI dispatcher. This lets new commands import the shared library directly — ready for converting Python scripts to Go subcommands. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-15 16:30:09 +00:00

19 commits