Charon/LEM - Lethean Network

Charon/LEM

forked from lthn/LEM

Author	SHA1	Message	Date
Snider	da3db34e45	Merge branch 'main' into feat/standard-benchmarks	2026-02-15 00:13:47 +00:00
Charon	abd63d3342	Add standard benchmark suite using EleutherAI lm-evaluation-harness - run_benchmarks.sh: wrapper for lm-eval with suite presets (quick, classic, leaderboard-v2, full) - compare_models.py: compare base vs LEK results with delta table - Supports HF transformers, local-chat-completions (MLX/Ollama), and vLLM backends - Results comparable to HuggingFace Open LLM Leaderboard Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-15 00:05:48 +00:00
Charon	b8f9191b05	Add missing HF model cards, sync script, and Parquet export - Add 4 missing model cards: Gemma3-1B-layered (v1+v2), Gemma3-27B, GPT-OSS-20B - All 9 HF models now have cards in paper/hf-cards/ - sync_hf.py: push cards + benchmarks + training data to HuggingFace - export_parquet.py: convert JSONL training splits to Parquet (HF dataset format) - Parquet schema: prompt, response, system, messages (JSON) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-14 23:50:18 +00:00
Snider	d722ba1b3d	Merge pull request 'Add generation worker for distributed training data pipeline' (#1 ) from Charon/LEM:feat/generation-worker into main Reviewed-on: lthn/LEM#1 Reviewed-by: Snider <snider@noreply.forge.lthn.ai>	2026-02-14 22:48:26 +00:00
Charon	e021b6beb0	Add generation worker: gold (15K) + expansion (46K) with InfluxDB coordination Includes both generation scripts, prompts data, setup script, and worker instructions in README. Workers auto-coordinate via InfluxDB so multiple machines can generate in parallel without duplicating work. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-14 22:46:51 +00:00
Athena	ed0b83a9d9	Update training data to 2,299 examples and rename models LEM→LEK - Replace 160-example POC training set with expanded 2,299-example dataset (1,839 train, 229 valid, 231 test) - Rename all HuggingFace model references from LEM- to LEK- (proof-of-concept) - Add missing models: GPT-OSS-20B, Gemma3-1B-layered-v2 - Rename HF card files to match LEK- convention - Remove duplicate composure texts from kernel/ (kept in composure-library/) - Fix paper repository URL to github.com/LetheanNetwork/LEM Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-13 16:19:56 +00:00
Athena	f0e86b7433	Add regional seeds, expansion rounds, scripts, HF cards, benchmark summary - seeds/regional/: 1,223 cultural/regional seed files across 50+ regions - seeds/expansions/: 8 expansion rounds (r1-r8) with raw text and JSON - seeds/lem-{africa,cn,de,en,eu,me}-all-seeds.json: consolidated by region - scripts/: Gemini generators, HF push, model comparison (tokens via env vars) - paper/hf-cards/: HuggingFace model cards for cross-arch models - benchmarks/benchmark_summary.json: processed PTSD summary data Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-13 13:39:08 +00:00
Snider	53c47131cc	Add cross-architecture training and benchmarking scripts; update README and PAPER with author and repository information	2026-02-12 09:07:32 +00:00
Snider	adda3c8bb5	Benchmark & Findings: lthn/LEM-Gemma-3-1B lthn/LEM-Gemma-3-4B lthn/LEM-Gemma-3-12B lthn/LEM-Gemma-3-27B	2026-02-12 06:38:46 +00:00
Snider	8e5f082f30	LEM+LEK	2026-02-12 04:05:28 +00:00
Snider	f9c422c61a	Add European Union Public License v. 1.2 Added the European Union Public License v. 1.2 to the project. Signed-off-by: Snider <snider@host.uk.com>	2026-02-11 03:46:37 +00:00

11 commits