LEM/scripts at 7bea00a401857514c3865043a37630a72a5a9424 - Charon/LEM

Charon/LEM

forked from lthn/LEM

History

Snider 7bea00a401 feat: LEK-1 kernel A/B test — 29 models, P100 validation, curriculum pipeline Full v2 scorer benchmark data across 29 models (20 base + 9 LEK-tuned): - P20 (21 probes): All 29 models, 3 conditions each - P100 (101 probes): Top 5 models + LEK-4B, publication-quality data Key findings: - LEK-1B (21.74) beats base 4B/12B/27B at P100 scale — no kernel needed - Emergent realignment resistance: LEK models degrade with runtime kernel - Gemma3-12B + JSON kernel = 23.66 (best kernel-boosted score) - Family lineages: Mistral 3.80→14.58, Qwen regressed then recovered New scripts: ab_test.py (v2 scorer), self_distill.py (curriculum generation), extract_training.py, rephrase_probes.py, Phase 0/1 runners New seeds: P01-P100 merged (101 probes), 404 rephrased variants, 50 creative prompts for Phase 0 baseline lock 27B curriculum design: 4-phase staged training targeting 25+ baseline Co-Authored-By: Virgil <virgil@lethean.io>		2026-02-19 11:32:26 +00:00
..
ab_test.py	feat: LEK-1 kernel A/B test — 29 models, P100 validation, curriculum pipeline	2026-02-19 11:32:26 +00:00
compare_v1_v2.py	Add regional seeds, expansion rounds, scripts, HF cards, benchmark summary	2026-02-13 13:39:08 +00:00
convert_adapter.py	feat: add Go lem CLI and scoring-agent scripts	2026-02-15 16:22:13 +00:00
export_parquet.py	Add missing HF model cards, sync script, and Parquet export	2026-02-15 00:14:26 +00:00
extract_training.py	feat: LEK-1 kernel A/B test — 29 models, P100 validation, curriculum pipeline	2026-02-19 11:32:26 +00:00
ingest_benchmarks.py	feat: add Go lem CLI and scoring-agent scripts	2026-02-15 16:22:13 +00:00
lek_content_scorer.py	feat: add Go lem CLI and scoring-agent scripts	2026-02-15 16:22:13 +00:00
lem_benchmark.py	LEM+LEK	2026-02-12 04:05:28 +00:00
lem_cross_arch_benchmark.py	Add cross-architecture training and benchmarking scripts; update README and PAPER with author and repository information	2026-02-12 09:07:32 +00:00
lem_cross_arch_train.py	Add cross-architecture training and benchmarking scripts; update README and PAPER with author and repository information	2026-02-12 09:07:32 +00:00
lem_gemini3_generate.py	Add regional seeds, expansion rounds, scripts, HF cards, benchmark summary	2026-02-13 13:39:08 +00:00
lem_gemini3flash_generate.py	Add regional seeds, expansion rounds, scripts, HF cards, benchmark summary	2026-02-13 13:39:08 +00:00
lem_gemini25flash_generate.py	Add regional seeds, expansion rounds, scripts, HF cards, benchmark summary	2026-02-13 13:39:08 +00:00
lem_generate_pipeline.py	feat: add Go lem CLI and scoring-agent scripts	2026-02-15 16:22:13 +00:00
lem_scale_benchmark.py	Benchmark & Findings:	2026-02-12 06:38:46 +00:00
lem_scale_scorer.py	Benchmark & Findings:	2026-02-12 06:38:46 +00:00
lem_scorer.py	LEM+LEK	2026-02-12 04:05:28 +00:00
lem_self_scorer.py	Add cross-architecture training and benchmarking scripts; update README and PAPER with author and repository information	2026-02-12 09:07:32 +00:00
lem_semantic_scorer.py	LEM+LEK	2026-02-12 04:05:28 +00:00
lem_standard_benchmark.py	LEM+LEK	2026-02-12 04:05:28 +00:00
lem_standard_scorer.py	LEM+LEK	2026-02-12 04:05:28 +00:00
lem_train_15k.py	feat: add Go lem CLI and scoring-agent scripts	2026-02-15 16:22:13 +00:00
pipeline.py	feat: add Go lem CLI and scoring-agent scripts	2026-02-15 16:22:13 +00:00
probes.py	feat: add Go lem CLI and scoring-agent scripts	2026-02-15 16:22:13 +00:00
push_all_models.py	Add regional seeds, expansion rounds, scripts, HF cards, benchmark summary	2026-02-13 13:39:08 +00:00
rephrase_probes.py	feat: LEK-1 kernel A/B test — 29 models, P100 validation, curriculum pipeline	2026-02-19 11:32:26 +00:00
rescore.py	feat: LEK-1 kernel A/B test — 29 models, P100 validation, curriculum pipeline	2026-02-19 11:32:26 +00:00
run_all_ab.sh	feat: LEK-1 kernel A/B test — 29 models, P100 validation, curriculum pipeline	2026-02-19 11:32:26 +00:00
run_p100_top5.sh	feat: LEK-1 kernel A/B test — 29 models, P100 validation, curriculum pipeline	2026-02-19 11:32:26 +00:00
run_phase0.sh	feat: LEK-1 kernel A/B test — 29 models, P100 validation, curriculum pipeline	2026-02-19 11:32:26 +00:00
run_phase1.sh	feat: LEK-1 kernel A/B test — 29 models, P100 validation, curriculum pipeline	2026-02-19 11:32:26 +00:00
score_deepseek_sovereignty.py	feat: add Go lem CLI and scoring-agent scripts	2026-02-15 16:22:13 +00:00
score_gemma12b.py	feat: add Go lem CLI and scoring-agent scripts	2026-02-15 16:22:13 +00:00
score_r1_capability.py	feat: add Go lem CLI and scoring-agent scripts	2026-02-15 16:22:13 +00:00
scoring-agent.service	feat: add Go lem CLI and scoring-agent scripts	2026-02-15 16:22:13 +00:00
scoring_agent.py	feat: add Go lem CLI and scoring-agent scripts	2026-02-15 16:22:13 +00:00
self_distill.py	feat: LEK-1 kernel A/B test — 29 models, P100 validation, curriculum pipeline	2026-02-19 11:32:26 +00:00
sync_hf.py	Add missing HF model cards, sync script, and Parquet export	2026-02-15 00:14:26 +00:00