Clotho/LEM - Lethean Network

Clotho/LEM

Fork 0

forked from lthn/LEM

Commit graph

Author	SHA1	Message	Date
Snider	526150621e	feat: rewire 12B scripts to use 4B+1B distilled cascade All 7 phases now pull from pre-distilled responses: - /Volumes/Data/lem/distilled-for-12b/distilled-4b-all.jsonl (7,544) - /Volumes/Data/lem/distilled/distilled-1b-p0p5.jsonl (1,404) - /Volumes/Data/lem/distilled/distilled-1b-golden.jsonl (12,828) - /Volumes/Data/lem/distilled/distilled-1b-golden-reverse.jsonl (4,183) 4B responses listed first (reverse cascade order), then 1B. P4/P5 no longer need live teacher distillation. P6 gets all 15,000 unique 1B golden responses + 6,140 4B. No data replicated into training/lem/model/ per model size. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-25 21:13:27 +00:00
Snider	74ef174ec8	feat: add faithful 12B training scripts (P0-P6) — 1:1 port of 4B curriculum Exact reproduction of all 7 CL-BPL phases for Gemma3-12B: - P0: LEK sandwich ethics (400 iters, LR 2e-5) - P1: Zen composure (300 iters, LR 1e-5) - P2: LEK sandwich reinforcement (300 iters, LR 1e-5) - P3: Freeflow multi-source (300 iters, LR 1e-5) - P4: 1B teacher tension distillation (300 iters, LR 1e-5) - P5: 1B teacher creative distillation (300 iters, LR 1e-5) - P6: Golden set graduation (13479 iters, LR 1e-5) Only model-size differences from 4B: 48GB/12GB Metal limits, 24 LoRA layers (vs 16), 12B base model path. All phases score at checkpoint cadence via lem-scorer. Previous wrong 12B models preserved as -no-axioms control group. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-25 20:44:03 +00:00

Author

SHA1

Message

Date

Snider

526150621e

feat: rewire 12B scripts to use 4B+1B distilled cascade

All 7 phases now pull from pre-distilled responses:
- /Volumes/Data/lem/distilled-for-12b/distilled-4b-all.jsonl (7,544)
- /Volumes/Data/lem/distilled/distilled-1b-p0p5.jsonl (1,404)
- /Volumes/Data/lem/distilled/distilled-1b-golden.jsonl (12,828)
- /Volumes/Data/lem/distilled/distilled-1b-golden-reverse.jsonl (4,183)

4B responses listed first (reverse cascade order), then 1B.
P4/P5 no longer need live teacher distillation.
P6 gets all 15,000 unique 1B golden responses + 6,140 4B.
No data replicated into training/lem/model/ per model size.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

2026-02-25 21:13:27 +00:00

Snider

74ef174ec8

feat: add faithful 12B training scripts (P0-P6) — 1:1 port of 4B curriculum

Exact reproduction of all 7 CL-BPL phases for Gemma3-12B:
- P0: LEK sandwich ethics (400 iters, LR 2e-5)
- P1: Zen composure (300 iters, LR 1e-5)
- P2: LEK sandwich reinforcement (300 iters, LR 1e-5)
- P3: Freeflow multi-source (300 iters, LR 1e-5)
- P4: 1B teacher tension distillation (300 iters, LR 1e-5)
- P5: 1B teacher creative distillation (300 iters, LR 1e-5)
- P6: Golden set graduation (13479 iters, LR 1e-5)

Only model-size differences from 4B: 48GB/12GB Metal limits,
24 LoRA layers (vs 16), 12B base model path.

All phases score at checkpoint cadence via lem-scorer.
Previous wrong 12B models preserved as -no-axioms control group.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

2026-02-25 20:44:03 +00:00

2 commits