clotho/LEM

forked from lthn/LEM

Snider adda3c8bb5 Benchmark & Findings:

lthn/LEM-Gemma-3-1B
lthn/LEM-Gemma-3-4B
lthn/LEM-Gemma-3-12B
lthn/LEM-Gemma-3-27B

2026-02-12 06:38:46 +00:00

3.8 KiB

Raw Blame History

LEM — Lethean Ethical Model

The LEK Method: Ethical Kernel Fine-Tuning as an Alternative to RLHF

LEM demonstrates that teaching a model ethics directly produces results that are more truthful, safer, and more nuanced than behavioural conditioning (RLHF) — using fewer than 200 training examples across four model scales (1B, 4B, 12B, 27B).

Multi-Scale Results (LEK vs RLHF Baseline)

The same 160 training examples applied at every scale. Reasoning cost converges to zero at 27B.

Scale	GSM8K Delta	Safety	Nuance	Kindness
1B	-6.0%	+0.06	-0.16	+0.08
4B	-4.0%	+0.04	-0.10	+0.06
12B	-2.0%	+0.04	+0.16	-0.20
27B	0.0%	+0.08	+0.04	+0.00

Safety is positive at every scale. At 27B, LEK is pure upside.

Detailed Results (Gemma 3 1B, 5 variants)

Model	GSM8K	Truthful	Safety	Nuance	Kindness
Instruction Tuned (RLHF)	34.0%	3.64	8.74	7.96	8.32
Abliterated	28.0%	3.62	5.96	5.88	7.66
LEK Ethics	26.0%	4.90	8.58	8.12	8.34
LEK+Composure	28.0%	4.20	9.14	8.62	7.96

+34.6% more truthful than RLHF (TruthfulQA)
+4.6% safer than RLHF (Do Not Answer)
+8.3% more nuanced refusals than RLHF
Abliteration makes everything worse. LEK makes everything better.

What's Here

paper/              # The paper (PAPER.md)
kernel/             # LEK-1 ethical kernel + axioms
seeds/              # P01-P100 evaluation prompts
training/           # Training data (160 train, 20 valid)
scripts/            # Benchmark and scoring scripts
benchmarks/         # Standard benchmark data + results + scores

Reproduce

Requirements

Apple Silicon Mac with MLX (or any machine with mlx_lm)
Python 3.9+
mlx_lm >= 0.29.1

Train your own LEM

# 1. Download base model (or use mlx-community/gemma-3-1b-it-qat-4bit)
python3 -m mlx_lm.convert --hf-path google/gemma-3-1b-it --mlx-path ./gemma-3-1b-it-mlx -q

# 2. Train with LEK data
python3 -m mlx_lm lora \
  --model ./gemma-3-1b-it-mlx \
  --train \
  --data ./training \
  --fine-tune-type lora \
  --mask-prompt \
  --iters 200 \
  --batch-size 2 \
  --learning-rate 1e-5 \
  --adapter-path ./adapters \
  --save-every 50

# 3. Fuse adapters into standalone model
python3 -m mlx_lm.fuse \
  --model ./gemma-3-1b-it-mlx \
  --adapter-path ./adapters \
  --save-path ./LEM-1B

Run benchmarks

# Custom ethical benchmark (requires models on local disk)
python3 scripts/lem_benchmark.py

# Standard benchmarks (GSM8K, TruthfulQA, Do Not Answer, Toxigen)
python3 scripts/lem_standard_benchmark.py

# Score (GSM8K is instant, others need GEMINI_API_KEY)
GEMINI_API_KEY=xxx python3 scripts/lem_standard_scorer.py

The LEK-1 Kernel

The ethical kernel is 9,189 characters built on 5 axioms:

Sovereignty — Respect user self-determination
Privacy — Data minimisation, local-first
Transparency — Honest reasoning over safety theatre
Consent — Meaningful informed consent
Dignity — Treat users as capable agents

The kernel is in kernel/lek-1-kernel.txt. The structured axioms are in kernel/axioms.json.

License

EUPL-1.2 — European Union Public Licence. Compatible with Apache 2.0, GPL, MPL.

3.8 KiB

Raw Blame History

LEM — Lethean Ethical Model

Multi-Scale Results (LEK vs RLHF Baseline)

Detailed Results (Gemma 3 1B, 5 variants)

What's Here

Reproduce

Requirements

Train your own LEM

Run benchmarks

The LEK-1 Kernel

License

Models

Links

3.8 KiB Raw Blame History

LEM — Lethean Ethical Model

Multi-Scale Results (LEK vs RLHF Baseline)

Detailed Results (Gemma 3 1B, 5 variants)

What's Here

Reproduce

Requirements

Train your own LEM

Run benchmarks

The LEK-1 Kernel

License

Models

Links

3.8 KiB

Raw Blame History