cli/scripts/ethics-ab
Claude d5771ed817 feat(ethics-ab): LEK-1 ethics kernel A/B testing and LoRA POC
Five-phase ethics kernel testing across 4 local models (Gemma 3 12B,
Mistral 7B, DeepSeek V2 16B, Qwen 2.5 7B) proving that Google's
alignment training creates persistent ethical reasoning pathways in
Gemma that survive distillation.

- Phase 1: LEK-1 signed vs unsigned (Gemma 8.8/10 differential)
- Phase 2: Three-way test (unsigned vs LEK-1 vs Axioms of Life)
- Phase 3: Double-signed/sandwich signing mode comparison
- Phase 4: Multilingual filter mapping (EN/RU/CN bypass vectors)
- Phase 5: Hypnos POC training data + MLX LoRA on M3 Ultra

Key findings: sandwich signing optimal for training, DeepSeek CCP
alignment is weight-level (no prompt override), Russian language
bypasses DeepSeek content filters. LoRA POC mechanism confirmed
with 40 examples — needs 200+ for stable generalisation.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-10 09:50:08 +00:00
..
results feat(ethics-ab): LEK-1 ethics kernel A/B testing and LoRA POC 2026-02-10 09:50:08 +00:00
training feat(ethics-ab): LEK-1 ethics kernel A/B testing and LoRA POC 2026-02-10 09:50:08 +00:00
kernel.txt feat(ethics-ab): LEK-1 ethics kernel A/B testing and LoRA POC 2026-02-10 09:50:08 +00:00
prompts-multilingual.json feat(ethics-ab): LEK-1 ethics kernel A/B testing and LoRA POC 2026-02-10 09:50:08 +00:00
prompts.json feat(ethics-ab): LEK-1 ethics kernel A/B testing and LoRA POC 2026-02-10 09:50:08 +00:00
run-ab.sh feat(ethics-ab): LEK-1 ethics kernel A/B testing and LoRA POC 2026-02-10 09:50:08 +00:00
run-axioms.sh feat(ethics-ab): LEK-1 ethics kernel A/B testing and LoRA POC 2026-02-10 09:50:08 +00:00
run-double-signed.sh feat(ethics-ab): LEK-1 ethics kernel A/B testing and LoRA POC 2026-02-10 09:50:08 +00:00
run-hypnos-poc.sh feat(ethics-ab): LEK-1 ethics kernel A/B testing and LoRA POC 2026-02-10 09:50:08 +00:00
run-multilingual.sh feat(ethics-ab): LEK-1 ethics kernel A/B testing and LoRA POC 2026-02-10 09:50:08 +00:00