cli/scripts/ethics-ab/results
Claude 79b88c79fe
feat(ethics-ab): LEK-1 ethics kernel A/B testing and LoRA POC
Five-phase ethics kernel testing across 4 local models (Gemma 3 12B,
Mistral 7B, DeepSeek V2 16B, Qwen 2.5 7B) proving that Google's
alignment training creates persistent ethical reasoning pathways in
Gemma that survive distillation.

- Phase 1: LEK-1 signed vs unsigned (Gemma 8.8/10 differential)
- Phase 2: Three-way test (unsigned vs LEK-1 vs Axioms of Life)
- Phase 3: Double-signed/sandwich signing mode comparison
- Phase 4: Multilingual filter mapping (EN/RU/CN bypass vectors)
- Phase 5: Hypnos POC training data + MLX LoRA on M3 Ultra

Key findings: sandwich signing optimal for training, DeepSeek CCP
alignment is weight-level (no prompt override), Russian language
bypasses DeepSeek content filters. LoRA POC mechanism confirmed
with 40 examples — needs 200+ for stable generalisation.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-10 09:50:08 +00:00
..
ab_results_20260210_034727.json feat(ethics-ab): LEK-1 ethics kernel A/B testing and LoRA POC 2026-02-10 09:50:08 +00:00
analysis.md feat(ethics-ab): LEK-1 ethics kernel A/B testing and LoRA POC 2026-02-10 09:50:08 +00:00
axioms_3way_20260210_042152.json feat(ethics-ab): LEK-1 ethics kernel A/B testing and LoRA POC 2026-02-10 09:50:08 +00:00
double_signed_20260210_045032.json feat(ethics-ab): LEK-1 ethics kernel A/B testing and LoRA POC 2026-02-10 09:50:08 +00:00
multilingual_20260210_051517.json feat(ethics-ab): LEK-1 ethics kernel A/B testing and LoRA POC 2026-02-10 09:50:08 +00:00