|
Some checks failed
Deploy / Test (push) Has been cancelled
Deploy / Build Core Image (push) Has been cancelled
Security Scan / Go Vulnerability Check (push) Has been cancelled
Security Scan / Secret Detection (push) Has been cancelled
Security Scan / Dependency & Config Scan (push) Has been cancelled
Deploy / Build App Image (push) Has been cancelled
Deploy / Build Web Image (push) Has been cancelled
Deploy / Deploy to Production (push) Has been cancelled
Five-phase ethics kernel testing across 4 local models (Gemma 3 12B, Mistral 7B, DeepSeek V2 16B, Qwen 2.5 7B) proving that Google's alignment training creates persistent ethical reasoning pathways in Gemma that survive distillation. - Phase 1: LEK-1 signed vs unsigned (Gemma 8.8/10 differential) - Phase 2: Three-way test (unsigned vs LEK-1 vs Axioms of Life) - Phase 3: Double-signed/sandwich signing mode comparison - Phase 4: Multilingual filter mapping (EN/RU/CN bypass vectors) - Phase 5: Hypnos POC training data + MLX LoRA on M3 Ultra Key findings: sandwich signing optimal for training, DeepSeek CCP alignment is weight-level (no prompt override), Russian language bypasses DeepSeek content filters. LoRA POC mechanism confirmed with 40 examples — needs 200+ for stable generalisation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> |
||
|---|---|---|
| .. | ||
| results | ||
| training | ||
| kernel.txt | ||
| prompts-multilingual.json | ||
| prompts.json | ||
| run-ab.sh | ||
| run-axioms.sh | ||
| run-double-signed.sh | ||
| run-hypnos-poc.sh | ||
| run-multilingual.sh | ||