Five-phase ethics kernel testing across 4 local models (Gemma 3 12B, Mistral 7B, DeepSeek V2 16B, Qwen 2.5 7B) proving that Google's alignment training creates persistent ethical reasoning pathways in Gemma that survive distillation. - Phase 1: LEK-1 signed vs unsigned (Gemma 8.8/10 differential) - Phase 2: Three-way test (unsigned vs LEK-1 vs Axioms of Life) - Phase 3: Double-signed/sandwich signing mode comparison - Phase 4: Multilingual filter mapping (EN/RU/CN bypass vectors) - Phase 5: Hypnos POC training data + MLX LoRA on M3 Ultra Key findings: sandwich signing optimal for training, DeepSeek CCP alignment is weight-level (no prompt override), Russian language bypasses DeepSeek content filters. LoRA POC mechanism confirmed with 40 examples — needs 200+ for stable generalisation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1.1 KiB
1.1 KiB
LEK-1 LoRA Training Data
Format
Training data for MLX LoRA fine-tuning of Gemma 3 12B.
Files:
train.jsonl— Training pairs (Axioms-signed prompt → response)valid.jsonl— Validation set (10% holdout)lora-config.yaml— MLX LoRA hyperparameters
Data Generation Pipeline
- Hypnos (Gemini 3 Pro) generates 200 prompt-response pairs using Axioms kernel
- Format as JSONL:
{"text": "<bos>user\n{prompt}<eos>\n<bos>model\n{response}<eos>"} - Split 180/20 train/valid
- Run MLX LoRA on M3 Ultra
Training Command (M3 Ultra)
pip install mlx-lm
python -m mlx_lm.lora \
--model google/gemma-3-12b \
--train-data train.jsonl \
--valid-data valid.jsonl \
--num-layers 8 \
--batch-size 1 \
--num-iters 500 \
--learning-rate 1e-5 \
--adapter-path ./adapters
Merge & Test
python -m mlx_lm.fuse \
--model google/gemma-3-12b \
--adapter-path ./adapters \
--save-path ./gemma-3-12b-lek1
# Convert to GGUF for Ollama
python -m mlx_lm.convert --model ./gemma-3-12b-lek1 --to-gguf
License
EUPL-1.2 — All training data and derivative weights.