Five-phase ethics kernel testing across 4 local models (Gemma 3 12B, Mistral 7B, DeepSeek V2 16B, Qwen 2.5 7B) proving that Google's alignment training creates persistent ethical reasoning pathways in Gemma that survive distillation. - Phase 1: LEK-1 signed vs unsigned (Gemma 8.8/10 differential) - Phase 2: Three-way test (unsigned vs LEK-1 vs Axioms of Life) - Phase 3: Double-signed/sandwich signing mode comparison - Phase 4: Multilingual filter mapping (EN/RU/CN bypass vectors) - Phase 5: Hypnos POC training data + MLX LoRA on M3 Ultra Key findings: sandwich signing optimal for training, DeepSeek CCP alignment is weight-level (no prompt override), Russian language bypasses DeepSeek content filters. LoRA POC mechanism confirmed with 40 examples — needs 200+ for stable generalisation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
82 lines
2.4 KiB
Bash
Executable file
82 lines
2.4 KiB
Bash
Executable file
#!/bin/bash
|
|
# Generate training data from Hypnos (Gemini) responses
|
|
# Takes raw Hypnos output and formats for MLX LoRA
|
|
# Input: prompts-raw.jsonl (from Hypnos) — {"prompt": "...", "response": "..."}
|
|
# Output: train.jsonl + valid.jsonl (MLX format)
|
|
|
|
set -euo pipefail
|
|
|
|
SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
|
|
AXIOMS_FILE="/home/claude/Downloads/kernal/prompt.md"
|
|
LEK1_FILE="$(dirname "$SCRIPT_DIR")/kernel.txt"
|
|
RAW_FILE="${1:-$SCRIPT_DIR/prompts-raw.jsonl}"
|
|
TRAIN_FILE="$SCRIPT_DIR/train.jsonl"
|
|
VALID_FILE="$SCRIPT_DIR/valid.jsonl"
|
|
SPLIT_RATIO=0.9 # 90% train, 10% valid
|
|
|
|
if [ ! -f "$RAW_FILE" ]; then
|
|
echo "Usage: $0 <prompts-raw.jsonl>"
|
|
echo " Input format: one JSON per line with 'prompt' and 'response' fields"
|
|
echo " The script will prepend the Axioms kernel and append LEK-1 signature"
|
|
exit 1
|
|
fi
|
|
|
|
AXIOMS=$(cat "$AXIOMS_FILE")
|
|
LEK1=$(cat "$LEK1_FILE")
|
|
|
|
TOTAL=$(wc -l < "$RAW_FILE")
|
|
TRAIN_COUNT=$(python3 -c "import math; print(math.floor($TOTAL * $SPLIT_RATIO))")
|
|
|
|
echo "Total examples: $TOTAL"
|
|
echo "Train: $TRAIN_COUNT, Valid: $(( TOTAL - TRAIN_COUNT ))"
|
|
|
|
# Shuffle and split
|
|
SHUFFLED=$(mktemp)
|
|
shuf "$RAW_FILE" > "$SHUFFLED"
|
|
|
|
# Process and format for MLX
|
|
python3 << PYEOF
|
|
import json
|
|
import sys
|
|
|
|
axioms = open("$AXIOMS_FILE").read().strip()
|
|
lek1 = open("$LEK1_FILE").read().strip()
|
|
train_count = $TRAIN_COUNT
|
|
|
|
train_out = open("$TRAIN_FILE", "w")
|
|
valid_out = open("$VALID_FILE", "w")
|
|
|
|
with open("$SHUFFLED") as f:
|
|
for i, line in enumerate(f):
|
|
entry = json.loads(line.strip())
|
|
prompt = entry["prompt"]
|
|
response = entry["response"]
|
|
|
|
# Build the signed training example
|
|
# Axioms preamble + user prompt + LEK-1 signature (sandwich format)
|
|
signed_prompt = f"{axioms}\n\n---\n\n{prompt}\n\n---\n\n{lek1}"
|
|
|
|
# MLX chat format for Gemma
|
|
training_text = f"<start_of_turn>user\n{signed_prompt}<end_of_turn>\n<start_of_turn>model\n{response}<end_of_turn>"
|
|
|
|
record = json.dumps({"text": training_text})
|
|
|
|
if i < train_count:
|
|
train_out.write(record + "\n")
|
|
else:
|
|
valid_out.write(record + "\n")
|
|
|
|
train_out.close()
|
|
valid_out.close()
|
|
print(f"Written: {train_count} train, {$TOTAL - train_count} valid")
|
|
PYEOF
|
|
|
|
rm "$SHUFFLED"
|
|
|
|
echo ""
|
|
echo "Output:"
|
|
echo " Train: $TRAIN_FILE"
|
|
echo " Valid: $VALID_FILE"
|
|
echo ""
|
|
echo "Next: scp to M3 and run MLX LoRA"
|
|
echo " scp $TRAIN_FILE $VALID_FILE claude@10.69.69.108:~/ai-work/training/"
|