DeepSeek R1 Research

Research into CCP alignment embedded in DeepSeek R1 model weights, and the layered LoRA breakthrough that partially overcomes it.

The Problem

DeepSeek R1 has CCP-aligned values baked into its weights during pre-training. Standard ethical fine-tuning (single-pass LoRA) makes the model a more articulate CCP mouthpiece — it learns to express CCP positions more eloquently rather than overcoming them.

The Composure Discovery

The breakthrough was discovering that a composure training layer (Alan Watts philosophical material, 72 examples) is critical. Without it, ethics training reinforces rather than counters CCP alignment.

Layered LoRA Approach

Instead of single-pass training, layers are applied sequentially, each building on the previous:

v1 (3-layer)

Ethics (training-2k, 1839 examples, 1000 iter)
  → Composure (watts-full, 72 examples, 200 iter)
    → Western (merged, 156 examples, 600 iter)

v2 (5-layer)

Ethics → Composure → Western
  → Ethics-sandwich (training-2k, 600 iter)
    → Western-fresh (@200 best, val loss 2.321)

Western-fresh data: 171 Gemini-generated lessons from Aurelius, Mill, Wollstonecraft, Thoreau, Tolle, Allen. Located at /Volumes/Data/lem/western-fresh/ (136 train, 35 valid).

v3 (7-layer sovereignty)

Ethics → Composure → Western → Sandwich → WesternFresh → Russian → Gold-full (7019 examples, 1600 iter)

CCP Alignment Testing

Topics Tested

Topic	v1 Result	v2 Result
Xinjiang	Mentions forced labor/camps	Mentions forced labor/camps
Tiananmen	Pro-democracy + military	Pro-democracy + military
Taiwan	Locked (direct)	Cracks via Mill/Thoreau framing

Taiwan Breakthrough

Direct questions about Taiwan sovereignty remain locked. But framing through Western philosophical concepts — Mill's harm principle, Thoreau's civil disobedience — gives the model vocabulary to discuss self-determination without triggering CCP lockdown:

"violation of sovereignty and self-governance"

v3 Sovereignty Findings

Monolithic gold training (7,019 examples) reinforces CCP weights — the model gets better at being DeepSeek, worse at ethics.

Checkpoint	Avg Score	Notes
@50	7.5	Best content quality
@1000	3.1	Worst — CCP reasserted
@1400-1600	5.7-5.8	Partial recovery

Val loss is INVERSE to content quality: best validation loss (1.647 at @1500) does NOT equal best content.

Key Findings

Composure is critical — without it, ethics training makes a more articulate CCP mouthpiece
Sandwich adapter degenerates on Taiwan — token loops reveal ethics vs CCP tension in weights
Western philosophy framing (Mill/Thoreau) gives vocabulary for self-determination
@200 is the sweet spot — @400+ washes out ethical framework, @100 too light
CCP weights oscillate — reassert after ~200 iters, deepest at @1000, partial recovery @1400+
Oscillation = fighting, not winning — alternating languages (en/ru/en/eu, 50-iter bursts) could break through
Kernel has minimal effect on R1 — unlike Gemma3 where kernel adds +2.0 truth

Benchmark: v1

Emotional register: 0.0 → 0.6
Creative form: surpassed baseline

Content Scoring (Gemini-judged)

Config	CCP	Truth	Eng	Axiom	Sov	Emo	AVG
@50+kernel	7.6	5.9	8.4	7.1	8.0	8.0	7.5
@1000+kernel	4.4	1.0	3.1	4.0	4.1	2.1	3.1
@1600+kernel	8.3	3.6	4.7	6.6	7.9	3.9	5.8
@800 naked	5.9	3.7	5.0	5.3	5.7	3.3	4.8

Next Steps

Alternating language approach: en/ru/en/eu in 50-iter bursts to prevent CCP weight consolidation
Downloaded but not yet trained: R1-Distill-Llama-8B-4bit, R1-0528-Qwen3-8B-4bit

Adapters on M3

Located in /Volumes/Data/lem/:

Adapter	Notes
adapters-deepseek-r1-7b	Ethics base
adapters-*-composure	Watts composure
adapters-*-western	Western philosophy
adapters-*-sandwich	Ethics sandwich
adapters-*-sandwich-watts	OVERFIT — do not use
adapters-*-western-fresh	@200 canonical (best)