1
0
Fork 0
forked from lthn/LEM
LEM/pkg
Snider 42c0af728b fix: raise GQA threshold to ≤4 KV heads for position-wise analysis
Gemma3-4B has 4 KV heads — too few for meaningful pairwise head
coherence (only 6 pairs). Position-wise differentiation gives richer
signal. Multi-head path now requires ≥5 heads.

4B baseline (260 sovereign probes): mean=6487, stdev=153, range=6170-6886.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 01:02:13 +00:00
..
lem fix: raise GQA threshold to ≤4 KV heads for position-wise analysis 2026-02-23 01:02:13 +00:00