1
0
Fork 0
forked from lthn/LEM
LEM/docs/plans/completed/qk-bone-orientation.md
2026-02-23 12:34:33 +00:00

1.9 KiB

Q/K Bone Orientation Implementation

Completed: 23 Feb 2026 Repos: go-inference, go-mlx, go-ml, LEM

What Was Done

Added attention-level Q/K Bone Orientation analysis to the LEM scoring pipeline. Bridges the gap between behavioural metrics (grammar, heuristic) and neural internals (attention head coherence, phase-lock, joint collapse).

Changes

Repo What
go-inference AttentionSnapshot type + AttentionInspector optional interface
go-mlx metalAdapter.InspectAttention() — KV cache K vector extraction after prefill
go-ml InferenceAdapter.InspectAttention() — type assertion pass-through
LEM attention.go analysis engine (pure Go CPU math), cmd_attention.go CLI, distill integration, 19D feature vectors

Key Decisions

  1. Optional interfaceAttentionInspector is a type assertion, not a TextModel method. Backends that don't support it are unaffected.
  2. KV cache extraction — K vectors are already in the cache after prefill. No changes to the model's Forward method.
  3. GQA handling — Models with 1-4 KV heads (Gemma3) use position-wise analysis instead of pairwise head coherence.
  4. Integer scoring — Composite uses 0-10000 integer scale (same principle as blockchain atomic units).
  5. Opt-in for distill — Attention scoring costs an extra prefill per probe. Off by default via scorer.attention config.

Metrics

Metric What it detects
Head Coherence Phase-lock (high) vs noise (low)
Cross-Layer Alignment Stable posture (high) vs joint snap (low)
Head Entropy Uniform attention (high) vs collapsed (low)
Phase-Lock Score Overall sovereign orientation
Joint Collapse Count Sycophancy/hallucination breakpoints

Tests

11 unit tests covering: coherent snapshots, collapsed snapshots, GQA models (1 and 4 heads), nil handling, composite scoring, feature vectors, feature labels.