Commit graph

1 commit

Author SHA1 Message Date
Snider
3b7ef9d26a feat(calibrate): 1B vs 27B domain calibration tool
CalibrateDomains() accepts two inference.TextModel instances and a corpus
of CalibrationSamples, classifies all with both models, and computes
agreement rate, per-domain distribution, confusion pairs, and accuracy
vs ground truth.

- calibrate.go: CalibrateDomains + classifyAll batch helper
- calibrate_test.go: 7 mock tests (agreement, disagreement, mixed,
  no ground truth, empty, batch boundary, results slice)
- integration/calibrate_test.go: 500-sample corpus (220 ground-truth
  + 280 unlabelled) for real 1B vs 27B model comparison
- TODO.md: Phase 2a calibration task marked complete

Co-Authored-By: Virgil <virgil@lethean.io>
2026-02-20 13:51:11 +00:00