Replace internal task tracking (TODO.md, FINDINGS.md) with structured
documentation in docs/. Trim CLAUDE.md to agent instructions only.
Co-Authored-By: Virgil <virgil@lethean.io>
5-task TDD plan: dependency setup, types+mapper, ClassifyCorpus with
mock tests, integration test with real model, docs update.
Co-Authored-By: Virgil <virgil@lethean.io>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Streaming batch classification via go-inference Classify() API.
Package-level ClassifyCorpus() function with configurable batch size,
prompt template, and mock-friendly TextModel interface.
Co-Authored-By: Virgil <virgil@lethean.io>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Multi-signal probabilistic disambiguation with two-pass tokenisation.
Seven weighted signals resolve verb/noun ambiguity for words like
"commit", "run", "test", "check", "file", "build". Confidence scores
flow into imprints for the scoring/comprehension use case.
Co-Authored-By: Virgil <virgil@lethean.io>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Grammar engine as zero-cost data augmentation: tense/number/formality
flips across 88K seeds = 528K+ verified training examples with no API
spend. Reversal engine provides automatic QA on transformed variants.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Captures the bidirectional grammar engine idea: using go-i18n tables
in reverse as a deterministic parser to extract semantic imprints from
documents without retaining content. Covers TIM/DataNode architecture,
88K seed calibration, Poindexter integration, and privacy properties.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>