Shared inference interfaces (TextModel, Backend, Token). Zero deps, compiles everywhere.
Expose prefill/decode timing, token counts, throughput, and GPU memory stats from the last inference operation. Same retrieval pattern as Err(). Co-Authored-By: Virgil <virgil@lethean.io> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> |
||
|---|---|---|
| CLAUDE.md | ||
| FINDINGS.md | ||
| go.mod | ||
| inference.go | ||
| options.go | ||
| README.md | ||
| TODO.md | ||
go-inference
Shared inference interfaces (TextModel, Backend, Token). Zero deps, compiles everywhere.