go-mlx

Author	SHA1	Message	Date
Snider	ce1acef462	docs: batch inference API design (Phase 5) Two new TextModel methods: Classify (prefill-only, fast path for classification) and BatchGenerate (autoregressive, multi-prompt). Adds attention masking for padded batches. Primary consumer: go-i18n Phase 2a domain classification at ~5K sentences/sec. Co-Authored-By: Virgil <virgil@lethean.io> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-19 23:18:38 +00:00
Snider	443347a2f8	fix(metal): address 4 minor code review items - Rename New() → newArray() to signal internal-only intent (112 usages) - Remove unused Collect() function and its test - Fix discarded json.Unmarshal error in qwen3.go - Document AsStrided stride formula in gemma3.go Co-Authored-By: Virgil <virgil@lethean.io> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-19 21:36:40 +00:00
Snider	97d9041455	docs(plan): fold Virgil review into design and implementation plan Virgil review items integrated: - context.Context on Generate/Chat (required for HTTP cancellation) - Err() error on TextModel (distinguish EOS from OOM) - Chat() on TextModel (model owns its chat template) - Memory control functions exposed at root package level - Functional options convention confirmed - pkg/process confirmed — no changes needed for mlxlm Co-Authored-By: Virgil <virgil@lethean.io> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-19 19:25:05 +00:00
Snider	28e2a07316	docs(plan): backend abstraction implementation plan (12 tasks) Detailed step-by-step plan for restructuring go-mlx: - Tasks 1-8: mechanical migration to internal/metal/ - Task 9: new Generate loop with iter.Seq[Token] streaming - Task 10: deterministic memory cleanup (fixes leak) - Tasks 11-12: integration tests and doc updates Critical checkpoint at Task 7: all 148 tests must pass. Co-Authored-By: Virgil <virgil@lethean.io> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-19 19:14:59 +00:00
Snider	c881813872	docs(design): backend abstraction with internal/metal reorganisation Approved design for restructuring go-mlx: - Root package becomes clean interface (TextModel, LoadModel, Token) - All CGO code moves to internal/metal/ - Deterministic memory management (Close + per-step cleanup) - Error propagation instead of silent logging - mlxlm/ backend placeholder for Python subprocess support Includes API breaking change communication in FINDINGS.md and memory management research tasks in cpp/TODO.md. See: docs/plans/2026-02-19-backend-abstraction-design.md Co-Authored-By: Virgil <virgil@lethean.io> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-19 19:12:04 +00:00

5 commits