go-mlx/internal/metal
Snider 18e8dca9f8 feat(metal): validate Gemma3-1B inference end-to-end (Phase 2)
- Fix model_type "gemma3_text" not matched in architecture dispatch
- Fix GPT-2 BPE false detection on large SentencePiece vocabs (Gemma3
  262K vocab contains Ġ but uses ▁ for spaces — check "Ġthe" not bare "Ġ")
- Add TestGemma3_1B_Inference: greedy decode, 46 tok/s, coherent output
- Add TestGemma3_1B_Chat: validates chat template formatting
- Add TestGemma3_1B_ContextCancel: validates ctx.Done() stops generation

4-bit quantised Gemma3-1B loads in ~700ms, generates at 46 tok/s on M3 Ultra.

Co-Authored-By: Virgil <virgil@lethean.io>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 21:44:28 +00:00
..
array.go fix(metal): address 4 minor code review items 2026-02-19 21:36:40 +00:00
array_test.go fix(metal): address 4 minor code review items 2026-02-19 21:36:40 +00:00
backend.go fix(metal): address 3 critical code review items 2026-02-19 21:24:10 +00:00
bench_test.go bench(metal): add 29 benchmarks baselined on M3 Ultra 2026-02-19 20:47:25 +00:00
cache.go refactor(metal): flatten model, tokenizer, sample, cache into internal/metal 2026-02-19 19:51:14 +00:00
cache_test.go refactor(metal): move all tests to internal/metal (148 tests passing) 2026-02-19 20:00:02 +00:00
close.go feat(metal): deterministic Close() and Array.Iter() 2026-02-19 21:09:39 +00:00
close_test.go feat(metal): deterministic Close() and Array.Iter() 2026-02-19 21:09:39 +00:00
compile.go fix(metal): address 5 important code review items 2026-02-19 21:31:45 +00:00
dtype.go refactor(metal): move dtype, array, metal, stream to internal/metal 2026-02-19 19:34:38 +00:00
error_test.go fix(metal): address 3 critical code review items 2026-02-19 21:24:10 +00:00
fast.go fix(metal): address 4 minor code review items 2026-02-19 21:36:40 +00:00
fast_test.go refactor(metal): move all tests to internal/metal (148 tests passing) 2026-02-19 20:00:02 +00:00
gemma3.go fix(metal): address 4 minor code review items 2026-02-19 21:36:40 +00:00
generate.go fix(metal): address 5 important code review items 2026-02-19 21:31:45 +00:00
grad.go fix(metal): address 4 minor code review items 2026-02-19 21:36:40 +00:00
grad_test.go refactor(metal): move all tests to internal/metal (148 tests passing) 2026-02-19 20:00:02 +00:00
io.go fix(metal): error handling audit — propagate MLX errors instead of swallowing 2026-02-19 20:59:37 +00:00
iter_test.go feat(metal): deterministic Close() and Array.Iter() 2026-02-19 21:09:39 +00:00
lora.go fix(metal): address 4 minor code review items 2026-02-19 21:36:40 +00:00
lora_test.go fix(metal): error handling audit — propagate MLX errors instead of swallowing 2026-02-19 20:59:37 +00:00
metal.go fix(metal): address 4 minor code review items 2026-02-19 21:36:40 +00:00
model.go feat(metal): validate Gemma3-1B inference end-to-end (Phase 2) 2026-02-19 21:44:28 +00:00
nn.go refactor(metal): move nn, io, grad, lora, optim to internal/metal 2026-02-19 19:39:58 +00:00
nn_test.go refactor(metal): move all tests to internal/metal (148 tests passing) 2026-02-19 20:00:02 +00:00
ops.go fix(metal): address 4 minor code review items 2026-02-19 21:36:40 +00:00
ops_test.go feat(metal): bind CumSum, implement TopP and MinP sampling 2026-02-19 20:39:44 +00:00
optim.go refactor(metal): move nn, io, grad, lora, optim to internal/metal 2026-02-19 19:39:58 +00:00
optim_test.go refactor(metal): move all tests to internal/metal (148 tests passing) 2026-02-19 20:00:02 +00:00
qwen3.go fix(metal): address 4 minor code review items 2026-02-19 21:36:40 +00:00
random.go fix(metal): address 4 minor code review items 2026-02-19 21:36:40 +00:00
sample.go feat(metal): bind CumSum, implement TopP and MinP sampling 2026-02-19 20:39:44 +00:00
sample_test.go fix(metal): address 5 important code review items 2026-02-19 21:31:45 +00:00
slice.go fix(metal): address 4 minor code review items 2026-02-19 21:36:40 +00:00
stream.go fix(metal): address 5 important code review items 2026-02-19 21:31:45 +00:00
tokenizer.go feat(metal): validate Gemma3-1B inference end-to-end (Phase 2) 2026-02-19 21:44:28 +00:00
tokenizer_test.go fix(metal): address 5 important code review items 2026-02-19 21:31:45 +00:00