go-mlx

History

Snider 535b04d5d6 feat(metal): add Qwen2 model support (DeepSeek R1 validated) Qwen2 and Qwen3 share the same architecture — Qwen3 adds Q/K RMS normalization which Qwen2 lacks. The loader auto-detects the variant from weight presence and reports the correct ModelType(). - Add "qwen2" to architecture dispatch in model.go - Make Q/K norm optional in attention forward (nil-safe check) - Store detected model type on Qwen3Model struct - Add "qwen2" to chat template routing - DeepSeek R1 7B (4-bit): 27 tok/s on M3 Ultra - 2 new tests: inference + chat Co-Authored-By: Virgil <virgil@lethean.io> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-19 21:55:56 +00:00
..
metal	feat(metal): add Qwen2 model support (DeepSeek R1 validated)	2026-02-19 21:55:56 +00:00

Snider 535b04d5d6 feat(metal): add Qwen2 model support (DeepSeek R1 validated)

Qwen2 and Qwen3 share the same architecture — Qwen3 adds Q/K RMS
normalization which Qwen2 lacks. The loader auto-detects the variant
from weight presence and reports the correct ModelType().

- Add "qwen2" to architecture dispatch in model.go
- Make Q/K norm optional in attention forward (nil-safe check)
- Store detected model type on Qwen3Model struct
- Add "qwen2" to chat template routing
- DeepSeek R1 7B (4-bit): 27 tok/s on M3 Ultra
- 2 new tests: inference + chat

Co-Authored-By: Virgil <virgil@lethean.io>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

2026-02-19 21:55:56 +00:00

metal

feat(metal): add Qwen2 model support (DeepSeek R1 validated)

2026-02-19 21:55:56 +00:00