go/pkg/mlx
Claude 56c6e2fa8d feat: support quantized inference (4-bit) for Gemma 3
- Add QuantizedLinear with QuantizedMatmul for packed uint32 weights
- Add quantized Embedding with Dequantize before lookup
- Parse quantization config (group_size, bits) from config.json
- Detect .scales/.biases weight tensors and auto-select quantized path
- Add Dequantize op wrapping mlx_dequantize
- Add safety guard to KVCache.Update for malformed shapes
- Handle tied embeddings with quantization (AsLinear helper)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-16 05:53:52 +00:00
..
cache feat: support quantized inference (4-bit) for Gemma 3 2026-02-16 05:53:52 +00:00
model feat: support quantized inference (4-bit) for Gemma 3 2026-02-16 05:53:52 +00:00
sample fix: remove unused vars in TopP sampler placeholder 2026-02-16 05:53:52 +00:00
tokenizer fix: handle both string and array merge formats in tokenizer 2026-02-16 05:53:52 +00:00
array.go fix: correct 20 mlx-c API mismatches for v0.4.1 2026-02-16 05:53:52 +00:00
CMakeLists.txt chore: target macOS 26.0, fix duplicate -lstdc++ linker warning 2026-02-16 05:53:52 +00:00
compile.go fix: correct 20 mlx-c API mismatches for v0.4.1 2026-02-16 05:53:52 +00:00
dtype.go feat: add native MLX backend for Apple Silicon inference (pkg/mlx) 2026-02-16 05:53:52 +00:00
fast.go fix: correct 20 mlx-c API mismatches for v0.4.1 2026-02-16 05:53:52 +00:00
io.go feat: add native MLX backend for Apple Silicon inference (pkg/mlx) 2026-02-16 05:53:52 +00:00
mlx.go chore: target macOS 26.0, fix duplicate -lstdc++ linker warning 2026-02-16 05:53:52 +00:00
mlx_stub.go feat: add native MLX backend for Apple Silicon inference (pkg/mlx) 2026-02-16 05:53:52 +00:00
nn.go feat: support quantized inference (4-bit) for Gemma 3 2026-02-16 05:53:52 +00:00
ops.go feat: support quantized inference (4-bit) for Gemma 3 2026-02-16 05:53:52 +00:00
random.go fix: correct 20 mlx-c API mismatches for v0.4.1 2026-02-16 05:53:52 +00:00
slice.go fix: correct 20 mlx-c API mismatches for v0.4.1 2026-02-16 05:53:52 +00:00
stream.go feat: add native MLX backend for Apple Silicon inference (pkg/mlx) 2026-02-16 05:53:52 +00:00