..
cache
feat: support quantized inference (4-bit) for Gemma 3
2026-02-16 05:53:52 +00:00
model
fix: correct SDPA mask mode and slice logits to last position
2026-02-16 05:53:52 +00:00
sample
fix: remove unused vars in TopP sampler placeholder
2026-02-16 05:53:52 +00:00
tokenizer
fix: handle both string and array merge formats in tokenizer
2026-02-16 05:53:52 +00:00
array.go
fix: correct 20 mlx-c API mismatches for v0.4.1
2026-02-16 05:53:52 +00:00
CMakeLists.txt
chore: target macOS 26.0, fix duplicate -lstdc++ linker warning
2026-02-16 05:53:52 +00:00
compile.go
fix: correct 20 mlx-c API mismatches for v0.4.1
2026-02-16 05:53:52 +00:00
dtype.go
feat: add native MLX backend for Apple Silicon inference (pkg/mlx)
2026-02-16 05:53:52 +00:00
fast.go
fix: correct SDPA mask mode and slice logits to last position
2026-02-16 05:53:52 +00:00
io.go
feat: add native MLX backend for Apple Silicon inference (pkg/mlx)
2026-02-16 05:53:52 +00:00
mlx.go
debug: add shape logging and stderr error handler for inference debugging
2026-02-16 05:53:52 +00:00
mlx_stub.go
feat: add native MLX backend for Apple Silicon inference (pkg/mlx)
2026-02-16 05:53:52 +00:00
nn.go
feat: support quantized inference (4-bit) for Gemma 3
2026-02-16 05:53:52 +00:00
ops.go
fix: use affine quantization mode and infer head_dim from weights
2026-02-16 05:53:52 +00:00
random.go
fix: correct 20 mlx-c API mismatches for v0.4.1
2026-02-16 05:53:52 +00:00
slice.go
fix: correct 20 mlx-c API mismatches for v0.4.1
2026-02-16 05:53:52 +00:00
stream.go
fix: add Metal cache management to prevent memory growth
2026-02-16 05:53:52 +00:00