cli/pkg/mlx
Claude 1d4ec55d05 fix: add GC-based memory management for MLX array handles
Go GC cannot see Metal/C memory pressure, so intermediate arrays from
each forward pass accumulated without bound, causing OOM kills after
3-4 requests. Fix: runtime.SetFinalizer on every Array releases C
handles when GC collects them, and runtime.GC() is forced every 4
tokens during generation. Also adds SetMemoryLimit(24GB) as a hard
Metal ceiling.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-16 05:53:52 +00:00
..
cache feat: support quantized inference (4-bit) for Gemma 3 2026-02-16 05:53:52 +00:00
model fix: correct SDPA mask mode and slice logits to last position 2026-02-16 05:53:52 +00:00
sample fix: remove unused vars in TopP sampler placeholder 2026-02-16 05:53:52 +00:00
tokenizer fix: handle both string and array merge formats in tokenizer 2026-02-16 05:53:52 +00:00
array.go fix: add GC-based memory management for MLX array handles 2026-02-16 05:53:52 +00:00
CMakeLists.txt chore: target macOS 26.0, fix duplicate -lstdc++ linker warning 2026-02-16 05:53:52 +00:00
compile.go fix: correct 20 mlx-c API mismatches for v0.4.1 2026-02-16 05:53:52 +00:00
dtype.go feat: add native MLX backend for Apple Silicon inference (pkg/mlx) 2026-02-16 05:53:52 +00:00
fast.go fix: correct SDPA mask mode and slice logits to last position 2026-02-16 05:53:52 +00:00
io.go feat: add native MLX backend for Apple Silicon inference (pkg/mlx) 2026-02-16 05:53:52 +00:00
mlx.go debug: add shape logging and stderr error handler for inference debugging 2026-02-16 05:53:52 +00:00
mlx_stub.go feat: add native MLX backend for Apple Silicon inference (pkg/mlx) 2026-02-16 05:53:52 +00:00
nn.go feat: support quantized inference (4-bit) for Gemma 3 2026-02-16 05:53:52 +00:00
ops.go fix: use affine quantization mode and infer head_dim from weights 2026-02-16 05:53:52 +00:00
random.go fix: correct 20 mlx-c API mismatches for v0.4.1 2026-02-16 05:53:52 +00:00
slice.go fix: correct 20 mlx-c API mismatches for v0.4.1 2026-02-16 05:53:52 +00:00
stream.go fix: add Metal cache management to prevent memory growth 2026-02-16 05:53:52 +00:00