go/pkg/mlx
Claude 6b603ee20b fix: remove Go-side array ref tracking, rely on MLX-C refcounting
The Go wrapper was tracking inter-array references via desc.inputs,
creating chains that kept all intermediate arrays alive across requests.
After 3-4 requests, Metal memory grew to 170GB+ and macOS killed the
process.

Fix: remove desc.inputs/numRefs entirely. MLX-C has its own internal
reference counting — when Go GC finalizes an Array wrapper, it calls
mlx_array_free which decrements the C-side refcount. If the C-side
count reaches 0, Metal memory is freed. Go GC + MLX-C refcounting
together handle all lifecycle management correctly.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-16 05:53:52 +00:00
..
cache feat: support quantized inference (4-bit) for Gemma 3 2026-02-16 05:53:52 +00:00
model fix: correct SDPA mask mode and slice logits to last position 2026-02-16 05:53:52 +00:00
sample fix: remove unused vars in TopP sampler placeholder 2026-02-16 05:53:52 +00:00
tokenizer fix: handle both string and array merge formats in tokenizer 2026-02-16 05:53:52 +00:00
array.go fix: remove Go-side array ref tracking, rely on MLX-C refcounting 2026-02-16 05:53:52 +00:00
CMakeLists.txt chore: target macOS 26.0, fix duplicate -lstdc++ linker warning 2026-02-16 05:53:52 +00:00
compile.go fix: correct 20 mlx-c API mismatches for v0.4.1 2026-02-16 05:53:52 +00:00
dtype.go feat: add native MLX backend for Apple Silicon inference (pkg/mlx) 2026-02-16 05:53:52 +00:00
fast.go fix: correct SDPA mask mode and slice logits to last position 2026-02-16 05:53:52 +00:00
io.go fix: remove Go-side array ref tracking, rely on MLX-C refcounting 2026-02-16 05:53:52 +00:00
mlx.go debug: add shape logging and stderr error handler for inference debugging 2026-02-16 05:53:52 +00:00
mlx_stub.go feat: add native MLX backend for Apple Silicon inference (pkg/mlx) 2026-02-16 05:53:52 +00:00
nn.go feat: support quantized inference (4-bit) for Gemma 3 2026-02-16 05:53:52 +00:00
ops.go fix: use affine quantization mode and infer head_dim from weights 2026-02-16 05:53:52 +00:00
random.go fix: correct 20 mlx-c API mismatches for v0.4.1 2026-02-16 05:53:52 +00:00
slice.go fix: correct 20 mlx-c API mismatches for v0.4.1 2026-02-16 05:53:52 +00:00
stream.go fix: add Metal cache management to prevent memory growth 2026-02-16 05:53:52 +00:00