go-mlx

core/go-mlx

Fork 0

Commit graph

Author	SHA1	Message	Date
Snider	71fe4bb5ac	fix: add Detach/Free calls to reduce Metal GPU memory retention Add deterministic memory cleanup across inference paths: - Detach logits after Eval to release graph references - Free intermediate arrays in attention (gemma3, qwen3) - Add cache Detach helper for KV cache cleanup after generation - New detach.cpp/go CGO bindings for mlx_array_detach Reduces 4B model memory from 78GB to ~17GB (vs 2.4GB mlx-lm baseline). Native Metal memory management still trails Python refcounting but is now viable for 1B models. Co-Authored-By: Virgil <virgil@lethean.io>	2026-02-26 05:14:09 +00:00
Snider	2292557fd6	chore: vendor MLX C headers for Go module consumers dist/include/ contains the MLX and MLX-C headers needed for CGo compilation. Without these, go-mlx cannot be used as a module dependency (headers not found in module cache). Libraries (dylib/metallib) are still gitignored — users build those locally via cmake. Co-Authored-By: Virgil <virgil@lethean.io>	2026-02-21 19:14:04 +00:00

Author

SHA1

Message

Date

Snider

71fe4bb5ac

fix: add Detach/Free calls to reduce Metal GPU memory retention

Add deterministic memory cleanup across inference paths:
- Detach logits after Eval to release graph references
- Free intermediate arrays in attention (gemma3, qwen3)
- Add cache Detach helper for KV cache cleanup after generation
- New detach.cpp/go CGO bindings for mlx_array_detach

Reduces 4B model memory from 78GB to ~17GB (vs 2.4GB mlx-lm baseline).
Native Metal memory management still trails Python refcounting but is
now viable for 1B models.

Co-Authored-By: Virgil <virgil@lethean.io>

2026-02-26 05:14:09 +00:00

Snider

2292557fd6

chore: vendor MLX C headers for Go module consumers

dist/include/ contains the MLX and MLX-C headers needed for CGo
compilation. Without these, go-mlx cannot be used as a module
dependency (headers not found in module cache).

Libraries (dylib/metallib) are still gitignored — users build
those locally via cmake.

Co-Authored-By: Virgil <virgil@lethean.io>

2026-02-21 19:14:04 +00:00

2 commits