go-mlx

Native Apple Metal GPU inference via mlx-c bindings for Go.

Overview

go-mlx provides pure Go + CGO bindings to Apple's MLX framework through the mlx-c C API. It runs LLM inference on Apple Silicon GPUs (M1-M4) using Metal compute shaders.

Platform: darwin/arm64 only. A stub provides MetalAvailable() bool on other platforms.

Module: forge.lthn.ai/core/go-mlx

Packages

Package	Purpose
Root (`mlx`)	Core bindings: Array, ops, materialisation, Metal init
`model/`	LLM implementations: Gemma3, Qwen3
`tokenizer/`	BPE tokenizer (sentencepiece format)
`sample/`	Sampling strategies (temperature, top-k, top-p)
`cache/`	KV cache for autoregressive inference

Quick Start

import (
    "forge.lthn.ai/core/go-mlx"
    "forge.lthn.ai/core/go-mlx/model"
)

if mlx.MetalAvailable() {
    m, _ := model.LoadModel("/path/to/safetensors/")
    tokens := m.Generate(prompt, maxTokens)
}

go-mlx

Overview

Packages

Quick Start

Pages