**macOS on Apple Silicon only.** All CGO source files carry `//go:build darwin && arm64`. The package will not build for native Metal inference on any other platform; a stub (`mlx_stub.go`) provides `MetalAvailable() bool` returning false elsewhere.
### Required Tools
| Tool | Version | Purpose |
|------|---------|---------|
| Go | 1.25.5+ | Module toolchain |
| CMake | 3.24+ | Builds mlx-c from source |
| AppleClang | 17.0+ | C/C++ compiler for mlx-c |
| macOS SDK | 26.2+ | Metal framework headers |
| Xcode Command Line Tools | Current | Provides `xcrun`, frameworks |
Install CMake if absent:
```bash
brew install cmake
```
### Go Workspace
go-mlx participates in a Go workspace alongside go-inference. The `go.mod` uses a `replace` directive for local development:
CMake fetches mlx-c v0.4.1 from GitHub, builds it with:
-`MLX_BUILD_SAFETENSORS=ON` (model loading)
-`MLX_BUILD_GGUF=OFF`
-`BUILD_SHARED_LIBS=ON`
- macOS deployment target: 13.3 (minimum required by MLX)
The built library installs to `dist/include/` and `dist/lib/`. Build time is approximately 2 minutes on M3 Ultra.
The `dist/` directory is gitignored and must be rebuilt on each fresh checkout.
### Step 2: Run Tests
```bash
go test ./...
```
Tests require a working mlx-c build. Integration tests that load model files are skipped automatically when model paths are absent (`/Volumes/Data/lem/safetensors/...`).
---
## CGO Flags
The `#cgo` directives in `internal/metal/metal.go` set all required flags automatically when building on darwin/arm64:
All 180+ tests in `internal/metal/` are unit or integration tests that exercise the CGO layer directly. The 11 tests in the root package (`mlx_test.go`) exercise the public API via go-inference.
### Running a Single Test
```bash
go test -run TestRMSNorm_Good ./internal/metal/
```
### Running with Race Detector
```bash
go test -race ./...
```
---
## Benchmarks
29 benchmarks in `internal/metal/bench_test.go`. Run with:
```bash
go test -bench=. -benchtime=2s ./internal/metal/
```
Key benchmarks:
| Benchmark group | What it measures |
|----------------|-----------------|
| `BenchmarkMatmul_*` | Matrix multiply at 128² through 4096², plus token projection |
| `BenchmarkSoftmax_*` | Softmax at 1K through 128K vocab |
| `BenchmarkElementWise_*` | Add, Mul, SiLU at 1M elements |
| `BenchmarkRMSNorm_*` | Fused RMSNorm at decode and prefill shapes |
| `BenchmarkRoPE_*` | RoPE at single-token and 512-token shapes |
The `mlxlm/` package has no CGO dependency and tests run on any platform where Python 3 is available. Tests use `testdata/mock_bridge.py` instead of the real `bridge.py`, so no `mlx-lm` installation is required.
Run mlxlm tests:
```bash
go test ./mlxlm/
```
The mock bridge responds to all commands with fixed fake data, enabling full subprocess protocol testing without GPU or Python ML dependencies.
To opt out of building the mlxlm backend:
```bash
go build -tags nomlxlm ./...
```
---
## Dependency Graph
```
go-mlx
├── forge.lthn.ai/core/go-inference (shared interfaces, zero dependencies)
└── mlx-c v0.4.1 (CMake, fetched from GitHub at generate time)
└── Apple MLX (Metal GPU compute)
└── Foundation, Metal, Accelerate frameworks
```
The root package and `mlxlm/` have no CGO dependency. Only `internal/metal/` links against mlx-c.