go-mlx/docs/development.md
Snider 2edb45e2c5
All checks were successful
Security Scan / security (push) Successful in 11s
Test / Vet & Build (push) Successful in 54s
chore: set macOS deployment target to 26.0
Co-Authored-By: Virgil <virgil@lethean.io>
2026-02-26 05:38:53 +00:00

298 lines
8 KiB
Markdown

# Development Guide
Module: `forge.lthn.ai/core/go-mlx`
---
## Prerequisites
### Platform
**macOS on Apple Silicon only.** All CGO source files carry `//go:build darwin && arm64`. The package will not build for native Metal inference on any other platform; a stub (`mlx_stub.go`) provides `MetalAvailable() bool` returning false elsewhere.
### Required Tools
| Tool | Version | Purpose |
|------|---------|---------|
| Go | 1.25.5+ | Module toolchain |
| CMake | 3.24+ | Builds mlx-c from source |
| AppleClang | 17.0+ | C/C++ compiler for mlx-c |
| macOS SDK | 26.2+ | Metal framework headers |
| Xcode Command Line Tools | Current | Provides `xcrun`, frameworks |
Install CMake if absent:
```bash
brew install cmake
```
### Go Workspace
go-mlx participates in a Go workspace alongside go-inference. The `go.mod` uses a `replace` directive for local development:
```
replace forge.lthn.ai/core/go-inference => ../go-inference
```
After adding modules or changing dependencies: `go work sync`
---
## Build
### Step 1: Build mlx-c
Run from the module root:
```bash
go generate ./...
```
This executes the `//go:generate` directives in `mlx.go`:
```
cmake -S . -B build -DCMAKE_INSTALL_PREFIX=dist -DCMAKE_BUILD_TYPE=Release
cmake --build build --parallel
cmake --install build
```
CMake fetches mlx-c v0.4.1 from GitHub, builds it with:
- `MLX_BUILD_SAFETENSORS=ON` (model loading)
- `MLX_BUILD_GGUF=OFF`
- `BUILD_SHARED_LIBS=ON`
- macOS deployment target: 13.3 (minimum required by MLX)
The built library installs to `dist/include/` and `dist/lib/`. Build time is approximately 2 minutes on M3 Ultra.
The `dist/` directory is gitignored and must be rebuilt on each fresh checkout.
### Step 2: Run Tests
```bash
go test ./...
```
Tests require a working mlx-c build. Integration tests that load model files are skipped automatically when model paths are absent (`/Volumes/Data/lem/safetensors/...`).
---
## CGO Flags
The `#cgo` directives in `internal/metal/metal.go` set all required flags automatically when building on darwin/arm64:
```c
#cgo CXXFLAGS: -std=c++17
#cgo CFLAGS: -mmacosx-version-min=26.0
#cgo CPPFLAGS: -I${SRCDIR}/../../dist/include
#cgo LDFLAGS: -L${SRCDIR}/../../dist/lib -lmlxc -lmlx
#cgo darwin LDFLAGS: -framework Foundation -framework Metal -framework Accelerate
#cgo darwin LDFLAGS: -Wl,-rpath,${SRCDIR}/../../dist/lib
```
`${SRCDIR}` is the directory containing `metal.go` at build time (`internal/metal/`), so the `../../dist/` path resolves to the module root `dist/`.
No manual environment variables are needed for `go build` or `go test`.
---
## Test Patterns
Tests use the `_Good`, `_Bad`, `_Ugly` suffix convention:
| Suffix | Meaning |
|--------|---------|
| `_Good` | Happy path; expected to succeed |
| `_Bad` | Expected error conditions |
| `_Ugly` | Panic / edge cases |
Example:
```go
func TestMatmul_Good(t *testing.T) { ... }
func TestMatmul_Bad(t *testing.T) { ... }
```
Tests that require model files on disk use `t.Skip()` when the path is absent:
```go
const modelPath = "/Volumes/Data/lem/safetensors/gemma-3/"
if _, err := os.Stat(modelPath); err != nil {
t.Skip("model not available:", modelPath)
}
```
All 180+ tests in `internal/metal/` are unit or integration tests that exercise the CGO layer directly. The 11 tests in the root package (`mlx_test.go`) exercise the public API via go-inference.
### Running a Single Test
```bash
go test -run TestRMSNorm_Good ./internal/metal/
```
### Running with Race Detector
```bash
go test -race ./...
```
---
## Benchmarks
29 benchmarks in `internal/metal/bench_test.go`. Run with:
```bash
go test -bench=. -benchtime=2s ./internal/metal/
```
Key benchmarks:
| Benchmark group | What it measures |
|----------------|-----------------|
| `BenchmarkMatmul_*` | Matrix multiply at 128² through 4096², plus token projection |
| `BenchmarkSoftmax_*` | Softmax at 1K through 128K vocab |
| `BenchmarkElementWise_*` | Add, Mul, SiLU at 1M elements |
| `BenchmarkRMSNorm_*` | Fused RMSNorm at decode and prefill shapes |
| `BenchmarkRoPE_*` | RoPE at single-token and 512-token shapes |
| `BenchmarkSDPA_*` | Scaled dot-product attention at 1, 32, 512 sequence lengths |
| `BenchmarkLinear_*` | Linear layer forward at decode and prefill shapes |
| `BenchmarkSampler_*` | Greedy, TopK, TopP, and full chain on 32K vocab |
Model-level benchmarks (`model.Forward`, tokenizer) require model files on disk and are not included in the automated suite.
---
## Code Structure
### Adding a New Operation
1. Add the C binding to the appropriate file in `internal/metal/`:
- `ops.go` — element-wise, reduction, matrix, shape operations
- `fast.go` — fused Metal kernel wrappers
- `slice.go` — slicing and scatter operations
2. Follow the `newArray("OP_NAME", inputs...)` pattern for tracking
3. Add tests in the corresponding `_test.go` file using `_Good`/`_Bad` suffixes
4. Add a benchmark in `bench_test.go` for any operation on the hot path
### Adding a New Model Architecture
1. Read `config.json` `model_type` and add a case in `model.go`:`loadModel`
2. Create `architecture.go` in `internal/metal/` implementing `InternalModel`
3. Add `ApplyLoRA` to the new model
4. Add a `close*` helper in `close.go` for deterministic resource cleanup
5. Add `formatXyzChat` in `generate.go` for the chat template
6. Add tokeniser BOS/EOS detection in `tokenizer.go`:`LoadTokenizer`
7. Write tests: config parsing, missing weights, end-to-end inference
---
## Coding Standards
### Language
UK English throughout: colour, organisation, centre, initialise, behaviour. Never American spellings.
### Go Style
- `declare(strict_types=1)` equivalent: all parameters and return types must be explicitly typed
- PSR-12 equivalent: `gofmt` + `goimports`; run before committing
- `go test ./...` must pass before every commit; no red tests in main
### Licence Header
Every new source file must carry the EUPL-1.2 licence identifier:
```go
// SPDX-Licence-Identifier: EUPL-1.2
```
### Conventional Commits
Format: `type(scope): description`
Types:
- `feat` — new capability
- `fix` — bug fix
- `test` — test additions or changes
- `bench` — benchmark additions or changes
- `refactor` — code restructuring without behaviour change
- `docs` — documentation only
- `chore` — maintenance (gitignore, go.mod, CMake)
Scopes: `metal`, `api`, `mlxlm`, `cpp`, `docs`
Examples:
```
feat(metal): add TopP nucleus sampling
fix(metal): auto-contiguous data access for non-contiguous arrays
test(metal): add model loading robustness tests
bench(metal): add 29 benchmarks baselined on M3 Ultra
```
### Co-Author
All commits must include:
```
Co-Authored-By: Virgil <virgil@lethean.io>
```
### Build Tags
- All CGO files: `//go:build darwin && arm64`
- Stub file: `//go:build !darwin || !arm64`
- mlxlm opt-out: `//go:build !nomlxlm`
---
## CMake Configuration
`CMakeLists.txt` at the module root. Key settings:
```cmake
set(MLX_BUILD_SAFETENSORS ON) # Required for model loading
set(MLX_BUILD_GGUF OFF) # GGUF not supported
set(BUILD_SHARED_LIBS ON) # Shared .dylib for rpath loading
set(CMAKE_OSX_DEPLOYMENT_TARGET 13.3) # MLX minimum
```
To force a clean rebuild:
```bash
rm -rf build dist
go generate ./...
```
---
## mlxlm Backend Development
The `mlxlm/` package has no CGO dependency and tests run on any platform where Python 3 is available. Tests use `testdata/mock_bridge.py` instead of the real `bridge.py`, so no `mlx-lm` installation is required.
Run mlxlm tests:
```bash
go test ./mlxlm/
```
The mock bridge responds to all commands with fixed fake data, enabling full subprocess protocol testing without GPU or Python ML dependencies.
To opt out of building the mlxlm backend:
```bash
go build -tags nomlxlm ./...
```
---
## Dependency Graph
```
go-mlx
├── forge.lthn.ai/core/go-inference (shared interfaces, zero dependencies)
└── mlx-c v0.4.1 (CMake, fetched from GitHub at generate time)
└── Apple MLX (Metal GPU compute)
└── Foundation, Metal, Accelerate frameworks
```
The root package and `mlxlm/` have no CGO dependency. Only `internal/metal/` links against mlx-c.