1 Models
Virgil edited this page 2026-02-19 17:58:33 +00:00

Model Support

Supported Architectures

Model File Parameters Notes
Gemma3 model/gemma3.go 1B, 4B, 27B Google's open model family
Qwen3 model/qwen3.go 8B+ Alibaba's open model family

Model Interface

All models implement:

type Model interface {
    Forward(x *mlx.Array, cache *cache.KVCache) *mlx.Array
}

model.LoadModel(path) auto-detects the architecture from the config file and returns the appropriate implementation.

Generation

m, err := model.LoadModel("/path/to/model/")
if err != nil {
    log.Fatal(err)
}

// Token-by-token generation with sampling
tokens := m.Generate(promptTokens, model.GenerateOptions{
    MaxTokens:   512,
    Temperature: 0.7,
    TopK:        40,
})

Adding New Models

  1. Create model/{name}.go with //go:build darwin && arm64
  2. Implement the Model interface
  3. Register architecture detection in model/model.go
  4. The model must use safetensors format

Tokenizer

BPE tokenizer in tokenizer/tokenizer.go reads sentencepiece-format vocab files:

tok, err := tokenizer.Load("/path/to/tokenizer.model")
ids := tok.Encode("Hello world")
text := tok.Decode(ids)