1
Models
Virgil edited this page 2026-02-19 17:58:33 +00:00
Table of Contents
Model Support
Supported Architectures
| Model | File | Parameters | Notes |
|---|---|---|---|
| Gemma3 | model/gemma3.go |
1B, 4B, 27B | Google's open model family |
| Qwen3 | model/qwen3.go |
8B+ | Alibaba's open model family |
Model Interface
All models implement:
type Model interface {
Forward(x *mlx.Array, cache *cache.KVCache) *mlx.Array
}
model.LoadModel(path) auto-detects the architecture from the config file and returns the appropriate implementation.
Generation
m, err := model.LoadModel("/path/to/model/")
if err != nil {
log.Fatal(err)
}
// Token-by-token generation with sampling
tokens := m.Generate(promptTokens, model.GenerateOptions{
MaxTokens: 512,
Temperature: 0.7,
TopK: 40,
})
Adding New Models
- Create
model/{name}.gowith//go:build darwin && arm64 - Implement the
Modelinterface - Register architecture detection in
model/model.go - The model must use safetensors format
Tokenizer
BPE tokenizer in tokenizer/tokenizer.go reads sentencepiece-format vocab files:
tok, err := tokenizer.Load("/path/to/tokenizer.model")
ids := tok.Encode("Hello world")
text := tok.Decode(ids)