Add "Models"
parent
57aa54cb64
commit
dec29b00e0
1 changed files with 53 additions and 0 deletions
53
Models.md
Normal file
53
Models.md
Normal file
|
|
@ -0,0 +1,53 @@
|
|||
# Model Support
|
||||
|
||||
## Supported Architectures
|
||||
|
||||
| Model | File | Parameters | Notes |
|
||||
|-------|------|-----------|-------|
|
||||
| Gemma3 | `model/gemma3.go` | 1B, 4B, 27B | Google's open model family |
|
||||
| Qwen3 | `model/qwen3.go` | 8B+ | Alibaba's open model family |
|
||||
|
||||
## Model Interface
|
||||
|
||||
All models implement:
|
||||
|
||||
```go
|
||||
type Model interface {
|
||||
Forward(x *mlx.Array, cache *cache.KVCache) *mlx.Array
|
||||
}
|
||||
```
|
||||
|
||||
`model.LoadModel(path)` auto-detects the architecture from the config file and returns the appropriate implementation.
|
||||
|
||||
## Generation
|
||||
|
||||
```go
|
||||
m, err := model.LoadModel("/path/to/model/")
|
||||
if err != nil {
|
||||
log.Fatal(err)
|
||||
}
|
||||
|
||||
// Token-by-token generation with sampling
|
||||
tokens := m.Generate(promptTokens, model.GenerateOptions{
|
||||
MaxTokens: 512,
|
||||
Temperature: 0.7,
|
||||
TopK: 40,
|
||||
})
|
||||
```
|
||||
|
||||
## Adding New Models
|
||||
|
||||
1. Create `model/{name}.go` with `//go:build darwin && arm64`
|
||||
2. Implement the `Model` interface
|
||||
3. Register architecture detection in `model/model.go`
|
||||
4. The model must use safetensors format
|
||||
|
||||
## Tokenizer
|
||||
|
||||
BPE tokenizer in `tokenizer/tokenizer.go` reads sentencepiece-format vocab files:
|
||||
|
||||
```go
|
||||
tok, err := tokenizer.Load("/path/to/tokenizer.model")
|
||||
ids := tok.Encode("Hello world")
|
||||
text := tok.Decode(ids)
|
||||
```
|
||||
Loading…
Add table
Reference in a new issue