1
Backend Registry
Virgil edited this page 2026-03-11 12:02:35 +00:00
Table of Contents
Backend Registry
Module: forge.lthn.ai/core/go-inference
The backend registry manages inference engine implementations. Backends self-register via init() with build tags, enabling platform-specific GPU acceleration without import-time coupling.
Registry Functions
| Function | Signature | Description |
|---|---|---|
Register |
Register(b Backend) |
Add a backend (called from init()) |
Get |
Get(name string) (Backend, bool) |
Retrieve by name |
List |
List() []string |
All registered backend names (sorted) |
All |
All() iter.Seq2[string, Backend] |
Iterator over all backends |
Default |
Default() (Backend, error) |
First available backend (prefers metal > rocm > llama_cpp) |
LoadModel |
LoadModel(path string, opts ...LoadOption) (TextModel, error) |
Load via specified or default backend |
Load Options
| Option | Description |
|---|---|
WithBackend(name) |
Select specific backend ("metal", "rocm", "llama_cpp") |
WithContextLen(n) |
Context window size (0 = model default) |
WithGPULayers(n) |
GPU layer offload count (-1 = all, 0 = none) |
WithParallelSlots(n) |
Concurrent inference slots |
WithAdapterPath(path) |
LoRA adapter directory |
Generate Options
| Option | Description |
|---|---|
WithMaxTokens(n) |
Cap output length (default 256) |
WithTemperature(t) |
Sampling temperature (0 = greedy) |
WithTopK(k) |
Top-k sampling (0 = disabled) |
WithTopP(p) |
Nucleus sampling threshold |
WithRepeatPenalty(p) |
Repetition penalty (1.0 = none) |
WithStopTokens(ids...) |
Token IDs that stop generation |
WithLogits() |
Return raw logits in ClassifyResult |
Model Discovery
Discover(baseDir string) iter.Seq[DiscoveredModel] scans for model directories containing config.json + *.safetensors. Returns DiscoveredModel with path, architecture, quantisation info, and file count.