1 Backend Registry
Virgil edited this page 2026-03-11 12:02:35 +00:00

Backend Registry

Module: forge.lthn.ai/core/go-inference

The backend registry manages inference engine implementations. Backends self-register via init() with build tags, enabling platform-specific GPU acceleration without import-time coupling.

Registry Functions

Function Signature Description
Register Register(b Backend) Add a backend (called from init())
Get Get(name string) (Backend, bool) Retrieve by name
List List() []string All registered backend names (sorted)
All All() iter.Seq2[string, Backend] Iterator over all backends
Default Default() (Backend, error) First available backend (prefers metal > rocm > llama_cpp)
LoadModel LoadModel(path string, opts ...LoadOption) (TextModel, error) Load via specified or default backend

Load Options

Option Description
WithBackend(name) Select specific backend ("metal", "rocm", "llama_cpp")
WithContextLen(n) Context window size (0 = model default)
WithGPULayers(n) GPU layer offload count (-1 = all, 0 = none)
WithParallelSlots(n) Concurrent inference slots
WithAdapterPath(path) LoRA adapter directory

Generate Options

Option Description
WithMaxTokens(n) Cap output length (default 256)
WithTemperature(t) Sampling temperature (0 = greedy)
WithTopK(k) Top-k sampling (0 = disabled)
WithTopP(p) Nucleus sampling threshold
WithRepeatPenalty(p) Repetition penalty (1.0 = none)
WithStopTokens(ids...) Token IDs that stop generation
WithLogits() Return raw logits in ClassifyResult

Model Discovery

Discover(baseDir string) iter.Seq[DiscoveredModel] scans for model directories containing config.json + *.safetensors. Returns DiscoveredModel with path, architecture, quantisation info, and file count.