1
0
Fork 0
forked from lthn/LEM
LEM/pkg/lem/backend_mlxlm.go
Snider d2cf891f15 feat: add mlx_lm subprocess backend and distill improvements
- Add backend_mlxlm.go blank import to register mlx-lm subprocess backend
- Select backend from ai.yaml config (metal, mlx_lm, rocm, api)
- Only set Metal cache/memory limits when using metal backend
- Add --no-dedup flag to disable grammar-profile deduplication
  (trained models with consistent voice trigger false positives at 0.02)
- Add --context-len flag and context_len config for KV cache sizing
- Pass WithBackend and WithContextLen to go-ml backend loader

Co-Authored-By: Virgil <virgil@lethean.io>
2026-02-23 18:37:12 +00:00

6 lines
284 B
Go

package lem
// Blank import registers the mlx-lm subprocess backend with go-inference.
// This spawns a Python process using mlx-lm for inference — handles memory
// management natively via Python's refcounting (2.4 GB vs 17+ GB in CGO).
import _ "forge.lthn.ai/core/go-mlx/mlxlm"