2 Platform Matrix
Claude edited this page 2026-02-19 20:05:22 +00:00

Platform Matrix

Which packages run where and why.

macOS (M3 Ultra, 96GB unified memory)

Package Why macOS Build Tag
core/go-mlx Metal GPU inference, CGO/mlx-c darwin && arm64
core/go-i18n Grammar engine (no platform dep, but active Claude here) none
core/go-html WASM rendering (no platform dep) none
core/go (framework) Virgil's home base none

Linux (snider-linux: Ryzen 9 9950X, RX 7800 XT gfx1100, 16GB VRAM, ROCm 7.2, Kernel 6.17)

Package Why Linux Build Tag Status
core/go-rocm ROCm/HIP GPU inference, llama-server linux && amd64 Phase 0 done, Phase 1 pending
core/go-p2p Real networking tests, no sandbox none Pending
core/go-rag Qdrant + Ollama with GPU embeddings none Pending

Platform-Agnostic (can run anywhere)

Package Notes
core/go-inference Zero deps, shared interfaces
core/go-ai Meta hub, imports others
core/go-ml Scoring engine, wraps backends
core/go-agentic Service lifecycle
core/go-crypt Crypto/auth
core/go-scm SCM/CI/AgentCI
core/go-help, go-ratelimit, go-session, go-store Small utilities
core/go-ws WebSocket hub
core/go-webview CDP automation
core/go-devops, go-netops Infra tooling

GPU Backend Comparison

Aspect go-mlx (macOS) go-rocm (Linux)
GPU Apple Metal (M-series) AMD ROCm (RDNA 3, gfx1100)
Approach Direct CGO (mlx-c) Subprocess (llama-server)
Model format Safetensors GGUF
Interface go-inference.TextModel go-inference.TextModel
VRAM Unified (shared with CPU) Dedicated 16GB GDDR6
Strength Low latency, tight integration Model variety (50+ architectures via llama.cpp)
Baseline TBD Gemma3-4B: 109 tok/s decode, 396 tok/s prefill
Gotcha None known iGPU crash — HIP_VISIBLE_DEVICES=0 required