AMD ROCm GPU inference for Linux. llama.cpp + HIP backend for RDNA 3.
Replaces filename-based guessModelType with GGUF header parsing. Caps default context at 4096 to prevent VRAM exhaustion on models with 128K+ native context. Co-Authored-By: Virgil <virgil@lethean.io> |
||
|---|---|---|
| docs/plans | ||
| internal | ||
| backend.go | ||
| CLAUDE.md | ||
| discover.go | ||
| discover_test.go | ||
| FINDINGS.md | ||
| go.mod | ||
| go.sum | ||
| model.go | ||
| README.md | ||
| register_rocm.go | ||
| rocm.go | ||
| rocm_integration_test.go | ||
| rocm_stub.go | ||
| server.go | ||
| server_test.go | ||
| TODO.md | ||
| vram.go | ||
| vram_test.go | ||
go-rocm
AMD ROCm GPU inference for Linux. llama.cpp + HIP backend for RDNA 3.