1 Home
Claude edited this page 2026-02-19 20:07:58 +00:00

go-rocm

AMD ROCm GPU inference backend for Linux. Implements inference.Backend and inference.TextModel from core/go-inference using llama.cpp's server mode with HIP/ROCm.

  • Environment — Hardware, ROCm, llama-server setup (validated 19 Feb 2026)
  • Architecture — How it works, design decisions, file layout
  • Interface-Contract — go-inference types this package must implement
  • Models — Available GGUF models and VRAM budget
  • Fleet-Context — How this repo fits in the wider agent fleet

Status

Phase Status Notes
Phase 0: Environment Done (Charon, 19 Feb) ROCm 7.2, llama-server built, baselines recorded
Phase 1: Core Implementation Pending GPU detection, server lifecycle, HTTP client, TextModel
Phase 2: Robustness Pending Crash recovery, graceful shutdown, VRAM monitoring
Phase 3: Model Support Pending GGUF discovery, chat templates, context sizing
Phase 4: Performance Pending Benchmarks, flash attention, batch inference

Module

forge.lthn.ai/core/go-rocm

Depends on: forge.lthn.ai/core/go-inference (shared interfaces, zero deps)