Environment

Validated by Charon, 19 Feb 2026.

Hardware

Component	Detail
Machine	snider-linux (10.69.69.165)
CPU	AMD Ryzen 9 9950X (32 threads @ 5.7GHz)
GPU (dGPU)	AMD Radeon RX 7800 XT — gfx1100, RDNA 3, 16GB GDDR6, 60 CUs
GPU (iGPU)	AMD Radeon Graphics — gfx1100, integrated in CPU
OS	Ubuntu (Kernel 6.17.0-14-generic)
ROCm	7.2.0
Go	1.25.6

iGPU Crash — Critical

The Ryzen 9 9950X iGPU is detected by ROCm as Device 1. It reports ~100GB free memory (actually system RAM, not real VRAM). llama-server's auto-fit logic tries to split model tensors across both devices, causing:

ROCm error: unspecified launch failure
  in function ggml_backend_cuda_buffer_set_tensor

This crashes with a core dump every time.

Fix: Always set HIP_VISIBLE_DEVICES=0 before spawning llama-server:

cmd.Env = append(os.Environ(), "HIP_VISIBLE_DEVICES=0")

Without this, every model load will crash. This must be in server.go.

llama-server

Detail	Value
Binary	`/usr/local/bin/llama-server`
Source	`/home/claude/llama.cpp/` (commit `11c325c`)
Build flags	`-DGGML_HIP=ON -DAMDGPU_TARGETS=gfx1100 -DGGML_HIP_ROCWMMA_FATTN=ON -DCMAKE_BUILD_TYPE=Release`
Version output	`ggml_cuda_init: found 2 ROCm devices` (with HIP_VISIBLE_DEVICES=0: 1 device)

Rebuild

cd /home/claude/llama.cpp
git pull
cmake -B build \
    -DGGML_HIP=ON \
    -DAMDGPU_TARGETS=gfx1100 \
    -DGGML_HIP_ROCWMMA_FATTN=ON \
    -DCMAKE_BUILD_TYPE=Release
cmake --build build --parallel $(nproc) -t llama-server
sudo cp build/bin/llama-server /usr/local/bin/llama-server

Manual Test

HIP_VISIBLE_DEVICES=0 llama-server \
    --model /data/lem/gguf/LEK-Gemma3-4B-Q4_K_M.gguf \
    --port 8199 --n-gpu-layers 99 --ctx-size 4096

# In another terminal:
curl -s http://localhost:8199/v1/chat/completions \
    -H "Content-Type: application/json" \
    -d '{"messages":[{"role":"user","content":"Hello"}],"max_tokens":32,"stream":false}' | jq .

ROCm Verification

rocm-smi          # GPU status, temp, VRAM usage
rocminfo          # Detailed GPU info (confirms gfx1100)
/dev/kfd          # Must exist (ROCm kernel driver)

Environment Variables

Variable	Required	Default	Purpose
`HIP_VISIBLE_DEVICES`	YES	Must set to `0`	Masks iGPU to prevent crash
`ROCM_LLAMA_SERVER_PATH`	No	`llama-server` (PATH)	Custom binary path
`ROCM_MODEL_DIR`	No	none	Default model directory
`HSA_OVERRIDE_GFX_VERSION`	No	unset	Not needed (native gfx1100)