Page:
Environment
No results
1
Environment
Claude edited this page 2026-02-19 20:07:58 +00:00
Environment
Validated by Charon, 19 Feb 2026.
Hardware
| Component | Detail |
|---|---|
| Machine | snider-linux (10.69.69.165) |
| CPU | AMD Ryzen 9 9950X (32 threads @ 5.7GHz) |
| GPU (dGPU) | AMD Radeon RX 7800 XT — gfx1100, RDNA 3, 16GB GDDR6, 60 CUs |
| GPU (iGPU) | AMD Radeon Graphics — gfx1100, integrated in CPU |
| OS | Ubuntu (Kernel 6.17.0-14-generic) |
| ROCm | 7.2.0 |
| Go | 1.25.6 |
iGPU Crash — Critical
The Ryzen 9 9950X iGPU is detected by ROCm as Device 1. It reports ~100GB free memory (actually system RAM, not real VRAM). llama-server's auto-fit logic tries to split model tensors across both devices, causing:
ROCm error: unspecified launch failure
in function ggml_backend_cuda_buffer_set_tensor
This crashes with a core dump every time.
Fix: Always set HIP_VISIBLE_DEVICES=0 before spawning llama-server:
cmd.Env = append(os.Environ(), "HIP_VISIBLE_DEVICES=0")
Without this, every model load will crash. This must be in server.go.
llama-server
| Detail | Value |
|---|---|
| Binary | /usr/local/bin/llama-server |
| Source | /home/claude/llama.cpp/ (commit 11c325c) |
| Build flags | -DGGML_HIP=ON -DAMDGPU_TARGETS=gfx1100 -DGGML_HIP_ROCWMMA_FATTN=ON -DCMAKE_BUILD_TYPE=Release |
| Version output | ggml_cuda_init: found 2 ROCm devices (with HIP_VISIBLE_DEVICES=0: 1 device) |
Rebuild
cd /home/claude/llama.cpp
git pull
cmake -B build \
-DGGML_HIP=ON \
-DAMDGPU_TARGETS=gfx1100 \
-DGGML_HIP_ROCWMMA_FATTN=ON \
-DCMAKE_BUILD_TYPE=Release
cmake --build build --parallel $(nproc) -t llama-server
sudo cp build/bin/llama-server /usr/local/bin/llama-server
Manual Test
HIP_VISIBLE_DEVICES=0 llama-server \
--model /data/lem/gguf/LEK-Gemma3-4B-Q4_K_M.gguf \
--port 8199 --n-gpu-layers 99 --ctx-size 4096
# In another terminal:
curl -s http://localhost:8199/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{"messages":[{"role":"user","content":"Hello"}],"max_tokens":32,"stream":false}' | jq .
ROCm Verification
rocm-smi # GPU status, temp, VRAM usage
rocminfo # Detailed GPU info (confirms gfx1100)
/dev/kfd # Must exist (ROCm kernel driver)
Environment Variables
| Variable | Required | Default | Purpose |
|---|---|---|---|
HIP_VISIBLE_DEVICES |
YES | Must set to 0 |
Masks iGPU to prevent crash |
ROCM_LLAMA_SERVER_PATH |
No | llama-server (PATH) |
Custom binary path |
ROCM_MODEL_DIR |
No | none | Default model directory |
HSA_OVERRIDE_GFX_VERSION |
No | unset | Not needed (native gfx1100) |