Integration test verifies model discovery on real GGUF files.
All 9 models in /data/lem/gguf/ discovered with correct metadata.
Co-Authored-By: Virgil <virgil@lethean.io>
Clear lastErr at the start of each Generate/Chat call so that Err()
reflects the most recent call, not a stale cancellation from a prior one.
Add two integration tests:
- GracefulShutdown: cancel mid-stream then generate again on the same
model, verifying the server survives cancellation.
- ConcurrentRequests: three goroutines calling Generate() simultaneously,
verifying no panics or deadlocks (llama-server serialises via slots).
Co-Authored-By: Virgil <virgil@lethean.io>
LoadModel → Generate → Chat → Close on real AMD GPU hardware.
Build-tagged //go:build rocm so normal go test skips them.
Co-Authored-By: Virgil <virgil@lethean.io>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>