go-rocm

Author	SHA1	Message	Date
Claude	a6e647c5b7	test: graceful shutdown and concurrent request integration tests Clear lastErr at the start of each Generate/Chat call so that Err() reflects the most recent call, not a stale cancellation from a prior one. Add two integration tests: - GracefulShutdown: cancel mid-stream then generate again on the same model, verifying the server survives cancellation. - ConcurrentRequests: three goroutines calling Generate() simultaneously, verifying no panics or deadlocks (llama-server serialises via slots). Co-Authored-By: Virgil <virgil@lethean.io>	2026-02-19 21:50:47 +00:00
Claude	c07f37afe9	fix: guard nil exitErr wrapping, document concurrency invariant Co-Authored-By: Virgil <virgil@lethean.io>	2026-02-19 21:38:01 +00:00
Claude	2c4966e652	feat: detect server crash before Generate/Chat calls Co-Authored-By: Virgil <virgil@lethean.io>	2026-02-19 21:34:46 +00:00
Claude	a8c494771d	feat: TextModel implementation wrapping llama-server rocmModel implements inference.TextModel with Generate() and Chat() methods that delegate to the llamacpp HTTP client, mapping go-inference types to llama-server's OpenAI-compatible API. Token streaming via iter.Seq[inference.Token] with mutex-protected error propagation. Co-Authored-By: Virgil <virgil@lethean.io> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-19 21:11:55 +00:00