Commit graph

6 commits

Author SHA1 Message Date
Claude
c7c9389749
feat: GGUF metadata parser for model discovery
Reads GGUF v2/v3 binary headers to extract model metadata (architecture,
name, quantisation type, context length, block count). Includes string
length limits for malformed input protection and uint64 value support
for compatibility with varied GGUF producers.

Co-Authored-By: Virgil <virgil@lethean.io>
2026-02-19 22:20:02 +00:00
Claude
5778f1f011
fix: guard response body lifecycle in SSE streaming client
Use sync.Once to ensure resp.Body is closed exactly once, preventing
TCP connection leaks when the iterator is never consumed and
double-close when iterated twice. Also adds Accept: text/event-stream
header to both SSE endpoints.

Co-Authored-By: Virgil <virgil@lethean.io>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 21:04:02 +00:00
Claude
1bc8c9948b
test: completion streaming tests for llamacpp client
Add TestComplete_Streaming (multi-chunk SSE with three tokens) and
TestComplete_HTTPError (400 status propagation) to exercise the
Complete() method alongside the existing chat tests.

Co-Authored-By: Virgil <virgil@lethean.io>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 20:59:21 +00:00
Claude
def3167199
feat: llamacpp SSE streaming client for chat completions
Add ChatComplete() and Complete() methods to the llamacpp Client,
backed by a shared parseSSE() line parser. Types include ChatMessage,
ChatRequest, CompletionRequest and their chunked response structs.

Tests cover multi-chunk streaming, empty responses, HTTP errors, and
context cancellation — all using httptest SSE servers.

Co-Authored-By: Virgil <virgil@lethean.io>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 20:58:46 +00:00
Claude
d5a92c7212
fix: health check includes response body in errors, adds 503 test
Co-Authored-By: Virgil <virgil@lethean.io>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 20:54:52 +00:00
Claude
3c756771ec
feat: llamacpp health check client
Add internal/llamacpp package with Client type and Health() method.
Client communicates with llama-server via HTTP; Health checks the
/health endpoint and reports readiness. Foundation type for the
streaming methods (Tasks 2-3).

Co-Authored-By: Virgil <virgil@lethean.io>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 20:50:36 +00:00