Commit graph

7 commits

Author SHA1 Message Date
Snider
4669cc503d refactor: replace fmt.Errorf/errors.New with coreerr.E()
Some checks failed
Security Scan / security (push) Successful in 8s
Test / Vet & Build (push) Failing after 23s
Co-Authored-By: Virgil <virgil@lethean.io>
2026-03-16 21:08:52 +00:00
Claude
72120bb200
feat: pass --parallel N to llama-server for concurrent inference slots
Co-Authored-By: Virgil <virgil@lethean.io>
2026-02-19 23:13:19 +00:00
Claude
b7342ec819
fix: only retry startServer on process exit, not timeout
Distinguishes retryable failures (process exited, e.g. port conflict)
from non-retryable ones (60s timeout, e.g. stuck server). Avoids
3x timeout penalty.

Co-Authored-By: Virgil <virgil@lethean.io>
2026-02-19 21:43:06 +00:00
Claude
c50a8e9e9b
feat: retry port selection in startServer on process failure
Co-Authored-By: Virgil <virgil@lethean.io>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 21:40:05 +00:00
Claude
c07f37afe9
fix: guard nil exitErr wrapping, document concurrency invariant
Co-Authored-By: Virgil <virgil@lethean.io>
2026-02-19 21:38:01 +00:00
Claude
2c4966e652
feat: detect server crash before Generate/Chat calls
Co-Authored-By: Virgil <virgil@lethean.io>
2026-02-19 21:34:46 +00:00
Claude
9aa7f624ba
feat: server lifecycle and helpers for llama-server subprocess
Adds server.go with the process lifecycle layer that manages spawning
llama-server, waiting for readiness, and graceful shutdown. Includes
three helper functions (findLlamaServer, freePort, serverEnv) and the
full startServer/waitReady/stop lifecycle. The serverEnv function
critically filters HIP_VISIBLE_DEVICES to mask the Ryzen 9 iGPU
which crashes llama-server if not excluded.

Co-Authored-By: Virgil <virgil@lethean.io>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 21:08:07 +00:00