|
docs/plans
|
docs: Phase 4 performance implementation plan
|
2026-02-19 23:11:30 +00:00 |
|
internal
|
feat: GGUF metadata parser for model discovery
|
2026-02-19 22:20:02 +00:00 |
|
backend.go
|
feat: use GGUF metadata for model type and context window auto-detection
|
2026-02-19 22:23:07 +00:00 |
|
CLAUDE.md
|
docs: Phase 0 complete — environment validated, llama-server built
|
2026-02-19 19:57:14 +00:00 |
|
discover.go
|
feat: model discovery scanning directories for GGUF files
|
2026-02-19 22:21:48 +00:00 |
|
discover_test.go
|
feat: model discovery scanning directories for GGUF files
|
2026-02-19 22:21:48 +00:00 |
|
FINDINGS.md
|
docs: Phase 3 complete — GGUF metadata, discovery, auto context
|
2026-02-19 22:24:52 +00:00 |
|
go.mod
|
feat: llamacpp health check client
|
2026-02-19 20:50:36 +00:00 |
|
go.sum
|
feat: llamacpp health check client
|
2026-02-19 20:50:36 +00:00 |
|
model.go
|
test: graceful shutdown and concurrent request integration tests
|
2026-02-19 21:50:47 +00:00 |
|
README.md
|
Initial commit
|
2026-02-19 19:35:55 +00:00 |
|
register_rocm.go
|
feat: scaffold go-rocm AMD GPU inference package
|
2026-02-19 19:39:40 +00:00 |
|
rocm.go
|
feat: model discovery scanning directories for GGUF files
|
2026-02-19 22:21:48 +00:00 |
|
rocm_integration_test.go
|
docs: Phase 3 complete — GGUF metadata, discovery, auto context
|
2026-02-19 22:24:52 +00:00 |
|
rocm_stub.go
|
feat: VRAM monitoring via sysfs with dGPU auto-detection
|
2026-02-19 21:45:02 +00:00 |
|
server.go
|
fix: only retry startServer on process exit, not timeout
|
2026-02-19 21:43:06 +00:00 |
|
server_test.go
|
feat: use GGUF metadata for model type and context window auto-detection
|
2026-02-19 22:23:07 +00:00 |
|
TODO.md
|
docs: Phase 3 complete — GGUF metadata, discovery, auto context
|
2026-02-19 22:24:52 +00:00 |
|
vram.go
|
fix: clamp VRAM Free to prevent uint64 underflow
|
2026-02-19 21:48:19 +00:00 |
|
vram_test.go
|
feat: VRAM monitoring via sysfs with dGPU auto-detection
|
2026-02-19 21:45:02 +00:00 |