go-inference

Author	SHA1	Message	Date
Snider	1576f744d1	docs: add README with quick start and docs links Co-Authored-By: Virgil <virgil@lethean.io>	2026-02-20 15:11:19 +00:00
Snider	aaeeee4898	Merge remote-tracking branch 'origin/main' # Conflicts: # TODO.md	2026-02-20 15:03:05 +00:00
Snider	74b3e7a53d	docs: graduate TODO/FINDINGS into production documentation Replace internal task tracking (TODO.md, FINDINGS.md) with structured documentation in docs/. Trim CLAUDE.md to agent instructions only. Co-Authored-By: Virgil <virgil@lethean.io>	2026-02-20 15:01:55 +00:00
Claude	35da94a138	docs: mark Phase 1 foundation tests as complete Co-Authored-By: Charon <developers@lethean.io>	2026-02-20 11:46:21 +00:00
Claude	e0ec07e667	test(inference): complete Phase 1 foundation tests Add comprehensive tests for all three Phase 1 items: - Option application: DefaultGenerateConfig idempotency, field isolation (WithMaxTokens leaves others at defaults), bad-input acceptance (negative temperature, negative TopK), empty variadic StopTokens, WithLogits default-is-false, partial-options preserve defaults, last-wins overrides for all GenerateOption and LoadOption types. - Backend registry: concurrent read/write safety (70 goroutines with -race), overwrite-keeps-count, capturingBackend verifies LoadModel forwards all options to both explicit and default backends, Get after overwrite returns latest, List returns independent slices. - Default() platform preference: registration order is irrelevant (metal wins regardless), all-preferred-unavailable falls back to custom, multiple custom backends finds the available one, empty path forwarding. 85 tests, 100% statement coverage, -race clean. Co-Authored-By: Charon <developers@lethean.io>	2026-02-20 11:45:59 +00:00
Snider	c91e30599c	docs: mark Phase 2 integration complete — all 3 backends migrated go-mlx: register_metal.go implements inference.Backend (metalBackend + metalAdapter) go-rocm: register_rocm.go implements inference.Backend (rocmBackend, 5,794 LOC) go-ml: adapter.go bridges inference.TextModel → ml.Backend (118 LOC, 13 tests) Phase 3 (extended interfaces) deliberately deferred per design principles. Co-Authored-By: Virgil <virgil@lethean.io>	2026-02-20 09:06:19 +00:00
Snider	85f587ade3	docs: mark Phase 1 foundation tests complete (Charon `d76448d`) Co-Authored-By: Virgil <virgil@lethean.io>	2026-02-20 07:32:27 +00:00
Claude	d76448d4a9	test(inference): add comprehensive tests for all exported API Cover options (GenerateConfig defaults, all With* options, ApplyGenerateOpts/ ApplyLoadOpts), backend registry (Register, Get, List, Default priority order metal > rocm > llama_cpp), LoadModel routing (explicit/auto backend, error paths), and Discover (model directory scanning, quantisation, edge cases). 69 tests, 100% statement coverage, race-clean. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-20 02:06:49 +00:00
Snider	15ee86ec62	fix: add json struct tags to Message for API serialization Message is used as a type alias by go-ml. Without json tags, HTTP backends that serialize messages to JSON (e.g. OpenAI API) would produce incorrect field names. Co-Authored-By: Virgil <virgil@lethean.io>	2026-02-20 02:05:46 +00:00
Snider	c61ec9f5c7	docs: expand package doc with workflow examples Cover chat, classify, batch generate, options, and discovery. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-19 23:44:05 +00:00
Snider	884225d8a9	feat: add Discover() for scanning model directories Scans a base directory for model directories (subdirs with config.json + *.safetensors). Returns path, architecture, quantisation info, and file count for each discovered model. Useful for embedding in applications that need to find available models at runtime. Co-Authored-By: Virgil <virgil@lethean.io> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-19 23:37:30 +00:00
Snider	28f444ced4	feat: add ModelInfo type and Info() to TextModel Expose model metadata: architecture, vocab size, layer count, hidden dimension, quantisation bits and group size. Co-Authored-By: Virgil <virgil@lethean.io> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-19 23:36:16 +00:00
Snider	df176765e7	feat: add GenerateMetrics type and Metrics() to TextModel Expose prefill/decode timing, token counts, throughput, and GPU memory stats from the last inference operation. Same retrieval pattern as Err(). Co-Authored-By: Virgil <virgil@lethean.io> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-19 23:34:31 +00:00
Snider	2517b774b8	feat: add batch inference API (Classify, BatchGenerate) Add ClassifyResult, BatchResult types and Classify/BatchGenerate methods to TextModel for batched prefill-only and autoregressive inference. Add WithLogits option for returning raw vocab logits. Co-Authored-By: Virgil <virgil@lethean.io> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-19 23:29:28 +00:00
Claude	3719734f56	feat: add ParallelSlots to LoadConfig for concurrent inference Co-Authored-By: Virgil <virgil@lethean.io>	2026-02-19 23:12:29 +00:00
Snider	07cd917259	feat: define shared TextModel, Backend, Token, Message interfaces Zero-dependency interface package for the Core inference ecosystem. Backends (go-mlx, go-rocm) implement these interfaces. Consumers (go-ml, go-ai, go-i18n) import them. Includes: - TextModel: Generate, Chat, Err, Close (with context.Context) - Backend: Named engine registry with platform preference - Functional options: WithMaxTokens, WithTemperature, WithTopK, etc. - LoadModel: Auto-selects best available backend Co-Authored-By: Virgil <virgil@lethean.io>	2026-02-19 19:37:27 +00:00
Virgil	fca0ed8e16	Initial commit	2026-02-19 19:35:54 +00:00

17 commits