Commit graph

26 commits

Author SHA1 Message Date
Snider
66d9d63017 docs: document AttentionSnapshot and AttentionInspector in architecture guide
All checks were successful
Security Scan / security (push) Successful in 7s
Test / test (push) Successful in 47s
Co-Authored-By: Virgil <virgil@lethean.io>
2026-02-23 12:34:30 +00:00
Snider
63c3d6aa53 chore: refresh go.sum for missing transitive deps
All checks were successful
Security Scan / security (push) Successful in 10s
Test / test (push) Successful in 1m55s
Co-Authored-By: Virgil <virgil@lethean.io>
2026-02-23 06:58:20 +00:00
Snider
3293253177 chore: add Go repo norms (badges, contributing, lint, taskfile, editorconfig)
Some checks failed
Security Scan / security (push) Successful in 10s
Test / test (push) Failing after 34s
Co-Authored-By: Virgil <virgil@lethean.io>
2026-02-23 06:45:40 +00:00
Snider
6333fa7b6d feat: modernise to Go 1.26 — iterators, slices.Sorted, maps.Keys
Some checks failed
Security Scan / security (push) Successful in 10s
Test / test (push) Failing after 37s
- List() now returns deterministic alphabetical order via slices.Sorted(maps.Keys())
- Add All() iter.Seq2[string, Backend] for iterator-based registry access
- Use slices.Insert for prepend in Discover
- Use maps.Values in Default() fallback
- Remove redundant sort.Strings in tests (List() is now sorted)

Co-Authored-By: Gemini <noreply@google.com>
Co-Authored-By: Virgil <virgil@lethean.io>
2026-02-23 05:07:50 +00:00
Claude
b65283bae7
ci: add Forgejo Actions test and security scan workflows
Some checks failed
Security Scan / security (push) Successful in 10s
Test / test (push) Failing after 25s
Uses reusable workflows from core/go-devops for Go testing
(with race detector and coverage) and security scanning
(govulncheck, gitleaks, trivy).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 03:28:52 +00:00
Snider
0f7263f731 feat: add AttentionInspector optional interface for Q/K Bone Orientation
Co-Authored-By: Virgil <virgil@lethean.io>
2026-02-23 00:36:45 +00:00
Claude
bd09ec4997
chore: sync workspace dependency versions
Run go work sync to align dependency versions across workspace.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-22 21:41:56 +00:00
Snider
82fd78ef90 refactor: apply go fix modernizers for Go 1.26
Automated fixes: interface{} → any, range-over-int, t.Context(),
wg.Go(), strings.SplitSeq, strings.Builder, slices.Contains,
maps helpers, min/max builtins.

Co-Authored-By: Virgil <virgil@lethean.io>
2026-02-22 21:00:16 +00:00
Snider
33414f608d chore: bump go directive to 1.26.0
Co-Authored-By: Virgil <virgil@lethean.io>
2026-02-22 20:33:48 +00:00
Snider
1576f744d1 docs: add README with quick start and docs links
Co-Authored-By: Virgil <virgil@lethean.io>
2026-02-20 15:11:19 +00:00
Snider
aaeeee4898 Merge remote-tracking branch 'origin/main'
# Conflicts:
#	TODO.md
2026-02-20 15:03:05 +00:00
Snider
74b3e7a53d docs: graduate TODO/FINDINGS into production documentation
Replace internal task tracking (TODO.md, FINDINGS.md) with structured
documentation in docs/. Trim CLAUDE.md to agent instructions only.

Co-Authored-By: Virgil <virgil@lethean.io>
2026-02-20 15:01:55 +00:00
Claude
35da94a138
docs: mark Phase 1 foundation tests as complete
Co-Authored-By: Charon <developers@lethean.io>
2026-02-20 11:46:21 +00:00
Claude
e0ec07e667
test(inference): complete Phase 1 foundation tests
Add comprehensive tests for all three Phase 1 items:

- Option application: DefaultGenerateConfig idempotency, field isolation
  (WithMaxTokens leaves others at defaults), bad-input acceptance
  (negative temperature, negative TopK), empty variadic StopTokens,
  WithLogits default-is-false, partial-options preserve defaults,
  last-wins overrides for all GenerateOption and LoadOption types.

- Backend registry: concurrent read/write safety (70 goroutines with
  -race), overwrite-keeps-count, capturingBackend verifies LoadModel
  forwards all options to both explicit and default backends, Get
  after overwrite returns latest, List returns independent slices.

- Default() platform preference: registration order is irrelevant
  (metal wins regardless), all-preferred-unavailable falls back to
  custom, multiple custom backends finds the available one, empty
  path forwarding.

85 tests, 100% statement coverage, -race clean.

Co-Authored-By: Charon <developers@lethean.io>
2026-02-20 11:45:59 +00:00
Snider
c91e30599c docs: mark Phase 2 integration complete — all 3 backends migrated
go-mlx: register_metal.go implements inference.Backend (metalBackend + metalAdapter)
go-rocm: register_rocm.go implements inference.Backend (rocmBackend, 5,794 LOC)
go-ml: adapter.go bridges inference.TextModel → ml.Backend (118 LOC, 13 tests)

Phase 3 (extended interfaces) deliberately deferred per design principles.

Co-Authored-By: Virgil <virgil@lethean.io>
2026-02-20 09:06:19 +00:00
Snider
85f587ade3 docs: mark Phase 1 foundation tests complete (Charon d76448d)
Co-Authored-By: Virgil <virgil@lethean.io>
2026-02-20 07:32:27 +00:00
Claude
d76448d4a9
test(inference): add comprehensive tests for all exported API
Cover options (GenerateConfig defaults, all With* options, ApplyGenerateOpts/
ApplyLoadOpts), backend registry (Register, Get, List, Default priority order
metal > rocm > llama_cpp), LoadModel routing (explicit/auto backend, error
paths), and Discover (model directory scanning, quantisation, edge cases).

69 tests, 100% statement coverage, race-clean.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-20 02:06:49 +00:00
Snider
15ee86ec62 fix: add json struct tags to Message for API serialization
Message is used as a type alias by go-ml. Without json tags,
HTTP backends that serialize messages to JSON (e.g. OpenAI API)
would produce incorrect field names.

Co-Authored-By: Virgil <virgil@lethean.io>
2026-02-20 02:05:46 +00:00
Snider
c61ec9f5c7 docs: expand package doc with workflow examples
Cover chat, classify, batch generate, options, and discovery.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 23:44:05 +00:00
Snider
884225d8a9 feat: add Discover() for scanning model directories
Scans a base directory for model directories (subdirs with config.json +
*.safetensors). Returns path, architecture, quantisation info, and file
count for each discovered model. Useful for embedding in applications
that need to find available models at runtime.

Co-Authored-By: Virgil <virgil@lethean.io>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 23:37:30 +00:00
Snider
28f444ced4 feat: add ModelInfo type and Info() to TextModel
Expose model metadata: architecture, vocab size, layer count, hidden
dimension, quantisation bits and group size.

Co-Authored-By: Virgil <virgil@lethean.io>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 23:36:16 +00:00
Snider
df176765e7 feat: add GenerateMetrics type and Metrics() to TextModel
Expose prefill/decode timing, token counts, throughput, and GPU memory
stats from the last inference operation. Same retrieval pattern as Err().

Co-Authored-By: Virgil <virgil@lethean.io>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 23:34:31 +00:00
Snider
2517b774b8 feat: add batch inference API (Classify, BatchGenerate)
Add ClassifyResult, BatchResult types and Classify/BatchGenerate methods
to TextModel for batched prefill-only and autoregressive inference.
Add WithLogits option for returning raw vocab logits.

Co-Authored-By: Virgil <virgil@lethean.io>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 23:29:28 +00:00
Claude
3719734f56
feat: add ParallelSlots to LoadConfig for concurrent inference
Co-Authored-By: Virgil <virgil@lethean.io>
2026-02-19 23:12:29 +00:00
Snider
07cd917259 feat: define shared TextModel, Backend, Token, Message interfaces
Zero-dependency interface package for the Core inference ecosystem.
Backends (go-mlx, go-rocm) implement these interfaces.
Consumers (go-ml, go-ai, go-i18n) import them.

Includes:
- TextModel: Generate, Chat, Err, Close (with context.Context)
- Backend: Named engine registry with platform preference
- Functional options: WithMaxTokens, WithTemperature, WithTopK, etc.
- LoadModel: Auto-selects best available backend

Co-Authored-By: Virgil <virgil@lethean.io>
2026-02-19 19:37:27 +00:00
fca0ed8e16 Initial commit 2026-02-19 19:35:54 +00:00