Commit graph

17 commits

Author SHA1 Message Date
Snider
4d73fa250c feat(ml): Phase 2 — migrate ml_backends to go-inference registry
- Make go-inference a direct dependency (was indirect)
- Rewrite mlBackends() to use inference.List()/Get()/Default()
  instead of ml.Service.Backends()/Backend()/DefaultBackend()
- Add documentation comments clarifying generation flow
- mlGenerate/mlScore/mlProbe unchanged (work via go-ml.Service)

Co-Authored-By: Virgil <virgil@lethean.io>
2026-02-20 06:01:30 +00:00
Snider
4665bea75e fix: post-split cleanup — remove test script, fix tests, update docs
- Delete test-mlx.go (standalone test script, not library code)
- Fix TestSandboxing_Symlinks_Followed to match security behaviour
  (renamed to TestSandboxing_Symlinks_Blocked — asserts sandbox
  correctly blocks symlinks escaping the workspace root)
- Fix TestNewTCPTransport_Warning by adding missing security warning
  to NewTCPTransport when binding to 0.0.0.0 (all interfaces)
- Update CLAUDE.md dependency table (go-mlx, duckdb, parquet, ollama,
  qdrant now indirect via go-ml/go-rag)
- Tidy go.mod (direct vs indirect aligned with actual imports)
- Verify go build, go vet, go test all pass

Co-Authored-By: Virgil <virgil@lethean.io>
2026-02-20 01:18:27 +00:00
Snider
a36466a870 docs: rewrite TODO.md with concrete file-level tasks
Replace vague Phase 1-4 with specific actionable items:
remove test-mlx.go, go-inference migration details,
MCP transport e2e testing, IDE bridge hardening.

Co-Authored-By: Virgil <virgil@lethean.io>
2026-02-20 00:28:42 +00:00
Virgil
e777d84a46 docs: add CLAUDE.md, TODO.md, FINDINGS.md for fleet delegation
Co-Authored-By: Virgil <virgil@lethean.io>
2026-02-19 21:34:55 +00:00
Snider
0af152eaef refactor: extract ml/ to standalone core/go-ml module
go-ai is now a meta package: ai/ facade + mcp/ hub that imports
go-ml, go-rag, go-mlx, and go-agentic as separate modules.

Co-Authored-By: Virgil <virgil@lethean.io>
2026-02-19 18:32:42 +00:00
Snider
2886ffaa4c refactor: extract rag/ to standalone core/go-rag module
Update ai/rag.go and mcp/tools_rag.go imports to reference new module.

Co-Authored-By: Virgil <virgil@lethean.io>
2026-02-19 18:30:37 +00:00
Snider
f99ca10c6c refactor: extract agentic/ to standalone core/go-agentic module
Now lives at forge.lthn.ai/core/go-agentic.

Co-Authored-By: Virgil <virgil@lethean.io>
2026-02-19 18:28:48 +00:00
Snider
34d0f9ce41 refactor: extract mlx/ to standalone core/go-mlx module
Remove mlx/ directory (now lives at forge.lthn.ai/core/go-mlx).
Update ml/backend_mlx.go imports to reference the new module.
Add replace directive for local development.

Co-Authored-By: Virgil <virgil@lethean.io>
2026-02-19 18:24:32 +00:00
Snider
906a535899 chore: update module paths and add gitignore
Sync module imports across agentic, mcp, ml, and rag packages.
Add .gitignore for mlx build artifacts.

Co-Authored-By: Virgil <virgil@lethean.io>
2026-02-17 19:19:38 +00:00
Snider
98749c66f2 fix(mlx): add DecodeToken for correct streaming word boundaries
The Decode method strips the SentencePiece leading space from every
token, which loses word boundaries during streaming. DecodeToken
preserves the space (it represents the word boundary) and only the
first token of each generation has its leading space stripped.

Fixes Gemma3 space prefix appearing in chat UI output.

Co-Authored-By: Virgil <virgil@lethean.io>
2026-02-17 19:18:27 +00:00
Snider
2d870385f9 feat(mlx): LoRA injection into models + masked cross-entropy loss
Add LoRA field to Linear for transparent adapter injection via model's
Forward() path. ApplyLoRA() on Qwen3/Gemma3 wraps target projections.
Deterministic param ordering for adapter save/load consistency.
MaskedCrossEntropyLoss for training on assistant tokens only.

Co-Authored-By: Virgil <virgil@lethean.io>
2026-02-17 17:37:44 +00:00
Snider
0eaf3d5a17 feat(mlx): add LoRA adapter layers and AdamW optimizer
LoRA: low-rank adaptation with trainable A/B matrices, Kaiming normal
init, safetensors save/load. AdamW: decoupled weight decay optimizer
with positional moment tracking for gradient-replaced params.

14 tests passing including end-to-end LoRA+AdamW training loop.

Co-Authored-By: Virgil <virgil@lethean.io>
2026-02-17 17:25:42 +00:00
Snider
e9973aef3c feat(mlx): add autograd — VJP, JVP, ValueAndGrad, loss functions
Native Go bindings for MLX-C gradient computation on Apple Silicon.
Foundation for LoRA training without Python.

- VJP (reverse-mode autodiff) for backward pass
- JVP (forward-mode autodiff) for directional derivatives
- ValueAndGrad for combined loss + gradient computation
- Checkpoint for memory-efficient gradient recomputation
- CrossEntropyLoss (numerically stable via LogSumExp)
- MSELoss, Log, SumAll, MeanAll, OnesLike helpers
- TakeAlongAxis and LogSumExp ops
- Fix closure callback null vector bug (affects compile.go too)
- Fix Float() returning 0 for float32 arrays

14 tests passing on Metal GPU.

Co-Authored-By: Virgil <virgil@lethean.io>
2026-02-17 17:18:47 +00:00
Snider
92c6282d50 refactor(mlx): drop mlx build tag, auto-enable on darwin/arm64
Remove the manual -tags mlx requirement. MLX is now automatically
compiled on darwin/arm64 via build constraints. Stubs remain for
other platforms. No functional change.

Co-Authored-By: Virgil <virgil@lethean.io>
2026-02-17 16:57:41 +00:00
d0cbd5065e Merge pull request 'test: validate MLX inference and scoring pipeline on M3 Ultra' (#4) from test/mlx-validation-m3-ultra into main
Reviewed-on: #4
2026-02-16 17:47:16 +00:00
3916633f4d test: validate MLX inference and scoring pipeline on M3 Ultra
Fixes #2

- Run complete test suite: all 84 tests passing (100%)
- Verify Metal 4 GPU support and hardware capabilities
- Test scoring pipeline (heuristic + judge + engine)
- Confirm GGUF model directory with 9 models (40.43 GB)
- Document MLX backend build requirements
- Update module imports from forge.lthn.ai/core/go to forge.lthn.ai/core/cli
- Add comprehensive TEST-RESULTS.md with findings

Platform: M3 Ultra (60 GPU cores, 96GB RAM, Metal 4)
Results: All tests passing, scoring pipeline operational, MLX ready to build

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-16 17:24:36 +00:00
Claude
e84d6ad3c9
feat: extract AI/ML packages from core/go
LEM scoring pipeline, native MLX Metal bindings, Claude SDK wrapper,
RAG with Qdrant/Ollama, unified AI facade, and MCP protocol server.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-16 15:25:55 +00:00