Commit graph

8 commits

Author SHA1 Message Date
Snider
98749c66f2 fix(mlx): add DecodeToken for correct streaming word boundaries
The Decode method strips the SentencePiece leading space from every
token, which loses word boundaries during streaming. DecodeToken
preserves the space (it represents the word boundary) and only the
first token of each generation has its leading space stripped.

Fixes Gemma3 space prefix appearing in chat UI output.

Co-Authored-By: Virgil <virgil@lethean.io>
2026-02-17 19:18:27 +00:00
Snider
2d870385f9 feat(mlx): LoRA injection into models + masked cross-entropy loss
Add LoRA field to Linear for transparent adapter injection via model's
Forward() path. ApplyLoRA() on Qwen3/Gemma3 wraps target projections.
Deterministic param ordering for adapter save/load consistency.
MaskedCrossEntropyLoss for training on assistant tokens only.

Co-Authored-By: Virgil <virgil@lethean.io>
2026-02-17 17:37:44 +00:00
Snider
0eaf3d5a17 feat(mlx): add LoRA adapter layers and AdamW optimizer
LoRA: low-rank adaptation with trainable A/B matrices, Kaiming normal
init, safetensors save/load. AdamW: decoupled weight decay optimizer
with positional moment tracking for gradient-replaced params.

14 tests passing including end-to-end LoRA+AdamW training loop.

Co-Authored-By: Virgil <virgil@lethean.io>
2026-02-17 17:25:42 +00:00
Snider
e9973aef3c feat(mlx): add autograd — VJP, JVP, ValueAndGrad, loss functions
Native Go bindings for MLX-C gradient computation on Apple Silicon.
Foundation for LoRA training without Python.

- VJP (reverse-mode autodiff) for backward pass
- JVP (forward-mode autodiff) for directional derivatives
- ValueAndGrad for combined loss + gradient computation
- Checkpoint for memory-efficient gradient recomputation
- CrossEntropyLoss (numerically stable via LogSumExp)
- MSELoss, Log, SumAll, MeanAll, OnesLike helpers
- TakeAlongAxis and LogSumExp ops
- Fix closure callback null vector bug (affects compile.go too)
- Fix Float() returning 0 for float32 arrays

14 tests passing on Metal GPU.

Co-Authored-By: Virgil <virgil@lethean.io>
2026-02-17 17:18:47 +00:00
Snider
92c6282d50 refactor(mlx): drop mlx build tag, auto-enable on darwin/arm64
Remove the manual -tags mlx requirement. MLX is now automatically
compiled on darwin/arm64 via build constraints. Stubs remain for
other platforms. No functional change.

Co-Authored-By: Virgil <virgil@lethean.io>
2026-02-17 16:57:41 +00:00
d0cbd5065e Merge pull request 'test: validate MLX inference and scoring pipeline on M3 Ultra' (#4) from test/mlx-validation-m3-ultra into main
Reviewed-on: #4
2026-02-16 17:47:16 +00:00
3916633f4d test: validate MLX inference and scoring pipeline on M3 Ultra
Fixes #2

- Run complete test suite: all 84 tests passing (100%)
- Verify Metal 4 GPU support and hardware capabilities
- Test scoring pipeline (heuristic + judge + engine)
- Confirm GGUF model directory with 9 models (40.43 GB)
- Document MLX backend build requirements
- Update module imports from forge.lthn.ai/core/go to forge.lthn.ai/core/cli
- Add comprehensive TEST-RESULTS.md with findings

Platform: M3 Ultra (60 GPU cores, 96GB RAM, Metal 4)
Results: All tests passing, scoring pipeline operational, MLX ready to build

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-16 17:24:36 +00:00
Claude
e84d6ad3c9
feat: extract AI/ML packages from core/go
LEM scoring pipeline, native MLX Metal bindings, Claude SDK wrapper,
RAG with Qdrant/Ollama, unified AI facade, and MCP protocol server.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-16 15:25:55 +00:00