Commit graph

20 commits

Author SHA1 Message Date
Snider
df0b300b1a fix(metal): auto-contiguous data access for non-contiguous arrays
Bind mlx_contiguous and _mlx_array_is_row_contiguous from mlx-c.
Floats(), DataInt32(), and Ints() now automatically handle non-contiguous
arrays (from Transpose, BroadcastTo, SliceAxis, etc.) by checking
IsRowContiguous() and making a contiguous copy when needed.

Previously these methods returned silently wrong data for view arrays.
The old workaround of Reshape(arr, totalSize) is no longer needed.

7 new tests for contiguous handling (transpose, broadcast, slice views).

Co-Authored-By: Virgil <virgil@lethean.io>
2026-02-19 20:39:36 +00:00
Snider
bff97ccf19 feat(api): migrate to go-inference shared interfaces
Replace local TextModel, Backend, Token, Message, and option types with
forge.lthn.ai/core/go-inference. go-mlx is now a pure backend that
registers "metal" into the shared inference registry via init().

Deleted: textmodel.go, options.go, backend.go
Updated: register_metal.go (implements inference.Backend with Available()),
  mlx_test.go (uses inference.* types, 4 new tests), go.mod,
  internal/metal/generate.go (added RepeatPenalty)

159 tests passing (148 internal/metal + 11 root).

Co-Authored-By: Virgil <virgil@lethean.io>
2026-02-19 20:15:42 +00:00
Snider
95d92fffff docs: update project docs for backend abstraction completion
- CLAUDE.md: new architecture diagram, public API examples
- TODO.md: Phase 4 marked complete, remaining items noted
- FINDINGS.md: migration completion notes, import cycle resolution

Co-Authored-By: Virgil <virgil@lethean.io>
2026-02-19 20:07:01 +00:00
Snider
eb8dee31bf test(api): integration tests for public LoadModel + Generate
Tests: MetalAvailable, DefaultBackend, GetBackend, LoadModel error
paths, options, defaults. Model-dependent test skips when model
not available on disk.

Co-Authored-By: Virgil <virgil@lethean.io>
2026-02-19 20:05:14 +00:00
Snider
4d1bff3d78 refactor(api): clean root package — interfaces only, metal auto-registered
Root package now contains only:
- mlx.go: package doc + go:generate directives
- textmodel.go: TextModel, Token, Message interfaces
- options.go: GenerateOption, LoadOption functional options
- backend.go: Backend interface, Register/Get/Default/LoadModel
- register_metal.go: build-tagged init() + adapter + memory delegates
- mlx_stub.go: non-darwin fallback

internal/metal/ has its own Token, GenerateConfig, Model types.
register_metal.go adapts between the two via metalAdapter.

Co-Authored-By: Virgil <virgil@lethean.io>
2026-02-19 20:04:19 +00:00
Snider
c612c3e060 refactor(metal): move all tests to internal/metal (148 tests passing)
Co-Authored-By: Virgil <virgil@lethean.io>
2026-02-19 20:00:02 +00:00
Snider
08976aa504 refactor(metal): flatten model, tokenizer, sample, cache into internal/metal
Co-Authored-By: Virgil <virgil@lethean.io>
2026-02-19 19:51:14 +00:00
Snider
a669d1d9c1 refactor(metal): move nn, io, grad, lora, optim to internal/metal
Co-Authored-By: Virgil <virgil@lethean.io>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 19:39:58 +00:00
Snider
d6a49544bd refactor(metal): move ops, slice, random, fast, compile to internal/metal
Co-Authored-By: Virgil <virgil@lethean.io>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 19:39:49 +00:00
Snider
1cf5178c80 refactor(metal): move dtype, array, metal, stream to internal/metal
Move foundation CGO files from root package to internal/metal/ package.
Changes package declaration from `package mlx` to `package metal`.
Updates CGO SRCDIR paths to account for new location (two levels deeper).
Extracts go:generate directives into root generate.go.

Co-Authored-By: Virgil <virgil@lethean.io>
2026-02-19 19:34:38 +00:00
Snider
643df757e0 feat(api): define public TextModel, Backend, and options interfaces
Co-Authored-By: Virgil <virgil@lethean.io>
2026-02-19 19:32:00 +00:00
Snider
f7a57657d4 chore: gitignore dist/ (CMake install output)
Co-Authored-By: Virgil <virgil@lethean.io>
2026-02-19 19:30:23 +00:00
Snider
97d9041455 docs(plan): fold Virgil review into design and implementation plan
Virgil review items integrated:
- context.Context on Generate/Chat (required for HTTP cancellation)
- Err() error on TextModel (distinguish EOS from OOM)
- Chat() on TextModel (model owns its chat template)
- Memory control functions exposed at root package level
- Functional options convention confirmed
- pkg/process confirmed — no changes needed for mlxlm

Co-Authored-By: Virgil <virgil@lethean.io>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 19:25:05 +00:00
Snider
28e2a07316 docs(plan): backend abstraction implementation plan (12 tasks)
Detailed step-by-step plan for restructuring go-mlx:
- Tasks 1-8: mechanical migration to internal/metal/
- Task 9: new Generate loop with iter.Seq[Token] streaming
- Task 10: deterministic memory cleanup (fixes leak)
- Tasks 11-12: integration tests and doc updates

Critical checkpoint at Task 7: all 148 tests must pass.

Co-Authored-By: Virgil <virgil@lethean.io>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 19:14:59 +00:00
Snider
c881813872 docs(design): backend abstraction with internal/metal reorganisation
Approved design for restructuring go-mlx:
- Root package becomes clean interface (TextModel, LoadModel, Token)
- All CGO code moves to internal/metal/
- Deterministic memory management (Close + per-step cleanup)
- Error propagation instead of silent logging
- mlxlm/ backend placeholder for Python subprocess support

Includes API breaking change communication in FINDINGS.md and
memory management research tasks in cpp/TODO.md.

See: docs/plans/2026-02-19-backend-abstraction-design.md

Co-Authored-By: Virgil <virgil@lethean.io>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 19:12:04 +00:00
Snider
3b155f6ca7 chore(cpp): set up standalone CLion project for parallel C++ sessions
cpp/ now has its own CMakeLists.txt that independently fetches mlx-c
v0.4.1, allowing a CLion Claude to build and explore the C API surface
without conflicting with the Go-side CMake build. Updated CLAUDE.md to
reflect cpp/ as the project root with standalone build instructions.

Added priority tasks to cpp/TODO.md: mlx_contiguous search, eval
semantics verification, mlx_cumsum check for TopP sampling.

Co-Authored-By: Virgil <virgil@lethean.io>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 18:49:16 +00:00
Snider
28b5c5bb47 test(sub-packages): add 33 tests for cache, sample, tokenizer
- cache_test.go (10): KVCache and RotatingKVCache — creation, single/multi
  update, bounded sliding window, reset, state access
- sample_test.go (8): greedy, temperature (high/low), topK (single/multi),
  chain composition, TopP/MinP stub pass-through
- tokenizer_test.go (15): Load (valid/missing/invalid), BOS/EOS detection,
  encode produces tokens, decode skips specials, decode regular tokens,
  DecodeToken (regular/special/space/unknown), FormatGemmaPrompt,
  GPT-2 byte maps (round-trip, printable ASCII, control chars)

Total test count: 29 → 148 across 10 test files.

Co-Authored-By: Virgil <virgil@lethean.io>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 18:40:49 +00:00
Snider
37abc496ba test(core): add 86 tests for ops, array, nn, fast kernels
Phase 1 hardening: cover all previously-untested core operations.

- array_test.go (25): scalar/array creation, shape, clone, free, data access
- ops_test.go (44): arithmetic, math, matmul, reductions, shape ops, indexing, slicing, random
- nn_test.go (8): Linear (dense/bias/LoRA), Embedding, RMSNormModule, RepeatKV
- fast_test.go (9): RMSNorm, LayerNorm, RoPE, ScaledDotProductAttention

Found: Floats()/DataInt32() return wrong data on non-contiguous arrays
(transpose, broadcast, slice views). Documented in FINDINGS.md.

Also: cpp/ workspace docs for CLion Claude session, Go 1.26 impact
assessment, verified go generate → test round-trip (29→115 tests).

Co-Authored-By: Virgil <virgil@lethean.io>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 18:37:30 +00:00
Snider
40cbdd7201 docs(todo): add ecosystem integration and Go 1.26 wishlist items
Streaming iterator for LEM Lab, batch inference for go-i18n throughput,
metrics, quantisation awareness, and Go 1.26 modernisation evaluation.

Co-Authored-By: Virgil <virgil@lethean.io>
2026-02-19 18:18:32 +00:00
Snider
cae7ef05e8 feat: extract go-mlx from go-ai as standalone Metal inference package
Split mlx/ directory from forge.lthn.ai/core/go-ai into its own module.
Rewrites import paths, adds CLAUDE.md/TODO.md/FINDINGS.md for dedicated
Claude sessions. Zero external Go deps — pure CGO + mlx-c v0.4.1.

Co-Authored-By: Virgil <virgil@lethean.io>
2026-02-19 17:57:37 +00:00