diff --git a/FINDINGS.md b/FINDINGS.md index 0010f5b..8634b45 100644 --- a/FINDINGS.md +++ b/FINDINGS.md @@ -295,3 +295,50 @@ for tok := range m.Generate(ctx, prompt, inference.WithMaxTokens(128)) { ... } - 148 internal/metal tests — all pass - 11 root integration tests — all pass - Total: 159 tests passing + +--- + +## 2026-02-19: CLion Claude Research Applied + +### Contiguous Array Fix (data correctness bug) + +`Floats()`, `DataInt32()`, and `Ints()` now automatically handle non-contiguous arrays. Previously, reading data from view arrays (Transpose, BroadcastTo, SliceAxis) returned silently wrong results. + +**Fix**: Bound `mlx_contiguous` and `_mlx_array_is_row_contiguous` from mlx-c. The `ensureContiguous()` helper checks `IsRowContiguous()` and makes a contiguous copy when needed before accessing the raw data pointer. + +The old workaround of `Reshape(arr, totalSize)` to force contiguity is no longer needed. + +### TopP (Nucleus) Sampling Implemented + +Was a stub that passed logits through unchanged. Now fully implemented: +1. Softmax to get probabilities +2. Argsort descending to get sorted indices +3. CumSum of sorted probabilities +4. Mask tokens where cumulative probability (excluding current) exceeds threshold +5. Scatter mask back to original positions via PutAlongAxis + argsort indices + +### MinP Sampling Implemented + +Was a stub. Now masks tokens whose probability is below `min_p * max_prob`. Uses MaxAxis to find the peak probability per position. + +### New Bindings + +| Function | Header | Purpose | +|----------|--------|---------| +| `mlx_contiguous` | ops.h | Force row-major contiguous layout | +| `_mlx_array_is_row_contiguous` | array.h | Check contiguity without copying | +| `mlx_cumsum` | ops.h | Cumulative sum (forward/reverse, inclusive/exclusive) | +| `mlx_sort_axis` | ops.h | Sort along axis | +| `mlx_argsort_axis` | ops.h | Indices that would sort | +| `mlx_greater` | ops.h | Element-wise comparison | +| `mlx_max_axis` | ops.h | Maximum along axis | +| `mlx_get_cache_memory` | memory.h | Current allocator cache size | +| `mlx_reset_peak_memory` | memory.h | Reset peak memory tracking | +| `mlx_set_wired_limit` | memory.h | Wired memory limit control | +| `mlx_metal_device_info` | metal.h | GPU hardware info | + +### Test results + +- 165 internal/metal tests — all pass +- 11 root integration tests — all pass +- Total: 176 tests passing diff --git a/TODO.md b/TODO.md index 1c1e921..2c64d72 100644 --- a/TODO.md +++ b/TODO.md @@ -38,8 +38,8 @@ Implementation plan: `docs/plans/2026-02-19-backend-abstraction-plan.md` - [x] **All CGO moved to `internal/metal/`** — 19 source files, 10 test files, 148 tests passing. - [x] **Public API: `TextModel`, `Backend`, functional options** — Clean root package, compiles on all platforms. - [x] **Integration tests** — 7 tests for public API (backend registration, options, LoadModel paths). -- [ ] **Error handling audit** — `checkError()` still logs + swallows. Needs conversion to error returns. Low priority — existing behaviour, not a regression. -- [ ] **Memory management — deterministic cleanup** — `Close()` stub in place. Needs CLion Claude research on `mlx_array_free` safety before implementing per-step cleanup. See `cpp/TODO.md`. +- [ ] **Error handling audit** — `checkError()` still logs + swallows. Needs conversion to error returns (return code 0/1 + stored error string pattern confirmed by CLion Claude). Low priority — existing behaviour, not a regression. +- [ ] **Memory management — deterministic cleanup** — `Close()` stub in place. CLion Claude confirmed `mlx_array_free()` is safe on graph-referenced arrays (refcounted via shared_ptr). Double-free is UB. Can now implement per-step cleanup. - [ ] **Documentation** — Public API has godoc but needs examples for common workflows. ## Phase 5: Ecosystem Integration (Virgil wishlist)