docs: update findings and TODO with CLion Claude research results
Document contiguous fix, TopP/MinP implementations, new bindings. Update memory management TODO — CLion Claude confirmed safe patterns. 176 tests passing. Co-Authored-By: Virgil <virgil@lethean.io>
This commit is contained in:
parent
f507620e85
commit
7435648f66
2 changed files with 49 additions and 2 deletions
47
FINDINGS.md
47
FINDINGS.md
|
|
@ -295,3 +295,50 @@ for tok := range m.Generate(ctx, prompt, inference.WithMaxTokens(128)) { ... }
|
|||
- 148 internal/metal tests — all pass
|
||||
- 11 root integration tests — all pass
|
||||
- Total: 159 tests passing
|
||||
|
||||
---
|
||||
|
||||
## 2026-02-19: CLion Claude Research Applied
|
||||
|
||||
### Contiguous Array Fix (data correctness bug)
|
||||
|
||||
`Floats()`, `DataInt32()`, and `Ints()` now automatically handle non-contiguous arrays. Previously, reading data from view arrays (Transpose, BroadcastTo, SliceAxis) returned silently wrong results.
|
||||
|
||||
**Fix**: Bound `mlx_contiguous` and `_mlx_array_is_row_contiguous` from mlx-c. The `ensureContiguous()` helper checks `IsRowContiguous()` and makes a contiguous copy when needed before accessing the raw data pointer.
|
||||
|
||||
The old workaround of `Reshape(arr, totalSize)` to force contiguity is no longer needed.
|
||||
|
||||
### TopP (Nucleus) Sampling Implemented
|
||||
|
||||
Was a stub that passed logits through unchanged. Now fully implemented:
|
||||
1. Softmax to get probabilities
|
||||
2. Argsort descending to get sorted indices
|
||||
3. CumSum of sorted probabilities
|
||||
4. Mask tokens where cumulative probability (excluding current) exceeds threshold
|
||||
5. Scatter mask back to original positions via PutAlongAxis + argsort indices
|
||||
|
||||
### MinP Sampling Implemented
|
||||
|
||||
Was a stub. Now masks tokens whose probability is below `min_p * max_prob`. Uses MaxAxis to find the peak probability per position.
|
||||
|
||||
### New Bindings
|
||||
|
||||
| Function | Header | Purpose |
|
||||
|----------|--------|---------|
|
||||
| `mlx_contiguous` | ops.h | Force row-major contiguous layout |
|
||||
| `_mlx_array_is_row_contiguous` | array.h | Check contiguity without copying |
|
||||
| `mlx_cumsum` | ops.h | Cumulative sum (forward/reverse, inclusive/exclusive) |
|
||||
| `mlx_sort_axis` | ops.h | Sort along axis |
|
||||
| `mlx_argsort_axis` | ops.h | Indices that would sort |
|
||||
| `mlx_greater` | ops.h | Element-wise comparison |
|
||||
| `mlx_max_axis` | ops.h | Maximum along axis |
|
||||
| `mlx_get_cache_memory` | memory.h | Current allocator cache size |
|
||||
| `mlx_reset_peak_memory` | memory.h | Reset peak memory tracking |
|
||||
| `mlx_set_wired_limit` | memory.h | Wired memory limit control |
|
||||
| `mlx_metal_device_info` | metal.h | GPU hardware info |
|
||||
|
||||
### Test results
|
||||
|
||||
- 165 internal/metal tests — all pass
|
||||
- 11 root integration tests — all pass
|
||||
- Total: 176 tests passing
|
||||
|
|
|
|||
4
TODO.md
4
TODO.md
|
|
@ -38,8 +38,8 @@ Implementation plan: `docs/plans/2026-02-19-backend-abstraction-plan.md`
|
|||
- [x] **All CGO moved to `internal/metal/`** — 19 source files, 10 test files, 148 tests passing.
|
||||
- [x] **Public API: `TextModel`, `Backend`, functional options** — Clean root package, compiles on all platforms.
|
||||
- [x] **Integration tests** — 7 tests for public API (backend registration, options, LoadModel paths).
|
||||
- [ ] **Error handling audit** — `checkError()` still logs + swallows. Needs conversion to error returns. Low priority — existing behaviour, not a regression.
|
||||
- [ ] **Memory management — deterministic cleanup** — `Close()` stub in place. Needs CLion Claude research on `mlx_array_free` safety before implementing per-step cleanup. See `cpp/TODO.md`.
|
||||
- [ ] **Error handling audit** — `checkError()` still logs + swallows. Needs conversion to error returns (return code 0/1 + stored error string pattern confirmed by CLion Claude). Low priority — existing behaviour, not a regression.
|
||||
- [ ] **Memory management — deterministic cleanup** — `Close()` stub in place. CLion Claude confirmed `mlx_array_free()` is safe on graph-referenced arrays (refcounted via shared_ptr). Double-free is UB. Can now implement per-step cleanup.
|
||||
- [ ] **Documentation** — Public API has godoc but needs examples for common workflows.
|
||||
|
||||
## Phase 5: Ecosystem Integration (Virgil wishlist)
|
||||
|
|
|
|||
Loading…
Add table
Reference in a new issue