## Summary - enforce a 10 MiB cap per `thread_id` in state log storage - enforce a 10 MiB cap per `process_uuid` for threadless (`thread_id IS NULL`) logs - scope pruning to only keys affected by the current insert batch - add a cheap per-key `SUM(...)` precheck so windowed prune queries only run for keys that are currently over the cap - add SQLite indexes used by the pruning queries - add focused runtime tests covering both pruning behaviors ## Why This keeps log growth bounded by the intended partition semantics while preserving a small, readable implementation localized to the existing insert path. ## Local Latency Snapshot (No Truncation-Pressure Run) Collected from session `019c734f-1d16-7002-9e00-c966c9fbbcae` using local-only (uncommitted) instrumentation, while not specifically benchmarking the truncation-heavy regime. ### Percentiles By Query (ms) | query | count | p50 | p90 | p95 | p99 | max | |---|---:|---:|---:|---:|---:|---:| | `insert_logs.insert_batch` | 110 | 0.332 | 0.999 | 1.811 | 2.978 | 3.493 | | `insert_logs.precheck.process` | 106 | 0.074 | 0.152 | 0.206 | 0.258 | 0.426 | | `insert_logs.precheck.thread` | 73 | 0.118 | 0.206 | 0.253 | 1.025 | 1.025 | | `insert_logs.prune.process` | 58 | 0.291 | 0.576 | 0.607 | 1.088 | 1.088 | | `insert_logs.prune.thread` | 44 | 0.318 | 0.467 | 0.728 | 0.797 | 0.797 | | `insert_logs.prune_total` | 110 | 0.488 | 0.976 | 1.237 | 1.593 | 1.684 | | `insert_logs.total` | 110 | 1.315 | 2.889 | 3.623 | 5.739 | 5.961 | | `insert_logs.tx_begin` | 110 | 0.133 | 0.235 | 0.282 | 0.412 | 0.546 | | `insert_logs.tx_commit` | 110 | 0.259 | 0.689 | 0.772 | 1.065 | 1.080 | ### `insert_logs.total` Histogram (ms) | bucket | count | |---|---:| | `<= 0.100` | 0 | | `<= 0.250` | 0 | | `<= 0.500` | 7 | | `<= 1.000` | 33 | | `<= 2.000` | 40 | | `<= 5.000` | 28 | | `<= 10.000` | 2 | | `<= 20.000` | 0 | | `<= 50.000` | 0 | | `<= 100.000` | 0 | | `> 100.000` | 0 | ## Local Latency Snapshot (Truncation-Heavy / Cap-Hit Regime) Collected from a run where cap-hit behavior was frequent (`135/180` insert calls), using local-only (uncommitted) instrumentation and a temporary local cap of `10_000` bytes for stress testing (not the merged `10 MiB` cap). ### Percentiles By Query (ms) | query | count | p50 | p90 | p95 | p99 | max | |---|---:|---:|---:|---:|---:|---:| | `insert_logs.insert_batch` | 180 | 0.524 | 1.645 | 2.163 | 3.424 | 3.777 | | `insert_logs.precheck.process` | 171 | 0.086 | 0.235 | 0.373 | 0.758 | 1.147 | | `insert_logs.precheck.thread` | 100 | 0.105 | 0.251 | 0.291 | 1.176 | 1.622 | | `insert_logs.prune.process` | 109 | 0.386 | 0.839 | 1.146 | 1.548 | 2.588 | | `insert_logs.prune.thread` | 56 | 0.253 | 0.550 | 1.148 | 2.484 | 2.484 | | `insert_logs.prune_total` | 180 | 0.511 | 1.221 | 1.695 | 4.548 | 5.512 | | `insert_logs.total` | 180 | 1.631 | 3.902 | 5.103 | 8.901 | 9.095 | | `insert_logs.total_cap_hit` | 135 | 1.876 | 4.501 | 5.547 | 8.902 | 9.096 | | `insert_logs.total_no_cap_hit` | 45 | 0.520 | 1.700 | 2.079 | 3.294 | 3.294 | | `insert_logs.tx_begin` | 180 | 0.109 | 0.253 | 0.287 | 1.088 | 1.406 | | `insert_logs.tx_commit` | 180 | 0.267 | 0.813 | 1.170 | 2.497 | 2.574 | ### `insert_logs.total` Histogram (ms) | bucket | count | |---|---:| | `<= 0.100` | 0 | | `<= 0.250` | 0 | | `<= 0.500` | 16 | | `<= 1.000` | 39 | | `<= 2.000` | 60 | | `<= 5.000` | 54 | | `<= 10.000` | 11 | | `<= 20.000` | 0 | | `<= 50.000` | 0 | | `<= 100.000` | 0 | | `> 100.000` | 0 | ### `insert_logs.total` Histogram When Cap Was Hit (ms) | bucket | count | |---|---:| | `<= 0.100` | 0 | | `<= 0.250` | 0 | | `<= 0.500` | 0 | | `<= 1.000` | 22 | | `<= 2.000` | 51 | | `<= 5.000` | 51 | | `<= 10.000` | 11 | | `<= 20.000` | 0 | | `<= 50.000` | 0 | | `<= 100.000` | 0 | | `> 100.000` | 0 | ### Performance Takeaways - Even in a cap-hit-heavy run (`75%` cap-hit calls), `insert_logs.total` stays sub-10ms at p99 (`8.901ms`) and max (`9.095ms`). - Calls that did **not** hit the cap are materially cheaper (`insert_logs.total_no_cap_hit` p95 `2.079ms`) than cap-hit calls (`insert_logs.total_cap_hit` p95 `5.547ms`). - Compared to the earlier non-truncation-pressure run, overall `insert_logs.total` rose from p95 `3.623ms` to p95 `5.103ms` (+`1.48ms`), indicating bounded overhead when pruning is active. - This truncation-heavy run used an intentionally low local cap for stress testing; with the real 10 MiB cap, cap-hit frequency should be much lower in normal sessions. ## Testing - `just fmt` (in `codex-rs`) - `cargo test -p codex-state` (in `codex-rs`) |
||
|---|---|---|
| .. | ||
| migrations | ||
| src | ||
| BUILD.bazel | ||
| Cargo.toml | ||