This PR add an URI-based system to reference agents within a tree. This
comes from a sync between research and engineering.
The main agent (the one manually spawned by a user) is always called
`/root`. Any sub-agent spawned by it will be `/root/agent_1` for example
where `agent_1` is chosen by the model.
Any agent can contact any agents using the path.
Paths can be used either in absolute or relative to the calling agents
Resume is not supported for now on this new path
Add a representation of the agent graph. This is now used for:
* Cascade close agents (when I close a parent, it close the kids)
* Cascade resume (oposite)
Later, this will also be used for post-compaction stuffing of the
context
Direct fix for: https://github.com/openai/codex/issues/14458
### Summary
The goal is for us to get the latest turn model and reasoning effort on
thread/resume is no override is provided on the thread/resume func call.
This is the part 1 which we write the model and reasoning effort for a
thread to the sqlite db and there will be a followup PR to consume the
two new fields on thread/resume.
[part 2 PR is currently WIP](https://github.com/openai/codex/pull/14888)
and this one can be merged independently.
This extends dynamic_tool_calls to allow us to hide a tool from the
model context but still use it as part of the general tool calling
runtime (for ex from js_repl/code_mode)
## Summary
This PR is the first slice of the per-session `/feedback` logging work:
it adds a process-unique identifier to SQLite log rows.
It does **not** change `/feedback` sourcing behavior yet.
## Changes
- Add migration `0009_logs_process_id.sql` to extend `logs` with:
- `process_uuid TEXT`
- `idx_logs_process_uuid` index
- Extend state log models:
- `LogEntry.process_uuid: Option<String>`
- `LogRow.process_uuid: Option<String>`
- Stamp each log row with a stable per-process UUID in the sqlite log
layer:
- generated once per process as `pid:<pid>:<uuid>`
- Update sqlite log insert/query paths to persist and read
`process_uuid`:
- `INSERT INTO logs (..., process_uuid, ...)`
- `SELECT ..., process_uuid, ... FROM logs`
## Why
App-server runs many sessions in one process. This change provides a
process-scoping primitive we need for follow-up `/feedback` work, so
threadless/process-level logs can be associated with the emitting
process without mixing across processes.
## Non-goals in this PR
- No `/feedback` transport/source changes
- No attachment size changes
- No sqlite retention/trim policy changes
## Testing
- `just fmt`
- CI will run the full checks
## Align with the new phase-1 design
Basically we know run phase 1 in parallel by considering:
* Max 64 rollouts
* Max 1 month old
* Consider the most recent first
This PR also adds stronger parallelization capabilities by detecting
stale jobs, retry policies, ownership of computation to prevent double
computations etc etc
Summary:
- read conversation summaries and cwd info from the state DB when
possible so we no longer rely on rollout files for metadata and avoid
extra I/O
- persist CLI version in thread metadata, surface it through summary
builders, and add the necessary DB migration hooks
- simplify thread listing by using enriched state DB data directly
rather than reading rollout heads
Testing:
- Not run (not requested)
## Summary
This PR makes SQLite rollout backfill resumable and repeatable instead
of one-shot-on-db-create.
## What changed
- Added a persisted backfill state table:
- state/migrations/0008_backfill_state.sql
- Tracks status (pending|running|complete), last_watermark, and
last_success_at.
- Added backfill state model/types in codex-state:
- BackfillState, BackfillStatus (state/src/model/backfill_state.rs)
- Added runtime APIs to manage backfill lifecycle/progress:
- get_backfill_state
- mark_backfill_running
- checkpoint_backfill
- mark_backfill_complete
- Updated core startup behavior:
- Backfill now runs whenever state is not Complete (not only when DB
file is newly created).
- Reworked backfill execution:
- Collect rollout files, derive deterministic watermark per path, sort,
resume from last_watermark.
- Process in batches (BACKFILL_BATCH_SIZE = 200), checkpoint after each
batch.
- Mark complete with last_success_at at the end.
## Why
Previous behavior could leave users permanently partially backfilled if
the process exited during initial async backfill. This change allows
safe continuation across restarts and avoids restarting from scratch.
- Schema: thread_id (PK, FK to threads.id with cascade delete),
trace_summary, memory_summary, updated_at.
- Migration: creates the table and an index on (updated_at DESC,
thread_id DESC) for efficient recent-first reads.
- Runtime API (DB-only):
- `get_thread_memory(thread_id)`: fetch one memory row.
- `upsert_thread_memory(thread_id, trace_summary, memory_summary)`:
insert/update by thread id and always advance updated_at.
- `get_last_n_thread_memories_for_cwd(cwd, n)`: join thread_memory with
threads and return newest n rows for an exact cwd match.
- Model layer: introduced ThreadMemory and row conversion types to keep
query decoding typed and consistent with existing state models.
Persist thread_dynamic_tools in sqlite and read first from it. Fall back
to rollout files if it's not found. Persist dynamic tools to both sqlite
and rollout files.
Saw that new sessions get populated to db correctly & old sessions get
backfilled correctly at startup:
```
celia@com-92114 codex-rs % sqlite3 ~/.codex/state.sqlite \ "select thread_id, position,name,description,input_schema from thread_dynamic_tools;"
019c0cad-ec0d-74b2-a787-e8b33a349117|0|geo_lookup|lookup a city|{"properties":{"city":{"type":"string"}},"required":["city"],"type":"object"}
....
019c10ca-aa4b-7620-ae40-c0919fbd7ea7|0|geo_lookup|lookup a city|{"properties":{"city":{"type":"string"}},"required":["city"],"type":"object"}
```
Add a `.sqlite` database to be used to store rollout metatdata (and
later logs)
This PR is phase 1:
* Add the database and the required infrastructure
* Add a backfill of the database
* Persist the newly created rollout both in files and in the DB
* When we need to get metadata or a rollout, consider the `JSONL` as the
source of truth but compare the results with the DB and show any errors