Added multi-limit support end-to-end by carrying limit_name in
rate-limit snapshots and handling multiple buckets instead of only
codex.
Extended /usage client parsing to consume additional_rate_limits
Updated TUI /status and in-memory state to store/render per-limit
snapshots
Extended app-server rate-limit read response: kept rate_limits and added
rate_limits_by_name.
Adjusted usage-limit error messaging for non-default codex limit buckets
Problem:
1. turn id is constructed in-memory;
2. on resuming threads, turn_id might not be unique;
3. client cannot no the boundary of a turn from rollout files easily.
This PR does three things:
1. persist `task_started` and `task_complete` events;
1. persist `turn_id` in rollout turn events;
5. generate turn_id as unique uuids instead of incrementing it in
memory.
This helps us resolve the issue of clients wanting to have unique turn
ids for resuming a thread, and knowing the boundry of each turn in
rollout files.
example debug logs
```
2026-02-11T00:32:10.746876Z DEBUG codex_app_server_protocol::protocol::thread_history: built turn from rollout items turn_index=8 turn=Turn { id: "019c4a07-d809-74c3-bc4b-fd9618487b4b", items: [UserMessage { id: "item-24", content: [Text { text: "hi", text_elements: [] }] }, AgentMessage { id: "item-25", text: "Hi. I’m in the workspace with your current changes loaded and ready. Send the next task and I’ll execute it end-to-end." }], status: Completed, error: None }
2026-02-11T00:32:10.746888Z DEBUG codex_app_server_protocol::protocol::thread_history: built turn from rollout items turn_index=9 turn=Turn { id: "019c4a18-1004-76c0-a0fb-a77610f6a9b8", items: [UserMessage { id: "item-26", content: [Text { text: "hello", text_elements: [] }] }, AgentMessage { id: "item-27", text: "Hello. Ready for the next change in `codex-rs`; I can continue from the current in-progress diff or start a new task." }], status: Completed, error: None }
2026-02-11T00:32:10.746899Z DEBUG codex_app_server_protocol::protocol::thread_history: built turn from rollout items turn_index=10 turn=Turn { id: "019c4a19-41f0-7db0-ad78-74f1503baeb8", items: [UserMessage { id: "item-28", content: [Text { text: "hello", text_elements: [] }] }, AgentMessage { id: "item-29", text: "Hello. Send the specific change you want in `codex-rs`, and I’ll implement it and run the required checks." }], status: Completed, error: None }
```
backward compatibility:
if you try to resume an old session without task_started and
task_complete event populated, the following happens:
- If you resume and do nothing: those reconstructed historical IDs can
differ next time you resume.
- If you resume and send a new turn: the new turn gets a fresh UUID from
live submission flow and is persisted, so that new turn’s ID is stable
on later resumes.
I think this behavior is fine, because we only care about deterministic
turn id once a turn is triggered.
As of this PR, `SessionServices` retains a
`Option<StartedNetworkProxy>`, if appropriate.
Now the `network` field on `Config` is `Option<NetworkProxySpec>`
instead of `Option<NetworkProxy>`.
Over in `Session::new()`, we invoke `NetworkProxySpec::start_proxy()` to
create the `StartedNetworkProxy`, which is a new struct that retains the
`NetworkProxy` as well as the `NetworkProxyHandle`. (Note that `Drop` is
implemented for `NetworkProxyHandle` to ensure the proxies are shutdown
when it is dropped.)
The `NetworkProxy` from the `StartedNetworkProxy` is threaded through to
the appropriate places.
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/11207).
* #11285
* __->__ #11207
With this PR we do not close the unified exec processes (i.e. background
terminals) at the end of a turn unless:
* The user interrupt the turn
* The user decide to clean the processes through `app-server` or
`/clean`
I made sure that `codex exec` correctly kill all the processes
Summary
- add the new resume_agent collab tool path through core, protocol, and
the app server API, including the resume events
- update the schema/TypeScript definitions plus docs so resume_agent
appears in generated artifacts and README
- note that resumed agents rehydrate rollout history without overwriting
their base instructions
Testing
- Not run (not requested)
In light of https://github.com/openai/codex/pull/10317, because
`.agents` can include resources that Codex can run in a privileged way,
it should be read-only by default just as `.codex` is.
### What
add wiring for `phase` field on `ResponseItem::Message` to lay
groundwork for differentiating model preambles and final messages.
currently optional.
follows pattern in #9698.
updated schemas with `just write-app-server-schema` so we can see type
changes.
### Tests
Updated existing tests for SSE parsing and hydrating from history
We started working with MCP in Codex before
https://crates.io/crates/rmcp was mature, so we had our own crate for
MCP types that was generated from the MCP schema:
8b95d3e082/codex-rs/mcp-types/README.md
Now that `rmcp` is more mature, it makes more sense to use their MCP
types in Rust, as they handle details (like the `_meta` field) that our
custom version ignored. Though one advantage that our custom types had
is that our generated types implemented `JsonSchema` and `ts_rs::TS`,
whereas the types in `rmcp` do not. As such, part of the work of this PR
is leveraging the adapters between `rmcp` types and the serializable
types that are API for us (app server and MCP) introduced in #10356.
Note this PR results in a number of changes to
`codex-rs/app-server-protocol/schema`, which merit special attention
during review. We must ensure that these changes are still
backwards-compatible, which is possible because we have:
```diff
- export type CallToolResult = { content: Array<ContentBlock>, isError?: boolean, structuredContent?: JsonValue, };
+ export type CallToolResult = { content: Array<JsonValue>, structuredContent?: JsonValue, isError?: boolean, _meta?: JsonValue, };
```
so `ContentBlock` has been replaced with the more general `JsonValue`.
Note that `ContentBlock` was defined as:
```typescript
export type ContentBlock = TextContent | ImageContent | AudioContent | ResourceLink | EmbeddedResource;
```
so the deletion of those individual variants should not be a cause of
great concern.
Similarly, we have the following change in
`codex-rs/app-server-protocol/schema/typescript/Tool.ts`:
```
- export type Tool = { annotations?: ToolAnnotations, description?: string, inputSchema: ToolInputSchema, name: string, outputSchema?: ToolOutputSchema, title?: string, };
+ export type Tool = { name: string, title?: string, description?: string, inputSchema: JsonValue, outputSchema?: JsonValue, annotations?: JsonValue, icons?: Array<JsonValue>, _meta?: JsonValue, };
```
so:
- `annotations?: ToolAnnotations` ➡️ `JsonValue`
- `inputSchema: ToolInputSchema` ➡️ `JsonValue`
- `outputSchema?: ToolOutputSchema` ➡️ `JsonValue`
and two new fields: `icons?: Array<JsonValue>, _meta?: JsonValue`
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/10349).
* #10357
* __->__ #10349
* #10356
## Summary
- Stream proposed plans in Plan Mode using `<proposed_plan>` tags parsed
in core, emitting plan deltas plus a plan `ThreadItem`, while stripping
tags from normal assistant output.
- Persist plan items and rebuild them on resume so proposed plans show
in thread history.
- Wire plan items/deltas through app-server protocol v2 and render a
dedicated proposed-plan view in the TUI, including the “Implement this
plan?” prompt only when a plan item is present.
## Changes
### Core (`codex-rs/core`)
- Added a generic, line-based tag parser that buffers each line until it
can disprove a tag prefix; implements auto-close on `finish()` for
unterminated tags. `codex-rs/core/src/tagged_block_parser.rs`
- Refactored proposed plan parsing to wrap the generic parser.
`codex-rs/core/src/proposed_plan_parser.rs`
- In plan mode, stream assistant deltas as:
- **Normal text** → `AgentMessageContentDelta`
- **Plan text** → `PlanDelta` + `TurnItem::Plan` start/completion
(`codex-rs/core/src/codex.rs`)
- Final plan item content is derived from the completed assistant
message (authoritative), not necessarily the concatenated deltas.
- Strips `<proposed_plan>` blocks from assistant text in plan mode so
tags don’t appear in normal messages.
(`codex-rs/core/src/stream_events_utils.rs`)
- Persist `ItemCompleted` events only for plan items for rollout replay.
(`codex-rs/core/src/rollout/policy.rs`)
- Guard `update_plan` tool in Plan Mode with a clear error message.
(`codex-rs/core/src/tools/handlers/plan.rs`)
- Updated Plan Mode prompt to:
- keep `<proposed_plan>` out of non-final reasoning/preambles
- require exact tag formatting
- allow only one `<proposed_plan>` block per turn
(`codex-rs/core/templates/collaboration_mode/plan.md`)
### Protocol / App-server protocol
- Added `TurnItem::Plan` and `PlanDeltaEvent` to core protocol items.
(`codex-rs/protocol/src/items.rs`, `codex-rs/protocol/src/protocol.rs`)
- Added v2 `ThreadItem::Plan` and `PlanDeltaNotification` with
EXPERIMENTAL markers and note that deltas may not match the final plan
item. (`codex-rs/app-server-protocol/src/protocol/v2.rs`)
- Added plan delta route in app-server protocol common mapping.
(`codex-rs/app-server-protocol/src/protocol/common.rs`)
- Rebuild plan items from persisted `ItemCompleted` events on resume.
(`codex-rs/app-server-protocol/src/protocol/thread_history.rs`)
### App-server
- Forward plan deltas to v2 clients and map core plan items to v2 plan
items. (`codex-rs/app-server/src/bespoke_event_handling.rs`,
`codex-rs/app-server/src/codex_message_processor.rs`)
- Added v2 plan item tests.
(`codex-rs/app-server/tests/suite/v2/plan_item.rs`)
### TUI
- Added a dedicated proposed plan history cell with special background
and padding, and moved “• Proposed Plan” outside the highlighted block.
(`codex-rs/tui/src/history_cell.rs`, `codex-rs/tui/src/style.rs`)
- Only show “Implement this plan?” when a plan item exists.
(`codex-rs/tui/src/chatwidget.rs`,
`codex-rs/tui/src/chatwidget/tests.rs`)
<img width="831" height="847" alt="Screenshot 2026-01-29 at 7 06 24 PM"
src="https://github.com/user-attachments/assets/69794c8c-f96b-4d36-92ef-c1f5c3a8f286"
/>
### Docs / Misc
- Updated protocol docs to mention plan deltas.
(`codex-rs/docs/protocol_v1.md`)
- Minor plumbing updates in exec/debug clients to tolerate plan deltas.
(`codex-rs/debug-client/src/reader.rs`, `codex-rs/exec/...`)
## Tests
- Added core integration tests:
- Plan mode strips plan from agent messages.
- Missing `</proposed_plan>` closes at end-of-message.
(`codex-rs/core/tests/suite/items.rs`)
- Added unit tests for generic tag parser (prefix buffering, non-tag
lines, auto-close). (`codex-rs/core/src/tagged_block_parser.rs`)
- Existing app-server plan item tests in v2.
(`codex-rs/app-server/tests/suite/v2/plan_item.rs`)
## Notes / Behavior
- Plan output no longer appears in standard assistant text in Plan Mode;
it streams via `PlanDelta` and completes as a `TurnItem::Plan`.
- The final plan item content is authoritative and may diverge from
streamed deltas (documented as experimental).
- Reasoning summaries are not filtered; prompt instructs the model not
to include `<proposed_plan>` outside the final plan message.
## Codex Author
`codex fork 019bec2d-b09d-7450-b292-d7bcdddcdbfb`
Session renaming:
- `/rename my_session`
- `/rename` without arg and passing an argument in `customViewPrompt`
- AppExitInfo shows resume hint using the session name if set instead of
uuid, defaults to uuid if not set
- Names are stored in `CODEX_HOME/sessions.jsonl`
Session resuming:
- codex resume <name> lookup for `CODEX_HOME/sessions.jsonl` first entry
matching the name and resumes the session
---------
Co-authored-by: jif-oai <jif@openai.com>
Add dynamic tools to rollout file for persistence & read from rollout on
resume. Ran a real example and spotted the following in the rollout
file:
```
{"timestamp":"2026-01-29T01:27:57.468Z","type":"session_meta","payload":{"id":"019c075d-3f0b-77e3-894e-c1c159b04b1e","timestamp":"2026-01-29T01:27:57.451Z","...."dynamic_tools":[{"name":"demo_tool","description":"Demo dynamic tool","inputSchema":{"additionalProperties":false,"properties":{"city":{"type":"string"}},"required":["city"],"type":"object"}}],"git":{"commit_hash":"ebc573f15c01b8af158e060cfedd401f043e9dfa","branch":"dev/cc/dynamic-tools","repository_url":"https://github.com/openai/codex.git"}}}
```
###### Problem
Users get generic 429s with no guidance when a model is at capacity.
###### Solution
Detect model-cap headers, surface a clear “try a different model”
message, and keep behavior non‑intrusive (no auto‑switch).
###### Scope
CLI/TUI only; protocol + error mapping updated to carry model‑cap info.
###### Tests
- just fmt
- cargo test -p codex-tui
- cargo test -p codex-core --lib
shell_snapshot::tests::try_new_creates_and_deletes_snapshot_file --
--nocapture (ran in isolated env)
- validate local build with backend
<img width="719" height="845" alt="image"
src="https://github.com/user-attachments/assets/1470b33d-0974-4b1f-b8e6-d11f892f4b54"
/>
### Summary
- Parse all `web_search` tool actions (`search`, `find_in_page`,
`open_page`).
- Previously we only parsed + displayed `search`, which made the TUI
appear to pause when the other actions were being used.
- Show in progress `web_search` calls as `Searching the web`
- Previously we only showed completed tool calls
<img width="308" height="149" alt="image"
src="https://github.com/user-attachments/assets/90a4e8ff-b06a-48ff-a282-b57b31121845"
/>
### Tests
Added + updated tests, tested locally
### Follow ups
Update VSCode extension to display these as well
## Summary
Add dynamic tool injection to thread startup in API v2, wire dynamic
tool calls through the app server to clients, and plumb responses back
into the model tool pipeline.
### Flow (high level)
- Thread start injects `dynamic_tools` into the model tool list for that
thread (validation is done here).
- When the model emits a tool call for one of those names, core raises a
`DynamicToolCallRequest` event.
- The app server forwards it to the client as `item/tool/call`, waits
for the client’s response, then submits a `DynamicToolResponse` back to
core.
- Core turns that into a `function_call_output` in the next model
request so the model can continue.
### What changed
- Added dynamic tool specs to v2 thread start params and protocol types;
introduced `item/tool/call` (request/response) for dynamic tool
execution.
- Core now registers dynamic tool specs at request time and routes those
calls via a new dynamic tool handler.
- App server validates tool names/schemas, forwards dynamic tool call
requests to clients, and publishes tool outputs back into the session.
- Integration tests
## Summary
Support updating Personality mid-Thread via UserTurn/OverwriteTurn. This
is explicitly unused by the clients so far, to simplify PRs - app-server
and tui implementations will be follow-ups.
## Testing
- [x] added integration tests
This PR adds support for chained (layered) config.toml file merging for
clients that use the app server interface. This feature already exists
for the TUI, but it does not work for GUI clients.
It does the following:
* Changes code paths for new thread, resume thread, and fork thread to
use the effective config based on the cwd.
* Updates the `config/read` API to accept an optional `cwd` parameter.
If specified, the API returns the effective config based on that cwd
path. Also optionally includes all layers including project config
files. If cwd is not specified, the API falls back on its older behavior
where it considers only the global (non-project) config files when
computing the effective config.
The changes in codex_message_processor.rs look deceptively large. They
mostly just involve moving existing blocks of code to a later point in
some functions so it can use the cwd to calculate the config.
This PR builds upon #9509 and should be reviewed and merged after that
PR.
Tested:
* Verified change with (dependent, as-yet-uncommitted) changes to IDE
Extension and confirmed correct behavior
The full fix requires additional changes in the IDE Extension code base,
but they depend on this PR.
## Summary
- add optional `collaboration_mode` to `TurnContextItem` in rollouts
- persist the current collaboration mode when recording turn context
(sampling + compaction)
## Rationale
We already persist turn context data for resume logic. Capturing
collaboration mode in the rollout gives us the mode context for each
turn, enabling follow‑up work to diff mode instructions correctly on
resume.
## Changes
- protocol: add optional `collaboration_mode` field to `TurnContextItem`
- core: persist collaboration mode alongside other turn context settings
in rollouts
## Summary
This PR consolidates base_instructions onto SessionMeta /
SessionConfiguration, so we ensure `base_instructions` is set once per
session and should be (mostly) immutable, unless:
- overridden by config on resume / fork
- sub-agent tasks, like review or collab
In a future PR, we should convert all references to `base_instructions`
to consistently used the typed struct, so it's less likely that we put
other strings there. See #9423. However, this PR is already quite
complex, so I'm deferring that to a follow-up.
## Testing
- [x] Added a resume test to assert that instructions are preserved. In
particular, `resume_switches_models_preserves_base_instructions` fails
against main.
Existing test coverage thats assert base instructions are preserved
across multiple requests in a session:
- Manual compact keeps baseline instructions:
core/tests/suite/compact.rs:199
- Auto-compact keeps baseline instructions:
core/tests/suite/compact.rs:1142
- Prompt caching reuses the same instructions across two requests:
core/tests/suite/prompt_caching.rs:150 and
core/tests/suite/prompt_caching.rs:157
- Prompt caching with explicit expected string across two requests:
core/tests/suite/prompt_caching.rs:213 and
core/tests/suite/prompt_caching.rs:222
- Resume with model switch keeps original instructions:
core/tests/suite/resume.rs:136
- Compact/resume/fork uses request 0 instructions for later expected
payloads: core/tests/suite/compact_resume_fork.rs:215
- `tui/` and `tui2/` submit `Op::UserTurn` and own full turn context
(cwd/approval/sandbox/model/etc.).
- `Op::UserInput` is documented as legacy in `codex-protocol` (doc-only;
no `#[deprecated]` to avoid `-D warnings` fallout).
- Remove obsolete `#[allow(deprecated)]` and the unused `ConversationId`
alias/re-export.
### Description
- Remove the now-unused `instructions` field from the session metadata
to simplify SessionMeta and stop propagating transient instruction text
through the rollout recorder API. This was only saving
user_instructions, and was never being read.
- Stop passing user instructions into the rollout writer at session
creation so the rollout header only contains canonical session metadata.
### Testing
- Ran `just fmt` which completed successfully.
- Ran `just fix -p codex-protocol`, `just fix -p codex-core`, `just fix
-p codex-app-server`, `just fix -p codex-tui`, and `just fix -p
codex-tui2` which completed (Clippy fixes applied) as part of
verification.
- Ran `cargo test -p codex-protocol` which passed (28 tests).
- Ran `cargo test -p codex-core` which showed failures in a small set of
tests (not caused by the protocol type change directly):
`default_client::tests::test_create_client_sets_default_headers`,
several `models_manager::manager::tests::refresh_available_models_*`,
and `shell_snapshot::tests::linux_sh_snapshot_includes_sections` (these
tests failed in this CI run).
- Ran `cargo test -p codex-app-server` which reported several failing
integration tests (including
`suite::codex_message_processor_flow::test_codex_jsonrpc_conversation_flow`,
`suite::output_schema::send_user_turn_*`, and
`suite::user_agent::get_user_agent_returns_current_codex_user_agent`).
- `cargo test -p codex-tui` and `cargo test -p codex-tui2` were
attempted but aborted due to disk space exhaustion (`No space left on
device`).
------
[Codex
Task](https://chatgpt.com/codex/tasks/task_i_696bd8ce632483228d298cf07c7eb41c)
Summary:
- Add forked_from to SessionMeta/SessionConfiguredEvent and persist it
for forked sessions.
- Surface forked_from in /status for tui + tui2 and add snapshots.
We’re introducing a new SKILL.toml to hold skill metadata so Codex can
deliver a richer Skills experience.
Initial focus is the interface block:
```
[interface]
display_name = "Optional user-facing name"
short_description = "Optional user-facing description"
icon_small = "./assets/small-400px.png"
icon_large = "./assets/large-logo.svg"
brand_color = "#3B82F6"
default_prompt = "Optional surrounding prompt to use the skill with"
```
All fields are exposed via the app server API.
display_name and short_description are consumed by the TUI.
Emit the following events around the collab tools. On the `app-server`
this will be under `item/started` and `item/completed`
```
#[derive(Debug, Clone, Deserialize, Serialize, PartialEq, JsonSchema, TS)]
pub struct CollabAgentSpawnBeginEvent {
/// Identifier for the collab tool call.
pub call_id: String,
/// Thread ID of the sender.
pub sender_thread_id: ThreadId,
/// Initial prompt sent to the agent. Can be empty to prevent CoT leaking at the
/// beginning.
pub prompt: String,
}
#[derive(Debug, Clone, Deserialize, Serialize, PartialEq, JsonSchema, TS)]
pub struct CollabAgentSpawnEndEvent {
/// Identifier for the collab tool call.
pub call_id: String,
/// Thread ID of the sender.
pub sender_thread_id: ThreadId,
/// Thread ID of the newly spawned agent, if it was created.
pub new_thread_id: Option<ThreadId>,
/// Initial prompt sent to the agent. Can be empty to prevent CoT leaking at the
/// beginning.
pub prompt: String,
/// Last known status of the new agent reported to the sender agent.
pub status: AgentStatus,
}
#[derive(Debug, Clone, Deserialize, Serialize, PartialEq, JsonSchema, TS)]
pub struct CollabAgentInteractionBeginEvent {
/// Identifier for the collab tool call.
pub call_id: String,
/// Thread ID of the sender.
pub sender_thread_id: ThreadId,
/// Thread ID of the receiver.
pub receiver_thread_id: ThreadId,
/// Prompt sent from the sender to the receiver. Can be empty to prevent CoT
/// leaking at the beginning.
pub prompt: String,
}
#[derive(Debug, Clone, Deserialize, Serialize, PartialEq, JsonSchema, TS)]
pub struct CollabAgentInteractionEndEvent {
/// Identifier for the collab tool call.
pub call_id: String,
/// Thread ID of the sender.
pub sender_thread_id: ThreadId,
/// Thread ID of the receiver.
pub receiver_thread_id: ThreadId,
/// Prompt sent from the sender to the receiver. Can be empty to prevent CoT
/// leaking at the beginning.
pub prompt: String,
/// Last known status of the receiver agent reported to the sender agent.
pub status: AgentStatus,
}
#[derive(Debug, Clone, Deserialize, Serialize, PartialEq, JsonSchema, TS)]
pub struct CollabWaitingBeginEvent {
/// Thread ID of the sender.
pub sender_thread_id: ThreadId,
/// Thread ID of the receiver.
pub receiver_thread_id: ThreadId,
/// ID of the waiting call.
pub call_id: String,
}
#[derive(Debug, Clone, Deserialize, Serialize, PartialEq, JsonSchema, TS)]
pub struct CollabWaitingEndEvent {
/// Thread ID of the sender.
pub sender_thread_id: ThreadId,
/// Thread ID of the receiver.
pub receiver_thread_id: ThreadId,
/// ID of the waiting call.
pub call_id: String,
/// Last known status of the receiver agent reported to the sender agent.
pub status: AgentStatus,
}
#[derive(Debug, Clone, Deserialize, Serialize, PartialEq, JsonSchema, TS)]
pub struct CollabCloseBeginEvent {
/// Identifier for the collab tool call.
pub call_id: String,
/// Thread ID of the sender.
pub sender_thread_id: ThreadId,
/// Thread ID of the receiver.
pub receiver_thread_id: ThreadId,
}
#[derive(Debug, Clone, Deserialize, Serialize, PartialEq, JsonSchema, TS)]
pub struct CollabCloseEndEvent {
/// Identifier for the collab tool call.
pub call_id: String,
/// Thread ID of the sender.
pub sender_thread_id: ThreadId,
/// Thread ID of the receiver.
pub receiver_thread_id: ThreadId,
/// Last known status of the receiver agent reported to the sender agent before
/// the close.
pub status: AgentStatus,
}
```
### Motivation
- Landlock alone cannot prevent writes to sensitive in-repo files like
`.git/` when the repo root is writable, so explicit mount restrictions
are required for those paths.
- The sandbox must set up any mounts before calling Landlock so Landlock
can still be applied afterwards and the two mechanisms compose
correctly.
### Description
- Add a new `linux-sandbox` helper `apply_read_only_mounts` in
`linux-sandbox/src/mounts.rs` that: unshares namespaces, maps uids/gids
when required, makes mounts private, bind-mounts targets, and remounts
them read-only.
- Wire the mount step into the sandbox flow by calling
`apply_read_only_mounts(...)` before network/seccomp and before applying
Landlock rules in `linux-sandbox/src/landlock.rs`.
### Summary
* Added `mcpServer/refresh` command to inform app servers and active
threads to refresh mcpServer on next turn event.
* Added `pending_mcp_server_refresh_config` to codex core so that if the
value is populated, we reinitialize the mcp server manager on the thread
level.
* The config is updated on `mcpServer/refresh` command which we iterate
through threads and provide with the latest config value after last
write.
Add implementation for the `wait` tool.
For this we consider all status different from `PendingInit` and
`Running` as terminal. The `wait` tool call will return either after a
given timeout or when the tool reaches a non-terminal status.
A few points to note:
* The usage of a channel is preferred to prevent some races (just
looping on `get_status()` could "miss" a terminal status)
* The order of operations is very important, we need to first subscribe
and then check the last known status to prevent race conditions
* If the channel gets dropped, we return an error on purpose
- Merge ModelFamily into ModelInfo
- Remove logic for adding instructions to apply patch
- Add compaction limit and visible context window to `ModelInfo`
Add `thread/rollback` to app-server to support IDEs undo-ing the last N
turns of a thread.
For context, an IDE partner will be supporting an "undo" capability
where the IDE (the app-server client) will be responsible for reverting
the local changes made during the last turn. To support this well, we
also need a way to drop the last turn (or more generally, the last N
turns) from the agent's context. This is what `thread/rollback` does.
**Core idea**: A Thread rollback is represented as a persisted event
message (EventMsg::ThreadRollback) in the rollout JSONL file, not by
rewriting history. On resume, both the model's context (core replay) and
the UI turn list (app-server v2's thread history builder) apply these
markers so the pruned history is consistent across live conversations
and `thread/resume`.
Implementation notes:
- Rollback only affects agent context and appends to the rollout file;
clients are responsible for reverting files on disk.
- If a thread rollback is currently in progress, subsequent
`thread/rollback` calls are rejected.
- Because we use `CodexConversation::submit` and codex core tracks
active turns, returning an error on concurrent rollbacks is communicated
via an `EventMsg::Error` with a new variant
`CodexErrorInfo::ThreadRollbackFailed`. app-server watches for that and
sends the BAD_REQUEST RPC response.
Tests cover thread rollbacks in both core and app-server, including when
`num_turns` > existing turns (which clears all turns).
**Note**: this explicitly does **not** behave like `/undo` which we just
removed from the CLI, which does the opposite of what `thread/rollback`
does. `/undo` reverts local changes via ghost commits/snapshots and does
not modify the agent's context / conversation history.
What changed
- Added `outputSchema` support to the app-server APIs, mirroring `codex
exec --output-schema` behavior.
- V1 `sendUserTurn` now accepts `outputSchema` and constrains the final
assistant message for that turn.
- V2 `turn/start` now accepts `outputSchema` and constrains the final
assistant message for that turn (explicitly per-turn only).
Core behavior
- `Op::UserTurn` already supported `final_output_json_schema`; now V1
`sendUserTurn` forwards `outputSchema` into that field.
- `Op::UserInput` now carries `final_output_json_schema` for per-turn
settings updates; core maps it into
`SessionSettingsUpdate.final_output_json_schema` so it applies to the
created turn context.
- V2 `turn/start` does NOT persist the schema via `OverrideTurnContext`
(it’s applied only for the current turn). Other overrides
(cwd/model/etc) keep their existing persistent behavior.
API / docs
- `codex-rs/app-server-protocol/src/protocol/v1.rs`: add `output_schema:
Option<serde_json::Value>` to `SendUserTurnParams` (serialized as
`outputSchema`).
- `codex-rs/app-server-protocol/src/protocol/v2.rs`: add `output_schema:
Option<JsonValue>` to `TurnStartParams` (serialized as `outputSchema`).
- `codex-rs/app-server/README.md`: document `outputSchema` for
`turn/start` and clarify it applies only to the current turn.
- `codex-rs/docs/codex_mcp_interface.md`: document `outputSchema` for v1
`sendUserTurn` and v2 `turn/start`.
Tests added/updated
- New app-server integration tests asserting `outputSchema` is forwarded
into outbound `/responses` requests as `text.format`:
- `codex-rs/app-server/tests/suite/output_schema.rs`
- `codex-rs/app-server/tests/suite/v2/output_schema.rs`
- Added per-turn semantics tests (schema does not leak to the next
turn):
- `send_user_turn_output_schema_is_per_turn_v1`
- `turn_start_output_schema_is_per_turn_v2`
- Added protocol wire-compat tests for the merged op:
- serialize omits `final_output_json_schema` when `None`
- deserialize works when field is missing
- serialize includes `final_output_json_schema` when `Some(schema)`
Call site updates (high level)
- Updated all `Op::UserInput { .. }` constructions to include
`final_output_json_schema`:
- `codex-rs/app-server/src/codex_message_processor.rs`
- `codex-rs/core/src/codex_delegate.rs`
- `codex-rs/mcp-server/src/codex_tool_runner.rs`
- `codex-rs/tui/src/chatwidget.rs`
- `codex-rs/tui2/src/chatwidget.rs`
- plus impacted core tests.
Validation
- `just fmt`
- `cargo test -p codex-core`
- `cargo test -p codex-app-server`
- `cargo test -p codex-mcp-server`
- `cargo test -p codex-tui`
- `cargo test -p codex-tui2`
- `cargo test -p codex-protocol`
- `cargo clippy --all-features --tests --profile dev --fix -- -D
warnings`