## Summary
This PR updates `request_user_input` behavior and Default-mode guidance
to match current collaboration-mode semantics and reduce model
confusion.
## Why
- `request_user_input` should be explicitly documented as **Plan-only**.
- Tool description and runtime availability checks should be driven by
the **same centralized mode policy**.
- Default mode prompt needed stronger execution guidance and explicit
instruction that `request_user_input` is unavailable.
- Error messages should report the **actual mode name** (not aliases
that can read as misleading).
## What changed
- Centralized `request_user_input` mode policy in `core` handler logic:
- Added a single allowed-modes config (`Plan` only).
- Reused that policy for:
- runtime rejection messaging
- tool description text
- Updated tool description to include availability constraint:
- `"This tool is only available in Plan mode."`
- Updated runtime rejection behavior:
- `Default` -> `"request_user_input is unavailable in Default mode"`
- `Execute` -> `"request_user_input is unavailable in Execute mode"`
- `PairProgramming` -> `"request_user_input is unavailable in Pair
Programming mode"`
- Strengthened Default collaboration prompt:
- Added explicit execution-first behavior
- Added assumptions-first guidance
- Added explicit `request_user_input` unavailability instruction
- Added concise progress-reporting expectations
- Simplified formatting implementation:
- Inlined allowed-mode name collection into `format_allowed_modes()`
- Kept `format_allowed_modes()` output for 3+ modes as CSV style
(`modes: a,b,c`)
One of our partners flagged that they were seeing the wrong order of
events when running `review/start` with command exec approvals:
```
{"method":"item/commandExecution/requestApproval","id":0,"params":{"threadId":"019c0b6b-6a42-7c02-99c4-98c80e88ac27","turnId":"0","itemId":"0","reason":"`/bin/zsh -lc 'git show b7a92b4eacf262c575f26b1e1ed621a357642e55 --stat'` requires approval: Xcode-required approval: Require explicit user confirmation for all commands.","proposedExecpolicyAmendment":null}}
{"method":"item/started","params":{"item":{"type":"commandExecution","id":"call_AEjlbHqLYNM7kbU3N6uw1CNi","command":"/bin/zsh -lc 'git show b7a92b4eacf262c575f26b1e1ed621a357642e55 --stat'","cwd":"/Users/devingreen/Desktop/SampleProject","processId":null,"status":"inProgress","commandActions":[{"type":"unknown","command":"git show b7a92b4eacf262c575f26b1e1ed621a357642e55 --stat"}],"aggregatedOutput":null,"exitCode":null,"durationMs":null},"threadId":"019c0b6b-6a42-7c02-99c4-98c80e88ac27","turnId":"0"}}
```
**Key fix**: In the review sub‑agent delegate we were forwarding exec
(and patch) approvals using the parent turn id (`parent_ctx.sub_id`) as
the approval call_id. That made
`item/commandExecution/requestApproval.itemId` differ from the actual
`item/started` id. We now forward the sub‑agent’s `call_id` from the
approval event instead, so the approval item id matches the
commandExecution item id in review flows.
Here’s the expected event order for an inline `review/start` that
triggers an exec approval after this fix:
1. Response to review/start (JSON‑RPC response)
- Includes `turn` (status inProgress) and `review_thread_id` (same as
parent thread for inline).
2. `turn/started` notification
- turnId is the review turn id (e.g., "0").
3. `item/started` → EnteredReviewMode
- item.id == turnId, marks entry into review mode.
4. `item/started` → commandExecution
- item.id == <call_id> (e.g., "review-call-1"), status: inProgress.
5. `item/commandExecution/requestApproval` request
- JSON‑RPC request (not a notification).
- params.itemId == <call_id> and params.turnId == turnId.
6. Client replies to approval request (Approved / Declined / etc).
7. If approved:
- Optional `item/commandExecution/outputDelta` notifications.
- `item/completed` → commandExecution with status and exitCode.
8. Review finishes:
- `item/started` → ExitedReviewMode
- `item/completed` → ExitedReviewMode
- (Agent message items may also appear, depending on review output.)
9. `turn/completed` notification
The key being #4 and #5 are now in the proper order with the correct
item id.
This updates our generated TypeScript types to be more correct with how
the server actually behaves, **specifically for JSON-RPC requests**.
Before this PR, we'd generate `field: T | null`. After this PR, we will
have `field?: T | null`. The latter matches how the server actually
works, in that if an optional field is omitted, the server will treat it
as null. This also makes it less annoying in theory for clients to
upgrade to newer versions of Codex, since adding a new optional field to
a JSON-RPC request should not require a client change.
NOTE: This only applies to JSON-RPC requests. All other payloads (i.e.
responses, notifications) will return `field: T | null` as usual.
In light of https://github.com/openai/codex/pull/10317, because
`.agents` can include resources that Codex can run in a privileged way,
it should be read-only by default just as `.codex` is.
Summary
- mark the shell-related tools as supporting parallel tool calls so
exec_command, shell_command, etc. can run concurrently
- update expectations in tool parallelism tests to reflect the new
parallel behavior
- drop the unused serial duration helper from the suite
Testing
- Not run (not requested)
## Summary
This PR simplifies collaboration modes to the visible set `default |
plan`, while preserving backward compatibility for older partners that
may still send legacy mode
names.
Specifically:
- Renames the old Code behavior to **Default**.
- Keeps **Plan** as-is.
- Removes **Custom** mode behavior (fallbacks now resolve to Default).
- Keeps `PairProgramming` and `Execute` internally for compatibility
plumbing, while removing them from schema/API and UI visibility.
- Adds legacy input aliasing so older clients can still send old mode
names.
## What Changed
1. Mode enum and compatibility
- `ModeKind` now uses `Plan` + `Default` as active/public modes.
- `ModeKind::Default` deserialization accepts legacy values:
- `code`
- `pair_programming`
- `execute`
- `custom`
- `PairProgramming` and `Execute` variants remain in code but are hidden
from protocol/schema generation.
- `Custom` variant is removed; previous custom fallbacks now map to
`Default`.
2. Collaboration presets and templates
- Built-in presets now return only:
- `Plan`
- `Default`
- Template rename:
- `core/templates/collaboration_mode/code.md` -> `default.md`
- `execute.md` and `pair_programming.md` remain on disk but are not
surfaced in visible preset lists.
3. TUI updates
- Updated user-facing naming and prompts from “Code” to “Default”.
- Updated mode-cycle and indicator behavior to reflect only visible
`Plan` and `Default`.
- Updated corresponding tests and snapshots.
4. request_user_input behavior
- `request_user_input` remains allowed only in `Plan` mode.
- Rejection messaging now consistently treats non-plan modes as
`Default`.
5. Schemas
- Regenerated config and app-server schemas.
- Public schema types now advertise mode values as:
- `plan`
- `default`
## Backward Compatibility Notes
- Incoming legacy mode names (`code`, `pair_programming`, `execute`,
`custom`) are accepted and coerced to `default`.
- Outgoing/public schema surfaces intentionally expose only `plan |
default`.
- This allows tolerant ingestion of older partner payloads while
standardizing new integrations on the reduced mode set.
## Codex author
`codex fork 019c1fae-693b-7840-b16e-9ad38ea0bd00`
type clash; app-server generated types were still using the v1
snake_case `WebSearchAction`, so there was a mismatch between the
camelCase emitted types and the snake_case types we were trying to
parse.
Updated v2 `WebSearchAction` to export into the `v2/` type set and
updated `ThreadItem` to use that.
### Tests
Ran new `just write-app-server-schema` to surface changes to schema, the
import looks correct now.
Summary
- remove the extra transaction guard that checked for existing dynamic
tools per thread before inserting new ones
- insert each tool record with `ON CONFLICT(thread_id, position) DO
NOTHING` to ignore duplicates instead of pre-querying
- simplify execution to use the shared pool directly and avoid unneeded
commits
Testing
- Not run (not requested)
## Summary
This introduces the first working foundation for Codex managed secrets:
a small Rust crate that can securely store and retrieve secrets locally.
Concretely, it adds a `codex-secrets` crate that:
- encrypts a local secrets file using `age`
- generates a high-entropy encryption key
- stores that key in the OS keyring
## What this enables
- A secure local persistence model for secrets
- A clean, isolated place for future provider backends
- A clear boundary: Codex can become a credential broker without putting
plaintext secrets in config files
## Implementation details
- New crate: `codex-rs/secrets/`
- Encryption: `age` with scrypt recipient/identity
- Key generation: `OsRng` (32 random bytes)
- Key storage: OS keyring via `codex-keyring-store`
## Testing
- `cd codex-rs && just fmt`
- `cd codex-rs && cargo test -p codex-secrets`
## Summary
Vendor Bubblewrap into the repo and add minimal build plumbing in
`codex-linux-sandbox` to compile/link it.
## Why
We want to move Linux sandboxing toward Bubblewrap, but in a safe
two-step rollout:
1) vendoring/build setup (this PR),
2) runtime integration (follow-up PR).
## Included
- Add `codex-rs/vendor/bubblewrap` sources.
- Add build-time FFI path in `codex-rs/linux-sandbox`.
- Update `build.rs` rerun tracking for vendored files.
- Small vendored compile warning fix (`sockaddr_nl` full init).
follow up in https://github.com/openai/codex/pull/9938
The current bug template uses CLI-specific instructions for getting the
version.
The current feature template doesn't ask the user to provide the Codex
variant (surface) they are using.
This PR addresses these problems.
Two fixes:
1. Include trailing tool output in the total context size calculation.
Otherwise when checking whether compaction should run we ignore newly
added outputs.
2. Trim trailing tool output/tool calls until we can fit the request
into the model context size. Otherwise the compaction endpoint will fail
to compact. We only trim items that can be reproduced again by the model
(tool calls, tool call outputs).
###### Summary
- Add input_modalities to model metadata so clients can determine
supported input types.
- Gate image paste/attach in TUI when the selected model does not
support images.
- Block submits that include images for unsupported models and show a
clear warning.
- Propagate modality metadata through app-server protocol/model-list
responses.
- Update related tests/fixtures.
###### Rationale
- Models support different input modalities.
- Clients need an explicit capability signal to prevent unsupported
requests.
- Backward-compatible defaults preserve existing behavior when modality
metadata is absent.
###### Scope
- codex-rs/protocol, codex-rs/core, codex-rs/tui
- codex-rs/app-server-protocol, codex-rs/app-server
- Generated app-server types / schema fixtures
###### Trade-offs
- Default behavior assumes text + image when field is absent for
compatibility.
- Server-side validation remains the source of truth.
###### Follow-up
- Non-TUI clients should consume input_modalities to disable unsupported
attachments.
- Model catalogs should explicitly set input_modalities for text-only
models.
###### Testing
- cargo fmt --all
- cargo test -p codex-tui
- env -u GITHUB_APP_KEY cargo test -p codex-core --lib
- just write-app-server-schema
- cargo run -p codex-cli --bin codex -- app-server generate-ts --out
app-server-types
- test against local backend
<img width="695" height="199" alt="image"
src="https://github.com/user-attachments/assets/d22dd04f-5eba-4db9-a7c5-a2506f60ec44"
/>
---------
Co-authored-by: Josh McKinney <joshka@openai.com>
### What
add wiring for `phase` field on `ResponseItem::Message` to lay
groundwork for differentiating model preambles and final messages.
currently optional.
follows pattern in #9698.
updated schemas with `just write-app-server-schema` so we can see type
changes.
### Tests
Updated existing tests for SSE parsing and hydrating from history
# External (non-OpenAI) Pull Request Requirements
Before opening this Pull Request, please read the dedicated
"Contributing" markdown file or your PR may be closed:
https://github.com/openai/codex/blob/main/docs/contributing.md
If your PR conforms to our contribution guidelines, replace this text
with a detailed and high quality description of your changes.
Include a link to a bug report or enhancement request.
- Hide the "Worked for" label in the final message separator unless
elapsed time is over one minute.\n- Update/add tests to cover both
hidden (<60s) and shown (>=61s) behavior.
This PR addresses #10395
When a user is asked to pick the trust level of a project, the code
currently reloads the config if they select "trusted". It doesn't reload
the config in the "untrusted" case but should. This causes the sandbox
mode to be reported incorrectly in `/status` during the first run (it's
displayed as `read-only` even though it acts as though it's
`workspace-write`).
We started working with MCP in Codex before
https://crates.io/crates/rmcp was mature, so we had our own crate for
MCP types that was generated from the MCP schema:
8b95d3e082/codex-rs/mcp-types/README.md
Now that `rmcp` is more mature, it makes more sense to use their MCP
types in Rust, as they handle details (like the `_meta` field) that our
custom version ignored. Though one advantage that our custom types had
is that our generated types implemented `JsonSchema` and `ts_rs::TS`,
whereas the types in `rmcp` do not. As such, part of the work of this PR
is leveraging the adapters between `rmcp` types and the serializable
types that are API for us (app server and MCP) introduced in #10356.
Note this PR results in a number of changes to
`codex-rs/app-server-protocol/schema`, which merit special attention
during review. We must ensure that these changes are still
backwards-compatible, which is possible because we have:
```diff
- export type CallToolResult = { content: Array<ContentBlock>, isError?: boolean, structuredContent?: JsonValue, };
+ export type CallToolResult = { content: Array<JsonValue>, structuredContent?: JsonValue, isError?: boolean, _meta?: JsonValue, };
```
so `ContentBlock` has been replaced with the more general `JsonValue`.
Note that `ContentBlock` was defined as:
```typescript
export type ContentBlock = TextContent | ImageContent | AudioContent | ResourceLink | EmbeddedResource;
```
so the deletion of those individual variants should not be a cause of
great concern.
Similarly, we have the following change in
`codex-rs/app-server-protocol/schema/typescript/Tool.ts`:
```
- export type Tool = { annotations?: ToolAnnotations, description?: string, inputSchema: ToolInputSchema, name: string, outputSchema?: ToolOutputSchema, title?: string, };
+ export type Tool = { name: string, title?: string, description?: string, inputSchema: JsonValue, outputSchema?: JsonValue, annotations?: JsonValue, icons?: Array<JsonValue>, _meta?: JsonValue, };
```
so:
- `annotations?: ToolAnnotations` ➡️ `JsonValue`
- `inputSchema: ToolInputSchema` ➡️ `JsonValue`
- `outputSchema?: ToolOutputSchema` ➡️ `JsonValue`
and two new fields: `icons?: Array<JsonValue>, _meta?: JsonValue`
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/10349).
* #10357
* __->__ #10349
* #10356
## Summary
This PR improves the TUI experience for `request_user_input` by
rendering submitted question/answer sets directly in conversation
history with clear, structured formatting.
It also intentionally simplifies interrupt behavior for now: on `Esc` /
`Ctrl+C`, the questions overlay interrupts the turn without attempting
to submit partial answers.
<img width="1344" height="573" alt="Screenshot 2026-02-02 at 4 51 40 PM"
src="https://github.com/user-attachments/assets/ff752131-7060-44c1-9ded-af061969a533"
/>
## Scope
- TUI-only changes.
- No core/protocol/app-server behavior changes in this PR.
- Resume reconstruction of interrupted question sets is out of scope for
this PR.
## What Changed
- Added a new history cell: `RequestUserInputResultCell` in
`codex-rs/tui/src/history_cell.rs`.
- On normal `request_user_input` submission, TUI now inserts that
history cell immediately after sending `Op::UserInputAnswer`.
- Rendering includes a `Questions` header with `answered/total` count.
- Rendering shows each question as a bullet item.
- Rendering styles submitted answer lines in cyan.
- Rendering styles notes (for option questions) as `note:` lines in
cyan.
- Rendering styles freeform text (for no-option questions) as `answer:`
lines in cyan.
- Rendering dims only the `(unanswered)` suffix.
- Rendering can include an interrupted suffix and summary text when the
cell is marked interrupted.
- Rendering redacts secret questions as `••••••` instead of showing raw
values.
- Added `wrap_with_prefix(...)` in `history_cell.rs` for wrapped
prefixed lines.
- Added `split_request_user_input_answer(...)` in `history_cell.rs` for
decoding `"user_note: ..."` entries.
## Interrupt Behavior (Intentional for this PR)
- `Esc` / `Ctrl+C` in the questions overlay now performs `Op::Interrupt`
and exits the overlay.
- It does **not** submit partial/committed answers on interrupt.
- Added TODO comments in `request_user_input` overlay interrupt paths
indicating where interrupted partial result emission should be
reintroduced once core support is finalized.
- Queued `request_user_input` overlays are discarded on interrupt in the
current behavior.
## Tests Updated
- Updated/added overlay tests in
`codex-rs/tui/src/bottom_pane/request_user_input/mod.rs` to reflect
interrupt-only behavior.
- Added helper assertion for interrupt-only event expectation.
- Existing submission-path tests now validate history insertion behavior
and expected answer maps.
## Behavior Notes
- Completed question flows now produce a readable `Questions` block in
transcript history.
- Interrupted flows currently do not persist partial answers to
model-visible tool output.
## Follow-ups
- Reintroduce partial-answer-on-interrupt semantics once core can
persist/sequence interrupted `request_user_input` outputs safely.
- Optionally add replay/resume rendering for interrupted question sets
as a separate PR.
## Codex author
`codex fork 019bfb8d-2a65-7313-9be2-ea7100d19a61`
## Summary
This fixes a UX bug (https://github.com/openai/codex/issues/10442) where
the **"Implement this plan?"** prompt could reappear after switching
agents with `/agent` and then switching back to the original agent
during plan execution.
## Root Cause
On thread switch, the TUI rebuilds `ChatWidget`, replays buffered thread
events, then drains any queued live events.
In this flow, a `TurnComplete` can be handled twice for the same logical
turn:
1. replayed (`from_replay = true`)
2. then live (`from_replay = false`)
`ChatWidget` used `saw_plan_item_this_turn` to decide whether to show
the plan implementation prompt, but that flag was only reset on
`TurnStarted`.
If duplicate completion events occurred, stale `saw_plan_item_this_turn
= true` could cause the prompt to re-trigger unexpectedly.
## Fix
- Clear `saw_plan_item_this_turn` at the end of `on_task_complete`,
after prompt gating runs.
- This keeps the flag truly turn-scoped and prevents duplicate
`TurnComplete` handling from reopening the prompt.
adds basic git context to the session prefix so the model can anchor git
actions and be a bit more version-aware. structured it in a
multiroot-friendly shape even though we only have one root today
Persist thread_dynamic_tools in sqlite and read first from it. Fall back
to rollout files if it's not found. Persist dynamic tools to both sqlite
and rollout files.
Saw that new sessions get populated to db correctly & old sessions get
backfilled correctly at startup:
```
celia@com-92114 codex-rs % sqlite3 ~/.codex/state.sqlite \ "select thread_id, position,name,description,input_schema from thread_dynamic_tools;"
019c0cad-ec0d-74b2-a787-e8b33a349117|0|geo_lookup|lookup a city|{"properties":{"city":{"type":"string"}},"required":["city"],"type":"object"}
....
019c10ca-aa4b-7620-ae40-c0919fbd7ea7|0|geo_lookup|lookup a city|{"properties":{"city":{"type":"string"}},"required":["city"],"type":"object"}
```
fixes https://github.com/openai/codex/issues/10160 and some more.
## Description
Hardens Git command safety to prevent approval bypasses for destructive
or write-capable invocations (branch delete, risky push forms,
output/config-override flags), so these commands no longer auto-run as
“safe.”
- `git branch -d` variants (especially in worktrees / with global
options like -C / -c)
- `git show|diff|log --output` ... style file-write flags
- risky Git config override flags (-c, --config-env) that can trigger
external execution
- dangerous push forms that weren’t fully caught (`--force*`,
`--delete`, `+refspec`, `:refspec`)
- grouped short-flag delete forms (e.g. stacked branch flags containing
`d/D`)
will fast follow with a common git policy to bring windows to parity.
---------
Co-authored-by: Eric Traut <etraut@openai.com>
Adding a `--experimental` flag to the `generate-ts` fct in the
app-sever.
It can be called through one of those 2 command
```
just write-app-server-schema --experimental
codex app-server generate-ts --experimental
```
## Summary
- Make typed slash commands become text elements when the user hits
space, including paste‑burst spaces.
- Enable `/plan` to accept inline args and submit them in plan mode,
mirroring `/review` behavior and blocking submission while a task is
running.
- Preserve text elements/attachments for slash commands that take args.
<img width="1510" height="500" alt="image"
src="https://github.com/user-attachments/assets/446024df-b69a-4249-85db-1a85110e07f1"
/>
## Changes
- Add safe helper to insert element ranges in the textarea.
- Extend command‑with‑args pipeline to carry text elements and reuse
submission prep.
- Update `/plan` dispatch to switch to plan mode then submit prompt +
elements.
- Document new composer behavior and add tests.
## Notes
- `/plan` is blocked during active tasks (same as `/review`).
- Slash‑command elementization recognizes built‑ins and `/prompts:`
custom commands only.
## Codex author
`codex fork 019c16d3-4520-7bb0-9b9d-48720d40a8ab`