## Why
The `notify` hook payload did not identify which Codex client started
the turn. That meant downstream notification hooks could not distinguish
between completions coming from the TUI and completions coming from
app-server clients such as VS Code or Xcode. Now that the Codex App
provides its own desktop notifications, it would be nice to be able to
filter those out.
This change adds that context without changing the existing payload
shape for callers that do not know the client name, and keeps the new
end-to-end test cross-platform.
## What changed
- added an optional top-level `client` field to the legacy `notify` JSON
payload
- threaded that value through `core` and `hooks`; the internal session
and turn state now carries it as `app_server_client_name`
- set the field to `codex-tui` for TUI turns
- captured `initialize.clientInfo.name` in the app server and applied it
to subsequent turns before dispatching hooks
- replaced the notify integration test hook with a `python3` script so
the test does not rely on Unix shell permissions or `bash`
- documented the new field in `docs/config.md`
## Testing
- `cargo test -p codex-hooks`
- `cargo test -p codex-tui`
- `cargo test -p codex-app-server
suite::v2::initialize::turn_start_notify_payload_includes_initialize_client_name
-- --exact --nocapture`
- `cargo test -p codex-core` (`src/lib.rs` passed; `core/tests/all.rs`
still has unrelated existing failures in this environment)
## Docs
The public config reference on `developers.openai.com/codex` should
mention that the legacy `notify` payload may include a top-level
`client` field. The TUI reports `codex-tui`, and the app server reports
`initialize.clientInfo.name` when it is available.
- replace show_nux with structured availability_nux model metadata
- expose availability NUX data through the app-server model API
- update shared fixtures and tests for the new field
- Update the cloud requirements cache TTL to 30 minutes.
- Add a background job to refresh the cache every 5 minutes.
- Ensure there is only one refresh job per process.
Addresses bug https://github.com/openai/codex/issues/12589
Builds on community PR #12763.
This adds `oauth_resource` support for MCP `streamable_http` servers and
wires it through the relevant config and login paths. It fixes the bug
where the configured OAuth resource was not reliably included in the
authorization request, causing MCP login to omit the expected
`resource` parameter.
## Summary
This changes `custom_tool_call_output` to use the same output payload
shape as `function_call_output`, so freeform tools can return either
plain text or structured content items.
The main goal is to let `js_repl` return image content from nested
`view_image` calls in its own `custom_tool_call_output`, instead of
relying on a separate injected message.
## What changed
- Changed `custom_tool_call_output.output` from `string` to
`FunctionCallOutputPayload`
- Updated freeform tool plumbing to preserve structured output bodies
- Updated `js_repl` to aggregate nested tool content items and attach
them to the outer `js_repl` result
- Removed the old `js_repl` special case that injected `view_image`
results as a separate pending user image message
- Updated normalization/history/truncation paths to handle multimodal
`custom_tool_call_output`
- Regenerated app-server protocol schema artifacts
## Behavior
Direct `view_image` calls still return a `function_call_output` with
image content.
When `view_image` is called inside `js_repl`, the outer `js_repl`
`custom_tool_call_output` now carries:
- an `input_text` item if the JS produced text output
- one or more `input_image` items from nested tool results
So the nested image result now stays inside the `js_repl` tool output
instead of being injected as a separate message.
## Compatibility
This is intended to be backward-compatible for resumed conversations.
Older histories that stored `custom_tool_call_output.output` as a plain
string still deserialize correctly, and older histories that used the
previous injected-image-message flow also continue to resume.
Added regression coverage for resuming a pre-change rollout containing:
- string-valued `custom_tool_call_output`
- legacy injected image message history
#### [git stack](https://github.com/magus/git-stack-cli)
- 👉 `1` https://github.com/openai/codex/pull/12948
## Summary
- add a dedicated /audio picker for realtime microphone and speaker
selection
- persist realtime audio choices and prompt to restart only local audio
when voice is live
- add snapshot coverage for the new picker surfaces
## Validation
- cargo test -p codex-tui
- cargo insta accept
- just fix -p codex-tui
- just fmt
## Summary\n- add a websocket test-server request waiter so tests can
synchronize on recorded client messages\n- use that waiter in the
realtime delegation test instead of a fixed audio timeout\n- add
temporary timing logs in the test and websocket mock to inspect where
the flake stalls
## Summary
This PR includes the session's local date and timezone in the
model-visible environment context and persists that data in
`TurnContextItem`.
## What changed
- captures the current local date and IANA timezone when building a turn
context, with a UTC fallback if the timezone lookup fails
- includes current_date and timezone in the serialized
<environment_context> payload
- stores those fields on TurnContextItem so they survive rollout/history
handling, subagent review threads, and resume flows
- treats date/timezone changes as environment updates, so prompt caching
and context refresh logic do not silently reuse stale time context
- updates tests to validate the new environment fields without depending
on a single hardcoded environment-context string
## test
built a local build and saw it in the rollout file:
```
{"timestamp":"2026-02-26T21:39:50.737Z","type":"response_item","payload":{"type":"message","role":"user","content":[{"type":"input_text","text":"<environment_context>\n <shell>zsh</shell>\n <current_date>2026-02-26</current_date>\n <timezone>America/Los_Angeles</timezone>\n</environment_context>"}]}}
```
## Why
`unix_escalation.rs` had a large inline `mod tests` block that made the
implementation harder to scan. This change moves those tests into a
sibling file while keeping them as a child module, so they can still
exercise private items without widening visibility.
## What Changed
- replaced the inline `#[cfg(test)] mod tests` block in
`codex-rs/core/src/tools/runtimes/shell/unix_escalation.rs` with a
path-based test module declaration
- moved the existing unit tests into
`codex-rs/core/src/tools/runtimes/shell/unix_escalation_tests.rs`
- kept the extracted tests using `super::...` imports so they continue
to access private helpers and types from `unix_escalation.rs`
## Testing
- `cargo test -p codex-core unix_escalation::tests`
## Why
`codex features list` currently prints features in declaration order
from `codex_core::features::FEATURES`. That makes the output harder to
scan when looking for a specific flag, and the order can change for
reasons unrelated to the CLI.
## What changed
- Sort the `codex features list` rows by feature key before printing
them in `codex-rs/cli/src/main.rs`.
- Add an integration test in `codex-rs/cli/tests/features.rs` that runs
`codex features list` and asserts the feature-name column is
alphabetized.
## Verification
- Added `features_list_is_sorted_alphabetically_by_feature_name`.
- Ran `cargo test -p codex-cli`.
Summary is a required parameter on UserTurn. Ideally we'd like the core
to decide the appropriate summary level.
Make the summary optional and don't send it when not needed.
## Summary
- make resume/fork targets explicit and typed as `SessionTarget { path,
thread_id }` (non-optional `thread_id`)
- resolve `thread_id` centrally via `resolve_session_thread_id(...)`:
- use CLI input directly when it is a UUID (`--resume <uuid>` / `--fork
<uuid>`)
- otherwise read `thread_id` from rollout `SessionMeta` for path-based
selections (picker, `--resume-last`, name-based resume/fork)
- use `thread_id` to read cwd from SQLite first during resume/fork cwd
resolution
- keep rollout fallback for cwd resolution when SQLite is unavailable or
does not return thread metadata (`TurnContext` tail, then `SessionMeta`)
- keep the resume picker open when a selected row has unreadable session
metadata, and show an inline recoverable error instead of aborting the
TUI
## Why
This removes ad-hoc rollout filename parsing and makes resume/fork
target identity explicit. The resume/fork cwd check can use indexed
SQLite lookup by `thread_id` in the common path, while preserving
rollout-based fallback behavior. It also keeps malformed legacy rows
recoverable in the picker instead of letting a selection failure unwind
the app.
## Notes
- minimal TUI-only change; no schema/protocol changes
- includes TUI test coverage for SQLite cwd precedence when `thread_id`
is available
- includes TUI regression coverage for picker inline error rendering /
non-fatal unreadable session rows
## Codex author
`codex resume 019c9205-7f8b-7173-a2a2-f082d4df3de3`
## Why
Before this change, an escalation approval could say that a command
should be rerun, but it could not carry the sandbox configuration that
should still apply when the escalated command is actually spawned.
That left an unsafe gap in the `zsh-fork` skill path: skill scripts
under `scripts/` that did not declare permissions could be escalated
without a sandbox, and scripts that did declare permissions could lose
their bounded sandbox on rerun or cached session approval.
This PR extends the escalation protocol so approvals can optionally
carry sandbox configuration all the way through execution. That lets the
shell runtime preserve the intended sandbox instead of silently widening
access.
We likely want a single permissions type for this codepath eventually,
probably centered on `Permissions`. For now, the protocol needs to
represent both the existing `PermissionProfile` form and the fuller
`Permissions` form, so this introduces a temporary disjoint union,
`EscalationPermissions`, to carry either one.
Further, this means that today, a skill either:
- does not declare any permissions, in which case it is run using the
default sandbox for the turn
- specifies permissions, in which case the skill is run using that exact
sandbox, which might be more restrictive than the default sandbox for
the turn
We will likely change the skill's permissions to be additive to the
existing permissions for the turn.
## What Changed
- Added `EscalationPermissions` to `codex-protocol` so escalation
requests can carry either a `PermissionProfile` or a full `Permissions`
payload.
- Added an explicit `EscalationExecution` mode to the shell escalation
protocol so reruns distinguish between `Unsandboxed`, `TurnDefault`, and
`Permissions(...)` instead of overloading `None`.
- Updated `zsh-fork` shell reruns to resolve `TurnDefault` at execution
time, which keeps ordinary `UseDefault` commands on the turn sandbox and
preserves turn-level macOS seatbelt profile extensions.
- Updated the `zsh-fork` skill path so a skill with no declared
permissions inherits the conversation's effective sandbox instead of
escalating unsandboxed.
- Updated the `zsh-fork` skill path so a skill with declared permissions
reruns with exactly those permissions, including when a cached session
approval is reused.
## Testing
- Added unit coverage in
`core/src/tools/runtimes/shell/unix_escalation.rs` for the explicit
`UseDefault` / `RequireEscalated` / `WithAdditionalPermissions`
execution mapping.
- Added unit coverage in
`core/src/tools/runtimes/shell/unix_escalation.rs` for macOS seatbelt
extension preservation in both the `TurnDefault` and
explicit-permissions rerun paths.
- Added integration coverage in `core/tests/suite/skill_approval.rs` for
permissionless skills inheriting the turn sandbox and explicit skill
permissions remaining bounded across cached approval reuse.
## Summary
- add tracing-based diagnostics for nested `codex.tool(...)` calls made
from `js_repl`
- emit a bounded, sanitized summary at `info!`
- emit the exact raw serialized response object or error string seen by
JavaScript at `trace!`
- document how to enable these logs and where to find them, especially
for `codex app-server`
## Why
Nested `codex.tool(...)` calls inside `js_repl` are a debugging
boundary: JavaScript sees the tool result, but that result is otherwise
hard to inspect from outside the kernel.
This change adds explicit tracing for that path using the repo’s normal
observability pattern:
- `info` for compact summaries
- `trace` for exact raw payloads when deep debugging is needed
## What changed
- `js_repl` now summarizes nested tool-call results across the response
shapes it can receive:
- message content
- function-call outputs
- custom tool outputs
- MCP tool results and MCP error results
- direct error strings
- each nested `codex.tool(...)` completion logs:
- `exec_id`
- `tool_call_id`
- `tool_name`
- `ok`
- a bounded summary struct describing the payload shape
- at `trace`, the same path also logs the exact serialized response
object or error string that JavaScript received
- docs now include concrete logging examples for `codex app-server`
- unit coverage was added for multimodal function output summaries and
error summaries
## How to use it
### Summary-only logging
Set:
```sh
RUST_LOG=codex_core::tools::js_repl=info
```
For `codex app-server`, tracing output is written to the server process
`stderr`.
Example:
```sh
RUST_LOG=codex_core::tools::js_repl=info \
LOG_FORMAT=json \
codex app-server \
2> /tmp/codex-app-server.log
```
This emits bounded summary lines for nested `codex.tool(...)` calls.
### Full raw debugging
Set:
```sh
RUST_LOG=codex_core::tools::js_repl=trace
```
Example:
```sh
RUST_LOG=codex_core::tools::js_repl=trace \
LOG_FORMAT=json \
codex app-server \
2> /tmp/codex-app-server.log
```
At `trace`, you get:
- the same `info` summary line
- a `trace` line with the exact serialized response object seen by
JavaScript
- or the exact error string if the nested tool call failed
### Where the logs go
For `codex app-server`, these logs go to process `stderr`, so redirect
or capture `stderr` to inspect them.
Example:
```sh
RUST_LOG=codex_core::tools::js_repl=trace \
LOG_FORMAT=json \
/Users/fjord/code/codex/codex-rs/target/debug/codex app-server \
2> /tmp/codex-app-server.log
```
Then inspect:
```sh
rg "js_repl nested tool call" /tmp/codex-app-server.log
```
Without an explicit `RUST_LOG` override, these `js_repl` nested
tool-call logs are typically not visible.
## Summary
- make `Config.model_reasoning_summary` optional so unset means use
model default
- resolve the optional config value to a concrete summary when building
`TurnContext`
- add protocol support for `default_reasoning_summary` in model metadata
## Validation
- `cargo test -p codex-core --lib client::tests -- --nocapture`
---------
Co-authored-by: Codex <noreply@openai.com>
Attempt to reduce disk usage in mac ci.
>off - This is the default for platforms with ELF binaries and
windows-gnu (not Windows MSVC and not macOS). This typically means that
DWARF debug information can be found in the final artifact in sections
of the executable. This option is not supported on Windows MSVC. On
macOS this options prevents the final execution of dsymutil to generate
debuginfo.
we recently changed file linking so the model uses markdown links when
it wants something to be clickable.
This works well across the GUI surfaces because they can render markdown
cleanly and use the full absolute path in the anchor target.
A previous pass hid the absolute path in the TUI (and only showed the
label), but that also meant we could lose useful location info when the
model put the line number or range in the anchor target instead of the
label.
This follow-up keeps the TUI behavior simple while making local file
links feel closer to the old TUI file reference style.
key changes:
- Local markdown file links in the TUI keep the old file-ref feel: code
styling, no underline, no visible absolute path.
- If the hidden local anchor target includes a location suffix and the
label does not already include one, we append that suffix to the visible
label.
- This works for single lines, line/column references, and ranges.
- If the label already includes the location, we leave it alone.
- normal web links keep the old TUI markdown-link behavior
some examples:
- `[foo.rs](/abs/path/foo.rs)` renders as `foo.rs`
- `[foo.rs](/abs/path/foo.rs:45)` renders as `foo.rs:45`
- `[foo.rs](/abs/path/foo.rs:45:3-48:9)` renders as `foo.rs:45:3-48:9`
- `[foo.rs:45](/abs/path/foo.rs:45)` stays `foo.rs:45`
- `[docs](https://example.com/docs)` still renders like a normal web
link
how it looks:
<img width="732" height="813" alt="Screenshot 2026-02-26 at 9 27 55 AM"
src="https://github.com/user-attachments/assets/d51bf236-653a-4e83-96e4-9427f0804471"
/>
## Summary
- bundle contextual prompt injection into at most one developer message
plus one contextual user message in both:
- per-turn settings updates
- initial context insertion
- preserve `<model_switch>` across compaction by rebuilding it through
canonical initial-context injection, instead of relying on
strip/reattach hacks
- centralize contextual user fragment detection in one shared definition
table and reuse it for parsing/compaction logic
- keep `AGENTS.md` in its natural serialized format:
- `# AGENTS.md instructions for {dirname}`
- `<INSTRUCTIONS>...</INSTRUCTIONS>`
- simplify related tests/helpers and accept the expected snapshot/layout
updates from bundled multi-part messages
## Why
The goal is to converge toward a simpler, more intentional prompt shape
where contextual updates are consistently represented as one developer
envelope plus one contextual user envelope, while keeping parsing and
compaction behavior aligned with that representation.
## Notable details
- the temporary `SettingsUpdateEnvelope` wrapper was removed; these
paths now return `Vec<ResponseItem>` directly
- local/remote compaction no longer rely on model-switch strip/restore
helpers
- contextual user detection is now driven by shared fragment definitions
instead of ad hoc matcher assembly
- AGENTS/user instructions are still the same logical context; only the
synthetic `<user_instructions>` wrapper was replaced by the natural
AGENTS text format
## Testing
- `just fmt`
- `cargo test -p codex-app-server
codex_message_processor::tests::extract_conversation_summary_prefers_plain_user_messages
-- --exact`
- `cargo test -p codex-core
compact::tests::collect_user_messages_filters_session_prefix_entries
--lib -- --exact`
- `cargo test -p codex-core --test all
'suite::compact::snapshot_request_shape_pre_turn_compaction_strips_incoming_model_switch'
-- --exact`
- `cargo test -p codex-core --test all
'suite::compact_remote::snapshot_request_shape_remote_pre_turn_compaction_strips_incoming_model_switch'
-- --exact`
- `cargo test -p codex-core --test all
'suite::client::includes_apps_guidance_as_developer_message_when_enabled'
-- --exact`
- `cargo test -p codex-core --test all
'suite::client::includes_developer_instructions_message_in_request' --
--exact`
- `cargo test -p codex-core --test all
'suite::client::includes_user_instructions_message_in_request' --
--exact`
- `cargo test -p codex-core --test all
'suite::client::resume_includes_initial_messages_and_sends_prior_items'
-- --exact`
- `cargo test -p codex-core --test all
'suite::review::review_input_isolated_from_parent_history' -- --exact`
- `cargo test -p codex-exec --test all
'suite::resume::exec_resume_last_respects_cwd_filter_and_all_flag' --
--exact`
- `cargo test -p core_test_support
context_snapshot::tests::full_text_mode_preserves_unredacted_text --
--exact`
## Notes
- I also ran several targeted `compact`, `compact_remote`,
`prompt_caching`, `model_visible_layout`, and `event_mapping` tests
while iterating on prompt-shape changes.
- I have not claimed a clean full-workspace `cargo test` from this
environment because local sandbox/resource conditions have previously
produced unrelated failures in large workspace runs.
Currently there is no bound on the length of a user message submitted in
the TUI or through the app server interface. That means users can paste
many megabytes of text, which can lead to bad performance, hangs, and
crashes. In extreme cases, it can lead to a [kernel
panic](https://github.com/openai/codex/issues/12323).
This PR limits the length of a user input to 2**20 (about 1M)
characters. This value was chosen because it fills the entire context
window on the latest models, so accepting longer inputs wouldn't make
sense anyway.
Summary
- add a shared `MAX_USER_INPUT_TEXT_CHARS` constant in codex-protocol
and surface it in TUI and app server code
- block oversized submissions in the TUI submit flow and emit error
history cells when validation fails
- reject heavy app-server requests with JSON-RPC `-32602` and structured
`input_too_large` data, plus document the behavior
Testing
- ran the IDE extension with this change and verified that when I
attempt to paste a user message that's several MB long, it correctly
reports an error instead of crashing or making my computer hot.
## Summary
- hide appended destinations for local path-style markdown links in the
TUI renderer
- keep web links rendering with their visible destination and style link
labels consistently
- add markdown renderer tests and a snapshot for the new file-link
output
## Testing
- just fmt
- cargo test -p codex-tui
<img width="1120" height="968" alt="image"
src="https://github.com/user-attachments/assets/490e8eda-ae47-4231-89fa-b254a1f83eed"
/>
## Summary
Lower the `js_repl` minimum Node version from `24.13.1` to `22.22.0`.
This updates the enforced minimum in `codex-rs/node-version.txt` and the
corresponding user-facing `/experimental` description for the JavaScript
REPL feature.
## Rationale
The previous `24.13.1` floor was stricter than necessary for `js_repl`.
I validated the REPL kernel behavior under Node `22.22.0` still works.
## Why `22.22.0`
`22.22.0` is a current, widely packaged Node 22 release across common
developer environments and distros, including Homebrew `node@22`, Fedora
`nodejs22`, Arch `nodejs-lts-jod`, and Debian testing. That makes it a
better exact floor than guessing at an older `22.x` patch we have not
validated.
`22.x` is also a maintenance branch that will be supported through April
2027, where the previous maintenance branch of `20.x` is only supported
through April of this year.
## Changes
- Update `codex-rs/node-version.txt` from `24.13.1` to `22.22.0`
- Update the `/experimental` JavaScript REPL description to say
`Requires Node >= v22.22.0 installed.`
# External (non-OpenAI) Pull Request Requirements
Before opening this Pull Request, please read the dedicated
"Contributing" markdown file or your PR may be closed:
https://github.com/openai/codex/blob/main/docs/contributing.md
If your PR conforms to our contribution guidelines, replace this text
with a detailed and high quality description of your changes.
Include a link to a bug report or enhancement request.