## Summary
- add `approvals_reviewer = "user" | "guardian_subagent"` as the runtime
control for who reviews approval requests
- route Smart Approvals guardian review through core for command
execution, file changes, managed-network approvals, MCP approvals, and
delegated/subagent approval flows
- expose guardian review in app-server with temporary unstable
`item/autoApprovalReview/{started,completed}` notifications carrying
`targetItemId`, `review`, and `action`
- update the TUI so Smart Approvals can be enabled from `/experimental`,
aligned with the matching `/approvals` mode, and surfaced clearly while
reviews are pending or resolved
## Runtime model
This PR does not introduce a new `approval_policy`.
Instead:
- `approval_policy` still controls when approval is needed
- `approvals_reviewer` controls who reviewable approval requests are
routed to:
- `user`
- `guardian_subagent`
`guardian_subagent` is a carefully prompted reviewer subagent that
gathers relevant context and applies a risk-based decision framework
before approving or denying the request.
The `smart_approvals` feature flag is a rollout/UI gate. Core runtime
behavior keys off `approvals_reviewer`.
When Smart Approvals is enabled from the TUI, it also switches the
current `/approvals` settings to the matching Smart Approvals mode so
users immediately see guardian review in the active thread:
- `approval_policy = on-request`
- `approvals_reviewer = guardian_subagent`
- `sandbox_mode = workspace-write`
Users can still change `/approvals` afterward.
Config-load behavior stays intentionally narrow:
- plain `smart_approvals = true` in `config.toml` remains just the
rollout/UI gate and does not auto-set `approvals_reviewer`
- the deprecated `guardian_approval = true` alias migration does
backfill `approvals_reviewer = "guardian_subagent"` in the same scope
when that reviewer is not already configured there, so old configs
preserve their original guardian-enabled behavior
ARC remains a separate safety check. For MCP tool approvals, ARC
escalations now flow into the configured reviewer instead of always
bypassing guardian and forcing manual review.
## Config stability
The runtime reviewer override is stable, but the config-backed
app-server protocol shape is still settling.
- `thread/start`, `thread/resume`, and `turn/start` keep stable
`approvalsReviewer` overrides
- the config-backed `approvals_reviewer` exposure returned via
`config/read` (including profile-level config) is now marked
`[UNSTABLE]` / experimental in the app-server protocol until we are more
confident in that config surface
## App-server surface
This PR intentionally keeps the guardian app-server shape narrow and
temporary.
It adds generic unstable lifecycle notifications:
- `item/autoApprovalReview/started`
- `item/autoApprovalReview/completed`
with payloads of the form:
- `{ threadId, turnId, targetItemId, review, action? }`
`review` is currently:
- `{ status, riskScore?, riskLevel?, rationale? }`
- where `status` is one of `inProgress`, `approved`, `denied`, or
`aborted`
`action` carries the guardian action summary payload from core when
available. This lets clients render temporary standalone pending-review
UI, including parallel reviews, even when the underlying tool item has
not been emitted yet.
These notifications are explicitly documented as `[UNSTABLE]` and
expected to change soon.
This PR does **not** persist guardian review state onto `thread/read`
tool items. The intended follow-up is to attach guardian review state to
the reviewed tool item lifecycle instead, which would improve
consistency with manual approvals and allow thread history / reconnect
flows to replay guardian review state directly.
## TUI behavior
- `/experimental` exposes the rollout gate as `Smart Approvals`
- enabling it in the TUI enables the feature and switches the current
session to the matching Smart Approvals `/approvals` mode
- disabling it in the TUI clears the persisted `approvals_reviewer`
override when appropriate and returns the session to default manual
review when the effective reviewer changes
- `/approvals` still exposes the reviewer choice directly
- the TUI renders:
- pending guardian review state in the live status footer, including
parallel review aggregation
- resolved approval/denial state in history
## Scope notes
This PR includes the supporting core/runtime work needed to make Smart
Approvals usable end-to-end:
- shell / unified-exec / apply_patch / managed-network / MCP guardian
review
- delegated/subagent approval routing into guardian review
- guardian review risk metadata and action summaries for app-server/TUI
- config/profile/TUI handling for `smart_approvals`, `guardian_approval`
alias migration, and `approvals_reviewer`
- a small internal cleanup of delegated approval forwarding to dedupe
fallback paths and simplify guardian-vs-parent approval waiting (no
intended behavior change)
Out of scope for this PR:
- redesigning the existing manual approval protocol shapes
- persisting guardian review state onto app-server `ThreadItem`s
- delegated MCP elicitation auto-review (the current delegated MCP
guardian shim only covers the legacy `RequestUserInput` path)
---------
Co-authored-by: Codex <noreply@openai.com>
## Description
This PR expands tracing coverage across app-server thread startup, core
session initialization, and the Responses transport layer. It also gives
core dispatch spans stable operation-specific names so traces are easier
to follow than the old generic `submission_dispatch` spans.
Also use `fmt::Display` for types that we serialize in traces so we send
strings instead of rust types
- add model and reasoning effort to app-server collab spawn items and
notifications
- regenerate app-server protocol schemas for the new fields
---------
Co-authored-by: Codex <noreply@openai.com>
## Why
to support a new bring your own search tool in Responses
API(https://developers.openai.com/api/docs/guides/tools-tool-search#client-executed-tool-search)
we migrating our bm25 search tool to use official way to execute search
on client and communicate additional tools to the model.
## What
- replace the legacy `search_tool_bm25` flow with client-executed
`tool_search`
- add protocol, SSE, history, and normalization support for
`tool_search_call` and `tool_search_output`
- return namespaced Codex Apps search results and wire namespaced
follow-up tool calls back into MCP dispatch
- include the requested sub-agent model and reasoning effort in the
spawn begin event\n- render that metadata next to the spawned agent name
and role in the TUI transcript
---------
Co-authored-by: Codex <noreply@openai.com>
## Summary
- add `skill_approval` to `RejectConfig` and the app-server v2
`AskForApproval::Reject` payload so skill-script prompts can be
configured independently from sandbox and rule-based prompts
- update Unix shell escalation to reject prompts based on the actual
decision source, keeping prefix rules tied to `rules`, unmatched command
fallbacks tied to `sandbox_approval`, and skill scripts tied to
`skill_approval`
- regenerate the affected protocol/config schemas and expand
unit/integration coverage for the new flag and skill approval behavior
Summary
- drop `McpToolOutput` in favor of `CallToolResult`, moving its helpers
to keep MCP tooling focused on the final result shape
- wire the new schema definitions through code mode, context, handlers,
and spec modules so MCP tools serialize the exact output shape expected
by the model
- extend code mode tests to cover multiple MCP call scenarios and ensure
the serialized data matches the new schema
- refresh JS runner helpers and protocol models alongside the schema
changes
Testing
- Not run (not requested)
- collect input/output transcript deltas into active handoff transcript
state
- attach and clear that transcript on each handoff, and regenerate
schema/tests
(Experimental)
This PR adds a first MVP for hooks, with SessionStart and Stop
The core design is:
- hooks live in a dedicated engine under codex-rs/hooks
- each hook type has its own event-specific file
- hook execution is synchronous and blocks normal turn progression while
running
- matching hooks run in parallel, then their results are aggregated into
a normalized HookRunSummary
On the AppServer side, hooks are exposed as operational metadata rather
than transcript-native items:
- new live notifications: hook/started, hook/completed
- persisted/replayed hook results live on Turn.hookRuns
- we intentionally did not add hook-specific ThreadItem variants
Hooks messages are not persisted, they remain ephemeral. The context
changes they add are (they get appended to the user's prompt)
## Summary
- preserve unknown `:special_path` tokens, including nested entries, so
older Codex builds warn and ignore instead of failing config load
- fail closed with a startup warning when a permissions profile has
missing or empty filesystem entries instead of aborting profile
compilation
- normalize Windows verbatim paths like `\?\C:\...` before absolute-path
validation while keeping explicit errors for truly invalid paths
## Testing
- just fmt
- cargo test -p codex-core permissions_profiles_allow
- cargo test -p codex-core
normalize_absolute_path_for_platform_simplifies_windows_verbatim_paths
- cargo test -p codex-protocol
unknown_special_paths_are_ignored_by_legacy_bridge
- cargo clippy -p codex-core -p codex-protocol --all-targets -- -D
warnings
- cargo clean
## Summary
This is a fast follow to the initial `[permissions]` structure.
- keep the new split-policy carveout behavior for narrower non-write
entries under broader writable roots
- preserve legacy `WorkspaceWrite` semantics by using a cwd-aware bridge
that drops only redundant nested readable roots when projecting from
`SandboxPolicy`
- route the legacy macOS seatbelt adapter through that same legacy
bridge so redundant nested readable roots do not become read-only
carveouts on macOS
- derive the legacy bridge for `command_exec` using the sandbox root cwd
rather than the request cwd so policy derivation matches later sandbox
enforcement
- add regression coverage for the legacy macOS nested-readable-root case
## Examples
### Legacy `workspace-write` on macOS
A legacy `workspace-write` policy can redundantly list a nested readable
root under an already-writable workspace root.
For example, legacy config can effectively mean:
- workspace root (`.` / `cwd`) is writable
- `docs/` is also listed in `readable_roots`
The new shared split-policy helper intentionally treats a narrower
non-write entry under a broader writable root as a carveout for real
`[permissions]` configs. Without this fast follow, the unchanged macOS
seatbelt legacy adapter could project that legacy shape into a
`FileSystemSandboxPolicy` that treated `docs/` like a read-only carveout
under the writable workspace root. In practice, legacy callers on macOS
could unexpectedly lose write access inside `docs/`, even though that
path was writable before the `[permissions]` migration work.
This change fixes that by routing the legacy seatbelt path through the
cwd-aware legacy bridge, so:
- legacy `workspace-write` keeps `docs/` writable when `docs/` was only
a redundant readable root
- explicit `[permissions]` entries like `'.' = 'write'` and `'docs' =
'read'` still make `docs/` read-only, which is the new intended
split-policy behavior
### Legacy `command_exec` with a subdirectory cwd
`command_exec` can run a command from a request cwd that is narrower
than the sandbox root cwd.
For example:
- sandbox root cwd is `/repo`
- request cwd is `/repo/subdir`
- legacy policy is still `workspace-write` rooted at `/repo`
Before this fast follow, `command_exec` derived the legacy bridge using
the request cwd, but the sandbox was later built using the sandbox root
cwd. That mismatch could miss redundant legacy readable roots during
projection and accidentally reintroduce read-only carveouts for paths
that should still be writable under the legacy model.
This change fixes that by deriving the legacy bridge with the same
sandbox root cwd that sandbox enforcement later uses.
## Verification
- `just fmt`
- `cargo test -p codex-core
seatbelt_legacy_workspace_write_nested_readable_root_stays_writable`
- `cargo test -p codex-core test_sandbox_config_parsing`
- `cargo clippy -p codex-core -p codex-app-server --all-targets -- -D
warnings`
- `cargo clean`
## Summary
We need to support allowing request_permissions calls when using
`Reject` policy
<img width="1133" height="588" alt="Screenshot 2026-03-09 at 12 06
40 PM"
src="https://github.com/user-attachments/assets/a8df987f-c225-4866-b8ab-5590960daec5"
/>
Note that this is a backwards-incompatible change for Reject policy. I'm
not sure if we need to add a default based on our current use/setup
## Testing
- [x] Added tests
- [x] Tested locally
## Summary
request_permissions flows should support persisting results for the
session.
Open Question: Still deciding if we need within-turn approvals - this
adds complexity but I could see it being useful
## Testing
- [x] Updated unit tests
---------
Co-authored-by: Codex <noreply@openai.com>
## Summary
- remove the remaining model-visible guardian-specific `on-request`
prompt additions so enabling the feature does not change the main
approval-policy instructions
- neutralize user-facing guardian wording to talk about automatic
approval review / approval requests rather than a second reviewer or
only sandbox escalations
- tighten guardian retry-context handling so agent-authored
`justification` stays in the structured action JSON and is not also
injected as raw retry context
- simplify guardian review plumbing in core by deleting dead
prompt-append paths and trimming some request/transcript setup code
## Notable Changes
- delete the dead `permissions/approval_policy/guardian.md` append path
and stop threading `guardian_approval_enabled` through model-facing
developer-instruction builders
- rename the experimental feature copy to `Automatic approval review`
and update the `/experimental` snapshot text accordingly
- make approval-review status strings generic across shell, patch,
network, and MCP review types
- forward real sandbox/network retry reasons for shell and unified-exec
guardian review, but do not pass agent-authored justification as raw
retry context
- simplify `guardian.rs` by removing the one-field request wrapper,
deduping reasoning-effort selection, and cleaning up transcript entry
collection
## Testing
- `just fmt`
- full validation left to CI
---------
Co-authored-by: Codex <noreply@openai.com>
Adds a built-in `request_permissions` tool and wires it through the
Codex core, protocol, and app-server layers so a running turn can ask
the client for additional permissions instead of relying on a static
session policy.
The new flow emits a `RequestPermissions` event from core, tracks the
pending request by call ID, forwards it through app-server v2 as an
`item/permissions/requestApproval` request, and resumes the tool call
once the client returns an approved subset of the requested permission
profile.
## Why
A restricted filesystem policy that grants `:root` read or write access
but also carries explicit deny entries should still behave like scoped
access with carveouts, not like unrestricted disk access.
Without that distinction, later platform backends cannot preserve
blocked subpaths under root-level permissions because the protocol layer
reports the policy as fully unrestricted.
## What changed
- taught `FileSystemSandboxPolicy` to treat root access plus explicit
deny entries as scoped access rather than full-disk access
- derived readable and writable roots from the filesystem root when root
access is combined with carveouts, while preserving the denied paths as
read-only subpaths
- added protocol coverage for root-write policies with carveouts and a
core sandboxing regression so those policies still require platform
sandboxing
## Verification
- added protocol coverage in `protocol/src/permissions.rs` and
`protocol/src/protocol.rs` for root access with explicit carveouts
- added platform-sandbox regression coverage in
`core/src/sandboxing/mod.rs`
- verified the current PR state with `just clippy`
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/13452).
* #13453
* __->__ #13452
* #13451
* #13449
* #13448
* #13445
* #13440
* #13439
---------
Co-authored-by: viyatb-oai <viyatb@openai.com>
## Why
The Linux sandbox helper still only accepted the legacy `SandboxPolicy`
payload.
That meant the runtime could compute split filesystem and network
policies, but the helper would immediately collapse them back to the
compatibility projection before applying seccomp or staging the
bubblewrap inner command.
## What changed
- added hidden `--file-system-sandbox-policy` and
`--network-sandbox-policy` flags alongside the legacy `--sandbox-policy`
flag so the helper can migrate incrementally
- updated the core-side Landlock wrapper to pass the split policies
explicitly when launching `codex-linux-sandbox`
- added helper-side resolution logic that accepts either the legacy
policy alone or a complete split-policy pair and normalizes that into
one effective configuration
- switched Linux helper network decisions to use `NetworkSandboxPolicy`
directly
- added `FromStr` support for the split policy types so the helper can
parse them from CLI JSON
## Verification
- added helper coverage in `linux-sandbox/src/linux_run_main_tests.rs`
for split-policy flags and policy resolution
- added CLI argument coverage in `core/src/landlock.rs`
- verified the current PR state with `just clippy`
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/13449).
* #13453
* #13452
* #13451
* __->__ #13449
* #13448
* #13445
* #13440
* #13439
---------
Co-authored-by: viyatb-oai <viyatb@openai.com>
## Summary
- add the guardian reviewer flow for `on-request` approvals in command,
patch, sandbox-retry, and managed-network approval paths
- keep guardian behind `features.guardian_approval` instead of exposing
a public `approval_policy = guardian` mode
- route ordinary `OnRequest` approvals to the guardian subagent when the
feature is enabled, without changing the public approval-mode surface
## Public model
- public approval modes stay unchanged
- guardian is enabled via `features.guardian_approval`
- when that feature is on, `approval_policy = on-request` keeps the same
approval boundaries but sends those approval requests to the guardian
reviewer instead of the user
- `/experimental` only persists the feature flag; it does not rewrite
`approval_policy`
- CLI and app-server no longer expose a separate `guardian` approval
mode in this PR
## Guardian reviewer
- the reviewer runs as a normal subagent and reuses the existing
subagent/thread machinery
- it is locked to a read-only sandbox and `approval_policy = never`
- it does not inherit user/project exec-policy rules
- it prefers `gpt-5.4` when the current provider exposes it, otherwise
falls back to the parent turn's active model
- it fail-closes on timeout, startup failure, malformed output, or any
other review error
- it currently auto-approves only when `risk_score < 80`
## Review context and policy
- guardian mirrors `OnRequest` approval semantics rather than
introducing a separate approval policy
- explicit `require_escalated` requests follow the same approval surface
as `OnRequest`; the difference is only who reviews them
- managed-network allowlist misses that enter the approval flow are also
reviewed by guardian
- the review prompt includes bounded recent transcript history plus
recent tool call/result evidence
- transcript entries and planned-action strings are truncated with
explicit `<guardian_truncated ... />` markers so large payloads stay
bounded
- apply-patch reviews include the full patch content (without
duplicating the structured `changes` payload)
- the guardian request layout is snapshot-tested using the same
model-visible Responses request formatter used elsewhere in core
## Guardian network behavior
- the guardian subagent inherits the parent session's managed-network
allowlist when one exists, so it can use the same approved network
surface while reviewing
- exact session-scoped network approvals are copied into the guardian
session with protocol/port scope preserved
- those copied approvals are now seeded before the guardian's first turn
is submitted, so inherited approvals are available during any immediate
review-time checks
## Out of scope / follow-ups
- the sandbox-permission validation split was pulled into a separate PR
and is not part of this diff
- a future follow-up can enable `serde_json` preserve-order in
`codex-core` and then simplify the guardian action rendering further
---------
Co-authored-by: Codex <noreply@openai.com>
## Why
`#13434` and `#13439` introduce split filesystem and network policies,
but the only code that could answer basic filesystem questions like "is
access effectively unrestricted?" or "which roots are readable and
writable for this cwd?" still lived on the legacy `SandboxPolicy` path.
That would force later backends to either keep projecting through
`SandboxPolicy` or duplicate path-resolution logic. This PR moves those
queries onto `FileSystemSandboxPolicy` itself so later runtime and
platform changes can consume the split policy directly.
## What changed
- added `FileSystemSandboxPolicy` helpers for full-read/full-write
checks, platform-default reads, readable roots, writable roots, and
explicit unreadable roots resolved against a cwd
- added a shared helper for the default read-only carveouts under
writable roots so the legacy and split-policy paths stay aligned
- added protocol coverage for full-access detection and derived
readable, writable, and unreadable roots
## Verification
- added protocol coverage in `protocol/src/protocol.rs` and
`protocol/src/permissions.rs` for full-root access and derived
filesystem roots
- verified the current PR state with `just clippy`
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/13440).
* #13453
* #13452
* #13451
* #13449
* #13448
* #13445
* __->__ #13440
* #13439
---------
Co-authored-by: viyatb-oai <viyatb@openai.com>
## Why
`#13434` introduces split `FileSystemSandboxPolicy` and
`NetworkSandboxPolicy`, but the runtime still made most execution-time
sandbox decisions from the legacy `SandboxPolicy` projection.
That projection loses information about combinations like unrestricted
filesystem access with restricted network access. In practice, that
means the runtime can choose the wrong platform sandbox behavior or set
the wrong network-restriction environment for a command even when config
has already separated those concerns.
This PR carries the split policies through the runtime so sandbox
selection, process spawning, and exec handling can consult the policy
that actually matters.
## What changed
- threaded `FileSystemSandboxPolicy` and `NetworkSandboxPolicy` through
`TurnContext`, `ExecRequest`, sandbox attempts, shell escalation state,
unified exec, and app-server exec overrides
- updated sandbox selection in `core/src/sandboxing/mod.rs` and
`core/src/exec.rs` to key off `FileSystemSandboxPolicy.kind` plus
`NetworkSandboxPolicy`, rather than inferring behavior only from the
legacy `SandboxPolicy`
- updated process spawning in `core/src/spawn.rs` and the platform
wrappers to use `NetworkSandboxPolicy` when deciding whether to set
`CODEX_SANDBOX_NETWORK_DISABLED`
- kept additional-permissions handling and legacy `ExternalSandbox`
compatibility projections aligned with the split policies, including
explicit user-shell execution and Windows restricted-token routing
- updated callers across `core`, `app-server`, and `linux-sandbox` to
pass the split policies explicitly
## Verification
- added regression coverage in `core/tests/suite/user_shell_cmd.rs` to
verify `RunUserShellCommand` does not inherit
`CODEX_SANDBOX_NETWORK_DISABLED` from the active turn
- added coverage in `core/src/exec.rs` for Windows restricted-token
sandbox selection when the legacy projection is `ExternalSandbox`
- updated Linux sandbox coverage in
`linux-sandbox/tests/suite/landlock.rs` to exercise the split-policy
exec path
- verified the current PR state with `just clippy`
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/13439).
* #13453
* #13452
* #13451
* #13449
* #13448
* #13445
* #13440
* __->__ #13439
---------
Co-authored-by: viyatb-oai <viyatb@openai.com>
This fixes a schema export bug where two different `WebSearchAction`
types were getting merged under the same name in the app-server v2 JSON
schema bundle.
The problem was that v2 thread items use the app-server API's
`WebSearchAction` with camelCase variants like `openPage`, while
`ThreadResumeParams.history` and
`RawResponseItemCompletedNotification.item` pull in the upstream
`ResponseItem` graph, which uses the Responses API snake_case shape like
`open_page`. During bundle generation we were flattening nested
definitions into the v2 namespace by plain name, so the later definition
could silently overwrite the earlier one.
That meant clients generating code from the bundled schema could end up
with the wrong `WebSearchAction` definition for v2 thread history. In
practice this shows up on web search items reconstructed from rollout
files with persisted extended history.
This change does two things:
- Gives the upstream Responses API schema a distinct JSON schema name:
`ResponsesApiWebSearchAction`
- Makes namespace-level schema definition collisions fail loudly instead
of silently overwriting
Previously, we could only configure whether web search was on/off.
This PR enables sending along a web search config, which includes all
the stuff responsesapi supports: filters, location, etc.
## Why
`SandboxPolicy` currently mixes together three separate concerns:
- parsing layered config from `config.toml`
- representing filesystem sandbox state
- carrying basic network policy alongside filesystem choices
That makes the existing config awkward to extend and blocks the new TOML
proposal where `[permissions]` becomes a table of named permission
profiles selected by `default_permissions`. (The idea is that if
`default_permissions` is not specified, we assume the user is opting
into the "traditional" way to configure the sandbox.)
This PR adds the config-side plumbing for those profiles while still
projecting back to the legacy `SandboxPolicy` shape that the current
macOS and Linux sandbox backends consume.
It also tightens the filesystem profile model so scoped entries only
exist for `:project_roots`, and so nested keys must stay within a
project root instead of using `.` or `..` traversal.
This drops support for the short-lived `[permissions.network]` in
`config.toml` because now that would be interpreted as a profile named
`network` within `[permissions]`.
## What Changed
- added `PermissionsToml`, `PermissionProfileToml`,
`FilesystemPermissionsToml`, and `FilesystemPermissionToml` so config
can parse named profiles under `[permissions.<profile>.filesystem]`
- added top-level `default_permissions` selection, validation for
missing or unknown profiles, and compilation from a named profile into
split `FileSystemSandboxPolicy` and `NetworkSandboxPolicy` values
- taught config loading to choose between the legacy `sandbox_mode` path
and the profile-based path without breaking legacy users
- introduced `codex-protocol::permissions` for the split filesystem and
network sandbox types, and stored those alongside the legacy projected
`sandbox_policy` in runtime `Permissions`
- modeled `FileSystemSpecialPath` so only `ProjectRoots` can carry a
nested `subpath`, matching the intended config syntax instead of
allowing invalid states for other special paths
- restricted scoped filesystem maps to `:project_roots`, with validation
that nested entries are non-empty descendant paths and cannot use `.` or
`..` to escape the project root
- kept existing runtime consumers working by projecting
`FileSystemSandboxPolicy` back into `SandboxPolicy`, with an explicit
error for profiles that request writes outside the workspace root
- loaded proxy settings from top-level `[network]`
- regenerated `core/config.schema.json`
## Verification
- added config coverage for profile deserialization,
`default_permissions` selection, top-level `[network]` loading, network
enablement, rejection of writes outside the workspace root, rejection of
nested entries for non-`:project_roots` special paths, and rejection of
parent-directory traversal in `:project_roots` maps
- added protocol coverage for the legacy bridge rejecting non-workspace
writes
## Docs
- update the Codex config docs on developers.openai.com/codex to
document named `[permissions.<profile>]` entries, `default_permissions`,
scoped `:project_roots` syntax, the descendant-path restriction for
nested `:project_roots` entries, and top-level `[network]` proxy
configuration
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/13434).
* #13453
* #13452
* #13451
* #13449
* #13448
* #13445
* #13440
* #13439
* __->__ #13434
#### What
Add structured `@plugin` parsing and TUI support for plugin mentions.
- Core: switch from plain-text `@display_name` parsing to structured
`plugin://...` mentions via `UserInput::Mention` and
`[$...](plugin://...)` links in text, same pattern as apps/skills.
- TUI: add plugin mention popup, autocomplete, and chips when typing
`$`. Load plugin capability summaries and feed them into the composer;
plugin mentions appear alongside skills and apps.
- Generalize mention parsing to a sigil parameter, still defaults to `$`
<img width="797" height="119" alt="image"
src="https://github.com/user-attachments/assets/f0fe2658-d908-4927-9139-73f850805ceb"
/>
Builds on #13510. Currently clients have to build their own `id` via
`plugin@marketplace` and filter plugins to show by `enabled`, but we
will add `id` and `available` as fields returned from `plugin/list`
soon.
####Tests
Added tests, verified locally.
## Summary
Today `SandboxPermissions::requires_additional_permissions()` does not
actually mean "is `WithAdditionalPermissions`". It returns `true` for
any non-default sandbox override, including `RequireEscalated`. That
broad behavior is relied on in multiple `main` callsites.
The naming is security-sensitive because `SandboxPermissions` is used on
shell-like tool calls to tell the executor how a single command should
relate to the turn sandbox:
- `UseDefault`: run with the turn sandbox unchanged
- `RequireEscalated`: request execution outside the sandbox
- `WithAdditionalPermissions`: stay sandboxed but widen permissions for
that command only
## Problem
The old helper name reads as if it only applies to the
`WithAdditionalPermissions` variant. In practice it means "this command
requested any explicit sandbox override."
That ambiguity made it easy to read production checks incorrectly and
made the guardian change look like a standalone `main` fix when it is
not.
On `main` today:
- `shell` and `unified_exec` intentionally reject any explicit
`sandbox_permissions` request unless approval policy is `OnRequest`
- `exec_policy` intentionally treats any explicit sandbox override as
prompt-worthy in restricted sandboxes
- tests intentionally serialize both `RequireEscalated` and
`WithAdditionalPermissions` as explicit sandbox override requests
So changing those callsites from the broad helper to a narrow
`WithAdditionalPermissions` check would be a behavior change, not a pure
cleanup.
## What This PR Does
- documents `SandboxPermissions` as a per-command sandbox override, not
a generic permissions bag
- adds `requests_sandbox_override()` for the broad meaning: anything
except `UseDefault`
- adds `uses_additional_permissions()` for the narrow meaning: only
`WithAdditionalPermissions`
- keeps `requires_additional_permissions()` as a compatibility alias to
the broad meaning for now
- updates the current broad callsites to use the accurately named broad
helper
- adds unit coverage that locks in the semantics of all three helpers
## What This PR Does Not Do
This PR does not change runtime behavior. That is intentional.
---------
Co-authored-by: Codex <noreply@openai.com>
## Summary
- delete the network proxy admin server and its runtime listener/task
plumbing
- remove the admin endpoint config, runtime, requirement, protocol,
schema, and debug-surface fields
- update proxy docs to reflect the remaining HTTP and SOCKS listeners
only
## Summary
This PR:
1. fixes a deserialization mismatch for macOS automation permissions in
approval payloads by making core parsing accept both supported wire
shapes for bundle IDs.
2. added `#[serde(default)]` to `MacOsSeatbeltProfileExtensions` so
omitted fields deserialize to secure defaults.
## Why this change is needed
`MacOsAutomationPermission` uses `#[serde(try_from =
"MacOsAutomationPermissionDe")]`, so deserialization is controlled by
`MacOsAutomationPermissionDe`. After we aligned v2
`additionalPermissions.macos.automations` to the core shape, approval
payloads started including `{ "bundle_ids": [...] }` in some paths.
`MacOsAutomationPermissionDe` previously accepted only `"none" | "all"`
or a plain array, so object-shaped bundle IDs failed with `data did not
match any variant of untagged enum MacOsAutomationPermissionDe`. This
change restores compatibility by accepting both forms while preserving
existing normalization behavior (trim values and map empty bundle lists
to `None`).
## Validation
saw this error went away when running
```
cargo run -p codex-app-server-test-client -- \
--codex-bin ./target/debug/codex \
-c 'approval_policy="on-request"' \
-c 'features.shell_zsh_fork=true' \
-c 'zsh_path="/tmp/codex-zsh-fork/package/vendor/aarch64-apple-darwin/zsh/macos-15/zsh"' \
send-message-v2 --experimental-api \
'Use $apple-notes and run scripts/notes_info now.'
```
:
```
Error: failed to deserialize ServerRequest from JSONRPCRequest
Caused by:
data did not match any variant of untagged enum MacOsAutomationPermissionDe
```
## Summary
This PR removes legacy macOS permission model types from
`codex-rs/protocol/src/models.rs`:
- `MacOsPermissions`
- `MacOsPreferencesValue`
- `MacOsAutomationValue`
The protocol now relies on the current `MacOsSeatbeltProfileExtensions`
model for macOS permission data.
This PR adds a durable trace linkage for each turn by storing the active
trace ID on the rollout TurnContext record stored in session rollout
files.
Before this change, we propagated trace context at runtime but didn’t
persist a stable per-turn trace key in rollout history. That made
after-the-fact debugging harder (for example, mapping a historical turn
to the corresponding trace in datadog). This sets us up for much easier
debugging in the future.
### What changed
- Added an optional `trace_id` to TurnContextItem (rollout schema).
- Added a small OTEL helper to read the current span trace ID.
- Captured `trace_id` when creating `TurnContext` and included it in
`to_turn_context_item()`.
- Updated tests and fixtures that construct TurnContextItem so
older/no-trace cases still work.
### Why this approach
TurnContext is already the canonical durable per-turn metadata in
rollout. This keeps ownership clean: trace linkage lives with other
persisted turn metadata.
This adds a first-class server request for MCP server elicitations:
`mcpServer/elicitation/request`.
Until now, MCP elicitation requests only showed up as a raw
`codex/event/elicitation_request` event from core. That made it hard for
v2 clients to handle elicitations using the same request/response flow
as other server-driven interactions (like shell and `apply_patch`
tools).
This also updates the underlying MCP elicitation request handling in
core to pass through the full MCP request (including URL and form data)
so we can expose it properly in app-server.
### Why not `item/mcpToolCall/elicitationRequest`?
This is because MCP elicitations are related to MCP servers first, and
only optionally to a specific MCP tool call.
In the MCP protocol, elicitation is a server-to-client capability: the
server sends `elicitation/create`, and the client replies with an
elicitation result. RMCP models it that way as well.
In practice an elicitation is often triggered by an MCP tool call, but
not always.
### What changed
- add `mcpServer/elicitation/request` to the v2 app-server API
- translate core `codex/event/elicitation_request` events into the new
v2 server request
- map client responses back into `Op::ResolveElicitation` so the MCP
server can continue
- update app-server docs and generated protocol schema
- add an end-to-end app-server test that covers the full round trip
through a real RMCP elicitation flow
- The new test exercises a realistic case where an MCP tool call
triggers an elicitation, the app-server emits
mcpServer/elicitation/request, the client accepts it, and the tool call
resumes and completes successfully.
### app-server API flow
- Client starts a thread with `thread/start`.
- Client starts a turn with `turn/start`.
- App-server sends `item/started` for the `mcpToolCall`.
- While that tool call is in progress, app-server sends
`mcpServer/elicitation/request`.
- Client responds to that request with `{ action: "accept" | "decline" |
"cancel" }`.
- App-server sends `serverRequest/resolved`.
- App-server sends `item/completed` for the mcpToolCall.
- App-server sends `turn/completed`.
- If the turn is interrupted while the elicitation is pending,
app-server still sends `serverRequest/resolved` before the turn
finishes.
add `web_search_tool_type` on model_info that can be populated from
backend. will be used to filter which models can use `web_search` with
images and which cant.
added small unit test.
## Summary
Changes the permission profile shape from a bare network boolean to a
nested object.
Before:
```yaml
permissions:
network: true
```
After:
```yaml
permissions:
network:
enabled: true
```
This also updates the shared Rust and app-server protocol types so
`PermissionProfile.network` is no longer `Option<bool>`, but
`Option<NetworkPermissions>` with `enabled: Option<bool>`.
## What Changed
- Updated `PermissionProfile` in `codex-rs/protocol/src/models.rs`:
- `pub network: Option<bool>` -> `pub network:
Option<NetworkPermissions>`
- Added `NetworkPermissions` with:
- `pub enabled: Option<bool>`
- Changed emptiness semantics so `network` is only considered empty when
`enabled` is `None`
- Updated skill metadata parsing to accept `permissions.network.enabled`
- Updated core permission consumers to read
`network.enabled.unwrap_or(false)` where a concrete boolean is needed
- Updated app-server v2 protocol types and regenerated schema/TypeScript
outputs
- Updated docs to mention `additionalPermissions.network.enabled`
## Summary
`PermissionProfile.network` could not be preserved when additional or
compiled permissions resolved to
`SandboxPolicy::ReadOnly`, because `ReadOnly` had no network_access
field. This change makes read-only + network
enabled representable directly and threads that through the protocol,
app-server v2 mirror, and permission-
merging logic.
## What changed
- Added `network_access: bool` to `SandboxPolicy::ReadOnly` in the core
protocol and app-server v2 protocol.
- Kept backward compatibility by defaulting the new field to false, so
legacy read-only payloads still
deserialize unchanged.
- Updated `has_full_network_access()` and sandbox summaries to respect
read-only network access.
- Preserved PermissionProfile.network when:
- compiling skill permission profiles into sandbox policies
- normalizing additional permissions
- merging additional permissions into existing sandbox policies
- Updated the approval overlay to show network in the rendered
permission rule when requested.
- Regenerated app-server schema fixtures for the new v2 wire shape.