Commit graph

329 commits

Author SHA1 Message Date
pash-openai
07e532dcb9
app-server service tier plumbing (plus some cleanup) (#13334)
followup to https://github.com/openai/codex/pull/13212 to expose fast
tier controls to app server
(majority of this PR is generated schema jsons - actual code is +69 /
-35 and +24 tests )

- add service tier fields to the app-server protocol surfaces used by
thread lifecycle, turn start, config, and session configured events
- thread service tier through the app-server message processor and core
thread config snapshots
- allow runtime config overrides to carry service tier for app-server
callers

cleanup:
- Removing useless "legacy" code supporting "standard" - we moved to
None | "fast", so "standard" is not needed.
2026-03-03 02:35:09 -08:00
pash-openai
2f5b01abd6
add fast mode toggle (#13212)
- add a local Fast mode setting in codex-core (similar to how model id
is currently stored on disk locally)
- send `service_tier=priority` on requests when Fast is enabled
- add `/fast` in the TUI and persist it locally
- feature flag
2026-03-02 20:29:33 -08:00
Ruslan Nigmatullin
9022cdc563
app-server: Silence thread status changes caused by thread being created (#13079)
Currently we emit `thread/status/changed` with `Idle` status right
before sending `thread/started` event (which also has `Idle` status in
it).
It feels that there is no point in that as client has no way to know
prior state of the thread as it didn't exist yet, so silence these kinds
of notifications.
2026-03-03 00:52:28 +00:00
Owen Lin
146b798129
fix(app-server): emit turn/started only when turn actually starts (#13261)
This is a follow-up for https://github.com/openai/codex/pull/13047

## Why
We had a race where `turn/started` could be observed before the thread
had actually transitioned to `Active`. This was because we eagerly
emitted `turn/started` in the request handler for `turn/start` (and
`review/start`).

That was showing up as flaky `thread/resume` tests, but the real issue
was broader: a client could see `turn/started` and still get back an
idle thread immediately afterward.

The first idea was to eagerly call
`thread_watch_manager.note_turn_started(...)` from the `turn/start`
request path. That turns out to be unsafe, because
`submit(Op::UserInput)` only queues work. If a turn starts and completes
quickly, request-path bookkeeping can race with the real lifecycle
events and leave stale running state behind.

**The real fix** is to move `turn/started` to emit only after the turn
_actually_ starts, so we do that by waiting for the
`EventMsg::TurnStarted` notification emitted by codex core. We do this
for both `turn/start` and `review/start`.

I also verified this change is safe for our first-party codex apps -
they don't have any assumptions that `turn/started` is emitted before
the RPC response to `turn/start` (which is correct anyway).

I also removed `single_client_mode` since it isn't really necessary now.

## Testing
- `cargo test -p codex-app-server thread_resume -- --nocapture`
- `cargo test -p codex-app-server
'suite::v2::turn_start::turn_start_emits_notifications_and_accepts_model_override'
-- --exact --nocapture`
- `cargo test -p codex-app-server`
2026-03-02 16:43:31 -08:00
Ahmed Ibrahim
b20b6aa46f
Update realtime websocket API (#13265)
- migrate the realtime websocket transport to the new session and
handoff flow
- make the realtime model configurable in config.toml and use API-key
auth for the websocket

---------

Co-authored-by: Codex <noreply@openai.com>
2026-03-02 16:05:40 -08:00
Owen Lin
d473e8d56d
feat(app-server): add tracing to all app-server APIs (#13285)
### Overview
This PR adds the first piece of tracing for app-server JSON-RPC
requests.

There are two main changes:
- JSON-RPC requests can now take an optional W3C trace context at the
top level via a `trace` field (`traceparent` / `tracestate`).
- app-server now creates a dedicated request span for every inbound
JSON-RPC request in `MessageProcessor`, and uses the request-level trace
context as the parent when present.

For compatibility with existing flows, app-server still falls back to
the TRACEPARENT env var when there is no request-level traceparent.

This PR is intentionally scoped to the app-server boundary. In a
followup, we'll actually propagate trace context through the async
handoff into core execution spans like run_turn, which will make
app-server traces much more useful.

### Spans
A few details on the app-server span shape:
- each inbound request gets its own server span
- span/resource names are based on the JSON-RPC method (`initialize`,
`thread/start`, `turn/start`, etc.)
- spans record transport (stdio vs websocket), request id, connection
id, and client name/version when available
- `initialize` stores client metadata in session state so later requests
on the same connection can reuse it
2026-03-02 16:01:41 -08:00
Ruslan Nigmatullin
14fcb6645c
app-server: Update thread/name/set to support not-loaded threads (#13282)
Currently `thread/name/set` does only work for loaded threads.
Expand the scope to also support persisted but not-yet-loaded ones for a
more predictable API surface.
This will make it possible to rename threads discovered via
`thread/list` and similar operations.
2026-03-02 15:13:18 -08:00
Thibault Sottiaux
c9cef6ba9e
[codex] include plan type in account updates (#13181)
This change fixes a Codex app account-state sync bug where clients could
know the user was signed in but still miss the ChatGPT subscription
tier, which could lead to incorrect upgrade messaging for paid users.

The root cause was that `account/updated` only carried `authMode` while
plan information was available separately via `account/read` and
rate-limit snapshots, so this update adds `planType` to
`account/updated`, populates it consistently across login and refresh
paths.
2026-03-01 13:43:37 -08:00
xl-openai
752402c4fe
feat: load from plugins (#12864)
Support loading plugins.

Plugins can now be enabled via [plugins.<name>] in config.toml. They are
loaded as first-class entities through PluginsManager, and their default
skills/ and .mcp.json contributions are integrated into the existing
skills and MCP flows.
2026-03-01 10:50:56 -08:00
Ruslan Nigmatullin
8c1e3f3e64
app-server: Add ephemeral field to Thread object (#13084)
Currently there is no alternative way to know that thread is ephemeral,
only client which did create it has the knowledge.
2026-02-27 17:42:25 -08:00
Owen Lin
8fa792868c
fix(app-server): make thread/start non-blocking (#13033)
Stop `thread/start` from blocking other app-server requests.

Before this change, `thread/start ran` inline on the request loop, so
slow startup paths like MCP auth checks could hold up unrelated requests
on the same connection, including `thread/loaded/list`. This moves
`thread/start` into a background task.

While doing so, it revealed an issue where we were doing nested locking
(and there were some race conditions possible that could introduce a
"phantom listener"). This PR also refactors the listener/subscription
bookkeeping - listener/subscription state is now centralized in
`ThreadStateManager` instead of being split across multiple lock
domains. That makes late auto-attach on `thread/start` race-safe and
avoids reintroducing disconnected clients as phantom subscribers.
2026-02-28 01:40:08 +00:00
Matthew Zeng
392fa7de50
[apps] Stablize app list updated event. (#13067)
Stablize app list updated event so that we only send 2 updates: 1 when
installed apps become available, one when all directory apps are
available. Previously it also updates when directory apps become
available before installed apps, which cuts off installed apps.
2026-02-27 15:23:24 -08:00
Ruslan Nigmatullin
69d7a456bb
app-server: Replay pending item requests on thread/resume (#12560)
Replay pending client requests after `thread/resume` and emit resolved
notifications when those requests clear so approval/input UI state stays
in sync after reconnects and across subscribed clients.

Affected RPCs:
- `item/commandExecution/requestApproval`
- `item/fileChange/requestApproval`
- `item/tool/requestUserInput`

Motivation:
- Resumed clients need to see pending approval/input requests that were
already outstanding before the reconnect.
- Clients also need an explicit signal when a pending request resolves
or is cleared so stale UI can be removed on turn start, completion, or
interruption.

Implementation notes:
- Use pending client requests from `OutgoingMessageSender` in order to
replay them after `thread/resume` attaches the connection, using
original request ids.
- Emit `serverRequest/resolved` when pending requests are answered
or cleared by lifecycle cleanup.
- Update the app-server protocol schema, generated TypeScript bindings,
and README docs for the replay/resolution flow.

High-level test plan:
- Added automated coverage for replaying pending command execution and
file change approval requests on `thread/resume`.
- Added automated coverage for resolved notifications in command
approval, file change approval, request_user_input, turn start, and turn
interrupt flows.
- Verified schema/docs updates in the relevant protocol and app-server
tests.

Manual testing:
- Tested reconnect/resume with multiple connections.
- Confirmed state stayed in sync between connections.
2026-02-27 12:45:59 -08:00
Michael Bolin
d09a7535ed
fix: use AbsolutePathBuf for permission profile file roots (#12970)
## Why
`PermissionProfile` should describe filesystem roots as absolute paths
at the type level. Using `PathBuf` in `FileSystemPermissions` made the
shared type too permissive and blurred together three different
deserialization cases:

- skill metadata in `agents/openai.yaml`, where relative paths should
resolve against the skill directory
- app-server API payloads, where callers should have to send absolute
paths
- local tool-call payloads for commands like `shell_command` and
`exec_command`, where `additional_permissions.file_system` may
legitimately be relative to the command `workdir`

This change tightens the shared model without regressing the existing
local command flow.

## What Changed
- changed `protocol::models::FileSystemPermissions` and the app-server
`AdditionalFileSystemPermissions` mirror to use `AbsolutePathBuf`
- wrapped skill metadata deserialization in `AbsolutePathBufGuard`, so
relative permission roots in `agents/openai.yaml` resolve against the
containing skill directory
- kept app-server/API deserialization strict, so relative
`additionalPermissions.fileSystem.*` paths are rejected at the boundary
- restored cwd/workdir-relative deserialization for local tool-call
payloads by parsing `shell`, `shell_command`, and `exec_command`
arguments under an `AbsolutePathBufGuard` rooted at the resolved command
working directory
- simplified runtime additional-permission normalization so it only
canonicalizes and deduplicates absolute roots instead of trying to
recover relative ones later
- updated the app-server schema fixtures, `app-server/README.md`, and
the affected transport/TUI tests to match the final behavior
2026-02-27 17:42:52 +00:00
Michael Bolin
e6cd75a684
notify: include client in legacy hook payload (#12968)
## Why

The `notify` hook payload did not identify which Codex client started
the turn. That meant downstream notification hooks could not distinguish
between completions coming from the TUI and completions coming from
app-server clients such as VS Code or Xcode. Now that the Codex App
provides its own desktop notifications, it would be nice to be able to
filter those out.

This change adds that context without changing the existing payload
shape for callers that do not know the client name, and keeps the new
end-to-end test cross-platform.

## What changed

- added an optional top-level `client` field to the legacy `notify` JSON
payload
- threaded that value through `core` and `hooks`; the internal session
and turn state now carries it as `app_server_client_name`
- set the field to `codex-tui` for TUI turns
- captured `initialize.clientInfo.name` in the app server and applied it
to subsequent turns before dispatching hooks
- replaced the notify integration test hook with a `python3` script so
the test does not rely on Unix shell permissions or `bash`
- documented the new field in `docs/config.md`

## Testing

- `cargo test -p codex-hooks`
- `cargo test -p codex-tui`
- `cargo test -p codex-app-server
suite::v2::initialize::turn_start_notify_payload_includes_initialize_client_name
-- --exact --nocapture`
- `cargo test -p codex-core` (`src/lib.rs` passed; `core/tests/all.rs`
still has unrelated existing failures in this environment)

## Docs

The public config reference on `developers.openai.com/codex` should
mention that the legacy `notify` payload may include a top-level
`client` field. The TUI reports `codex-tui`, and the app server reports
`initialize.clientInfo.name` when it is available.
2026-02-26 22:27:34 -08:00
Ahmed Ibrahim
4d180ae428
Add model availability NUX metadata (#12972)
- replace show_nux with structured availability_nux model metadata
- expose availability NUX data through the app-server model API
- update shared fixtures and tests for the new field
2026-02-26 22:02:57 -08:00
Eric Traut
cee009d117
Add oauth_resource handling for MCP login flows (#12866)
Addresses bug https://github.com/openai/codex/issues/12589

Builds on community PR #12763.

This adds `oauth_resource` support for MCP `streamable_http` servers and
wires it through the relevant config and login paths. It fixes the bug
where the configured OAuth resource was not reliably included in the
authorization request, causing MCP login to omit the expected
`resource` parameter.
2026-02-26 20:10:12 -08:00
Matthew Zeng
6fe3dc2e22
[apps] Improve app/list with force_fetch=true (#12745)
- [x] Improve app/list with force_fetch=true, we now keep cached
snapshot until both install apps and directory apps load.
2026-02-27 03:54:03 +00:00
Shijie Rao
8715a6ef84
Feat: cxa-1833 update model/list (#12958)
### Summary
Update `model/list` in app server to include more upgrade information.
2026-02-26 17:02:24 -08:00
pakrym-oai
951a389654
Allow clients not to send summary as an option (#12950)
Summary is a required parameter on UserTurn. Ideally we'd like the core
to decide the appropriate summary level.

Make the summary optional and don't send it when not needed.
2026-02-26 14:37:38 -08:00
Charley Cunningham
07aefffb1f
core: bundle settings diff updates into one dev/user envelope (#12417)
## Summary
- bundle contextual prompt injection into at most one developer message
plus one contextual user message in both:
  - per-turn settings updates
  - initial context insertion
- preserve `<model_switch>` across compaction by rebuilding it through
canonical initial-context injection, instead of relying on
strip/reattach hacks
- centralize contextual user fragment detection in one shared definition
table and reuse it for parsing/compaction logic
- keep `AGENTS.md` in its natural serialized format:
  - `# AGENTS.md instructions for {dirname}`
  - `<INSTRUCTIONS>...</INSTRUCTIONS>`
- simplify related tests/helpers and accept the expected snapshot/layout
updates from bundled multi-part messages

## Why
The goal is to converge toward a simpler, more intentional prompt shape
where contextual updates are consistently represented as one developer
envelope plus one contextual user envelope, while keeping parsing and
compaction behavior aligned with that representation.

## Notable details
- the temporary `SettingsUpdateEnvelope` wrapper was removed; these
paths now return `Vec<ResponseItem>` directly
- local/remote compaction no longer rely on model-switch strip/restore
helpers
- contextual user detection is now driven by shared fragment definitions
instead of ad hoc matcher assembly
- AGENTS/user instructions are still the same logical context; only the
synthetic `<user_instructions>` wrapper was replaced by the natural
AGENTS text format

## Testing
- `just fmt`
- `cargo test -p codex-app-server
codex_message_processor::tests::extract_conversation_summary_prefers_plain_user_messages
-- --exact`
- `cargo test -p codex-core
compact::tests::collect_user_messages_filters_session_prefix_entries
--lib -- --exact`
- `cargo test -p codex-core --test all
'suite::compact::snapshot_request_shape_pre_turn_compaction_strips_incoming_model_switch'
-- --exact`
- `cargo test -p codex-core --test all
'suite::compact_remote::snapshot_request_shape_remote_pre_turn_compaction_strips_incoming_model_switch'
-- --exact`
- `cargo test -p codex-core --test all
'suite::client::includes_apps_guidance_as_developer_message_when_enabled'
-- --exact`
- `cargo test -p codex-core --test all
'suite::client::includes_developer_instructions_message_in_request' --
--exact`
- `cargo test -p codex-core --test all
'suite::client::includes_user_instructions_message_in_request' --
--exact`
- `cargo test -p codex-core --test all
'suite::client::resume_includes_initial_messages_and_sends_prior_items'
-- --exact`
- `cargo test -p codex-core --test all
'suite::review::review_input_isolated_from_parent_history' -- --exact`
- `cargo test -p codex-exec --test all
'suite::resume::exec_resume_last_respects_cwd_filter_and_all_flag' --
--exact`
- `cargo test -p core_test_support
context_snapshot::tests::full_text_mode_preserves_unredacted_text --
--exact`

## Notes
- I also ran several targeted `compact`, `compact_remote`,
`prompt_caching`, `model_visible_layout`, and `event_mapping` tests
while iterating on prompt-shape changes.
- I have not claimed a clean full-workspace `cargo test` from this
environment because local sandbox/resource conditions have previously
produced unrelated failures in large workspace runs.
2026-02-26 00:12:08 -08:00
Eric Traut
28bfbb8f2b
Enforce user input length cap (#12823)
Currently there is no bound on the length of a user message submitted in
the TUI or through the app server interface. That means users can paste
many megabytes of text, which can lead to bad performance, hangs, and
crashes. In extreme cases, it can lead to a [kernel
panic](https://github.com/openai/codex/issues/12323).

This PR limits the length of a user input to 2**20 (about 1M)
characters. This value was chosen because it fills the entire context
window on the latest models, so accepting longer inputs wouldn't make
sense anyway.

Summary
- add a shared `MAX_USER_INPUT_TEXT_CHARS` constant in codex-protocol
and surface it in TUI and app server code
- block oversized submissions in the TUI submit flow and emit error
history cells when validation fails
- reject heavy app-server requests with JSON-RPC `-32602` and structured
`input_too_large` data, plus document the behavior

Testing
- ran the IDE extension with this change and verified that when I
attempt to paste a user message that's several MB long, it correctly
reports an error instead of crashing or making my computer hot.
2026-02-25 22:23:51 -08:00
Michael Bolin
14116ade8d
feat: include available decisions in command approval requests (#12758)
Command-approval clients currently infer which choices to show from
side-channel fields like `networkApprovalContext`,
`proposedExecpolicyAmendment`, and `additionalPermissions`. That makes
the request shape harder to evolve, and it forces each client to
replicate the server's heuristics instead of receiving the exact
decision list for the prompt.

This PR introduces a mapping between `CommandExecutionApprovalDecision`
and `codex_protocol::protocol::ReviewDecision`:

```rust
impl From<CoreReviewDecision> for CommandExecutionApprovalDecision {
    fn from(value: CoreReviewDecision) -> Self {
        match value {
            CoreReviewDecision::Approved => Self::Accept,
            CoreReviewDecision::ApprovedExecpolicyAmendment {
                proposed_execpolicy_amendment,
            } => Self::AcceptWithExecpolicyAmendment {
                execpolicy_amendment: proposed_execpolicy_amendment.into(),
            },
            CoreReviewDecision::ApprovedForSession => Self::AcceptForSession,
            CoreReviewDecision::NetworkPolicyAmendment {
                network_policy_amendment,
            } => Self::ApplyNetworkPolicyAmendment {
                network_policy_amendment: network_policy_amendment.into(),
            },
            CoreReviewDecision::Abort => Self::Cancel,
            CoreReviewDecision::Denied => Self::Decline,
        }
    }
}
```

And updates `CommandExecutionRequestApprovalParams` to have a new field:

```rust
available_decisions: Option<Vec<CommandExecutionApprovalDecision>>
```

when, if specified, should make it easier for clients to display an
appropriate list of options in the UI.

This makes it possible for `CoreShellActionProvider::prompt()` in
`unix_escalation.rs` to specify the `Vec<ReviewDecision>` directly,
adding support for `ApprovedForSession` when approving a skill script,
which was previously missing in the TUI.

Note this results in a significant change to `exec_options()` in
`approval_overlay.rs`, as the displayed options are now derived from
`available_decisions: &[ReviewDecision]`.

## What Changed

- Add `available_decisions` to
[`ExecApprovalRequestEvent`](de00e932dd/codex-rs/protocol/src/approvals.rs (L111-L175)),
including helpers to derive the legacy default choices when older
senders omit the field.
- Map `codex_protocol::protocol::ReviewDecision` to app-server
`CommandExecutionApprovalDecision` and expose the ordered list as
experimental `availableDecisions` in
[`CommandExecutionRequestApprovalParams`](de00e932dd/codex-rs/app-server-protocol/src/protocol/v2.rs (L3798-L3807)).
- Thread optional `available_decisions` through the core approval path
so Unix shell escalation can explicitly request `ApprovedForSession` for
session-scoped approvals instead of relying on client heuristics.
[`unix_escalation.rs`](de00e932dd/codex-rs/core/src/tools/runtimes/shell/unix_escalation.rs (L194-L214))
- Update the TUI approval overlay to build its buttons from the ordered
decision list, while preserving the legacy fallback when
`available_decisions` is missing.
- Update the app-server README, test client output, and generated schema
artifacts to document and surface the new field.

## Testing

- Add `approval_overlay.rs` coverage for explicit decision lists,
including the generic `ApprovedForSession` path and network approval
options.
- Update `chatwidget/tests.rs` and app-server protocol tests to populate
the new optional field and keep older event shapes working.

## Developers Docs

- If we document `item/commandExecution/requestApproval` on
[developers.openai.com/codex](https://developers.openai.com/codex), add
experimental `availableDecisions` as the preferred source of approval
choices and note that older servers may omit it.
2026-02-26 01:10:46 +00:00
Celia Chen
4f45668106
Revert "Add skill approval event/response (#12633)" (#12811)
This reverts commit https://github.com/openai/codex/pull/12633. We no
longer need this PR, because we favor sending normal exec command
approval server request with `additional_permissions` of skill
permissions instead
2026-02-26 01:02:42 +00:00
Charley Cunningham
2f4d6ded1d
Enable request_user_input in Default mode (#12735)
## Summary
- allow `request_user_input` in Default collaboration mode as well as
Plan
- update the Default-mode instructions to prefer assumptions first and
use `request_user_input` only when a question is unavoidable
- update request_user_input and app-server tests to match the new
Default-mode behavior
- refactor collaboration-mode availability plumbing into
`CollaborationModesConfig` for future mode-related flags

## Codex author
`codex resume 019c9124-ed28-7c13-96c6-b916b1c97d49`
2026-02-25 15:20:46 -08:00
Celia Chen
b6d20748e0
Revert "Ensure shell command skills trigger approval (#12697)" (#12721)
This reverts commit daf0f03ac8.

# External (non-OpenAI) Pull Request Requirements

Before opening this Pull Request, please read the dedicated
"Contributing" markdown file or your PR may be closed:
https://github.com/openai/codex/blob/main/docs/contributing.md

If your PR conforms to our contribution guidelines, replace this text
with a detailed and high quality description of your changes.

Include a link to a bug report or enhancement request.
2026-02-25 22:49:53 +00:00
Owen Lin
21f7032dbb
feat(app-server): thread/unsubscribe API (#10954)
Adds a new v2 app-server API for a client to be able to unsubscribe to a
thread:
- New RPC method: `thread/unsubscribe`
- New server notification: `thread/closed`

Today clients can start/resume/archive threads, but there wasn’t a way
to explicitly unload a live thread from memory without archiving it.
With `thread/unsubscribe`, a client can indicate it is no longer
actively working with a live Thread. If this is the only client
subscribed to that given thread, the thread will be automatically closed
by app-server, at which point the server will send `thread/closed` and
`thread/status/changed` with `status: notLoaded` notifications.

This gives clients a way to prevent long-running app-server processes
from accumulating too many thread (and related) objects in memory.

Closed threads will also be removed from `thread/loaded/list`.
2026-02-25 13:14:30 -08:00
Owen Lin
a0fd94bde6
feat(app-server): add ThreadItem::DynamicToolCall (#12732)
Previously, clients would call `thread/start` with dynamic_tools set,
and when a model invokes a dynamic tool, it would just make the
server->client `item/tool/call` request and wait for the client's
response to complete the tool call. This works, but it doesn't have an
`item/started` or `item/completed` event.

Now we are doing this:
- [new] emit `item/started` with `DynamicToolCall` populated with the
call arguments
- send an `item/tool/call` server request
- [new] once the client responds, emit `item/completed` with
`DynamicToolCall` populated with the response.

Also, with `persistExtendedHistory: true`, dynamic tool calls are now
reconstructable in `thread/read` and `thread/resume` as
`ThreadItem::DynamicToolCall`.
2026-02-25 12:00:10 -08:00
Ahmed Ibrahim
947092283a
Add app-server v2 thread realtime API (#12715)
Add experimental `thread/realtime/*` v2 requests and notifications, then
route app-server realtime events through that thread-scoped surface with
integration coverage.

---------

Co-authored-by: Codex <noreply@openai.com>
2026-02-25 09:59:10 -08:00
mcgrew-oai
9a393c9b6f
feat(network-proxy): add embedded OTEL policy audit logging (#12046)
**PR Summary**

This PR adds embedded-only OTEL policy audit logging for
`codex-network-proxy` and threads audit metadata from `codex-core` into
managed proxy startup.

### What changed
- Added structured audit event emission in `network_policy.rs` with
target `codex_otel.network_proxy`.
- Emitted:
- `codex.network_proxy.domain_policy_decision` once per domain-policy
evaluation.
  - `codex.network_proxy.block_decision` for non-domain denies.
- Added required policy/network fields, RFC3339 UTC millisecond
`event.timestamp`, and fallback defaults (`http.request.method="none"`,
`client.address="unknown"`).
- Added non-domain deny audit emission in HTTP/SOCKS handlers for
mode-guard and proxy-state denies, including unix-socket deny paths.
- Added `REASON_UNIX_SOCKET_UNSUPPORTED` and used it for unsupported
unix-socket auditing.
- Added `NetworkProxyAuditMetadata` to runtime/state, re-exported from
`lib.rs` and `state.rs`.
- Added `start_proxy_with_audit_metadata(...)` in core config, with
`start_proxy()` delegating to default metadata.
- Wired metadata construction in `codex.rs` from session/auth context,
including originator sanitization for OTEL-safe tagging.
- Updated `network-proxy/README.md` with embedded-mode audit schema and
behavior notes.
- Refactored HTTP block-audit emission to a small local helper to reduce
duplication.
- Preserved existing unix-socket proxy-disabled host/path behavior for
responses and blocked history while using an audit-only endpoint
override (`server.address="unix-socket"`, `server.port=0`).

### Explicit exclusions
- No standalone proxy OTEL startup work.
- No `main.rs` binary wiring.
- No `standalone_otel.rs`.
- No standalone docs/tests.

### Tests
- Extended `network_policy.rs` tests for event mapping, metadata
propagation, fallbacks, timestamp format, and target prefix.
- Extended HTTP tests to assert unix-socket deny block audit events.
- Extended SOCKS tests to cover deny emission from handler deny
branches.
- Added/updated core tests to verify audit metadata threading into
managed proxy state.

### Validation run
- `just fmt`
- `cargo test -p codex-network-proxy` 
- `cargo test -p codex-core` ran with one unrelated flaky timeout
(`shell_snapshot::tests::snapshot_shell_does_not_inherit_stdin`), and
the test passed when rerun directly 

---------

Co-authored-by: viyatb-oai <viyatb@openai.com>
2026-02-25 11:46:37 -05:00
alexsong-oai
6d6570d89d
Support external agent config detect and import (#12660)
Migration Behavior

* Config
  *  Migrates settings.json into config.toml
* Only adds fields when config.toml is missing, or when those fields are
missing from the existing file
  *  Supported mappings:
    env -> shell_environment_policy
     sandbox.enabled = true -> sandbox_mode = "workspace-write"

* Skills
  *  Copies home and repo .claude/skills into .agents/skills
  *  Existing skill directories are not overwritten
  *  SKILL.md content is rewritten from Claude-related terms to Codex

* AgentsMd
  *  Repo only
  *  Migrates CLAUDE.md into AGENTS.md
* Detect/import only proceed when AGENTS.md is missing or present but
empty
  *  Content is rewritten from Claude-related terms to Codex
2026-02-25 02:11:51 -08:00
jif-oai
f46b767b7e
feat: add search term to thread list (#12578)
Add `searchTerm` to `thread/list` that will search for a match in the
titles (the condition being `searchTerm` $$\in$$ `title`)
2026-02-25 09:59:41 +00:00
jif-oai
10c04e11b8
feat: add service name to app-server (#12319)
Add service name to the app-server so that the app can use it's own
service name

This is on thread level because later we might plan the app-server to
become a singleton on the computer
2026-02-25 09:51:42 +00:00
viyatb-oai
c086b36b58
feat(ui): add network approval persistence plumbing (#12358)
## Summary
- add TUI approval options for persistent network host rules
- add app-server v2 approval payload plumbing for network approval
context + proposed network policy amendments
- add app-server handling to translate `applyNetworkPolicyAmendment`
decisions back into core review decisions
- update docs/test client output and generated app-server schemas/types
2026-02-25 07:06:19 +00:00
Celia Chen
1151972fb2
feat: add experimental additionalPermissions to v2 command execution approval requests (#12737)
This adds additionalPermissions to the app-server v2
item/commandExecution/requestApproval payload as an experimental field.

The field is now exposed on CommandExecutionRequestApprovalParams and is
populated from the existing core approval event when a command requests
additional sandbox permissions.

This PR also contains changes to make server requests to support
experiment API.

A real app server test client test:

sample payload with experimental flag off:
```
 {
<   "id": 0,
<   "method": "item/commandExecution/requestApproval",
<   "params": {
<     "command": "/bin/zsh -lc 'mkdir -p ~/some/test && touch ~/some/test/file'",
<     "commandActions": [
<       {
<         "command": "mkdir -p '~/some/test'",
<         "type": "unknown"
<       },
<       {
<         "command": "touch '~/some/test/file'",
<         "type": "unknown"
<       }
<     ],
<     "cwd": "/Users/celia/code/codex/codex-rs",
<     "itemId": "call_QLp0LWkQ1XkU6VW9T2vUZFWB",
<     "proposedExecpolicyAmendment": [
<       "mkdir",
<       "-p",
<       "~/some/test"
<     ],
<     "reason": "Do you want to allow creating ~/some/test/file outside the workspace?",
<     "threadId": "019c9309-e209-7d82-a01b-dcf9556a354d",
<     "turnId": "019c9309-e27a-7f33-834f-6011e795c2d6"
<   }
< }
```
with experimental flag on: 
```
< {
<   "id": 0,
<   "method": "item/commandExecution/requestApproval",
<   "params": {
<     "additionalPermissions": {
<       "fileSystem": null,
<       "macos": null,
<       "network": true
<     },
<     "command": "/bin/zsh -lc 'install -D /dev/null ~/some/test/file'",
<     "commandActions": [
<       {
<         "command": "install -D /dev/null '~/some/test/file'",
<         "type": "unknown"
<       }
<     ],
<     "cwd": "/Users/celia/code/codex/codex-rs",
<     "itemId": "call_K3U4b3dRbj3eMCqslmncbGsq",
<     "proposedExecpolicyAmendment": [
<       "install",
<       "-D"
<     ],
<     "reason": "Do you want to allow creating the file at ~/some/test/file outside the workspace sandbox?",
<     "threadId": "019c9303-3a8e-76e1-81bf-d67ac446d892",
<     "turnId": "019c9303-3af1-7143-88a1-73132f771234"
<   }
< }
```
2026-02-25 05:16:35 +00:00
Michael Bolin
e88f74d140
feat: pass helper executable paths via Arg0DispatchPaths (#12719)
## Why

`codex-rs/core/src/tools/runtimes/shell/unix_escalation.rs` previously
located `codex-execve-wrapper` by scanning `PATH` and sibling
directories. That lookup is brittle and can select the wrong binary when
the runtime environment differs from startup assumptions.

We already pass `codex-linux-sandbox` from `codex-arg0`;
`codex-execve-wrapper` should use the same startup-driven path plumbing.

## What changed

- Introduced `Arg0DispatchPaths` in `codex-arg0` to carry both helper
executable paths:
  - `codex_linux_sandbox_exe`
  - `main_execve_wrapper_exe`
- Updated `arg0_dispatch_or_else()` to pass `Arg0DispatchPaths` to
top-level binaries and preserve helper paths created in
`prepend_path_entry_for_codex_aliases()`.
- Threaded `Arg0DispatchPaths` through entrypoints in `cli`, `exec`,
`tui`, `app-server`, and `mcp-server`.
- Added `main_execve_wrapper_exe` to core configuration plumbing
(`Config`, `ConfigOverrides`, and `SessionServices`).
- Updated zsh-fork shell escalation to consume the configured
`main_execve_wrapper_exe` and removed path-sniffing fallback logic.
- Updated app-server config reload paths so reloaded configs keep the
same startup-provided helper executable paths.

## References

- [`Arg0DispatchPaths`
definition](e355b43d5c/codex-rs/arg0/src/lib.rs (L20-L24))
- [`arg0_dispatch_or_else()` forwarding both
paths](e355b43d5c/codex-rs/arg0/src/lib.rs (L145-L176))
- [zsh-fork escalation using configured wrapper
path](e355b43d5c/codex-rs/core/src/tools/runtimes/shell/unix_escalation.rs (L109-L150))

## Testing

- `cargo check -p codex-arg0 -p codex-core -p codex-exec -p codex-tui -p
codex-mcp-server -p codex-app-server`
- `cargo test -p codex-arg0`
- `cargo test -p codex-core tools::runtimes:🐚:unix_escalation:: --
--nocapture`
2026-02-24 17:44:38 -08:00
Michael Bolin
448fb6ac22
fix: clarify the value of SkillMetadata.path (#12729)
Rename `SkillMetadata.path` to `SkillMetadata.path_to_skills_md` for
clarity.

Would ideally change the type to `AbsolutePathBuf`, but that can be done
later.
2026-02-24 17:15:54 -08:00
Max Johnson
5163850025
codex-rs/app-server: graceful websocket restart on Ctrl-C (#12517)
## Summary
- add graceful websocket app-server restart on Ctrl-C by draining until
no assistant turns are running
- stop the websocket acceptor and disconnect existing connections once
the drain condition is met
- add a websocket integration test that verifies Ctrl-C waits for an
in-flight turn before exit

## Verification
- `cargo check -p codex-app-server --quiet`
- `cargo test -p codex-app-server --test all
suite::v2::connection_handling_websocket`
- I (maxj) tested remote and local Codex.app

---------

Co-authored-by: Codex <noreply@openai.com>
2026-02-24 16:27:59 -08:00
pakrym-oai
5571a022eb
Add app-server event tracing (#12695)
To help with debugging
2026-02-24 14:45:50 -08:00
pakrym-oai
daf0f03ac8
Ensure shell command skills trigger approval (#12697)
Summary
- detect skill-invoking shell commands based on the original command
string, request approvals when needed, and cache positive decisions per
session
- keep implicit skill invocation emitted after approval and keep skill
approval decline messaging centralized to the shell handler
- expand and adjust skill approval tests to cover shell-based skill
scripts while matching the new detection expectations

Testing
- Not run (not requested)
2026-02-24 12:13:20 -08:00
Michael Bolin
3ca0e7673b
feat: run zsh fork shell tool via shell-escalation (#12649)
## Why

This PR switches the `shell_command` zsh-fork path over to
`codex-shell-escalation` so the new shell tool can use the shared
exec-wrapper/escalation protocol instead of the `zsh_exec_bridge`
implementation that was introduced in
https://github.com/openai/codex/pull/12052. `zsh_exec_bridge` relied on
UNIX domain sockets, which is not as tamper-proof as the FD-based
approach in `codex-shell-escalation`.

## What Changed

- Added a Unix zsh-fork runtime adapter in `core`
(`core/src/tools/runtimes/shell/unix_escalation.rs`) that:
- runs zsh-fork commands through
`codex_shell_escalation::run_escalate_server`
  - bridges exec-policy / approval decisions into `ShellActionProvider`
- executes escalated commands via a `ShellCommandExecutor` that calls
`process_exec_tool_call`
- Updated `ShellRuntime` / `ShellCommandHandler` / tool spec wiring to
select a `shell_command` backend (`classic` vs `zsh-fork`) while leaving
the generic `shell` tool path unchanged.
- Removed the `zsh_exec_bridge`-based session service and deleted
`core/src/zsh_exec_bridge/mod.rs`.
- Moved exec-wrapper entrypoint dispatch to `arg0` by handling the
`codex-execve-wrapper` arg0 alias there, and removed the old
`codex_core::maybe_run_zsh_exec_wrapper_mode()` hooks from `cli` and
`app-server` mains.
- Added the needed `codex-shell-escalation` dependencies for `core` and
`arg0`.

## Tests

- `cargo test -p codex-core
shell_zsh_fork_prefers_shell_command_over_unified_exec`
- `cargo test -p codex-app-server turn_start_shell_zsh_fork --
--nocapture`
- verifies zsh-fork command execution and approval flows through the new
backend
- includes subcommand approve/decline coverage using the shared zsh
DotSlash fixture in `app-server/tests/suite/zsh`
- To test manually, I added the following to `~/.codex/config.toml`:

```toml
zsh_path = "/Users/mbolin/code/codex3/codex-rs/app-server/tests/suite/zsh"

[features]
shell_zsh_fork = true
```

Then I ran `just c` to run the dev build of Codex with these changes and
sent it the message:

```
run `echo $0`
```

And it replied with:

```
  echo $0 printed:

  /Users/mbolin/code/codex3/codex-rs/app-server/tests/suite/zsh

  In this tool context, $0 reflects the script path used to invoke the shell, not just zsh.
```

so the tool appears to be wired up correctly.

## Notes

- The zsh subcommand-decline integration test now uses `rm` under a
`WorkspaceWrite` sandbox. The previous `/usr/bin/true` scenario is
auto-allowed by the new `shell-escalation` policy path, which no longer
produces subcommand approval prompts.
2026-02-24 10:31:08 -08:00
pakrym-oai
58763afa0f
Add skill approval event/response (#12633)
Set the stage for skill-level permission approval in addition to
command-level.

Behind a feature flag.
2026-02-23 22:28:58 -08:00
Javi
5a3bdcb27b
app-server: fix connecting via websockets with Sec-WebSocket-Extensions: permessage-deflate (#12629)
# External (non-OpenAI) Pull Request Requirements

Before opening this Pull Request, please read the dedicated
"Contributing" markdown file or your PR may be closed:
https://github.com/openai/codex/blob/main/docs/contributing.md

If your PR conforms to our contribution guidelines, replace this text
with a detailed and high quality description of your changes.

Include a link to a bug report or enhancement request.
2026-02-24 02:41:03 +00:00
Charley Cunningham
3cea3e665e
app-server: box request dispatch future to reduce stack pressure (#12421) 2026-02-23 10:05:41 -08:00
Michael Bolin
e8949f4507
test: vendor zsh fork via DotSlash and stabilize zsh-fork tests (#12518)
## Why

The zsh integration tests were still brittle in two ways:

- they relied on `CODEX_TEST_ZSH_PATH` / environment-specific setup, so
they often did not exercise the patched zsh fork that `shell-tool-mcp`
ships
- once the tests consistently used the vendored zsh fork, they exposed
real Linux-specific zsh-fork issues in CI

In particular, the Linux failures were not just test noise:

- the zsh-fork launch path was dropping `ExecRequest.arg0`, so Linux
`codex-linux-sandbox` arg0 dispatch did not run and zsh wrapper-mode
could receive malformed arguments
- the
`turn_start_shell_zsh_fork_subcommand_decline_marks_parent_declined_v2`
test uses the zsh exec bridge (which talks to the parent over a Unix
socket), but Linux restricted sandbox seccomp denies `connect(2)`,
causing timeouts on `ubuntu-24.04` x86/arm

This PR makes the zsh tests consistently run against the intended
vendored zsh fork and fixes/hardens the zsh-fork path so the Linux CI
signal is meaningful.

## What Changed

- Added a single shared test-only DotSlash file for the patched zsh fork
at `codex-rs/exec-server/tests/suite/zsh` (analogous to the existing
`bash` test resource).
- Updated both app-server and exec-server zsh tests to use that shared
DotSlash zsh (no duplicate zsh DotSlash file, no `CODEX_TEST_ZSH_PATH`
dependency).
- Updated the app-server zsh-fork test helper to resolve the shared
DotSlash zsh and avoid silently falling back to host zsh.
- Kept the app-server zsh-fork tests configured via `config.toml`, using
a test wrapper path where needed to force `zsh -df` (and rewrite `-lc`
to `-c`) for the subcommand-decline test.
- Hardened the app-server subcommand-decline zsh-fork test for CI
variability:
  - tolerate an extra `/responses` POST with a no-op mock response
- tolerate non-target approval ordering while remaining strict on the
two `/usr/bin/true` approvals and decline behavior
- use `DangerFullAccess` on Linux for this one test because it validates
zsh approval flow, not Linux sandbox socket restrictions
- Fixed zsh-fork process launching on Linux by preserving `req.arg0` in
`ZshExecBridge::execute_shell_request(...)` so `codex-linux-sandbox`
arg0 dispatch continues to work.
- Moved `maybe_run_zsh_exec_wrapper_mode()` under
`arg0_dispatch_or_else(...)` in `app-server` and `cli` so wrapper-mode
handling coexists correctly with arg0-dispatched helper modes.
- Consolidated duplicated `dotslash -- fetch` resolution logic into
shared test support (`core/tests/common/lib.rs`).
- Updated `codex-rs/exec-server/tests/suite/accept_elicitation.rs` to
use DotSlash zsh and hardened the zsh elicitation test for Bazel/zsh
differences by:
  - resolving an absolute `git` path
  - running `git init --quiet .`
- asserting success / `.git` creation instead of relying on banner text

## Verification

- `cargo test -p codex-app-server turn_start_zsh_fork -- --nocapture`
- `cargo test -p codex-exec-server accept_elicitation -- --nocapture`
- `bazel test //codex-rs/exec-server:exec-server-all-test
--test_output=streamed --test_arg=--nocapture
--test_arg=accept_elicitation_for_prompt_rule_with_zsh`
- CI (`rust-ci`) on the final cleaned commit: `Tests — ubuntu-24.04 -
x86_64-unknown-linux-gnu` and `Tests — ubuntu-24.04-arm -
aarch64-unknown-linux-gnu` passed in [run
22291424358](https://github.com/openai/codex/actions/runs/22291424358)
2026-02-22 19:39:56 -08:00
Max Johnson
37610240ec
app-server: retain thread listener across disconnects (#12373)
- keep the per-thread app-server listener alive when the last client
unsubscribes or disconnects
- preserve listener-side active turn history so running `thread/resume`
can merge an in-progress turn snapshot after reconnect
- add `ThreadStateManager` regressions for disconnect/unsubscribe
retention and explicit thread teardown cleanup

Added unit tests, and I manually tested to confirm the fix

---------

Co-authored-by: Codex <noreply@openai.com>
2026-02-22 05:33:33 +00:00
Michael Bolin
1af2a37ada
chore: remove codex-core public protocol/shell re-exports (#12432)
## Why

`codex-rs/core/src/lib.rs` re-exported a broad set of types and modules
from `codex-protocol` and `codex-shell-command`. That made it easy for
workspace crates to import those APIs through `codex-core`, which in
turn hides dependency edges and makes it harder to reduce compile-time
coupling over time.

This change removes those public re-exports so call sites must import
from the source crates directly. Even when a crate still depends on
`codex-core` today, this makes dependency boundaries explicit and
unblocks future work to drop `codex-core` dependencies where possible.

## What Changed

- Removed public re-exports from `codex-rs/core/src/lib.rs` for:
- `codex_protocol::protocol` and related protocol/model types (including
`InitialHistory`)
  - `codex_protocol::config_types` (`protocol_config_types`)
- `codex_shell_command::{bash, is_dangerous_command, is_safe_command,
parse_command, powershell}`
- Migrated workspace Rust call sites to import directly from:
  - `codex_protocol::protocol`
  - `codex_protocol::config_types`
  - `codex_protocol::models`
  - `codex_shell_command`
- Added explicit `Cargo.toml` dependencies (`codex-protocol` /
`codex-shell-command`) in crates that now import those crates directly.
- Kept `codex-core` internal modules compiling by using `pub(crate)`
aliases in `core/src/lib.rs` (internal-only, not part of the public
API).
- Updated the two utility crates that can already drop a `codex-core`
dependency edge entirely:
  - `codex-utils-approval-presets`
  - `codex-utils-cli`

## Verification

- `cargo test -p codex-utils-approval-presets`
- `cargo test -p codex-utils-cli`
- `cargo check --workspace --all-targets`
- `just clippy`
2026-02-20 23:45:35 -08:00
Michael Bolin
a73efab8dd
fix: address flakiness in thread_resume_rejoins_running_thread_even_with_override_mismatch (#12381)
## Why
`thread/resume` responses for already-running threads can be reported as
`Idle` even while a turn is still in progress. This is caused by a
timing window where the runtime watch state has not yet observed the
running-thread transition, so API clients can receive stale status
information at resume time.

Possibly related: https://github.com/openai/codex/pull/11786

## What
- Add a shared status normalization helper, `resolve_thread_status`, in
`codex-rs/app-server/src/thread_status.rs` that resolves
`Idle`/`NotLoaded` to `Active { active_flags: [] }` when an in-progress
turn is known.
- Reuse this helper across thread response paths in
`codex-rs/app-server/src/codex_message_processor.rs` (including
`thread/start`, `thread/unarchive`, `thread/read`, `thread/resume`,
`thread/fork`, and review/thread-started notification responses).
- In `handle_pending_thread_resume_request`, use both the in-memory
`active_turn_snapshot` and the resumed rollout turns to decide whether a
turn is in progress before resolving thread status for the response.
- Extend `thread_status` tests to validate the new status-resolution
behavior directly.

## Verification
- `cargo test -p codex-app-server
suite::v2::thread_resume::thread_resume_rejoins_running_thread_even_with_override_mismatch`
2026-02-20 20:36:04 -08:00
natea-oai
936e744c93
Add field to Thread object for the latest rename set for a given thread (#12301)
Exposes through the app server updated names set for a thread. This
enables other surfaces to use the core as the source of truth for thread
naming. `threadName` is gathered using the helper functions used to
interact with `session_index.jsonl`, and is hydrated in:
- `thread/list`
- `thread/read`
- `thread/resume`
- `thread/unarchive`
- `thread/rollback`

We don't do this for `thread/start` and `thread/fork`.
2026-02-20 18:26:57 -08:00
pakrym-oai
1bb7989b20
Add ability to attach extra files to feedback (#12370)
Allow clients to provide extra files.
2026-02-20 22:26:14 +00:00