Commit graph

388 commits

Author SHA1 Message Date
Jack Mousseau
7c7e267501
Simplify permissions available in request permissions tool (#14529) 2026-03-12 21:13:17 -07:00
alexsong-oai
650beb177e
Refactor cloud requirements error and surface in JSON-RPC error (#14504)
Refactors cloud requirements error handling to carry structured error
metadata and surfaces that metadata through JSON-RPC config-load
failures, including:
* adds typed CloudRequirementsLoadErrorCode values plus optional
statusCode
* marks thread/start, thread/resume, and thread/fork config failures
with structured cloud-requirements error data
2026-03-13 03:30:51 +00:00
alexsong-oai
1a363d5fcf
Add plugin usage telemetry (#14531)
adding metrics including: 
* plugin used
* plugin installed/uninstalled
* plugin enabled/disabled
2026-03-12 19:22:30 -07:00
xl-openai
1ea69e8d50
feat: add plugin/read. (#14445)
return more information for a specific plugin.
2026-03-12 16:52:21 -07:00
Owen Lin
d3e6680531
fix turn_start_jsonrpc_span_parents_core_turn_spans flakiness (#14490)
This makes the test less flaky by checking the core invariant instead of
the full span chain.

Before, the test waited for several specific internal spans
(`submission_dispatch`, `session_task.turn`, `run_turn`) and asserted
their exact relationships. That was brittle because those spans are
exported asynchronously and are more of an implementation detail than
the thing we actually care about.

Now, the test only checks that:
- `turn/start` is on the expected remote trace with the expected remote
parent
- at least one representative core turn span on that same trace descends
from it

That keeps the sanity-check we want while making the test less sensitive
to timing and internal refactors.
2026-03-12 12:16:56 -07:00
jgershen-oai
3e96c867fe
use scopes_supported for OAuth when present on MCP servers (#14419)
Fixes [#8889](https://github.com/openai/codex/issues/8889).

## Summary
- Discover and use advertised MCP OAuth `scopes_supported` when no
explicit or configured scopes are present.
- Apply the same scope precedence across `mcp add`, `mcp login`, skill
dependency auto-login, and app-server MCP OAuth login.
- Keep discovered scopes ephemeral and non-persistent.
- Retry once without scopes for CLI and skill auto-login flows if the
OAuth provider rejects discovered scopes.

## Motivation
Some MCP servers advertise the scopes they expect clients to request
during OAuth, but Codex was ignoring that metadata and typically
starting OAuth with no scopes unless the user manually passed `--scopes`
or configured `server.scopes`.

That made compliant MCP servers harder to use out of the box and is the
behavior described in
[#8889](https://github.com/openai/codex/issues/8889).

This change also brings our behavior in line with the MCP authorization
spec's scope selection guidance:

https://modelcontextprotocol.io/specification/2025-11-25/basic/authorization#scope-selection-strategy

## Behavior
Scope selection now follows this order everywhere:
1. Explicit request scopes / CLI `--scopes`
2. Configured `server.scopes`
3. Discovered `scopes_supported`
4. Legacy empty-scope behavior

Compatibility notes:
- Existing working setups keep the same behavior because explicit and
configured scopes still win.
- Discovered scopes are never written back into config or token storage.
- If discovery is missing, malformed, or empty, behavior falls back to
the previous empty-scope path.
- App-server login gets the same precedence rules, but does not add a
transparent retry path in this change.

## Implementation
- Extend streamable HTTP OAuth discovery to parse and normalize
`scopes_supported`.
- Add a shared MCP scope resolver in `core` so all login entrypoints use
the same precedence rules.
- Preserve provider callback errors from the OAuth flow so CLI/skill
flows can safely distinguish provider rejections from other failures.
- Reuse discovered scopes from the existing OAuth support check where
possible instead of persisting new config.
2026-03-12 11:57:06 -07:00
gabec-openai
4fa7d6f444
Handle malformed agent role definitions nonfatally (#14488)
## Summary
- make malformed agent role definitions nonfatal during config loading
- drop invalid agent roles and record warnings in `startup_warnings`
- forward startup warnings through app-server `configWarning`
notifications

## Testing
- `cargo test -p codex-core agent_role_ -- --nocapture`
- `just fix -p codex-core`
- `just fmt`
- `cargo test -p codex-app-server config_warning -- --nocapture`

Co-authored-by: Codex <noreply@openai.com>
2026-03-12 11:20:31 -07:00
viyatb-oai
04892b4ceb
refactor: make bubblewrap the default Linux sandbox (#13996)
## Summary
- make bubblewrap the default Linux sandbox and keep
`use_legacy_landlock` as the only override
- remove `use_linux_sandbox_bwrap` from feature, config, schema, and
docs surfaces
- update Linux sandbox selection, CLI/config plumbing, and related
tests/docs to match the new default
- fold in the follow-up CI fixes for request-permissions responses and
Linux read-only sandbox error text
2026-03-11 23:31:18 -07:00
sayan-oai
917c2df201
chore: use AVAILABLE and ON_INSTALL as default plugin install and auth policies (#14407)
make `AVAILABLE` the default plugin installPolicy when unset in
`marketplace.json`. similarly, make `ON_INSTALL` the default authPolicy.

this means, when unset, plugins are available to be installed (but not
auto-installed), and the contained connectors will be authed at
install-time.

updated tests.
2026-03-11 20:33:17 -07:00
Owen Lin
5bc82c5b93
feat(app-server): propagate traces across tasks and core ops (#14387)
## Summary

This PR keeps app-server RPC request trace context alive for the full
lifetime of the work that request kicks off (e.g. for `thread/start`,
this is `app-server rpc handler -> tokio background task -> core op
submissions`). Previously we lose trace lineage once the request handler
returns or hands work off to background tasks.

This approach is especially relevant for `thread/start` and other RPC
handlers that run in a non-blocking way. In the near future we'll most
likely want to make all app-server handlers run in a non-blocking way by
default, and only queue operations that must operate in order (e.g.
thread RPCs per thread?), so we want to make sure tracing in app-server
just generally works.

Depends on https://github.com/openai/codex/pull/14300

**Before**
<img width="155" height="207" alt="image"
src="https://github.com/user-attachments/assets/c9487459-36f1-436c-beb7-fafeb40737af"
/>


**After**
<img width="299" height="337" alt="image"
src="https://github.com/user-attachments/assets/727392b2-d072-4427-9dc4-0502d8652dea"
/>

## What changed

- Keep request-scoped trace context around until we send the final
response or error, or the connection closes.
- Thread that trace context through detached `thread/start` work so
background startup stays attached to the originating request.
- Pass request trace context through to downstream core operations,
including:
  - thread creation
  - resume/fork flows
  - turn submission
  - review
  - interrupt
  - realtime conversation operations
- Add tracing tests that verify:
  - remote W3C trace context is preserved for `thread/start`
  - remote W3C trace context is preserved for `turn/start`
  - downstream core spans stay under the originating request span
  - request-scoped tracing state is cleaned up correctly
- Clean up shutdown behavior so detached background tasks and spawned
threads are drained before process exit.
2026-03-11 20:18:31 -07:00
Ahmed Ibrahim
bf5e997b31
Include spawn agent model metadata in app-server items (#14410)
- add model and reasoning effort to app-server collab spawn items and
notifications
- regenerate app-server protocol schemas for the new fields

---------

Co-authored-by: Codex <noreply@openai.com>
2026-03-11 19:25:21 -07:00
Owen Lin
72631755e0
chore(app-server): stop emitting codex/event/ notifications (#14392)
## Description

This PR stops emitting legacy `codex/event/*` notifications from the
public app-server transports.

It's been a long time coming! app-server was still producing a raw
notification stream from core, alongside the typed app-server
notifications and server requests, for compatibility reasons. Now,
external clients should no longer be depending on those legacy
notifications, so this change removes them from the stdio and websocket
contract and updates the surrounding docs, examples, and tests to match.

### Caveat
I left the "in-process" version of app-server alone for now, since
`codex exec` was recently based on top of app-server via this in-process
form here: https://github.com/openai/codex/pull/14005

Seems like `codex exec` still consumes some legacy notifications
internally, so this branch only removes `codex/event/*` from app-server
over stdio and websockets.

## Follow-up

Once `codex exec` is fully migrated off `codex/event/*` notifications,
we'll be able to stop emitting them entirely entirely instead of just
filtering it at the external transport boundary.
2026-03-12 00:45:20 +00:00
sayan-oai
7b2cee53db chore: wire through plugin policies + category from marketplace.json (#14305)
wire plugin marketplace metadata through app-server endpoints:
- `plugin/list` has `installPolicy` and `authPolicy`
- `plugin/install` has plugin-level `authPolicy`

`plugin/install` also now enforces `NOT_AVAILABLE` `installPolicy` when
installing.


added tests.
2026-03-11 12:33:10 -07:00
Owen Lin
fa1242c83b fix(otel): make HTTP trace export survive app-server runtimes (#14300)
## Summary

This PR fixes OTLP HTTP trace export in runtimes where the previous
exporter setup was unreliable, especially around app-server usage. It
also removes the old `codex_otel::otel_provider` compatibility shim and
switches remaining call sites over to the crate-root
`codex_otel::OtelProvider` export.

## What changed

- Use a runtime-safe OTLP HTTP trace exporter path for Tokio runtimes.
- Add an async HTTP client path for trace export when we are already
inside a multi-thread Tokio runtime.
- Make provider shutdown flush traces before tearing down the tracer
provider.
- Add loopback coverage that verifies traces are actually sent to
`/v1/traces`:
  - outside Tokio
  - inside a multi-thread Tokio runtime
  - inside a current-thread Tokio runtime
- Remove the `codex_otel::otel_provider` shim and update remaining
imports.

## Why

I hit cases where spans were being created correctly but never made it
to the collector. The issue turned out to be in exporter/runtime
behavior rather than the span plumbing itself. This PR narrows that gap
and gives us regression coverage for the actual export path.
2026-03-11 12:33:10 -07:00
Leo Shimonaka
889b4796fc feat: Add additional macOS Sandbox Permissions for Launch Services, Contacts, Reminders (#14155)
Add additional macOS Sandbox Permissions levers for the following:

- Launch Services
- Contacts
- Reminders
2026-03-11 12:33:09 -07:00
joeytrasatti-openai
8ac27b2a16 Add ephemeral flag support to thread fork (#14248)
### Summary
This PR adds first-class ephemeral support to thread/fork, bringing it
in line with thread/start. The goal is to support one-off completions on
full forked threads without persisting them as normal user-visible
threads.

### Testing
2026-03-11 12:33:08 -07:00
xl-openai
d751e68f44 feat: Allow sync with remote plugin status. (#14176)
Add forceRemoteSync to plugin/list.
When it is set to True, we will sync the local plugin status with the
remote one (backend-api/plugins/list).
2026-03-11 12:33:08 -07:00
guinness-oai
4ac6042850 Mark incomplete resumed turns interrupted when idle (#14125)
Fixes a Codex app bug where quitting the app mid-run could leave the
reopened thread stuck in progress and non-interactable. On cold thread
resume, app-server could return an idle thread with a replayed turn
still marked in progress. This marks incomplete replayed turns as
interrupted unless the thread is actually active.
2026-03-11 12:33:07 -07:00
Eric Traut
f9cba5cb16 Log ChatGPT user ID for feedback tags (#13901)
There are some bug investigations that currently require us to ask users
for their user ID even though they've already uploaded logs and session
details via `/feedback`. This frustrates users and increases the time
for diagnosis.

This PR includes the ChatGPT user ID in the metadata uploaded for
`/feedback` (both the TUI and app-server).
2026-03-11 12:33:07 -07:00
Ahmed Ibrahim
f3f47cf455
Stabilize app-server notify initialize test (#13939)
## What changed
- This PR changes only the flaky test setup for
`turn_start_notify_payload_includes_initialize_client_name`.
- Instead of shelling out to `python3` to write the notify payload, the
test uses the first-party `codex-app-server-test-notify-capture` helper.
- The helper writes `notify.json` atomically and the test waits for the
file to exist before reading it.

## Why this fixes the flake
- The old test depended on an external Python interpreter being present
and behaving consistently on every CI runner.
- It also raced the file write: the test could observe the path before
the payload had been fully written, which produced partial reads and
intermittent assertion failures.
- Moving the write into a repo-owned helper removes the external
dependency, and atomic write-plus-wait makes the handoff deterministic.

## Scope
- Test-only change.
2026-03-09 23:41:58 -07:00
Ahmed Ibrahim
2e24be2134
Use realtime transcript for handoff context (#14132)
- collect input/output transcript deltas into active handoff transcript
state
- attach and clear that transcript on each handoff, and regenerate
schema/tests
2026-03-09 22:30:03 -07:00
Channing Conger
c6343e0649
Implemented thread-level atomic elicitation counter for stopwatch pausing (#12296)
### Purpose
While trying to build out CLI-Tools for the agent to use under skills we
have found that those tools sometimes need to invoke a user elicitation.
These elicitations are handled out of band of the codex app-server but
need to indicate to the exec manager that the command running is not
going to progress on the usual timeout horizon.

### Example
Model calls universal exec:
`$ download-credit-card-history --start-date 2026-01-19 --end-date
2026-02-19 > credit_history.jsonl`

download-cred-card-history might hit a hosted/preauthenticated service
to fetch data. That service might decide that the request requires an
end user approval the access to the personal data. It should be able to
signal to the running thread that the command in question is blocked on
user elicitation. In that case we want the exec to continue, but the
timeout to not expire on the tool call, essentially freezing time until
the user approves or rejects the command at which point the tool would
signal the app-server to decrement the outstanding elicitation count.
Now timeouts would proceed as normal.

### What's Added

- New v2 RPC methods:
    - thread/increment_elicitation
    - thread/decrement_elicitation
- Protocol updates in:
    - codex-rs/app-server-protocol/src/protocol/common.rs
    - codex-rs/app-server-protocol/src/protocol/v2.rs
- App-server handlers wired in:
    - codex-rs/app-server/src/codex_message_processor.rs

### Behavior

- Counter starts at 0 per thread.
- increment atomically increases the counter.
- decrement atomically decreases the counter; decrement at 0 returns
invalid request.
- Transition rules:
- 0 -> 1: broadcast pause state, pausing all active stopwatches
immediately.
    - \>0 -> >0: remain paused.
    - 1 -> 0: broadcast unpause state, resuming stopwatches.
- Core thread/session logic:
    - codex-rs/core/src/codex_thread.rs
    - codex-rs/core/src/codex.rs
    - codex-rs/core/src/mcp_connection_manager.rs

### Exec-server stopwatch integration

- Added centralized stopwatch tracking/controller:
    - codex-rs/exec-server/src/posix/stopwatch_controller.rs
- Hooked pause/unpause broadcast handling + stopwatch registration:
    - codex-rs/exec-server/src/posix/mcp.rs
    - codex-rs/exec-server/src/posix/stopwatch.rs
    - codex-rs/exec-server/src/posix.rs
2026-03-09 22:29:26 -07:00
Matthew Zeng
566e4cee4b
[apps] Fix apps enablement condition. (#14011)
- [x] Fix apps enablement condition to check both the feature flag and
that the user is not an API key user.
2026-03-09 22:25:43 -07:00
xl-openai
0c33af7746
feat: support disabling bundled system skills (#13792)
Support disable bundled system skills with a config:

[skills.bundled]
enabled = false
2026-03-09 22:02:53 -07:00
Andrei Eternal
244b2d53f4
start of hooks engine (#13276)
(Experimental)

This PR adds a first MVP for hooks, with SessionStart and Stop

The core design is:

- hooks live in a dedicated engine under codex-rs/hooks
- each hook type has its own event-specific file
- hook execution is synchronous and blocks normal turn progression while
running
- matching hooks run in parallel, then their results are aggregated into
a normalized HookRunSummary

On the AppServer side, hooks are exposed as operational metadata rather
than transcript-native items:

- new live notifications: hook/started, hook/completed
- persisted/replayed hook results live on Turn.hookRuns
- we intentionally did not add hook-specific ThreadItem variants

Hooks messages are not persisted, they remain ephemeral. The context
changes they add are (they get appended to the user's prompt)
2026-03-10 04:11:31 +00:00
viyatb-oai
b0cbc25a48
fix(protocol): preserve legacy workspace-write semantics (#13957)
## Summary
This is a fast follow to the initial `[permissions]` structure.

- keep the new split-policy carveout behavior for narrower non-write
entries under broader writable roots
- preserve legacy `WorkspaceWrite` semantics by using a cwd-aware bridge
that drops only redundant nested readable roots when projecting from
`SandboxPolicy`
- route the legacy macOS seatbelt adapter through that same legacy
bridge so redundant nested readable roots do not become read-only
carveouts on macOS
- derive the legacy bridge for `command_exec` using the sandbox root cwd
rather than the request cwd so policy derivation matches later sandbox
enforcement
- add regression coverage for the legacy macOS nested-readable-root case

## Examples
### Legacy `workspace-write` on macOS
A legacy `workspace-write` policy can redundantly list a nested readable
root under an already-writable workspace root.

For example, legacy config can effectively mean:
- workspace root (`.` / `cwd`) is writable
- `docs/` is also listed in `readable_roots`

The new shared split-policy helper intentionally treats a narrower
non-write entry under a broader writable root as a carveout for real
`[permissions]` configs. Without this fast follow, the unchanged macOS
seatbelt legacy adapter could project that legacy shape into a
`FileSystemSandboxPolicy` that treated `docs/` like a read-only carveout
under the writable workspace root. In practice, legacy callers on macOS
could unexpectedly lose write access inside `docs/`, even though that
path was writable before the `[permissions]` migration work.

This change fixes that by routing the legacy seatbelt path through the
cwd-aware legacy bridge, so:
- legacy `workspace-write` keeps `docs/` writable when `docs/` was only
a redundant readable root
- explicit `[permissions]` entries like `'.' = 'write'` and `'docs' =
'read'` still make `docs/` read-only, which is the new intended
split-policy behavior

### Legacy `command_exec` with a subdirectory cwd
`command_exec` can run a command from a request cwd that is narrower
than the sandbox root cwd.

For example:
- sandbox root cwd is `/repo`
- request cwd is `/repo/subdir`
- legacy policy is still `workspace-write` rooted at `/repo`

Before this fast follow, `command_exec` derived the legacy bridge using
the request cwd, but the sandbox was later built using the sandbox root
cwd. That mismatch could miss redundant legacy readable roots during
projection and accidentally reintroduce read-only carveouts for paths
that should still be writable under the legacy model.

This change fixes that by deriving the legacy bridge with the same
sandbox root cwd that sandbox enforcement later uses.

## Verification
- `just fmt`
- `cargo test -p codex-core
seatbelt_legacy_workspace_write_nested_readable_root_stays_writable`
- `cargo test -p codex-core test_sandbox_config_parsing`
- `cargo clippy -p codex-core -p codex-app-server --all-targets -- -D
warnings`
- `cargo clean`
2026-03-09 18:43:27 -07:00
Max Johnson
66e71cce11
codex-rs/app-server: add health endpoints for --listen websocket server (#13782)
Healthcheck endpoints for the websocket server

- serve `GET /readyz` and `GET /healthz` from the same listener used for
`--listen ws://...`
- switch the websocket listener over to `axum` upgrade handling instead
of manual socket parsing
- add websocket transport coverage for the health endpoints and document
the new behavior

Testing
- integration tests
- built and tested e2e

```
> curl -i http://127.0.0.1:9234/readyz
HTTP/1.1 200 OK
content-length: 0
date: Fri, 06 Mar 2026 19:20:23 GMT

>  curl -i http://127.0.0.1:9234/healthz
HTTP/1.1 200 OK
content-length: 0
date: Fri, 06 Mar 2026 19:20:24 GMT
```
2026-03-09 22:11:30 +00:00
Dylan Hurd
d241dc598c
feat(core) Persist request_permission data across turns (#14009)
## Summary
request_permissions flows should support persisting results for the
session.

Open Question: Still deciding if we need within-turn approvals - this
adds complexity but I could see it being useful

## Testing
- [x] Updated unit tests

---------

Co-authored-by: Codex <noreply@openai.com>
2026-03-09 14:36:38 -07:00
sayan-oai
6ad448b658
chore: plugin/uninstall endpoint (#14111)
add `plugin/uninstall` app-server endpoint to fully rm plugin from
plugins cache dir and rm entry from user config file.

plugin-enablement is session-scoped, so uninstalls are only picked up in
new sessions (like installs).

added tests.
2026-03-09 12:40:25 -07:00
xl-openai
c1f3ef16ec
fix(plugin): Also load curated plugins for TUI. (#14050)
Also run maybe_start_curated_repo_sync_for_config at TUI start time.
2026-03-09 11:05:02 -07:00
Ahmed Ibrahim
0dc242a672
Order websocket initialize after handshake (#13943)
## What changed
- `app-server` now sends initialize notifications to the specific
websocket connection before that connection is marked outbound-ready.
- `message_processor` now exposes the forwarding hook needed to target
that initialize delivery path.

## Why this fixes the flake
- This was a real websocket ordering bug.
- The old code allowed “connection is ready for outbound broadcasts” to
become true before the initialize notification had been routed to the
intended client.
- On CI this showed up as a race where tests would occasionally miss or
misorder initialize delivery depending on scheduler timing.
- Sending initialize to the exact connection first, then exposing it to
the general outbound path, removes that race instead of hiding it with
timing slack.

## Scope
- Production logic change.
2026-03-09 10:27:19 -07:00
Jack Mousseau
e6b93841c5
Add request permissions tool (#13092)
Adds a built-in `request_permissions` tool and wires it through the
Codex core, protocol, and app-server layers so a running turn can ask
the client for additional permissions instead of relying on a static
session policy.

The new flow emits a `RequestPermissions` event from core, tracks the
pending request by call ID, forwards it through app-server v2 as an
`item/permissions/requestApproval` request, and resumes the tool call
once the client returns an approved subset of the requested permission
profile.
2026-03-08 20:23:06 -07:00
Celia Chen
340f9c9ecb
app-server: include experimental skill metadata in exec approval requests (#13929)
## Summary

This change surfaces skill metadata on command approval requests so
app-server clients can tell when an approval came from a skill script
and identify the originating `SKILL.md`.

- add `skill_metadata` to exec approval events in the shared protocol
- thread skill metadata through core shell escalation and delegated
approval handling for skill-triggered approvals
- expose the field in app-server v2 as experimental `skillMetadata`
- regenerate the JSON/TypeScript schemas and cover the new field in
protocol, transport, core, and TUI tests

## Why

Skill-triggered approvals already carry skill context inside core, but
app-server clients could not see which skill caused the prompt. Sending
the skill metadata with the approval request makes it possible for
clients to present better approval UX and connect the prompt back to the
relevant skill definition.


## example event in app-server-v2
verified that we see this event when experimental api is on:
```
< {
<   "id": 11,
<   "method": "item/commandExecution/requestApproval",
<   "params": {
<     "additionalPermissions": {
<       "fileSystem": null,
<       "macos": {
<         "accessibility": false,
<         "automations": {
<           "bundle_ids": [
<             "com.apple.Notes"
<           ]
<         },
<         "calendar": false,
<         "preferences": "read_only"
<       },
<       "network": null
<     },
<     "approvalId": "25d600ee-5a3c-4746-8d17-e2e61fb4c563",
<     "availableDecisions": [
<       "accept",
<       "acceptForSession",
<       "cancel"
<     ],
<     "command": "/Applications/ChatGPT.app/Contents/Resources/CodexAppServer_CodexAppServerBundledSkills.bundle/Contents/Resources/skills/apple-notes/scripts/notes_info",
<     "commandActions": [
<       {
<         "command": "/Applications/ChatGPT.app/Contents/Resources/CodexAppServer_CodexAppServerBundledSkills.bundle/Contents/Resources/skills/apple-notes/scripts/notes_info",
<         "type": "unknown"
<       }
<     ],
<     "cwd": "/Applications/ChatGPT.app/Contents/Resources/CodexAppServer_CodexAppServerBundledSkills.bundle/Contents/Resources/skills/apple-notes",
<     "itemId": "call_jZp3xFpNg4D8iKAD49cvEvZy",
<     "skillMetadata": {
<       "pathToSkillsMd": "/Applications/ChatGPT.app/Contents/Resources/CodexAppServer_CodexAppServerBundledSkills.bundle/Contents/Resources/skills/apple-notes/SKILL.md"
<     },
<     "threadId": "019ccc10-b7d3-7ff2-84fe-3a75e7681e69",
<     "turnId": "019ccc10-b848-76f1-81b3-4a1fa225493f"
<   }
< }`
```

& verified that this is the event when experimental api is off:
```
< {
<   "id": 13,
<   "method": "item/commandExecution/requestApproval",
<   "params": {
<     "approvalId": "5fbbf776-261b-4cf8-899b-c125b547f2c0",
<     "availableDecisions": [
<       "accept",
<       "acceptForSession",
<       "cancel"
<     ],
<     "command": "/Applications/ChatGPT.app/Contents/Resources/CodexAppServer_CodexAppServerBundledSkills.bundle/Contents/Resources/skills/apple-notes/scripts/notes_info",
<     "commandActions": [
<       {
<         "command": "/Applications/ChatGPT.app/Contents/Resources/CodexAppServer_CodexAppServerBundledSkills.bundle/Contents/Resources/skills/apple-notes/scripts/notes_info",
<         "type": "unknown"
<       }
<     ],
<     "cwd": "/Users/celia/code/codex/codex-rs",
<     "itemId": "call_OV2DHzTgYcbYtWaTTBWlocOt",
<     "threadId": "019ccc16-2a2b-7be1-8500-e00d45b892d4",
<     "turnId": "019ccc16-2a8e-7961-98ec-649600e7d06a"
<   }
< }
```
2026-03-08 18:07:46 -07:00
Eric Traut
da3689f0ef
Add in-process app server and wire up exec to use it (#14005)
This is a subset of PR #13636. See that PR for a full overview of the
architectural change.

This PR implements the in-process app server and modifies the
non-interactive "exec" entry point to use the app server.

---------

Co-authored-by: Felipe Coury <felipe.coury@gmail.com>
2026-03-08 18:43:55 -06:00
Matthew Zeng
a684a36091
[app-server] Support hot-reload user config when batch writing config. (#13839)
- [x] Support hot-reload user config when batch writing config.
2026-03-08 17:38:01 -07:00
Ahmed Ibrahim
dc19e78962
Stabilize abort task follow-up handling (#13874)
- production logic plus tests; cancel running tasks before clearing
pending turn state
- suppress follow-up model requests after cancellation and assert on
stabilized request counts instead of fixed sleeps
2026-03-07 22:56:00 -08:00
jif-oai
cf143bf71e
feat: simplify DB further (#13771) 2026-03-07 03:48:36 -08:00
iceweasel-oai
4b4f61d379
app-server: require absolute cwd for windowsSandbox/setupStart (#13833)
## Summary
- require windowsSandbox/setupStart.cwd to be an AbsolutePathBuf
- reject relative cwd values at request parsing instead of normalizing
them later in the setup flow
- add RPC-layer coverage for relative cwd rejection and update the
checked-in protocol schemas/docs

## Why
windowsSandbox/setupStart was carrying the client-provided cwd as a raw
PathBuf for command_cwd while config derivation normalized the same
value into an absolute policy_cwd.

That left room for relative-path ambiguity in the setup path, especially
for inputs like cwd: "repo". Making the RPC accept only absolute paths
removes that split entirely: the handler now receives one
already-validated absolute path and uses it for both config derivation
and setup.

This keeps the trust model unchanged. Trusted clients could already
choose the session cwd; this change is only about making the setup RPC
reject relative paths so command_cwd and policy_cwd cannot diverge.

## Testing
- cargo test -p codex-app-server windows_sandbox_setup (run locally by
user)
- cargo test -p codex-app-server-protocol windows_sandbox (run locally
by user)
2026-03-06 22:47:08 -08:00
Michael Bolin
22ac6b9aaa
sandboxing: plumb split sandbox policies through runtime (#13439)
## Why

`#13434` introduces split `FileSystemSandboxPolicy` and
`NetworkSandboxPolicy`, but the runtime still made most execution-time
sandbox decisions from the legacy `SandboxPolicy` projection.

That projection loses information about combinations like unrestricted
filesystem access with restricted network access. In practice, that
means the runtime can choose the wrong platform sandbox behavior or set
the wrong network-restriction environment for a command even when config
has already separated those concerns.

This PR carries the split policies through the runtime so sandbox
selection, process spawning, and exec handling can consult the policy
that actually matters.

## What changed

- threaded `FileSystemSandboxPolicy` and `NetworkSandboxPolicy` through
`TurnContext`, `ExecRequest`, sandbox attempts, shell escalation state,
unified exec, and app-server exec overrides
- updated sandbox selection in `core/src/sandboxing/mod.rs` and
`core/src/exec.rs` to key off `FileSystemSandboxPolicy.kind` plus
`NetworkSandboxPolicy`, rather than inferring behavior only from the
legacy `SandboxPolicy`
- updated process spawning in `core/src/spawn.rs` and the platform
wrappers to use `NetworkSandboxPolicy` when deciding whether to set
`CODEX_SANDBOX_NETWORK_DISABLED`
- kept additional-permissions handling and legacy `ExternalSandbox`
compatibility projections aligned with the split policies, including
explicit user-shell execution and Windows restricted-token routing
- updated callers across `core`, `app-server`, and `linux-sandbox` to
pass the split policies explicitly

## Verification

- added regression coverage in `core/tests/suite/user_shell_cmd.rs` to
verify `RunUserShellCommand` does not inherit
`CODEX_SANDBOX_NETWORK_DISABLED` from the active turn
- added coverage in `core/src/exec.rs` for Windows restricted-token
sandbox selection when the legacy projection is `ExternalSandbox`
- updated Linux sandbox coverage in
`linux-sandbox/tests/suite/landlock.rs` to exercise the split-policy
exec path
- verified the current PR state with `just clippy`




---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/13439).
* #13453
* #13452
* #13451
* #13449
* #13448
* #13445
* #13440
* __->__ #13439

---------

Co-authored-by: viyatb-oai <viyatb@openai.com>
2026-03-07 02:30:21 +00:00
viyatb-oai
25fa974166
fix: support managed network allowlist controls (#12752)
## Summary
- treat `requirements.toml` `allowed_domains` and `denied_domains` as
managed network baselines for the proxy
- in restricted modes by default, build the effective runtime policy
from the managed baseline plus user-configured allowlist and denylist
entries, so common hosts can be pre-approved without blocking later user
expansion
- add `experimental_network.managed_allowed_domains_only = true` to pin
the effective allowlist to managed entries, ignore user allowlist
additions, and hard-deny non-managed domains without prompting
- apply `managed_allowed_domains_only` anywhere managed network
enforcement is active, including full access, while continuing to
respect denied domains from all sources
- add regression coverage for merged-baseline behavior, managed-only
behavior, and full-access managed-only enforcement

## Behavior
Assuming `requirements.toml` defines both
`experimental_network.allowed_domains` and
`experimental_network.denied_domains`.

### Default mode
- By default, the effective allowlist is
`experimental_network.allowed_domains` plus user or persisted allowlist
additions.
- By default, the effective denylist is
`experimental_network.denied_domains` plus user or persisted denylist
additions.
- Allowlist misses can go through the network approval flow.
- Explicit denylist hits and local or private-network blocks are still
hard-denied.
- When `experimental_network.managed_allowed_domains_only = true`, only
managed `allowed_domains` are respected, user allowlist additions are
ignored, and non-managed domains are hard-denied without prompting.
- Denied domains continue to be respected from all sources.

### Full access
- With managed requirements present, the effective allowlist is pinned
to `experimental_network.allowed_domains`.
- With managed requirements present, the effective denylist is pinned to
`experimental_network.denied_domains`.
- There is no allowlist-miss approval path in full access.
- Explicit denylist hits are hard-denied.
- `experimental_network.managed_allowed_domains_only = true` now also
applies in full access, so managed-only behavior remains in effect
anywhere managed network enforcement is active.
2026-03-06 17:52:54 -08:00
Ruslan Nigmatullin
e9bd8b20a1
app-server: Add streaming and tty/pty capabilities to command/exec (#13640)
* Add an ability to stream stdin, stdout, and stderr
* Streaming of stdout and stderr has a configurable cap for total amount
of transmitted bytes (with an ability to disable it)
* Add support for overriding environment variables
* Add an ability to terminate running applications (using
`command/exec/terminate`)
* Add TTY/PTY support, with an ability to resize the terminal (using
`command/exec/resize`)
2026-03-06 17:30:17 -08:00
xl-openai
0243734300
feat: Add curated plugin marketplace + Metadata Cleanup. (#13712)
1. Add a synced curated plugin marketplace and include it in marketplace
discovery.
2. Expose optional plugin.json interface metadata in plugin/list
3. Tighten plugin and marketplace path handling using validated absolute
paths.
4. Let manifests override skill, MCP, and app config paths.
5. Restrict plugin enablement/config loading to the user config layer so
plugin enablement is at global level
2026-03-06 19:39:35 -05:00
Ruslan Nigmatullin
51fcdc760d
app-server: Emit thread/name/updated event globally (#13674) 2026-03-06 10:25:18 -08:00
jif-oai
fa16c26908
feat: drop sqlite db feature flag (#13750) 2026-03-06 17:57:52 +01:00
Matthew Zeng
98dca99db7
[elicitations] Switch to use MCP style elicitation payload for mcp tool approvals. (#13621)
- [x] Switch to use MCP style elicitation payload for mcp tool
approvals.
- [ ] TODO: Update the UI to support the full spec.
2026-03-06 01:50:26 -08:00
sayan-oai
014a59fb0b
check app auth in plugin/install (#13685)
#### What
on `plugin/install`, check if installed apps are already authed on
chatgpt, and return list of all apps that are not. clients can use this
list to trigger auth workflows as needed.

checks are best effort based on `codex_apps` loading, much like
`app/list`.

#### Tests
Added integration tests, tested locally.
2026-03-06 06:45:00 +00:00
viyatb-oai
6a79ed5920
refactor: remove proxy admin endpoint (#13687)
## Summary
- delete the network proxy admin server and its runtime listener/task
plumbing
- remove the admin endpoint config, runtime, requirement, protocol,
schema, and debug-surface fields
- update proxy docs to reflect the remaining HTTP and SOCKS listeners
only
2026-03-05 22:03:16 -08:00
xl-openai
520ed724d2
support plugin/list. (#13540)
Introduce a plugin/list which reads from local marketplace.json.
Also update the signature for plugin/install.
2026-03-05 21:58:50 -05:00
Max Johnson
1980b6ce00
treat SIGTERM like ctrl-c for graceful shutdown (#13594)
treat SIGTERM the same as SIGINT for graceful app-server websocket
shutdown
2026-03-05 18:16:58 +00:00
Owen Lin
926b2f19e8
feat(app-server): support mcp elicitations in v2 api (#13425)
This adds a first-class server request for MCP server elicitations:
`mcpServer/elicitation/request`.

Until now, MCP elicitation requests only showed up as a raw
`codex/event/elicitation_request` event from core. That made it hard for
v2 clients to handle elicitations using the same request/response flow
as other server-driven interactions (like shell and `apply_patch`
tools).

This also updates the underlying MCP elicitation request handling in
core to pass through the full MCP request (including URL and form data)
so we can expose it properly in app-server.

### Why not `item/mcpToolCall/elicitationRequest`?
This is because MCP elicitations are related to MCP servers first, and
only optionally to a specific MCP tool call.

In the MCP protocol, elicitation is a server-to-client capability: the
server sends `elicitation/create`, and the client replies with an
elicitation result. RMCP models it that way as well.

In practice an elicitation is often triggered by an MCP tool call, but
not always.

### What changed
- add `mcpServer/elicitation/request` to the v2 app-server API
- translate core `codex/event/elicitation_request` events into the new
v2 server request
- map client responses back into `Op::ResolveElicitation` so the MCP
server can continue
- update app-server docs and generated protocol schema
- add an end-to-end app-server test that covers the full round trip
through a real RMCP elicitation flow
- The new test exercises a realistic case where an MCP tool call
triggers an elicitation, the app-server emits
mcpServer/elicitation/request, the client accepts it, and the tool call
resumes and completes successfully.

### app-server API flow
- Client starts a thread with `thread/start`.
- Client starts a turn with `turn/start`.
- App-server sends `item/started` for the `mcpToolCall`.
- While that tool call is in progress, app-server sends
`mcpServer/elicitation/request`.
- Client responds to that request with `{ action: "accept" | "decline" |
"cancel" }`.
- App-server sends `serverRequest/resolved`.
- App-server sends `item/completed` for the mcpToolCall.
- App-server sends `turn/completed`.
- If the turn is interrupted while the elicitation is pending,
app-server still sends `serverRequest/resolved` before the turn
finishes.
2026-03-05 07:20:20 -08:00