core-agent-ide

Author	SHA1	Message	Date
xl-openai	db5781a088	feat: support product-scoped plugins. (#15041 ) 1. Added SessionSource::Custom(String) and --session-source. 2. Enforced plugin and skill products by session_source. 3. Applied the same filtering to curated background refresh.	2026-03-19 00:46:15 -07:00
xl-openai	86982ca1f9	Revert "fix: harden plugin feature gating" (#15102 ) Reverts openai/codex#15020 I messed up the commit in my PR and accidentally merged changes that were still under review.	2026-03-18 15:19:29 -07:00
xl-openai	580f32ad2a	fix: harden plugin feature gating (#15020 ) 1. Use requirement-resolved config.features as the plugin gate. 2. Guard plugin/list, plugin/read, and related flows behind that gate. 3. Skip bad marketplace.json files instead of failing the whole list. 4. Simplify plugin state and caching.	2026-03-18 10:11:43 -07:00
jif-oai	a265d6043e	feat: add memory citation to agent message (#14821 ) Client side to come	2026-03-18 10:03:38 +00:00
Andrei Eternal	6fef421654	[hooks] userpromptsubmit - hook before user's prompt is executed (#14626 ) - this allows blocking the user's prompts from executing, and also prevents them from entering history - handles the edge case where you can both prevent the user's prompt AND add n amount of additionalContexts - refactors some old code into common.rs where hooks overlap functionality - refactors additionalContext being previously added to user messages, instead we use developer messages for them - handles queued messages correctly Sample hook for testing - if you write "[block-user-submit]" this hook will stop the thread: example run ``` › sup • Running UserPromptSubmit hook: reading the observatory notes UserPromptSubmit hook (completed) warning: wizard-tower UserPromptSubmit demo inspected: sup hook context: Wizard Tower UserPromptSubmit demo fired. For this reply only, include the exact phrase 'observatory lanterns lit' exactly once near the end. • Just riding the cosmic wave and ready to help, my friend. What are we building today? observatory lanterns lit › and [block-user-submit] • Running UserPromptSubmit hook: reading the observatory notes UserPromptSubmit hook (stopped) warning: wizard-tower UserPromptSubmit demo blocked the prompt on purpose. stop: Wizard Tower demo block: remove [block-user-submit] to continue. ``` .codex/config.toml ``` [features] codex_hooks = true ``` .codex/hooks.json ``` { "hooks": { "UserPromptSubmit": [ { "hooks": [ { "type": "command", "command": "/usr/bin/python3 .codex/hooks/user_prompt_submit_demo.py", "timeoutSec": 10, "statusMessage": "reading the observatory notes" } ] } ] } } ``` .codex/hooks/user_prompt_submit_demo.py ``` #!/usr/bin/env python3 import json import sys from pathlib import Path def prompt_from_payload(payload: dict) -> str: prompt = payload.get("prompt") if isinstance(prompt, str) and prompt.strip(): return prompt.strip() event = payload.get("event") if isinstance(event, dict): user_prompt = event.get("user_prompt") if isinstance(user_prompt, str): return user_prompt.strip() return "" def main() -> int: payload = json.load(sys.stdin) prompt = prompt_from_payload(payload) cwd = Path(payload.get("cwd", ".")).name or "wizard-tower" if "[block-user-submit]" in prompt: print( json.dumps( { "systemMessage": ( f"{cwd} UserPromptSubmit demo blocked the prompt on purpose." ), "decision": "block", "reason": ( "Wizard Tower demo block: remove [block-user-submit] to continue." ), } ) ) return 0 prompt_preview = prompt or "(empty prompt)" if len(prompt_preview) > 80: prompt_preview = f"{prompt_preview[:77]}..." print( json.dumps( { "systemMessage": ( f"{cwd} UserPromptSubmit demo inspected: {prompt_preview}" ), "hookSpecificOutput": { "hookEventName": "UserPromptSubmit", "additionalContext": ( "Wizard Tower UserPromptSubmit demo fired. " "For this reply only, include the exact phrase " "'observatory lanterns lit' exactly once near the end." ), }, } ) ) return 0 if __name__ == "__main__": raise SystemExit(main()) ```	2026-03-17 22:09:22 -07:00
xl-openai	a5d3114e97	feat: Add product-aware plugin policies and clean up manifest naming (#14993 ) - Add shared Product support to marketplace plugin policy and skill policy (no enforced yet). - Move marketplace installation/authentication under policy and model it as MarketplacePluginPolicy. - Rename plugin/marketplace local manifest types to separate raw serde shapes from resolved in-memory models.	2026-03-17 17:01:34 -07:00
Ahmed Ibrahim	c6ab4ee537	Gate realtime audio interruption logic to v2 (#14984 ) - thread the realtime version into conversation start and app-server notifications - keep playback-aware mic gating and playback interruption behavior on v2 only, leaving v1 on the legacy path	2026-03-17 15:24:37 -07:00
xl-openai	1a9555eda9	Cleanup skills/remote/xxx endpoints. (#14977 ) Remote skills/remote/xxx as they are not in used for now.	2026-03-17 15:22:36 -07:00
jif-oai	e8add54e5d	feat: show effective model in spawn agent event (#14944 ) Show effective model after the full config layering for the sub agent	2026-03-17 16:58:58 +00:00
Ahmed Ibrahim	fbd7f9b986	[stack 2/4] Align main realtime v2 wire and runtime flow (#14830 ) ## Stack Position 2/4. Built on top of #14828. ## Base - #14828 ## Unblocks - #14829 - #14827 ## Scope - Port the realtime v2 wire parsing, session, app-server, and conversation runtime behavior onto the split websocket-method base. - Branch runtime behavior directly on the current realtime session kind instead of parser-derived flow flags. - Keep regression coverage in the existing e2e suites. --------- Co-authored-by: Codex <noreply@openai.com>	2026-03-16 21:38:07 -07:00
jif-oai	3f266bcd68	feat: make interrupt state not final for multi-agents (#13850 ) Make `interrupted` an agent state and make it not final. As a result, a `wait` won't return on an interrupted agent and no notification will be send to the parent agent. The rationals are: * If a user interrupt a sub-agent for any reason, you don't want the parent agent to instantaneously ask the sub-agent to restart * If a parent agent interrupt a sub-agent, no need to add a noisy notification in the parent agen	2026-03-16 16:39:40 +00:00
friel-openai	ba463a9dc7	Preserve background terminals on interrupt and rename cleanup command to /stop (#14602 ) ### Motivation - Interrupting a running turn (Ctrl+C / Esc) currently also terminates long‑running background shells, which is surprising for workflows like local dev servers or file watchers. - The existing cleanup command name was confusing; callers expect an explicit command to stop background terminals rather than a UI clear action. - Make background‑shell termination explicit and surface a clearer command name while preserving backward compatibility. ### Description - Renamed the background‑terminal cleanup slash command from `Clean` (`/clean`) to `Stop` (`/stop`) and kept `clean` as an alias in the command parsing/visibility layer, updated the user descriptions and command popup wiring accordingly. - Updated the unified‑exec footer text and snapshots to point to `/stop` (and trimmed corresponding snapshot output to match the new label). - Changed interrupt behavior so `Op::Interrupt` (Ctrl+C / Esc interrupt) no longer closes or clears tracked unified exec / background terminal processes in the TUI or core cleanup path; background shells are now preserved after an interrupt. - Updated protocol/docs to clarify that `turn/interrupt` (or `Op::Interrupt`) interrupts the active turn but does not terminate background terminals, and that `thread/backgroundTerminals/clean` is the explicit API to stop those shells. - Updated unit/integration tests and insta snapshots in the TUI and core unified‑exec suites to reflect the new semantics and command name. ### Testing - Ran formatting with `just fmt` in `codex-rs` (succeeded). - Ran `cargo test -p codex-protocol` (succeeded). - Attempted `cargo test -p codex-tui` but the build could not complete in this environment due to a native build dependency that requires `libcap` development headers (the `codex-linux-sandbox` vendored build step); install `libcap-dev` / make `libcap.pc` available in `PKG_CONFIG_PATH` to run the TUI test suite locally. - Updated and accepted the affected `insta` snapshots for the TUI changes so visual diffs reflect the new `/stop` wording and preserved interrupt behavior. ------ [Codex Task](https://chatgpt.com/codex/tasks/task_i_69b39c44b6dc8323bd133ae206310fae)	2026-03-15 22:17:25 -07:00
viyatb-oai	9060dc7557	fix: fix symlinked writable roots in sandbox policies (#14674 ) ## Summary - normalize effective readable, writable, and unreadable sandbox roots after resolving special paths so symlinked roots use canonical runtime paths - add a protocol regression test for a symlinked writable root with a denied child and update protocol expectations to canonicalized effective paths - update macOS seatbelt tests to assert against effective normalized roots produced by the shared policy helpers ## Testing - just fmt - cargo test -p codex-protocol - cargo test -p codex-core explicit_unreadable_paths_are_excluded_ - cargo clippy -p codex-protocol -p codex-core --tests -- -D warnings ## Notes - This is intended to fix the symlinked TMPDIR bind failure in bubblewrap described in #14672. Fixes #14672	2026-03-14 13:24:43 -07:00
sayan-oai	d272f45058	move plugin/skill instructions into dev msg and reorder (#14609 ) Move the general `Apps`, `Skills` and `Plugins` instructions blocks out of `user_instructions` and into the developer message, with new `Apps -> Skills -> Plugins` order for better clarity. Also wrap those sections in stable XML-style instruction tags (like other sections) and update prompt-layout tests/snapshots. This makes the tests less brittle in snapshot output (we can parse the sections), and it consolidates the capability instructions in one place. #### Tests Updated snapshots, added tests. `<AGENTS_MD>` disappearing in snapshots is expected: before this change, the wrapped user-instructions message was kept alive by `Skills` content. Now that `Skills` and `Plugins` are in the developer message, that wrapper only appears when there is real project-doc/user-instructions content. --------- Co-authored-by: Charley Cunningham <ccunningham@openai.com>	2026-03-13 20:51:01 -07:00
Charley Cunningham	bc24017d64	Add Smart Approvals guardian review across core, app-server, and TUI (#13860 ) ## Summary - add `approvals_reviewer = "user" \| "guardian_subagent"` as the runtime control for who reviews approval requests - route Smart Approvals guardian review through core for command execution, file changes, managed-network approvals, MCP approvals, and delegated/subagent approval flows - expose guardian review in app-server with temporary unstable `item/autoApprovalReview/{started,completed}` notifications carrying `targetItemId`, `review`, and `action` - update the TUI so Smart Approvals can be enabled from `/experimental`, aligned with the matching `/approvals` mode, and surfaced clearly while reviews are pending or resolved ## Runtime model This PR does not introduce a new `approval_policy`. Instead: - `approval_policy` still controls when approval is needed - `approvals_reviewer` controls who reviewable approval requests are routed to: - `user` - `guardian_subagent` `guardian_subagent` is a carefully prompted reviewer subagent that gathers relevant context and applies a risk-based decision framework before approving or denying the request. The `smart_approvals` feature flag is a rollout/UI gate. Core runtime behavior keys off `approvals_reviewer`. When Smart Approvals is enabled from the TUI, it also switches the current `/approvals` settings to the matching Smart Approvals mode so users immediately see guardian review in the active thread: - `approval_policy = on-request` - `approvals_reviewer = guardian_subagent` - `sandbox_mode = workspace-write` Users can still change `/approvals` afterward. Config-load behavior stays intentionally narrow: - plain `smart_approvals = true` in `config.toml` remains just the rollout/UI gate and does not auto-set `approvals_reviewer` - the deprecated `guardian_approval = true` alias migration does backfill `approvals_reviewer = "guardian_subagent"` in the same scope when that reviewer is not already configured there, so old configs preserve their original guardian-enabled behavior ARC remains a separate safety check. For MCP tool approvals, ARC escalations now flow into the configured reviewer instead of always bypassing guardian and forcing manual review. ## Config stability The runtime reviewer override is stable, but the config-backed app-server protocol shape is still settling. - `thread/start`, `thread/resume`, and `turn/start` keep stable `approvalsReviewer` overrides - the config-backed `approvals_reviewer` exposure returned via `config/read` (including profile-level config) is now marked `[UNSTABLE]` / experimental in the app-server protocol until we are more confident in that config surface ## App-server surface This PR intentionally keeps the guardian app-server shape narrow and temporary. It adds generic unstable lifecycle notifications: - `item/autoApprovalReview/started` - `item/autoApprovalReview/completed` with payloads of the form: - `{ threadId, turnId, targetItemId, review, action? }` `review` is currently: - `{ status, riskScore?, riskLevel?, rationale? }` - where `status` is one of `inProgress`, `approved`, `denied`, or `aborted` `action` carries the guardian action summary payload from core when available. This lets clients render temporary standalone pending-review UI, including parallel reviews, even when the underlying tool item has not been emitted yet. These notifications are explicitly documented as `[UNSTABLE]` and expected to change soon. This PR does not persist guardian review state onto `thread/read` tool items. The intended follow-up is to attach guardian review state to the reviewed tool item lifecycle instead, which would improve consistency with manual approvals and allow thread history / reconnect flows to replay guardian review state directly. ## TUI behavior - `/experimental` exposes the rollout gate as `Smart Approvals` - enabling it in the TUI enables the feature and switches the current session to the matching Smart Approvals `/approvals` mode - disabling it in the TUI clears the persisted `approvals_reviewer` override when appropriate and returns the session to default manual review when the effective reviewer changes - `/approvals` still exposes the reviewer choice directly - the TUI renders: - pending guardian review state in the live status footer, including parallel review aggregation - resolved approval/denial state in history ## Scope notes This PR includes the supporting core/runtime work needed to make Smart Approvals usable end-to-end: - shell / unified-exec / apply_patch / managed-network / MCP guardian review - delegated/subagent approval routing into guardian review - guardian review risk metadata and action summaries for app-server/TUI - config/profile/TUI handling for `smart_approvals`, `guardian_approval` alias migration, and `approvals_reviewer` - a small internal cleanup of delegated approval forwarding to dedupe fallback paths and simplify guardian-vs-parent approval waiting (no intended behavior change) Out of scope for this PR: - redesigning the existing manual approval protocol shapes - persisting guardian review state onto app-server `ThreadItem`s - delegated MCP elicitation auto-review (the current delegated MCP guardian shim only covers the legacy `RequestUserInput` path) --------- Co-authored-by: Codex <noreply@openai.com>	2026-03-13 15:27:00 -07:00
Owen Lin	014e19510d	feat(app-server, core): add more spans (#14479 ) ## Description This PR expands tracing coverage across app-server thread startup, core session initialization, and the Responses transport layer. It also gives core dispatch spans stable operation-specific names so traces are easier to follow than the old generic `submission_dispatch` spans. Also use `fmt::Display` for types that we serialize in traces so we send strings instead of rust types	2026-03-13 13:16:33 -07:00
Jack Mousseau	b7dba72dbd	Rename reject approval policy to granular (#14516 )	2026-03-12 16:38:04 -07:00
Jack Mousseau	a314c7d3ae	Decouple request permissions feature and tool (#14426 )	2026-03-12 14:47:08 -07:00
Ahmed Ibrahim	bf5e997b31	Include spawn agent model metadata in app-server items (#14410 ) - add model and reasoning effort to app-server collab spawn items and notifications - regenerate app-server protocol schemas for the new fields --------- Co-authored-by: Codex <noreply@openai.com>	2026-03-11 19:25:21 -07:00
Ahmed Ibrahim	285b3a5143	Show spawned agent model and effort in TUI (#14273 ) - include the requested sub-agent model and reasoning effort in the spawn begin event\n- render that metadata next to the spawned agent name and role in the TUI transcript --------- Co-authored-by: Codex <noreply@openai.com>	2026-03-11 12:33:09 -07:00
Celia Chen	c1a424691f	chore: add a separate reject-policy flag for skill approvals (#14271 ) ## Summary - add `skill_approval` to `RejectConfig` and the app-server v2 `AskForApproval::Reject` payload so skill-script prompts can be configured independently from sandbox and rule-based prompts - update Unix shell escalation to reject prompts based on the actual decision source, keeping prefix rules tied to `rules`, unmatched command fallbacks tied to `sandbox_approval`, and skill scripts tied to `skill_approval` - regenerate the affected protocol/config schemas and expand unit/integration coverage for the new flag and skill approval behavior	2026-03-11 12:33:09 -07:00
Ahmed Ibrahim	2e24be2134	Use realtime transcript for handoff context (#14132 ) - collect input/output transcript deltas into active handoff transcript state - attach and clear that transcript on each handoff, and regenerate schema/tests	2026-03-09 22:30:03 -07:00
Dylan Hurd	772259b01f	fix(core) default RejectConfig.request_permissions (#14165 ) ## Summary Adds a default here so existing config deserializes ## Testing - [x] Added a unit test	2026-03-10 04:56:23 +00:00
Andrei Eternal	244b2d53f4	start of hooks engine (#13276 ) (Experimental) This PR adds a first MVP for hooks, with SessionStart and Stop The core design is: - hooks live in a dedicated engine under codex-rs/hooks - each hook type has its own event-specific file - hook execution is synchronous and blocks normal turn progression while running - matching hooks run in parallel, then their results are aggregated into a normalized HookRunSummary On the AppServer side, hooks are exposed as operational metadata rather than transcript-native items: - new live notifications: hook/started, hook/completed - persisted/replayed hook results live on Turn.hookRuns - we intentionally did not add hook-specific ThreadItem variants Hooks messages are not persisted, they remain ephemeral. The context changes they add are (they get appended to the user's prompt)	2026-03-10 04:11:31 +00:00
viyatb-oai	b0cbc25a48	fix(protocol): preserve legacy workspace-write semantics (#13957 ) ## Summary This is a fast follow to the initial `[permissions]` structure. - keep the new split-policy carveout behavior for narrower non-write entries under broader writable roots - preserve legacy `WorkspaceWrite` semantics by using a cwd-aware bridge that drops only redundant nested readable roots when projecting from `SandboxPolicy` - route the legacy macOS seatbelt adapter through that same legacy bridge so redundant nested readable roots do not become read-only carveouts on macOS - derive the legacy bridge for `command_exec` using the sandbox root cwd rather than the request cwd so policy derivation matches later sandbox enforcement - add regression coverage for the legacy macOS nested-readable-root case ## Examples ### Legacy `workspace-write` on macOS A legacy `workspace-write` policy can redundantly list a nested readable root under an already-writable workspace root. For example, legacy config can effectively mean: - workspace root (`.` / `cwd`) is writable - `docs/` is also listed in `readable_roots` The new shared split-policy helper intentionally treats a narrower non-write entry under a broader writable root as a carveout for real `[permissions]` configs. Without this fast follow, the unchanged macOS seatbelt legacy adapter could project that legacy shape into a `FileSystemSandboxPolicy` that treated `docs/` like a read-only carveout under the writable workspace root. In practice, legacy callers on macOS could unexpectedly lose write access inside `docs/`, even though that path was writable before the `[permissions]` migration work. This change fixes that by routing the legacy seatbelt path through the cwd-aware legacy bridge, so: - legacy `workspace-write` keeps `docs/` writable when `docs/` was only a redundant readable root - explicit `[permissions]` entries like `'.' = 'write'` and `'docs' = 'read'` still make `docs/` read-only, which is the new intended split-policy behavior ### Legacy `command_exec` with a subdirectory cwd `command_exec` can run a command from a request cwd that is narrower than the sandbox root cwd. For example: - sandbox root cwd is `/repo` - request cwd is `/repo/subdir` - legacy policy is still `workspace-write` rooted at `/repo` Before this fast follow, `command_exec` derived the legacy bridge using the request cwd, but the sandbox was later built using the sandbox root cwd. That mismatch could miss redundant legacy readable roots during projection and accidentally reintroduce read-only carveouts for paths that should still be writable under the legacy model. This change fixes that by deriving the legacy bridge with the same sandbox root cwd that sandbox enforcement later uses. ## Verification - `just fmt` - `cargo test -p codex-core seatbelt_legacy_workspace_write_nested_readable_root_stays_writable` - `cargo test -p codex-core test_sandbox_config_parsing` - `cargo clippy -p codex-core -p codex-app-server --all-targets -- -D warnings` - `cargo clean`	2026-03-09 18:43:27 -07:00
Dylan Hurd	6da84efed8	feat(approvals) RejectConfig for request_permissions (#14118 ) ## Summary We need to support allowing request_permissions calls when using `Reject` policy <img width="1133" height="588" alt="Screenshot 2026-03-09 at 12 06 40 PM" src="https://github.com/user-attachments/assets/a8df987f-c225-4866-b8ab-5590960daec5" /> Note that this is a backwards-incompatible change for Reject policy. I'm not sure if we need to add a default based on our current use/setup ## Testing - [x] Added tests - [x] Tested locally	2026-03-09 18:16:54 -07:00
Jack Mousseau	e6b93841c5	Add request permissions tool (#13092 ) Adds a built-in `request_permissions` tool and wires it through the Codex core, protocol, and app-server layers so a running turn can ask the client for additional permissions instead of relying on a static session policy. The new flow emits a `RequestPermissions` event from core, tracks the pending request by call ID, forwards it through app-server v2 as an `item/permissions/requestApproval` request, and resumes the tool call once the client returns an approved subset of the requested permission profile.	2026-03-08 20:23:06 -07:00
Celia Chen	340f9c9ecb	app-server: include experimental skill metadata in exec approval requests (#13929 ) ## Summary This change surfaces skill metadata on command approval requests so app-server clients can tell when an approval came from a skill script and identify the originating `SKILL.md`. - add `skill_metadata` to exec approval events in the shared protocol - thread skill metadata through core shell escalation and delegated approval handling for skill-triggered approvals - expose the field in app-server v2 as experimental `skillMetadata` - regenerate the JSON/TypeScript schemas and cover the new field in protocol, transport, core, and TUI tests ## Why Skill-triggered approvals already carry skill context inside core, but app-server clients could not see which skill caused the prompt. Sending the skill metadata with the approval request makes it possible for clients to present better approval UX and connect the prompt back to the relevant skill definition. ## example event in app-server-v2 verified that we see this event when experimental api is on: ``` < { < "id": 11, < "method": "item/commandExecution/requestApproval", < "params": { < "additionalPermissions": { < "fileSystem": null, < "macos": { < "accessibility": false, < "automations": { < "bundle_ids": [ < "com.apple.Notes" < ] < }, < "calendar": false, < "preferences": "read_only" < }, < "network": null < }, < "approvalId": "25d600ee-5a3c-4746-8d17-e2e61fb4c563", < "availableDecisions": [ < "accept", < "acceptForSession", < "cancel" < ], < "command": "/Applications/ChatGPT.app/Contents/Resources/CodexAppServer_CodexAppServerBundledSkills.bundle/Contents/Resources/skills/apple-notes/scripts/notes_info", < "commandActions": [ < { < "command": "/Applications/ChatGPT.app/Contents/Resources/CodexAppServer_CodexAppServerBundledSkills.bundle/Contents/Resources/skills/apple-notes/scripts/notes_info", < "type": "unknown" < } < ], < "cwd": "/Applications/ChatGPT.app/Contents/Resources/CodexAppServer_CodexAppServerBundledSkills.bundle/Contents/Resources/skills/apple-notes", < "itemId": "call_jZp3xFpNg4D8iKAD49cvEvZy", < "skillMetadata": { < "pathToSkillsMd": "/Applications/ChatGPT.app/Contents/Resources/CodexAppServer_CodexAppServerBundledSkills.bundle/Contents/Resources/skills/apple-notes/SKILL.md" < }, < "threadId": "019ccc10-b7d3-7ff2-84fe-3a75e7681e69", < "turnId": "019ccc10-b848-76f1-81b3-4a1fa225493f" < } < }` ``` & verified that this is the event when experimental api is off: ``` < { < "id": 13, < "method": "item/commandExecution/requestApproval", < "params": { < "approvalId": "5fbbf776-261b-4cf8-899b-c125b547f2c0", < "availableDecisions": [ < "accept", < "acceptForSession", < "cancel" < ], < "command": "/Applications/ChatGPT.app/Contents/Resources/CodexAppServer_CodexAppServerBundledSkills.bundle/Contents/Resources/skills/apple-notes/scripts/notes_info", < "commandActions": [ < { < "command": "/Applications/ChatGPT.app/Contents/Resources/CodexAppServer_CodexAppServerBundledSkills.bundle/Contents/Resources/skills/apple-notes/scripts/notes_info", < "type": "unknown" < } < ], < "cwd": "/Users/celia/code/codex/codex-rs", < "itemId": "call_OV2DHzTgYcbYtWaTTBWlocOt", < "threadId": "019ccc16-2a2b-7be1-8500-e00d45b892d4", < "turnId": "019ccc16-2a8e-7961-98ec-649600e7d06a" < } < } ```	2026-03-08 18:07:46 -07:00
Michael Bolin	3b5fe5ca35	protocol: keep root carveouts sandboxed (#13452 ) ## Why A restricted filesystem policy that grants `:root` read or write access but also carries explicit deny entries should still behave like scoped access with carveouts, not like unrestricted disk access. Without that distinction, later platform backends cannot preserve blocked subpaths under root-level permissions because the protocol layer reports the policy as fully unrestricted. ## What changed - taught `FileSystemSandboxPolicy` to treat root access plus explicit deny entries as scoped access rather than full-disk access - derived readable and writable roots from the filesystem root when root access is combined with carveouts, while preserving the denied paths as read-only subpaths - added protocol coverage for root-write policies with carveouts and a core sandboxing regression so those policies still require platform sandboxing ## Verification - added protocol coverage in `protocol/src/permissions.rs` and `protocol/src/protocol.rs` for root access with explicit carveouts - added platform-sandbox regression coverage in `core/src/sandboxing/mod.rs` - verified the current PR state with `just clippy` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/13452). * #13453 * __->__ #13452 * #13451 * #13449 * #13448 * #13445 * #13440 * #13439 --------- Co-authored-by: viyatb-oai <viyatb@openai.com>	2026-03-07 21:15:47 -08:00
Michael Bolin	07a30da3fb	linux-sandbox: plumb split sandbox policies through helper (#13449 ) ## Why The Linux sandbox helper still only accepted the legacy `SandboxPolicy` payload. That meant the runtime could compute split filesystem and network policies, but the helper would immediately collapse them back to the compatibility projection before applying seccomp or staging the bubblewrap inner command. ## What changed - added hidden `--file-system-sandbox-policy` and `--network-sandbox-policy` flags alongside the legacy `--sandbox-policy` flag so the helper can migrate incrementally - updated the core-side Landlock wrapper to pass the split policies explicitly when launching `codex-linux-sandbox` - added helper-side resolution logic that accepts either the legacy policy alone or a complete split-policy pair and normalizes that into one effective configuration - switched Linux helper network decisions to use `NetworkSandboxPolicy` directly - added `FromStr` support for the split policy types so the helper can parse them from CLI JSON ## Verification - added helper coverage in `linux-sandbox/src/linux_run_main_tests.rs` for split-policy flags and policy resolution - added CLI argument coverage in `core/src/landlock.rs` - verified the current PR state with `just clippy` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/13449). * #13453 * #13452 * #13451 * __->__ #13449 * #13448 * #13445 * #13440 * #13439 --------- Co-authored-by: viyatb-oai <viyatb@openai.com>	2026-03-07 19:40:10 -08:00
Michael Bolin	b52c18e414	protocol: derive effective file access from filesystem policies (#13440 ) ## Why `#13434` and `#13439` introduce split filesystem and network policies, but the only code that could answer basic filesystem questions like "is access effectively unrestricted?" or "which roots are readable and writable for this cwd?" still lived on the legacy `SandboxPolicy` path. That would force later backends to either keep projecting through `SandboxPolicy` or duplicate path-resolution logic. This PR moves those queries onto `FileSystemSandboxPolicy` itself so later runtime and platform changes can consume the split policy directly. ## What changed - added `FileSystemSandboxPolicy` helpers for full-read/full-write checks, platform-default reads, readable roots, writable roots, and explicit unreadable roots resolved against a cwd - added a shared helper for the default read-only carveouts under writable roots so the legacy and split-policy paths stay aligned - added protocol coverage for full-access detection and derived readable, writable, and unreadable roots ## Verification - added protocol coverage in `protocol/src/protocol.rs` and `protocol/src/permissions.rs` for full-root access and derived filesystem roots - verified the current PR state with `just clippy` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/13440). * #13453 * #13452 * #13451 * #13449 * #13448 * #13445 * __->__ #13440 * #13439 --------- Co-authored-by: viyatb-oai <viyatb@openai.com>	2026-03-07 03:49:29 +00:00
Michael Bolin	f82678b2a4	config: add initial support for the new permission profile config language in config.toml (#13434 ) ## Why `SandboxPolicy` currently mixes together three separate concerns: - parsing layered config from `config.toml` - representing filesystem sandbox state - carrying basic network policy alongside filesystem choices That makes the existing config awkward to extend and blocks the new TOML proposal where `[permissions]` becomes a table of named permission profiles selected by `default_permissions`. (The idea is that if `default_permissions` is not specified, we assume the user is opting into the "traditional" way to configure the sandbox.) This PR adds the config-side plumbing for those profiles while still projecting back to the legacy `SandboxPolicy` shape that the current macOS and Linux sandbox backends consume. It also tightens the filesystem profile model so scoped entries only exist for `:project_roots`, and so nested keys must stay within a project root instead of using `.` or `..` traversal. This drops support for the short-lived `[permissions.network]` in `config.toml` because now that would be interpreted as a profile named `network` within `[permissions]`. ## What Changed - added `PermissionsToml`, `PermissionProfileToml`, `FilesystemPermissionsToml`, and `FilesystemPermissionToml` so config can parse named profiles under `[permissions.<profile>.filesystem]` - added top-level `default_permissions` selection, validation for missing or unknown profiles, and compilation from a named profile into split `FileSystemSandboxPolicy` and `NetworkSandboxPolicy` values - taught config loading to choose between the legacy `sandbox_mode` path and the profile-based path without breaking legacy users - introduced `codex-protocol::permissions` for the split filesystem and network sandbox types, and stored those alongside the legacy projected `sandbox_policy` in runtime `Permissions` - modeled `FileSystemSpecialPath` so only `ProjectRoots` can carry a nested `subpath`, matching the intended config syntax instead of allowing invalid states for other special paths - restricted scoped filesystem maps to `:project_roots`, with validation that nested entries are non-empty descendant paths and cannot use `.` or `..` to escape the project root - kept existing runtime consumers working by projecting `FileSystemSandboxPolicy` back into `SandboxPolicy`, with an explicit error for profiles that request writes outside the workspace root - loaded proxy settings from top-level `[network]` - regenerated `core/config.schema.json` ## Verification - added config coverage for profile deserialization, `default_permissions` selection, top-level `[network]` loading, network enablement, rejection of writes outside the workspace root, rejection of nested entries for non-`:project_roots` special paths, and rejection of parent-directory traversal in `:project_roots` maps - added protocol coverage for the legacy bridge rejecting non-workspace writes ## Docs - update the Codex config docs on developers.openai.com/codex to document named `[permissions.<profile>]` entries, `default_permissions`, scoped `:project_roots` syntax, the descendant-path restriction for nested `:project_roots` entries, and top-level `[network]` proxy configuration --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/13434). * #13453 * #13452 * #13451 * #13449 * #13448 * #13445 * #13440 * #13439 * __->__ #13434	2026-03-06 15:39:13 -08:00
Matthew Zeng	98dca99db7	[elicitations] Switch to use MCP style elicitation payload for mcp tool approvals. (#13621 ) - [x] Switch to use MCP style elicitation payload for mcp tool approvals. - [ ] TODO: Update the UI to support the full spec.	2026-03-06 01:50:26 -08:00
Won Park	ee1a20258a	Enabling CWD Saving for Image-Gen (#13607 ) Codex now saves the generated image on to your current working directory.	2026-03-06 00:47:21 -08:00
viyatb-oai	6a79ed5920	refactor: remove proxy admin endpoint (#13687 ) ## Summary - delete the network proxy admin server and its runtime listener/task plumbing - remove the admin endpoint config, runtime, requirement, protocol, schema, and debug-surface fields - update proxy docs to reflect the remaining HTTP and SOCKS listeners only	2026-03-05 22:03:16 -08:00
Owen Lin	aa3fe8abf8	feat(core): persist trace_id for turns in RolloutItem::TurnContext (#13602 ) This PR adds a durable trace linkage for each turn by storing the active trace ID on the rollout TurnContext record stored in session rollout files. Before this change, we propagated trace context at runtime but didn’t persist a stable per-turn trace key in rollout history. That made after-the-fact debugging harder (for example, mapping a historical turn to the corresponding trace in datadog). This sets us up for much easier debugging in the future. ### What changed - Added an optional `trace_id` to TurnContextItem (rollout schema). - Added a small OTEL helper to read the current span trace ID. - Captured `trace_id` when creating `TurnContext` and included it in `to_turn_context_item()`. - Updated tests and fixtures that construct TurnContextItem so older/no-trace cases still work. ### Why this approach TurnContext is already the canonical durable per-turn metadata in rollout. This keeps ownership clean: trace linkage lives with other persisted turn metadata.	2026-03-05 13:26:48 -08:00
Owen Lin	926b2f19e8	feat(app-server): support mcp elicitations in v2 api (#13425 ) This adds a first-class server request for MCP server elicitations: `mcpServer/elicitation/request`. Until now, MCP elicitation requests only showed up as a raw `codex/event/elicitation_request` event from core. That made it hard for v2 clients to handle elicitations using the same request/response flow as other server-driven interactions (like shell and `apply_patch` tools). This also updates the underlying MCP elicitation request handling in core to pass through the full MCP request (including URL and form data) so we can expose it properly in app-server. ### Why not `item/mcpToolCall/elicitationRequest`? This is because MCP elicitations are related to MCP servers first, and only optionally to a specific MCP tool call. In the MCP protocol, elicitation is a server-to-client capability: the server sends `elicitation/create`, and the client replies with an elicitation result. RMCP models it that way as well. In practice an elicitation is often triggered by an MCP tool call, but not always. ### What changed - add `mcpServer/elicitation/request` to the v2 app-server API - translate core `codex/event/elicitation_request` events into the new v2 server request - map client responses back into `Op::ResolveElicitation` so the MCP server can continue - update app-server docs and generated protocol schema - add an end-to-end app-server test that covers the full round trip through a real RMCP elicitation flow - The new test exercises a realistic case where an MCP tool call triggers an elicitation, the app-server emits mcpServer/elicitation/request, the client accepts it, and the tool call resumes and completes successfully. ### app-server API flow - Client starts a thread with `thread/start`. - Client starts a turn with `turn/start`. - App-server sends `item/started` for the `mcpToolCall`. - While that tool call is in progress, app-server sends `mcpServer/elicitation/request`. - Client responds to that request with `{ action: "accept" \| "decline" \| "cancel" }`. - App-server sends `serverRequest/resolved`. - App-server sends `item/completed` for the mcpToolCall. - App-server sends `turn/completed`. - If the turn is interrupted while the elicitation is pending, app-server still sends `serverRequest/resolved` before the turn finishes.	2026-03-05 07:20:20 -08:00
Won Park	229e6d0347	image-gen-event/client_processing (#13512 ) enabling client-side to process with image-generation capabilities (setting app-server)	2026-03-04 16:54:38 -08:00
Celia Chen	e6773f856c	Feat: Preserve network access on read-only sandbox policies (#13409 ) ## Summary `PermissionProfile.network` could not be preserved when additional or compiled permissions resolved to `SandboxPolicy::ReadOnly`, because `ReadOnly` had no network_access field. This change makes read-only + network enabled representable directly and threads that through the protocol, app-server v2 mirror, and permission- merging logic. ## What changed - Added `network_access: bool` to `SandboxPolicy::ReadOnly` in the core protocol and app-server v2 protocol. - Kept backward compatibility by defaulting the new field to false, so legacy read-only payloads still deserialize unchanged. - Updated `has_full_network_access()` and sandbox summaries to respect read-only network access. - Preserved PermissionProfile.network when: - compiling skill permission profiles into sandbox policies - normalizing additional permissions - merging additional permissions into existing sandbox policies - Updated the approval overlay to show network in the rendered permission rule when requested. - Regenerated app-server schema fixtures for the new v2 wire shape.	2026-03-04 02:41:57 +00:00
Owen Lin	52521a5e40	feat(app-server): propagate app-server trace context into core (#13368 ) ### Summary Propagate trace context originating at app-server RPC method handlers -> codex core submission loop (so this includes spans such as `run_turn`!). This implements PR 2 of the app-server tracing rollout. This also removes the old lower-level env-based reparenting in core so explicit request/submission ancestry wins instead of being overridden by ambient `TRACEPARENT` state. ### What changed - Added `trace: Option<W3cTraceContext>` to codex_protocol::Submission - Taught `Codex::submit()` / `submit_with_id()` to automatically capture the current span context when constructing or forwarding a submission - Wrapped the core submission loop in a submission_dispatch span parented from Submission.trace - Warn on invalid submission trace carriers and ignore them cleanly - Removed the old env-based downstream reparenting path in core task execution - Stopped OTEL provider init from implicitly attaching env trace context process-wide - Updated mcp-server Submission call sites for the new field Added focused unit tests for: - capturing trace context into Submission - preferring `Submission.trace` when building the core dispatch span ### Why PR 1 gave us consistent inbound request spans in app-server, but that only covered the transport boundary. For long-running work like turns and reviews, the important missing piece was preserving ancestry after the request handler returns and core continues work on a different async path. This change makes that handoff explicit and keeps the parentage rules simple: - app-server request span sets the current context - `Submission.trace` snapshots that context - core restores it once, at the submission boundary - deeper core spans inherit naturally That also lets us stop relying on env-based reparenting for this path, which was too ambient and could override explicit ancestry.	2026-03-04 01:03:45 +00:00
pash-openai	07e532dcb9	app-server service tier plumbing (plus some cleanup) (#13334 ) followup to https://github.com/openai/codex/pull/13212 to expose fast tier controls to app server (majority of this PR is generated schema jsons - actual code is +69 / -35 and +24 tests ) - add service tier fields to the app-server protocol surfaces used by thread lifecycle, turn start, config, and session configured events - thread service tier through the app-server message processor and core thread config snapshots - allow runtime config overrides to carry service tier for app-server callers cleanup: - Removing useless "legacy" code supporting "standard" - we moved to None \| "fast", so "standard" is not needed.	2026-03-03 02:35:09 -08:00
pash-openai	2f5b01abd6	add fast mode toggle (#13212 ) - add a local Fast mode setting in codex-core (similar to how model id is currently stored on disk locally) - send `service_tier=priority` on requests when Fast is enabled - add `/fast` in the TUI and persist it locally - feature flag	2026-03-02 20:29:33 -08:00
Ahmed Ibrahim	b20b6aa46f	Update realtime websocket API (#13265 ) - migrate the realtime websocket transport to the new session and handoff flow - make the realtime model configurable in config.toml and use API-key auth for the websocket --------- Co-authored-by: Codex <noreply@openai.com>	2026-03-02 16:05:40 -08:00
Owen Lin	d473e8d56d	feat(app-server): add tracing to all app-server APIs (#13285 ) ### Overview This PR adds the first piece of tracing for app-server JSON-RPC requests. There are two main changes: - JSON-RPC requests can now take an optional W3C trace context at the top level via a `trace` field (`traceparent` / `tracestate`). - app-server now creates a dedicated request span for every inbound JSON-RPC request in `MessageProcessor`, and uses the request-level trace context as the parent when present. For compatibility with existing flows, app-server still falls back to the TRACEPARENT env var when there is no request-level traceparent. This PR is intentionally scoped to the app-server boundary. In a followup, we'll actually propagate trace context through the async handoff into core execution spans like run_turn, which will make app-server traces much more useful. ### Spans A few details on the app-server span shape: - each inbound request gets its own server span - span/resource names are based on the JSON-RPC method (`initialize`, `thread/start`, `turn/start`, etc.) - spans record transport (stdio vs websocket), request id, connection id, and client name/version when available - `initialize` stores client metadata in session state so later requests on the same connection can reuse it	2026-03-02 16:01:41 -08:00
jif-oai	b649953845	feat: polluted memories (#13008 ) Add a feature flag to disable memory creation for "polluted"	2026-03-02 11:57:32 +00:00
Ahmed Ibrahim	0aeb55bf08	Record realtime close marker on replacement (#13058 ) ## Summary - record a realtime close developer message when a new realtime session replaces an active one - assert the replacement marker through the mocked responses request path --------- Co-authored-by: Codex <noreply@openai.com> Co-authored-by: Charles Cunningham <ccunningham@openai.com>	2026-03-01 13:54:12 -08:00
Charley Cunningham	695957a348	Unify rollout reconstruction with resume/fork TurnContext hydration (#12612 ) ## Summary This PR unifies rollout history reconstruction and resume/fork metadata hydration under a single `Session::reconstruct_history_from_rollout` implementation. The key change from main is that replay metadata now comes from the same reconstruction pass that rebuilds model-visible history, instead of doing a second bespoke rollout scan to recover `previous_model` / `reference_context_item`. ## What Changed ### Unified reconstruction output `reconstruct_history_from_rollout` now returns a single `RolloutReconstruction` bundle containing: - rebuilt `history` - `previous_model` - `reference_context_item` Resume and fork both consume that shared output directly. ### Reverse replay core The reconstruction logic moved into `codex-rs/core/src/codex/rollout_reconstruction.rs` and now scans rollout items newest-to-oldest. That reverse pass: - derives `previous_model` - derives whether `reference_context_item` is preserved or cleared - stops early once it has both resume metadata and a surviving `replacement_history` checkpoint History materialization is still bridged eagerly for now by replaying only the surviving suffix forward, which keeps the history result stable while moving the control flow toward the future lazy reverse loader design. ### Removed bespoke context lookup This deletes `last_rollout_regular_turn_context_lookup` and its separate compaction-aware scan. The previous model / baseline metadata is now computed from the same replay state that rebuilds history, so resume/fork cannot drift from the reconstructed transcript view. ### `TurnContextItem` persistence contract `TurnContextItem` is now treated as the replay source of truth for durable model-visible baselines. This PR keeps the following contract explicit: - persist `TurnContextItem` for the first real user turn so resume can recover `previous_model` - persist it for later turns that emit model-visible context updates - if mid-turn compaction reinjects full initial context into replacement history, persist a fresh `TurnContextItem` after `Compacted` so resume/fork can re-establish the baseline from the rewritten history - do not treat manual compaction or pre-sampling compaction as creating a new durable baseline on their own ## Behavior Preserved - rollback replay stays aligned with `drop_last_n_user_turns` - rollback skips only user turns - incomplete active user turns are dropped before older finalized turns when rollback applies - unmatched aborts do not consume the current active turn - missing abort IDs still conservatively clear stale compaction state - compaction clears `reference_context_item` until a later `TurnContextItem` re-establishes it - `previous_model` still comes from the newest surviving user turn that established one ## Tests Targeted validation run for the current branch shape: - `cd codex-rs && cargo test -p codex-core --lib codex::rollout_reconstruction_tests -- --nocapture` - `cd codex-rs && just fmt` The branch also extracts the rollout reconstruction tests into `codex-rs/core/src/codex/rollout_reconstruction_tests.rs` so this logic has a dedicated home instead of living inline in `codex.rs`.	2026-02-27 13:50:45 -08:00
Celia Chen	90cc4e79a2	feat: add local date/timezone to turn environment context (#12947 ) ## Summary This PR includes the session's local date and timezone in the model-visible environment context and persists that data in `TurnContextItem`. ## What changed - captures the current local date and IANA timezone when building a turn context, with a UTC fallback if the timezone lookup fails - includes current_date and timezone in the serialized <environment_context> payload - stores those fields on TurnContextItem so they survive rollout/history handling, subagent review threads, and resume flows - treats date/timezone changes as environment updates, so prompt caching and context refresh logic do not silently reuse stale time context - updates tests to validate the new environment fields without depending on a single hardcoded environment-context string ## test built a local build and saw it in the rollout file: ``` {"timestamp":"2026-02-26T21:39:50.737Z","type":"response_item","payload":{"type":"message","role":"user","content":[{"type":"input_text","text":"<environment_context>\n <shell>zsh</shell>\n <current_date>2026-02-26</current_date>\n <timezone>America/Los_Angeles</timezone>\n</environment_context>"}]}} ```	2026-02-26 23:17:35 +00:00
pakrym-oai	951a389654	Allow clients not to send summary as an option (#12950 ) Summary is a required parameter on UserTurn. Ideally we'd like the core to decide the appropriate summary level. Make the summary optional and don't send it when not needed.	2026-02-26 14:37:38 -08:00
jif-oai	14a08d6c14	nit: captial (#12885 )	2026-02-26 09:36:13 +00:00

1 2 3 4 5

247 commits