core-agent-ide

Author	SHA1	Message	Date
sayan-oai	8a54d3caaa	feat: structured plugin parsing (#13711 ) #### What Add structured `@plugin` parsing and TUI support for plugin mentions. - Core: switch from plain-text `@display_name` parsing to structured `plugin://...` mentions via `UserInput::Mention` and `[$...](plugin://...)` links in text, same pattern as apps/skills. - TUI: add plugin mention popup, autocomplete, and chips when typing `$`. Load plugin capability summaries and feed them into the composer; plugin mentions appear alongside skills and apps. - Generalize mention parsing to a sigil parameter, still defaults to `$` <img width="797" height="119" alt="image" src="https://github.com/user-attachments/assets/f0fe2658-d908-4927-9139-73f850805ceb" /> Builds on #13510. Currently clients have to build their own `id` via `plugin@marketplace` and filter plugins to show by `enabled`, but we will add `id` and `available` as fields returned from `plugin/list` soon. ####Tests Added tests, verified locally.	2026-03-06 11:08:36 -08:00
Charley Cunningham	cb1a182bbe	Clarify sandbox permission override helper semantics (#13703 ) ## Summary Today `SandboxPermissions::requires_additional_permissions()` does not actually mean "is `WithAdditionalPermissions`". It returns `true` for any non-default sandbox override, including `RequireEscalated`. That broad behavior is relied on in multiple `main` callsites. The naming is security-sensitive because `SandboxPermissions` is used on shell-like tool calls to tell the executor how a single command should relate to the turn sandbox: - `UseDefault`: run with the turn sandbox unchanged - `RequireEscalated`: request execution outside the sandbox - `WithAdditionalPermissions`: stay sandboxed but widen permissions for that command only ## Problem The old helper name reads as if it only applies to the `WithAdditionalPermissions` variant. In practice it means "this command requested any explicit sandbox override." That ambiguity made it easy to read production checks incorrectly and made the guardian change look like a standalone `main` fix when it is not. On `main` today: - `shell` and `unified_exec` intentionally reject any explicit `sandbox_permissions` request unless approval policy is `OnRequest` - `exec_policy` intentionally treats any explicit sandbox override as prompt-worthy in restricted sandboxes - tests intentionally serialize both `RequireEscalated` and `WithAdditionalPermissions` as explicit sandbox override requests So changing those callsites from the broad helper to a narrow `WithAdditionalPermissions` check would be a behavior change, not a pure cleanup. ## What This PR Does - documents `SandboxPermissions` as a per-command sandbox override, not a generic permissions bag - adds `requests_sandbox_override()` for the broad meaning: anything except `UseDefault` - adds `uses_additional_permissions()` for the narrow meaning: only `WithAdditionalPermissions` - keeps `requires_additional_permissions()` as a compatibility alias to the broad meaning for now - updates the current broad callsites to use the accurately named broad helper - adds unit coverage that locks in the semantics of all three helpers ## What This PR Does Not Do This PR does not change runtime behavior. That is intentional. --------- Co-authored-by: Codex <noreply@openai.com>	2026-03-06 09:57:48 -08:00
Matthew Zeng	98dca99db7	[elicitations] Switch to use MCP style elicitation payload for mcp tool approvals. (#13621 ) - [x] Switch to use MCP style elicitation payload for mcp tool approvals. - [ ] TODO: Update the UI to support the full spec.	2026-03-06 01:50:26 -08:00
Won Park	ee1a20258a	Enabling CWD Saving for Image-Gen (#13607 ) Codex now saves the generated image on to your current working directory.	2026-03-06 00:47:21 -08:00
viyatb-oai	6a79ed5920	refactor: remove proxy admin endpoint (#13687 ) ## Summary - delete the network proxy admin server and its runtime listener/task plumbing - remove the admin endpoint config, runtime, requirement, protocol, schema, and debug-surface fields - update proxy docs to reflect the remaining HTTP and SOCKS listeners only	2026-03-05 22:03:16 -08:00
Celia Chen	f9ce403b5a	fix: accept two macOS automation input shapes for approval payload compatibility (#13683 ) ## Summary This PR: 1. fixes a deserialization mismatch for macOS automation permissions in approval payloads by making core parsing accept both supported wire shapes for bundle IDs. 2. added `#[serde(default)]` to `MacOsSeatbeltProfileExtensions` so omitted fields deserialize to secure defaults. ## Why this change is needed `MacOsAutomationPermission` uses `#[serde(try_from = "MacOsAutomationPermissionDe")]`, so deserialization is controlled by `MacOsAutomationPermissionDe`. After we aligned v2 `additionalPermissions.macos.automations` to the core shape, approval payloads started including `{ "bundle_ids": [...] }` in some paths. `MacOsAutomationPermissionDe` previously accepted only `"none" \| "all"` or a plain array, so object-shaped bundle IDs failed with `data did not match any variant of untagged enum MacOsAutomationPermissionDe`. This change restores compatibility by accepting both forms while preserving existing normalization behavior (trim values and map empty bundle lists to `None`). ## Validation saw this error went away when running ``` cargo run -p codex-app-server-test-client -- \ --codex-bin ./target/debug/codex \ -c 'approval_policy="on-request"' \ -c 'features.shell_zsh_fork=true' \ -c 'zsh_path="/tmp/codex-zsh-fork/package/vendor/aarch64-apple-darwin/zsh/macos-15/zsh"' \ send-message-v2 --experimental-api \ 'Use $apple-notes and run scripts/notes_info now.' ``` : ``` Error: failed to deserialize ServerRequest from JSONRPCRequest Caused by: data did not match any variant of untagged enum MacOsAutomationPermissionDe ```	2026-03-06 06:02:33 +00:00
Celia Chen	fb9fcf060f	chore: remove unused legacy macOS permission types (#13677 ) ## Summary This PR removes legacy macOS permission model types from `codex-rs/protocol/src/models.rs`: - `MacOsPermissions` - `MacOsPreferencesValue` - `MacOsAutomationValue` The protocol now relies on the current `MacOsSeatbeltProfileExtensions` model for macOS permission data.	2026-03-06 05:32:40 +00:00
Celia Chen	aaefee04cd	core/protocol: add structured macOS additional permissions and merge them into sandbox execution (#13499 ) ## Summary - Introduce strongly-typed macOS additional permissions across protocol/core/app-server boundaries. - Merge additional permissions into effective sandbox execution, including macOS seatbelt profile extensions. - Expand docs, schema/tool definitions, UI rendering, and tests for `network`, `file_system`, and `macos` additional permissions.	2026-03-05 16:21:45 -08:00
Owen Lin	aa3fe8abf8	feat(core): persist trace_id for turns in RolloutItem::TurnContext (#13602 ) This PR adds a durable trace linkage for each turn by storing the active trace ID on the rollout TurnContext record stored in session rollout files. Before this change, we propagated trace context at runtime but didn’t persist a stable per-turn trace key in rollout history. That made after-the-fact debugging harder (for example, mapping a historical turn to the corresponding trace in datadog). This sets us up for much easier debugging in the future. ### What changed - Added an optional `trace_id` to TurnContextItem (rollout schema). - Added a small OTEL helper to read the current span trace ID. - Captured `trace_id` when creating `TurnContext` and included it in `to_turn_context_item()`. - Updated tests and fixtures that construct TurnContextItem so older/no-trace cases still work. ### Why this approach TurnContext is already the canonical durable per-turn metadata in rollout. This keeps ownership clean: trace linkage lives with other persisted turn metadata.	2026-03-05 13:26:48 -08:00
Owen Lin	926b2f19e8	feat(app-server): support mcp elicitations in v2 api (#13425 ) This adds a first-class server request for MCP server elicitations: `mcpServer/elicitation/request`. Until now, MCP elicitation requests only showed up as a raw `codex/event/elicitation_request` event from core. That made it hard for v2 clients to handle elicitations using the same request/response flow as other server-driven interactions (like shell and `apply_patch` tools). This also updates the underlying MCP elicitation request handling in core to pass through the full MCP request (including URL and form data) so we can expose it properly in app-server. ### Why not `item/mcpToolCall/elicitationRequest`? This is because MCP elicitations are related to MCP servers first, and only optionally to a specific MCP tool call. In the MCP protocol, elicitation is a server-to-client capability: the server sends `elicitation/create`, and the client replies with an elicitation result. RMCP models it that way as well. In practice an elicitation is often triggered by an MCP tool call, but not always. ### What changed - add `mcpServer/elicitation/request` to the v2 app-server API - translate core `codex/event/elicitation_request` events into the new v2 server request - map client responses back into `Op::ResolveElicitation` so the MCP server can continue - update app-server docs and generated protocol schema - add an end-to-end app-server test that covers the full round trip through a real RMCP elicitation flow - The new test exercises a realistic case where an MCP tool call triggers an elicitation, the app-server emits mcpServer/elicitation/request, the client accepts it, and the tool call resumes and completes successfully. ### app-server API flow - Client starts a thread with `thread/start`. - Client starts a turn with `turn/start`. - App-server sends `item/started` for the `mcpToolCall`. - While that tool call is in progress, app-server sends `mcpServer/elicitation/request`. - Client responds to that request with `{ action: "accept" \| "decline" \| "cancel" }`. - App-server sends `serverRequest/resolved`. - App-server sends `item/completed` for the mcpToolCall. - App-server sends `turn/completed`. - If the turn is interrupted while the elicitation is pending, app-server still sends `serverRequest/resolved` before the turn finishes.	2026-03-05 07:20:20 -08:00
sayan-oai	03d55f0e6f	chore: add web_search_tool_type for image support (#13538 ) add `web_search_tool_type` on model_info that can be populated from backend. will be used to filter which models can use `web_search` with images and which cant. added small unit test.	2026-03-05 07:02:27 +00:00
Won Park	229e6d0347	image-gen-event/client_processing (#13512 ) enabling client-side to process with image-generation capabilities (setting app-server)	2026-03-04 16:54:38 -08:00
Won Park	fa2306b303	image-gen-core (#13290 ) Core tool-calling for image-gen, handles requesting and receiving logic for images using response API	2026-03-03 23:11:28 -08:00
Val Kharitonov	4f6c4bb143	support 'flex' tier in app-server in addition to 'fast' (#13391 )	2026-03-03 22:46:05 -08:00
Celia Chen	d622bff384	chore: Nest skill and protocol network permissions under `network.enabled` (#13427 ) ## Summary Changes the permission profile shape from a bare network boolean to a nested object. Before: ```yaml permissions: network: true ``` After: ```yaml permissions: network: enabled: true ``` This also updates the shared Rust and app-server protocol types so `PermissionProfile.network` is no longer `Option<bool>`, but `Option<NetworkPermissions>` with `enabled: Option<bool>`. ## What Changed - Updated `PermissionProfile` in `codex-rs/protocol/src/models.rs`: - `pub network: Option<bool>` -> `pub network: Option<NetworkPermissions>` - Added `NetworkPermissions` with: - `pub enabled: Option<bool>` - Changed emptiness semantics so `network` is only considered empty when `enabled` is `None` - Updated skill metadata parsing to accept `permissions.network.enabled` - Updated core permission consumers to read `network.enabled.unwrap_or(false)` where a concrete boolean is needed - Updated app-server v2 protocol types and regenerated schema/TypeScript outputs - Updated docs to mention `additionalPermissions.network.enabled`	2026-03-03 20:57:29 -08:00
Celia Chen	e6773f856c	Feat: Preserve network access on read-only sandbox policies (#13409 ) ## Summary `PermissionProfile.network` could not be preserved when additional or compiled permissions resolved to `SandboxPolicy::ReadOnly`, because `ReadOnly` had no network_access field. This change makes read-only + network enabled representable directly and threads that through the protocol, app-server v2 mirror, and permission- merging logic. ## What changed - Added `network_access: bool` to `SandboxPolicy::ReadOnly` in the core protocol and app-server v2 protocol. - Kept backward compatibility by defaulting the new field to false, so legacy read-only payloads still deserialize unchanged. - Updated `has_full_network_access()` and sandbox summaries to respect read-only network access. - Preserved PermissionProfile.network when: - compiling skill permission profiles into sandbox policies - normalizing additional permissions - merging additional permissions into existing sandbox policies - Updated the approval overlay to show network in the rendered permission rule when requested. - Regenerated app-server schema fixtures for the new v2 wire shape.	2026-03-04 02:41:57 +00:00
Owen Lin	52521a5e40	feat(app-server): propagate app-server trace context into core (#13368 ) ### Summary Propagate trace context originating at app-server RPC method handlers -> codex core submission loop (so this includes spans such as `run_turn`!). This implements PR 2 of the app-server tracing rollout. This also removes the old lower-level env-based reparenting in core so explicit request/submission ancestry wins instead of being overridden by ambient `TRACEPARENT` state. ### What changed - Added `trace: Option<W3cTraceContext>` to codex_protocol::Submission - Taught `Codex::submit()` / `submit_with_id()` to automatically capture the current span context when constructing or forwarding a submission - Wrapped the core submission loop in a submission_dispatch span parented from Submission.trace - Warn on invalid submission trace carriers and ignore them cleanly - Removed the old env-based downstream reparenting path in core task execution - Stopped OTEL provider init from implicitly attaching env trace context process-wide - Updated mcp-server Submission call sites for the new field Added focused unit tests for: - capturing trace context into Submission - preferring `Submission.trace` when building the core dispatch span ### Why PR 1 gave us consistent inbound request spans in app-server, but that only covered the transport boundary. For long-running work like turns and reviews, the important missing piece was preserving ancestry after the request handler returns and core continues work on a different async path. This change makes that handoff explicit and keeps the parentage rules simple: - app-server request span sets the current context - `Submission.trace` snapshots that context - core restores it once, at the submission boundary - deeper core spans inherit naturally That also lets us stop relying on env-based reparenting for this path, which was too ambient and could override explicit ancestry.	2026-03-04 01:03:45 +00:00
Curtis 'Fjord' Hawthorne	b92146d48b	Add under-development original-resolution view_image support (#13050 ) ## Summary Add original-resolution support for `view_image` behind the under-development `view_image_original_resolution` feature flag. When the flag is enabled and the target model is `gpt-5.3-codex` or newer, `view_image` now preserves original PNG/JPEG/WebP bytes and sends `detail: "original"` to the Responses API instead of using the legacy resize/compress path. ## What changed - Added `view_image_original_resolution` as an under-development feature flag. - Added `ImageDetail` to the protocol models and support for serializing `detail: "original"` on tool-returned images. - Added `PromptImageMode::Original` to `codex-utils-image`. - Preserves original PNG/JPEG/WebP bytes. - Keeps legacy behavior for the resize path. - Updated `view_image` to: - use the shared `local_image_content_items_with_label_number(...)` helper in both code paths - select original-resolution mode only when: - the feature flag is enabled, and - the model slug parses as `gpt-5.3-codex` or newer - Kept local user image attachments on the existing resize path; this change is specific to `view_image`. - Updated history/image accounting so only `detail: "original"` images use the docs-based GPT-5 image cost calculation; legacy images still use the old fixed estimate. - Added JS REPL guidance, gated on the same feature flag, to prefer JPEG at 85% quality unless lossless is required, while still allowing other formats when explicitly requested. - Updated tests and helper code that construct `FunctionCallOutputContentItem::InputImage` to carry the new `detail` field. ## Behavior ### Feature off - `view_image` keeps the existing resize/re-encode behavior. - History estimation keeps the existing fixed-cost heuristic. ### Feature on + `gpt-5.3-codex+` - `view_image` sends original-resolution images with `detail: "original"`. - PNG/JPEG/WebP source bytes are preserved when possible. - History estimation uses the GPT-5 docs-based image-cost calculation for those `detail: "original"` images. #### [git stack](https://github.com/magus/git-stack-cli) - 👉 `1` https://github.com/openai/codex/pull/13050 - ⏳ `2` https://github.com/openai/codex/pull/13331 - ⏳ `3` https://github.com/openai/codex/pull/13049	2026-03-03 15:56:54 -08:00
Ahmed Ibrahim	041c896509	Revert "Revert "realtime prompt changes"" (#13398 ) Reverts openai/codex#13385	2026-03-03 14:41:26 -08:00
Ahmed Ibrahim	8afe2127dc	Revert "realtime prompt changes" (#13385 ) Reverts openai/codex#13376	2026-03-03 12:30:37 -08:00
Ahmed Ibrahim	f6288248f4	realtime prompt changes (#13376 ) # External (non-OpenAI) Pull Request Requirements Before opening this Pull Request, please read the dedicated "Contributing" markdown file or your PR may be closed: https://github.com/openai/codex/blob/main/docs/contributing.md If your PR conforms to our contribution guidelines, replace this text with a detailed and high quality description of your changes. Include a link to a bug report or enhancement request.	2026-03-03 10:13:14 -08:00
pash-openai	07e532dcb9	app-server service tier plumbing (plus some cleanup) (#13334 ) followup to https://github.com/openai/codex/pull/13212 to expose fast tier controls to app server (majority of this PR is generated schema jsons - actual code is +69 / -35 and +24 tests ) - add service tier fields to the app-server protocol surfaces used by thread lifecycle, turn start, config, and session configured events - thread service tier through the app-server message processor and core thread config snapshots - allow runtime config overrides to carry service tier for app-server callers cleanup: - Removing useless "legacy" code supporting "standard" - we moved to None \| "fast", so "standard" is not needed.	2026-03-03 02:35:09 -08:00
pash-openai	2f5b01abd6	add fast mode toggle (#13212 ) - add a local Fast mode setting in codex-core (similar to how model id is currently stored on disk locally) - send `service_tier=priority` on requests when Fast is enabled - add `/fast` in the TUI and persist it locally - feature flag	2026-03-02 20:29:33 -08:00
Ahmed Ibrahim	b20b6aa46f	Update realtime websocket API (#13265 ) - migrate the realtime websocket transport to the new session and handoff flow - make the realtime model configurable in config.toml and use API-key auth for the websocket --------- Co-authored-by: Codex <noreply@openai.com>	2026-03-02 16:05:40 -08:00
Owen Lin	d473e8d56d	feat(app-server): add tracing to all app-server APIs (#13285 ) ### Overview This PR adds the first piece of tracing for app-server JSON-RPC requests. There are two main changes: - JSON-RPC requests can now take an optional W3C trace context at the top level via a `trace` field (`traceparent` / `tracestate`). - app-server now creates a dedicated request span for every inbound JSON-RPC request in `MessageProcessor`, and uses the request-level trace context as the parent when present. For compatibility with existing flows, app-server still falls back to the TRACEPARENT env var when there is no request-level traceparent. This PR is intentionally scoped to the app-server boundary. In a followup, we'll actually propagate trace context through the async handoff into core execution spans like run_turn, which will make app-server traces much more useful. ### Spans A few details on the app-server span shape: - each inbound request gets its own server span - span/resource names are based on the JSON-RPC method (`initialize`, `thread/start`, `turn/start`, etc.) - spans record transport (stdio vs websocket), request id, connection id, and client name/version when available - `initialize` stores client metadata in session state so later requests on the same connection can reuse it	2026-03-02 16:01:41 -08:00
jif-oai	b649953845	feat: polluted memories (#13008 ) Add a feature flag to disable memory creation for "polluted"	2026-03-02 11:57:32 +00:00
Ahmed Ibrahim	0aeb55bf08	Record realtime close marker on replacement (#13058 ) ## Summary - record a realtime close developer message when a new realtime session replaces an active one - assert the replacement marker through the mocked responses request path --------- Co-authored-by: Codex <noreply@openai.com> Co-authored-by: Charles Cunningham <ccunningham@openai.com>	2026-03-01 13:54:12 -08:00
Charley Cunningham	695957a348	Unify rollout reconstruction with resume/fork TurnContext hydration (#12612 ) ## Summary This PR unifies rollout history reconstruction and resume/fork metadata hydration under a single `Session::reconstruct_history_from_rollout` implementation. The key change from main is that replay metadata now comes from the same reconstruction pass that rebuilds model-visible history, instead of doing a second bespoke rollout scan to recover `previous_model` / `reference_context_item`. ## What Changed ### Unified reconstruction output `reconstruct_history_from_rollout` now returns a single `RolloutReconstruction` bundle containing: - rebuilt `history` - `previous_model` - `reference_context_item` Resume and fork both consume that shared output directly. ### Reverse replay core The reconstruction logic moved into `codex-rs/core/src/codex/rollout_reconstruction.rs` and now scans rollout items newest-to-oldest. That reverse pass: - derives `previous_model` - derives whether `reference_context_item` is preserved or cleared - stops early once it has both resume metadata and a surviving `replacement_history` checkpoint History materialization is still bridged eagerly for now by replaying only the surviving suffix forward, which keeps the history result stable while moving the control flow toward the future lazy reverse loader design. ### Removed bespoke context lookup This deletes `last_rollout_regular_turn_context_lookup` and its separate compaction-aware scan. The previous model / baseline metadata is now computed from the same replay state that rebuilds history, so resume/fork cannot drift from the reconstructed transcript view. ### `TurnContextItem` persistence contract `TurnContextItem` is now treated as the replay source of truth for durable model-visible baselines. This PR keeps the following contract explicit: - persist `TurnContextItem` for the first real user turn so resume can recover `previous_model` - persist it for later turns that emit model-visible context updates - if mid-turn compaction reinjects full initial context into replacement history, persist a fresh `TurnContextItem` after `Compacted` so resume/fork can re-establish the baseline from the rewritten history - do not treat manual compaction or pre-sampling compaction as creating a new durable baseline on their own ## Behavior Preserved - rollback replay stays aligned with `drop_last_n_user_turns` - rollback skips only user turns - incomplete active user turns are dropped before older finalized turns when rollback applies - unmatched aborts do not consume the current active turn - missing abort IDs still conservatively clear stale compaction state - compaction clears `reference_context_item` until a later `TurnContextItem` re-establishes it - `previous_model` still comes from the newest surviving user turn that established one ## Tests Targeted validation run for the current branch shape: - `cd codex-rs && cargo test -p codex-core --lib codex::rollout_reconstruction_tests -- --nocapture` - `cd codex-rs && just fmt` The branch also extracts the rollout reconstruction tests into `codex-rs/core/src/codex/rollout_reconstruction_tests.rs` so this logic has a dedicated home instead of living inline in `codex.rs`.	2026-02-27 13:50:45 -08:00
daniel-oai	6046ca19ba	Clarify escalation guidance for sandbox-related network failures (#13051 ) This updates the on-request permissions instructions so likely sandbox-related network failures during dependency installation are treated as escalation candidates. Repro: - Run `codex -a on-request -s workspace-write` in a fresh temp dir. - Prompt: `Build a new rust app with one dependency, anyhow, and try installing the dependency`. - Before this change, DNS/registry failures like `Could not resolve host: index.crates.io` could be treated like ordinary transient failures and not escalate. Fix: - Clarify that likely sandbox-related network errors such as DNS/host resolution, registry/index access, and dependency download failures should trigger escalation. Validation: - Rebuild the CLI and rerun the same repro. The same instructions should now be more likely to trigger escalation instead of silently stopping. Related Slack canvas: - https://openai.enterprise.slack.com/docs/T0BQTNSUF/F0ACVNJAV09	2026-02-27 13:48:52 -08:00
Michael Bolin	d09a7535ed	fix: use AbsolutePathBuf for permission profile file roots (#12970 ) ## Why `PermissionProfile` should describe filesystem roots as absolute paths at the type level. Using `PathBuf` in `FileSystemPermissions` made the shared type too permissive and blurred together three different deserialization cases: - skill metadata in `agents/openai.yaml`, where relative paths should resolve against the skill directory - app-server API payloads, where callers should have to send absolute paths - local tool-call payloads for commands like `shell_command` and `exec_command`, where `additional_permissions.file_system` may legitimately be relative to the command `workdir` This change tightens the shared model without regressing the existing local command flow. ## What Changed - changed `protocol::models::FileSystemPermissions` and the app-server `AdditionalFileSystemPermissions` mirror to use `AbsolutePathBuf` - wrapped skill metadata deserialization in `AbsolutePathBufGuard`, so relative permission roots in `agents/openai.yaml` resolve against the containing skill directory - kept app-server/API deserialization strict, so relative `additionalPermissions.fileSystem.*` paths are rejected at the boundary - restored cwd/workdir-relative deserialization for local tool-call payloads by parsing `shell`, `shell_command`, and `exec_command` arguments under an `AbsolutePathBufGuard` rooted at the resolved command working directory - simplified runtime additional-permission normalization so it only canonicalizes and deduplicates absolute roots instead of trying to recover relative ones later - updated the app-server schema fixtures, `app-server/README.md`, and the affected transport/TUI tests to match the final behavior	2026-02-27 17:42:52 +00:00
Ahmed Ibrahim	4d180ae428	Add model availability NUX metadata (#12972 ) - replace show_nux with structured availability_nux model metadata - expose availability NUX data through the app-server model API - update shared fixtures and tests for the new field	2026-02-26 22:02:57 -08:00
Curtis 'Fjord' Hawthorne	7e980d7db6	Support multimodal custom tool outputs (#12948 ) ## Summary This changes `custom_tool_call_output` to use the same output payload shape as `function_call_output`, so freeform tools can return either plain text or structured content items. The main goal is to let `js_repl` return image content from nested `view_image` calls in its own `custom_tool_call_output`, instead of relying on a separate injected message. ## What changed - Changed `custom_tool_call_output.output` from `string` to `FunctionCallOutputPayload` - Updated freeform tool plumbing to preserve structured output bodies - Updated `js_repl` to aggregate nested tool content items and attach them to the outer `js_repl` result - Removed the old `js_repl` special case that injected `view_image` results as a separate pending user image message - Updated normalization/history/truncation paths to handle multimodal `custom_tool_call_output` - Regenerated app-server protocol schema artifacts ## Behavior Direct `view_image` calls still return a `function_call_output` with image content. When `view_image` is called inside `js_repl`, the outer `js_repl` `custom_tool_call_output` now carries: - an `input_text` item if the JS produced text output - one or more `input_image` items from nested tool results So the nested image result now stays inside the `js_repl` tool output instead of being injected as a separate message. ## Compatibility This is intended to be backward-compatible for resumed conversations. Older histories that stored `custom_tool_call_output.output` as a plain string still deserialize correctly, and older histories that used the previous injected-image-message flow also continue to resume. Added regression coverage for resuming a pre-change rollout containing: - string-valued `custom_tool_call_output` - legacy injected image message history #### [git stack](https://github.com/magus/git-stack-cli) - 👉 `1` https://github.com/openai/codex/pull/12948	2026-02-26 18:17:46 -08:00
Celia Chen	90cc4e79a2	feat: add local date/timezone to turn environment context (#12947 ) ## Summary This PR includes the session's local date and timezone in the model-visible environment context and persists that data in `TurnContextItem`. ## What changed - captures the current local date and IANA timezone when building a turn context, with a UTC fallback if the timezone lookup fails - includes current_date and timezone in the serialized <environment_context> payload - stores those fields on TurnContextItem so they survive rollout/history handling, subagent review threads, and resume flows - treats date/timezone changes as environment updates, so prompt caching and context refresh logic do not silently reuse stale time context - updates tests to validate the new environment fields without depending on a single hardcoded environment-context string ## test built a local build and saw it in the rollout file: ``` {"timestamp":"2026-02-26T21:39:50.737Z","type":"response_item","payload":{"type":"message","role":"user","content":[{"type":"input_text","text":"<environment_context>\n <shell>zsh</shell>\n <current_date>2026-02-26</current_date>\n <timezone>America/Los_Angeles</timezone>\n</environment_context>"}]}} ```	2026-02-26 23:17:35 +00:00
pakrym-oai	951a389654	Allow clients not to send summary as an option (#12950 ) Summary is a required parameter on UserTurn. Ideally we'd like the core to decide the appropriate summary level. Make the summary optional and don't send it when not needed.	2026-02-26 14:37:38 -08:00
Michael Bolin	7fa9d9ae35	feat: include sandbox config with escalation request (#12839 ) ## Why Before this change, an escalation approval could say that a command should be rerun, but it could not carry the sandbox configuration that should still apply when the escalated command is actually spawned. That left an unsafe gap in the `zsh-fork` skill path: skill scripts under `scripts/` that did not declare permissions could be escalated without a sandbox, and scripts that did declare permissions could lose their bounded sandbox on rerun or cached session approval. This PR extends the escalation protocol so approvals can optionally carry sandbox configuration all the way through execution. That lets the shell runtime preserve the intended sandbox instead of silently widening access. We likely want a single permissions type for this codepath eventually, probably centered on `Permissions`. For now, the protocol needs to represent both the existing `PermissionProfile` form and the fuller `Permissions` form, so this introduces a temporary disjoint union, `EscalationPermissions`, to carry either one. Further, this means that today, a skill either: - does not declare any permissions, in which case it is run using the default sandbox for the turn - specifies permissions, in which case the skill is run using that exact sandbox, which might be more restrictive than the default sandbox for the turn We will likely change the skill's permissions to be additive to the existing permissions for the turn. ## What Changed - Added `EscalationPermissions` to `codex-protocol` so escalation requests can carry either a `PermissionProfile` or a full `Permissions` payload. - Added an explicit `EscalationExecution` mode to the shell escalation protocol so reruns distinguish between `Unsandboxed`, `TurnDefault`, and `Permissions(...)` instead of overloading `None`. - Updated `zsh-fork` shell reruns to resolve `TurnDefault` at execution time, which keeps ordinary `UseDefault` commands on the turn sandbox and preserves turn-level macOS seatbelt profile extensions. - Updated the `zsh-fork` skill path so a skill with no declared permissions inherits the conversation's effective sandbox instead of escalating unsandboxed. - Updated the `zsh-fork` skill path so a skill with declared permissions reruns with exactly those permissions, including when a cached session approval is reused. ## Testing - Added unit coverage in `core/src/tools/runtimes/shell/unix_escalation.rs` for the explicit `UseDefault` / `RequireEscalated` / `WithAdditionalPermissions` execution mapping. - Added unit coverage in `core/src/tools/runtimes/shell/unix_escalation.rs` for macOS seatbelt extension preservation in both the `TurnDefault` and explicit-permissions rerun paths. - Added integration coverage in `core/tests/suite/skill_approval.rs` for permissionless skills inheriting the turn sandbox and explicit skill permissions remaining bounded across cached approval reuse.	2026-02-26 12:00:18 -08:00
pakrym-oai	ba41e84a50	Use model catalog default for reasoning summary fallback (#12873 ) ## Summary - make `Config.model_reasoning_summary` optional so unset means use model default - resolve the optional config value to a concrete summary when building `TurnContext` - add protocol support for `default_reasoning_summary` in model metadata ## Validation - `cargo test -p codex-core --lib client::tests -- --nocapture` --------- Co-authored-by: Codex <noreply@openai.com>	2026-02-26 09:31:13 -08:00
jif-oai	14a08d6c14	nit: captial (#12885 )	2026-02-26 09:36:13 +00:00
Eric Traut	28bfbb8f2b	Enforce user input length cap (#12823 ) Currently there is no bound on the length of a user message submitted in the TUI or through the app server interface. That means users can paste many megabytes of text, which can lead to bad performance, hangs, and crashes. In extreme cases, it can lead to a [kernel panic](https://github.com/openai/codex/issues/12323). This PR limits the length of a user input to 2**20 (about 1M) characters. This value was chosen because it fills the entire context window on the latest models, so accepting longer inputs wouldn't make sense anyway. Summary - add a shared `MAX_USER_INPUT_TEXT_CHARS` constant in codex-protocol and surface it in TUI and app server code - block oversized submissions in the TUI submit flow and emit error history cells when validation fails - reject heavy app-server requests with JSON-RPC `-32602` and structured `input_too_large` data, plus document the behavior Testing - ran the IDE extension with this change and verified that when I attempt to paste a user message that's several MB long, it correctly reports an error instead of crashing or making my computer hot.	2026-02-25 22:23:51 -08:00
Michael Bolin	14116ade8d	feat: include available decisions in command approval requests (#12758 ) Command-approval clients currently infer which choices to show from side-channel fields like `networkApprovalContext`, `proposedExecpolicyAmendment`, and `additionalPermissions`. That makes the request shape harder to evolve, and it forces each client to replicate the server's heuristics instead of receiving the exact decision list for the prompt. This PR introduces a mapping between `CommandExecutionApprovalDecision` and `codex_protocol::protocol::ReviewDecision`: ```rust impl From<CoreReviewDecision> for CommandExecutionApprovalDecision { fn from(value: CoreReviewDecision) -> Self { match value { CoreReviewDecision::Approved => Self::Accept, CoreReviewDecision::ApprovedExecpolicyAmendment { proposed_execpolicy_amendment, } => Self::AcceptWithExecpolicyAmendment { execpolicy_amendment: proposed_execpolicy_amendment.into(), }, CoreReviewDecision::ApprovedForSession => Self::AcceptForSession, CoreReviewDecision::NetworkPolicyAmendment { network_policy_amendment, } => Self::ApplyNetworkPolicyAmendment { network_policy_amendment: network_policy_amendment.into(), }, CoreReviewDecision::Abort => Self::Cancel, CoreReviewDecision::Denied => Self::Decline, } } } ``` And updates `CommandExecutionRequestApprovalParams` to have a new field: ```rust available_decisions: Option<Vec<CommandExecutionApprovalDecision>> ``` when, if specified, should make it easier for clients to display an appropriate list of options in the UI. This makes it possible for `CoreShellActionProvider::prompt()` in `unix_escalation.rs` to specify the `Vec<ReviewDecision>` directly, adding support for `ApprovedForSession` when approving a skill script, which was previously missing in the TUI. Note this results in a significant change to `exec_options()` in `approval_overlay.rs`, as the displayed options are now derived from `available_decisions: &[ReviewDecision]`. ## What Changed - Add `available_decisions` to [`ExecApprovalRequestEvent`](`de00e932dd/codex-rs/protocol/src/approvals.rs (L111-L175)`), including helpers to derive the legacy default choices when older senders omit the field. - Map `codex_protocol::protocol::ReviewDecision` to app-server `CommandExecutionApprovalDecision` and expose the ordered list as experimental `availableDecisions` in [`CommandExecutionRequestApprovalParams`](`de00e932dd/codex-rs/app-server-protocol/src/protocol/v2.rs (L3798-L3807)`). - Thread optional `available_decisions` through the core approval path so Unix shell escalation can explicitly request `ApprovedForSession` for session-scoped approvals instead of relying on client heuristics. [`unix_escalation.rs`](`de00e932dd/codex-rs/core/src/tools/runtimes/shell/unix_escalation.rs (L194-L214)`) - Update the TUI approval overlay to build its buttons from the ordered decision list, while preserving the legacy fallback when `available_decisions` is missing. - Update the app-server README, test client output, and generated schema artifacts to document and surface the new field. ## Testing - Add `approval_overlay.rs` coverage for explicit decision lists, including the generic `ApprovedForSession` path and network approval options. - Update `chatwidget/tests.rs` and app-server protocol tests to populate the new optional field and keep older event shapes working. ## Developers Docs - If we document `item/commandExecution/requestApproval` on [developers.openai.com/codex](https://developers.openai.com/codex), add experimental `availableDecisions` as the preferred source of approval choices and note that older servers may omit it.	2026-02-26 01:10:46 +00:00
Celia Chen	4f45668106	Revert "Add skill approval event/response (#12633 )" (#12811 ) This reverts commit https://github.com/openai/codex/pull/12633. We no longer need this PR, because we favor sending normal exec command approval server request with `additional_permissions` of skill permissions instead	2026-02-26 01:02:42 +00:00
Owen Lin	a0fd94bde6	feat(app-server): add ThreadItem::DynamicToolCall (#12732 ) Previously, clients would call `thread/start` with dynamic_tools set, and when a model invokes a dynamic tool, it would just make the server->client `item/tool/call` request and wait for the client's response to complete the tool call. This works, but it doesn't have an `item/started` or `item/completed` event. Now we are doing this: - [new] emit `item/started` with `DynamicToolCall` populated with the call arguments - send an `item/tool/call` server request - [new] once the client responds, emit `item/completed` with `DynamicToolCall` populated with the response. Also, with `persistExtendedHistory: true`, dynamic tool calls are now reconstructable in `thread/read` and `thread/resume` as `ThreadItem::DynamicToolCall`.	2026-02-25 12:00:10 -08:00
Celia Chen	16ca527c80	chore: migrate additional permissions to PermissionProfile (#12731 ) This PR replaces the old `additional_permissions.fs_read/fs_write` shape with a shared `PermissionProfile` model and wires it through the command approval, sandboxing, protocol, and TUI layers. The schema is adopted from the `SkillManifestPermissions`, which is also refactored to use this unified struct. This helps us easily expose permission profiles in app server/core as a follow-up.	2026-02-25 03:35:28 +00:00
Dylan Hurd	f6053fdfb3	feat(core) Introduce Feature::RequestPermissions (#11871 ) ## Summary Introduces the initial implementation of Feature::RequestPermissions. RequestPermissions allows the model to request that a command be run inside the sandbox, with additional permissions, like writing to a specific folder. Eventually this will include other rules as well, and the ability to persist these permissions, but this PR is already quite large - let's get the core flow working and go from there! <img width="1279" height="541" alt="Screenshot 2026-02-15 at 2 26 22 PM" src="https://github.com/user-attachments/assets/0ee3ec0f-02ec-4509-91a2-809ac80be368" /> ## Testing - [x] Added tests - [x] Tested locally - [x] Feature	2026-02-24 09:48:57 -08:00
sayan-oai	7e46e5b9c2	chore: rm hardcoded PRESETS list (#12650 ) rm `PRESETS` list harcoded in `model_presets` as we now have bundled `models.json` with equivalent info. update logic to rely on bundled models instead, update tests.	2026-02-23 22:35:51 -08:00
pakrym-oai	58763afa0f	Add skill approval event/response (#12633 ) Set the stage for skill-level permission approval in addition to command-level. Behind a feature flag.	2026-02-23 22:28:58 -08:00
viyatb-oai	c3048ff90a	feat(core): persist network approvals in execpolicy (#12357 ) ## Summary Persist network approval allow/deny decisions as `network_rule(...)` entries in execpolicy (not proxy config) It adds `network_rule` parsing + append support in `codex-execpolicy`, including `decision="prompt"` (parse-only; not compiled into proxy allow/deny lists) - compile execpolicy network rules into proxy allow/deny lists and update the live proxy state on approval - preserve requirements execpolicy `network_rule(...)` entries when merging with file-based execpolicy - reject broad wildcard hosts (for example `*`) for persisted `network_rule(...)`	2026-02-23 21:37:46 -08:00
jif-oai	6fbf19ef5f	chore: phase 2 name (#12568 )	2026-02-23 11:04:55 +00:00
Charley Cunningham	bb0ac5be70	Fix compaction context reinjection and model baselines (#12252 ) ## Summary - move regular-turn context diff/full-context persistence into `run_turn` so pre-turn compaction runs before incoming context updates are recorded - after successful pre-turn compaction, rely on a cleared `reference_context_item` to trigger full context reinjection on the follow-up regular turn (manual `/compact` keeps replacement history summary-only and also clears the baseline) - preserve `<model_switch>` when full context is reinjected, and inject it before the rest of the full-context items - scope `reference_context_item` and `previous_model` to regular user turns only so standalone tasks (`/compact`, shell, review, undo) cannot suppress future reinjection or `<model_switch>` behavior - make context-diff persistence + `reference_context_item` updates explicit in the regular-turn path, with clearer docs/comments around the invariant - stop persisting local `/compact` `RolloutItem::TurnContext` snapshots (only regular turns persist `TurnContextItem` now) - simplify resume/fork previous-model/reference-baseline hydration by looking up the last surviving turn context from rollout lifecycle events, including rollback and compaction-crossing handling - remove the legacy fallback that guessed from bare `TurnContext` rollouts without lifecycle events - update compaction/remote-compaction/model-visible snapshots and compact test assertions (including remote compaction mock response shape) ## Why We were persisting incoming context items before spawning the regular turn task, which let pre-turn compaction requests accidentally include incoming context diffs without the new user message. Fixing that exposed follow-on baseline issues around `/compact`, resume/fork, and standalone tasks that could cause duplicate context injection or suppress `<model_switch>` instructions. This PR re-centers the invariants around regular turns: - regular turns persist model-visible context diffs/full reinjection and update the `reference_context_item` - standalone tasks do not advance those regular-turn baselines - compaction clears the baseline when replacement history may have stripped the referenced context diffs ## Follow-ups (TODOs left in code) - `TODO(ccunningham)`: fix rollback/backtracking baseline handling more comprehensively - `TODO(ccunningham)`: include pending incoming context items in pre-turn compaction threshold estimation - `TODO(ccunningham)`: inject updated personality spec alongside `<model_switch>` so some model-switch paths can avoid forced full reinjection - `TODO(ccunningham)`: review task turn lifecycle (`TurnStarted`/`TurnComplete`) behavior and emit task-start context diffs for task types that should have them (excluding `/compact`) ## Validation - `just fmt` - CI should cover the updated compaction/resume/model-visible snapshot expectations and rollout-hydration behavior - I did not rerun the full local test suite after the latest resume-lookup / rollout-persistence simplifications	2026-02-20 23:13:08 -08:00
Ahmed Ibrahim	6817f0be8a	Wire realtime api to core (#12268 ) - Introduce `RealtimeConversationManager` for realtime API management - Add `op::conversation` to start conversation, insert audio, insert text, and close conversation. - emit conversation lifecycle and realtime events. - Move shared realtime payload types into codex-protocol and add core e2e websocket tests for start/replace/transport-close paths. Things to consider: - Should we use the same `op::` and `Events` channel to carry audio? I think we should try this simple approach and later we can create separate one if the channels got congested. - Sending text updates to the client: we can start simple and later restrict that. - Provider auth isn't wired for now intentionally	2026-02-20 19:06:35 -08:00
jif-oai	4d60c803ba	feat: cleaner TUI for sub-agents (#12327 ) <img width="760" height="496" alt="Screenshot 2026-02-20 at 14 31 25" src="https://github.com/user-attachments/assets/1983b825-bb47-417e-9925-6f727af56765" />	2026-02-20 15:26:33 +00:00

1 2 3 4 5 ...

354 commits