core-agent-ide

Author	SHA1	Message	Date
sayan-oai	7b2cee53db	chore: wire through plugin policies + category from marketplace.json (#14305 ) wire plugin marketplace metadata through app-server endpoints: - `plugin/list` has `installPolicy` and `authPolicy` - `plugin/install` has plugin-level `authPolicy` `plugin/install` also now enforces `NOT_AVAILABLE` `installPolicy` when installing. added tests.	2026-03-11 12:33:10 -07:00
pakrym-oai	548583198a	Allow bool web_search in ToolsToml (#14352 ) Summary - add a custom deserializer so `[tools].web_search` can be a bool (treated as disabled) or a config object - extend core and app-server tests to cover bool handling in TOML config Testing - Not run (not requested)	2026-03-11 12:33:10 -07:00
Celia Chen	c1a424691f	chore: add a separate reject-policy flag for skill approvals (#14271 ) ## Summary - add `skill_approval` to `RejectConfig` and the app-server v2 `AskForApproval::Reject` payload so skill-script prompts can be configured independently from sandbox and rule-based prompts - update Unix shell escalation to reject prompts based on the actual decision source, keeping prefix rules tied to `rules`, unmatched command fallbacks tied to `sandbox_approval`, and skill scripts tied to `skill_approval` - regenerate the affected protocol/config schemas and expand unit/integration coverage for the new flag and skill approval behavior	2026-03-11 12:33:09 -07:00
joeytrasatti-openai	8ac27b2a16	Add ephemeral flag support to thread fork (#14248 ) ### Summary This PR adds first-class ephemeral support to thread/fork, bringing it in line with thread/start. The goal is to support one-off completions on full forked threads without persisting them as normal user-visible threads. ### Testing	2026-03-11 12:33:08 -07:00
Dylan Hurd	d5694529ca	app-server: propagate nested experimental gating for AskForApproval::Reject (#14191 ) ## Summary This change makes `AskForApproval::Reject` gate correctly anywhere it appears inside otherwise-stable app-server protocol types. Previously, experimental gating for `approval_policy: Reject` was handled with request-specific logic in `ClientRequest` detection. That covered a few request params types, but it did not generalize to other nested uses such as `ProfileV2`, `Config`, `ConfigReadResponse`, or `ConfigRequirements`. This PR replaces that ad hoc handling with a generic nested experimental propagation mechanism. ## Testing seeing this when run app-server-test-client without experimental api enabled: ``` initialize response: InitializeResponse { user_agent: "codex-toy-app-server/0.0.0 (Mac OS 26.3.1; arm64) vscode/2.4.36 (codex-toy-app-server; 0.0.0)" } > { > "id": "50244f6a-270a-425d-ace0-e9e98205bde7", > "method": "thread/start", > "params": { > "approvalPolicy": { > "reject": { > "mcp_elicitations": false, > "request_permissions": true, > "rules": false, > "sandbox_approval": true > } > }, > "baseInstructions": null, > "config": null, > "cwd": null, > "developerInstructions": null, > "dynamicTools": null, > "ephemeral": null, > "experimentalRawEvents": false, > "mockExperimentalField": null, > "model": null, > "modelProvider": null, > "persistExtendedHistory": false, > "personality": null, > "sandbox": null, > "serviceName": null > } > } < { < "error": { < "code": -32600, < "message": "askForApproval.reject requires experimentalApi capability" < }, < "id": "50244f6a-270a-425d-ace0-e9e98205bde7" < } [verified] thread/start rejected approvalPolicy=Reject without experimentalApi ``` --------- Co-authored-by: celia-oai <celia@openai.com>	2026-03-11 12:33:08 -07:00
xl-openai	d751e68f44	feat: Allow sync with remote plugin status. (#14176 ) Add forceRemoteSync to plugin/list. When it is set to True, we will sync the local plugin status with the remote one (backend-api/plugins/list).	2026-03-11 12:33:08 -07:00
guinness-oai	4ac6042850	Mark incomplete resumed turns interrupted when idle (#14125 ) Fixes a Codex app bug where quitting the app mid-run could leave the reopened thread stuck in progress and non-interactable. On cold thread resume, app-server could return an idle thread with a replayed turn still marked in progress. This marks incomplete replayed turns as interrupted unless the thread is actually active.	2026-03-11 12:33:07 -07:00
Ahmed Ibrahim	f3f47cf455	Stabilize app-server notify initialize test (#13939 ) ## What changed - This PR changes only the flaky test setup for `turn_start_notify_payload_includes_initialize_client_name`. - Instead of shelling out to `python3` to write the notify payload, the test uses the first-party `codex-app-server-test-notify-capture` helper. - The helper writes `notify.json` atomically and the test waits for the file to exist before reading it. ## Why this fixes the flake - The old test depended on an external Python interpreter being present and behaving consistently on every CI runner. - It also raced the file write: the test could observe the path before the payload had been fully written, which produced partial reads and intermittent assertion failures. - Moving the write into a repo-owned helper removes the external dependency, and atomic write-plus-wait makes the handoff deterministic. ## Scope - Test-only change.	2026-03-09 23:41:58 -07:00
Ahmed Ibrahim	b39ae9501f	Stabilize websocket test server binding (#14002 ) ## Summary - stop reserving a localhost port in the websocket tests before spawning the server - let the app-server bind `127.0.0.1:0` itself and read back the actual bound websocket address from stderr - update the websocket test helpers and callers to use the discovered address ## Why this fixes the flake The previous harness reserved a port in the test process, dropped it, and then asked the server process to bind that same address. On busy runners there is a race between releasing the reservation and the child process rebinding it, which can produce sporadic startup failures. Binding to port `0` inside the server removes that race entirely, and waiting for the server to report the real bound address makes the tests connect only after the listener is actually ready.	2026-03-09 23:39:56 -07:00
Channing Conger	c6343e0649	Implemented thread-level atomic elicitation counter for stopwatch pausing (#12296 ) ### Purpose While trying to build out CLI-Tools for the agent to use under skills we have found that those tools sometimes need to invoke a user elicitation. These elicitations are handled out of band of the codex app-server but need to indicate to the exec manager that the command running is not going to progress on the usual timeout horizon. ### Example Model calls universal exec: `$ download-credit-card-history --start-date 2026-01-19 --end-date 2026-02-19 > credit_history.jsonl` download-cred-card-history might hit a hosted/preauthenticated service to fetch data. That service might decide that the request requires an end user approval the access to the personal data. It should be able to signal to the running thread that the command in question is blocked on user elicitation. In that case we want the exec to continue, but the timeout to not expire on the tool call, essentially freezing time until the user approves or rejects the command at which point the tool would signal the app-server to decrement the outstanding elicitation count. Now timeouts would proceed as normal. ### What's Added - New v2 RPC methods: - thread/increment_elicitation - thread/decrement_elicitation - Protocol updates in: - codex-rs/app-server-protocol/src/protocol/common.rs - codex-rs/app-server-protocol/src/protocol/v2.rs - App-server handlers wired in: - codex-rs/app-server/src/codex_message_processor.rs ### Behavior - Counter starts at 0 per thread. - increment atomically increases the counter. - decrement atomically decreases the counter; decrement at 0 returns invalid request. - Transition rules: - 0 -> 1: broadcast pause state, pausing all active stopwatches immediately. - \>0 -> >0: remain paused. - 1 -> 0: broadcast unpause state, resuming stopwatches. - Core thread/session logic: - codex-rs/core/src/codex_thread.rs - codex-rs/core/src/codex.rs - codex-rs/core/src/mcp_connection_manager.rs ### Exec-server stopwatch integration - Added centralized stopwatch tracking/controller: - codex-rs/exec-server/src/posix/stopwatch_controller.rs - Hooked pause/unpause broadcast handling + stopwatch registration: - codex-rs/exec-server/src/posix/mcp.rs - codex-rs/exec-server/src/posix/stopwatch.rs - codex-rs/exec-server/src/posix.rs	2026-03-09 22:29:26 -07:00
Matthew Zeng	566e4cee4b	[apps] Fix apps enablement condition. (#14011 ) - [x] Fix apps enablement condition to check both the feature flag and that the user is not an API key user.	2026-03-09 22:25:43 -07:00
pakrym-oai	d71e042694	Enforce single tool output type in codex handlers (#14157 ) We'll need to associate output schema with each tool. Each tool can only have on output type.	2026-03-09 21:49:44 -07:00
Max Johnson	66e71cce11	codex-rs/app-server: add health endpoints for --listen websocket server (#13782 ) Healthcheck endpoints for the websocket server - serve `GET /readyz` and `GET /healthz` from the same listener used for `--listen ws://...` - switch the websocket listener over to `axum` upgrade handling instead of manual socket parsing - add websocket transport coverage for the health endpoints and document the new behavior Testing - integration tests - built and tested e2e ``` > curl -i http://127.0.0.1:9234/readyz HTTP/1.1 200 OK content-length: 0 date: Fri, 06 Mar 2026 19:20:23 GMT > curl -i http://127.0.0.1:9234/healthz HTTP/1.1 200 OK content-length: 0 date: Fri, 06 Mar 2026 19:20:24 GMT ```	2026-03-09 22:11:30 +00:00
Dylan Hurd	d241dc598c	feat(core) Persist request_permission data across turns (#14009 ) ## Summary request_permissions flows should support persisting results for the session. Open Question: Still deciding if we need within-turn approvals - this adds complexity but I could see it being useful ## Testing - [x] Updated unit tests --------- Co-authored-by: Codex <noreply@openai.com>	2026-03-09 14:36:38 -07:00
sayan-oai	6ad448b658	chore: plugin/uninstall endpoint (#14111 ) add `plugin/uninstall` app-server endpoint to fully rm plugin from plugins cache dir and rm entry from user config file. plugin-enablement is session-scoped, so uninstalls are only picked up in new sessions (like installs). added tests.	2026-03-09 12:40:25 -07:00
Ahmed Ibrahim	10bf6008f4	Stabilize thread resume replay tests (#13885 ) ## What changed - The thread-resume replay tests now use unchecked mock sequencing so the replay flow can complete before the test asserts. - They also poll outbound `/responses` request counts and fail immediately if replay emits an unexpected extra request. ## Why this fixes the flake - The previous version asserted while the replay machinery was still mid-flight, so the test was sometimes checking an intermediate state instead of the completed behavior. - Strict mock sequencing made that problem worse by forcing the test to care about exact sub-step timing rather than about the end result. - Letting replay settle and then asserting on stabilized request counts makes the test validate the real contract: the replay path finishes and does not send extra model requests. ## Scope - Test-only change.	2026-03-09 10:41:23 -07:00
Ahmed Ibrahim	6b68d1ef66	Stabilize plan item app-server tests (#14058 ) ## What changed - run the two plan-mode app-server tests on a multi-thread Tokio runtime instead of the default single-thread test runtime - stop relying on wiremock teardown expectations for `/responses` and explicitly wait for the expected request count after the turn completes ## Why this fixes the flake - this failure was showing up on Windows ARM as a late wiremock panic saying the mock server saw zero `/responses` calls, but the real issue was that the test could stall around app-server startup and only fail during teardown - moving these tests to the same multi-thread runtime used by the other collaboration-mode app-server tests removes that startup scheduling race - asserting the `/responses` count directly makes the test deterministic: we now wait for the real POST instead of depending on a drop-time verification that can hide the underlying timing issue ## Scope - test-only change; no production logic changes	2026-03-09 10:24:18 -07:00
Ahmed Ibrahim	615ed0e437	Stabilize zsh fork app-server tests (#13872 ) ## What changed - `turn_start_shell_zsh_fork_executes_command_v2` now keeps the shell command alive with a file marker until the interrupt arrives instead of using a command that can finish too quickly. - `turn_start_shell_zsh_fork_subcommand_decline_marks_parent_declined_v2` now waits for `turn/completed` before sending a fallback interrupt and accepts the real terminal outcomes observed across platforms. ## Why this fixes the flake - The original tests assumed a narrow ordering window: the child command would still be running when the interrupt happened, and completion would always arrive in one specific order. - In CI, especially across different shells and runner speeds, those assumptions break. Sometimes the child finishes before the interrupt; sometimes the protocol completes while the fallback path is still arming itself. - Holding the command open until the interrupt and waiting for the explicit protocol completion event makes the tests synchronize on the behavior under test instead of on wall-clock timing. ## Scope - Test-only change.	2026-03-09 09:38:16 -07:00
Ahmed Ibrahim	3f1280ce1c	Reduce app-server test timeout pressure (#13884 ) ## What changed - The auth/account/fuzzy-file-search test configs disable unrelated `shell_snapshot` setup. - The fuzzy-file-search fixture set was reduced so the stop-updates test does less incidental work before reaching the assertion. ## Why this fixes the flake - These failures were caused by cumulative timeout pressure, not by a missing product-level delay. - The old tests were paying for shell snapshot initialization and extra fixture volume that were not part of the behavior being validated. - Removing that incidental work keeps the same coverage but shortens the critical path enough that the tests finish comfortably inside the existing timeout budget, which is the right fix versus simply extending the timeout. ## Scope - Test-only change.	2026-03-09 09:37:41 -07:00
Ahmed Ibrahim	2bc3e52a91	Stabilize app list update ordering test (#14052 ) ## Summary - make `list_apps_waits_for_accessible_data_before_emitting_directory_updates` accept the two valid notification paths the server can emit - keep rejecting the real bug this test is meant to catch: a directory-only `app/list/updated` notification before accessible app data is available ## Why this fixes the flake The old test used a fixed `150ms` silence window and assumed the first notification after that window had to be the fully merged final update. In CI, scheduling occasionally lets accessible app data arrive before directory data, so the first valid notification can be an accessible-only interim update. That made the test fail even though the server behavior was correct. This change makes the test deterministic by reading notifications until the final merged payload arrives. Any interim update is only accepted if it contains accessible apps only; if the server ever emits inaccessible directory data before accessible data is ready, the test still fails immediately. ## Change type - test-only; no production app-list logic changes	2026-03-09 00:16:13 -07:00
Jack Mousseau	e6b93841c5	Add request permissions tool (#13092 ) Adds a built-in `request_permissions` tool and wires it through the Codex core, protocol, and app-server layers so a running turn can ask the client for additional permissions instead of relying on a static session policy. The new flow emits a `RequestPermissions` event from core, tracks the pending request by call ID, forwards it through app-server v2 as an `item/permissions/requestApproval` request, and resumes the tool call once the client returns an approved subset of the requested permission profile.	2026-03-08 20:23:06 -07:00
Matthew Zeng	a684a36091	[app-server] Support hot-reload user config when batch writing config. (#13839 ) - [x] Support hot-reload user config when batch writing config.	2026-03-08 17:38:01 -07:00
Charley Cunningham	7ba1fccfc1	fix(ci): restore guardian coverage and bazel unit tests (#13912 ) ## Summary - restore the guardian review request snapshot test and its tracked snapshot after it was dropped from `main` - make Bazel Rust unit-test wrappers resolve runfiles correctly on manifest-only platforms like macOS and point Insta at the real workspace root - harden the shell-escalation socket-closure assertion so the musl Bazel test no longer depends on fd reuse behavior ## Verification - cargo test -p codex-core guardian_review_request_layout_matches_model_visible_request_snapshot - cargo test -p codex-shell-escalation - bazel test //codex-rs/exec:exec-unit-tests //codex-rs/shell-escalation:shell-escalation-unit-tests Supersedes #13894. --------- Co-authored-by: Ahmed Ibrahim <aibrahim@openai.com> Co-authored-by: viyatb-oai <viyatb@openai.com> Co-authored-by: Codex <noreply@openai.com>	2026-03-08 12:05:19 -07:00
Ahmed Ibrahim	dc19e78962	Stabilize abort task follow-up handling (#13874 ) - production logic plus tests; cancel running tasks before clearing pending turn state - suppress follow-up model requests after cancellation and assert on stabilized request counts instead of fixed sleeps	2026-03-07 22:56:00 -08:00
jif-oai	cf143bf71e	feat: simplify DB further (#13771 )	2026-03-07 03:48:36 -08:00
iceweasel-oai	4b4f61d379	app-server: require absolute cwd for windowsSandbox/setupStart (#13833 ) ## Summary - require windowsSandbox/setupStart.cwd to be an AbsolutePathBuf - reject relative cwd values at request parsing instead of normalizing them later in the setup flow - add RPC-layer coverage for relative cwd rejection and update the checked-in protocol schemas/docs ## Why windowsSandbox/setupStart was carrying the client-provided cwd as a raw PathBuf for command_cwd while config derivation normalized the same value into an absolute policy_cwd. That left room for relative-path ambiguity in the setup path, especially for inputs like cwd: "repo". Making the RPC accept only absolute paths removes that split entirely: the handler now receives one already-validated absolute path and uses it for both config derivation and setup. This keeps the trust model unchanged. Trusted clients could already choose the session cwd; this change is only about making the setup RPC reject relative paths so command_cwd and policy_cwd cannot diverge. ## Testing - cargo test -p codex-app-server windows_sandbox_setup (run locally by user) - cargo test -p codex-app-server-protocol windows_sandbox (run locally by user)	2026-03-06 22:47:08 -08:00
Ruslan Nigmatullin	e9bd8b20a1	app-server: Add streaming and tty/pty capabilities to `command/exec` (#13640 ) * Add an ability to stream stdin, stdout, and stderr * Streaming of stdout and stderr has a configurable cap for total amount of transmitted bytes (with an ability to disable it) * Add support for overriding environment variables * Add an ability to terminate running applications (using `command/exec/terminate`) * Add TTY/PTY support, with an ability to resize the terminal (using `command/exec/resize`)	2026-03-06 17:30:17 -08:00
Rohan Mehta	61098c7f51	Allow full web search tool config (#13675 ) Previously, we could only configure whether web search was on/off. This PR enables sending along a web search config, which includes all the stuff responsesapi supports: filters, location, etc.	2026-03-07 00:50:50 +00:00
xl-openai	0243734300	feat: Add curated plugin marketplace + Metadata Cleanup. (#13712 ) 1. Add a synced curated plugin marketplace and include it in marketplace discovery. 2. Expose optional plugin.json interface metadata in plugin/list 3. Tighten plugin and marketplace path handling using validated absolute paths. 4. Let manifests override skill, MCP, and app config paths. 5. Restrict plugin enablement/config loading to the user config layer so plugin enablement is at global level	2026-03-06 19:39:35 -05:00
Ruslan Nigmatullin	51fcdc760d	app-server: Emit `thread/name/updated` event globally (#13674 )	2026-03-06 10:25:18 -08:00
Owen Lin	6c98a59dbd	fix(app-server): fix turn_start_shell_zsh_fork_executes_command_v2 flake (#13770 ) This fixes a flaky `turn_start_shell_zsh_fork_executes_command_v2` test. The interrupt path can race with the follow-up `/responses` request that reports the aborted tool call, so the test now allows that extra no-op response instead of assuming there will only ever be one request. The assertions still stay focused on the behavior the test actually cares about: starting the zsh-forked command correctly. Testing: - `just fmt` - `cargo test -p codex-app-server --test all suite::v2::turn_start_zsh_fork::turn_start_shell_zsh_fork_executes_command_v2 -- --exact --nocapture`	2026-03-06 10:10:16 -08:00
Matthew Zeng	98dca99db7	[elicitations] Switch to use MCP style elicitation payload for mcp tool approvals. (#13621 ) - [x] Switch to use MCP style elicitation payload for mcp tool approvals. - [ ] TODO: Update the UI to support the full spec.	2026-03-06 01:50:26 -08:00
sayan-oai	014a59fb0b	check app auth in plugin/install (#13685 ) #### What on `plugin/install`, check if installed apps are already authed on chatgpt, and return list of all apps that are not. clients can use this list to trigger auth workflows as needed. checks are best effort based on `codex_apps` loading, much like `app/list`. #### Tests Added integration tests, tested locally.	2026-03-06 06:45:00 +00:00
xl-openai	520ed724d2	support plugin/list. (#13540 ) Introduce a plugin/list which reads from local marketplace.json. Also update the signature for plugin/install.	2026-03-05 21:58:50 -05:00
Ahmed Ibrahim	6cf0ed4e79	Refine realtime startup context formatting (#13560 ) ## Summary - group recent work by git repo when available, otherwise by directory - render recent work as bounded user asks with per-thread cwd context - exclude hidden files and directories from workspace trees	2026-03-05 16:31:20 -08:00
sayan-oai	4e77ea0ec7	add @plugin mentions (#13510 ) ## Note-- added plugin mentions via @, but that conflicts with file mentions depends and builds upon #13433. - introduces explicit `@plugin` mentions. this injects the plugin's mcp servers, app names, and skill name format into turn context as a dev message. - we do not yet have UI for these mentions, so we currently parse raw text (as opposed to skills and apps which have UI chips, autocomplete, etc.) this depends on a `plugins/list` app-server endpoint we can feed the UI with, which is upcoming - also annotate mcp and app tool descriptions with the plugin(s) they come from. this gives the model a first class way of understanding what tools come from which plugins, which will help implicit invocation. ### Tests Added and updated tests, unit and integration. Also confirmed locally a raw `@plugin` injects the dev message, and the model knows about its apps, mcps, and skills.	2026-03-06 00:03:39 +00:00
Max Johnson	1980b6ce00	treat SIGTERM like ctrl-c for graceful shutdown (#13594 ) treat SIGTERM the same as SIGINT for graceful app-server websocket shutdown	2026-03-05 18:16:58 +00:00
Owen Lin	926b2f19e8	feat(app-server): support mcp elicitations in v2 api (#13425 ) This adds a first-class server request for MCP server elicitations: `mcpServer/elicitation/request`. Until now, MCP elicitation requests only showed up as a raw `codex/event/elicitation_request` event from core. That made it hard for v2 clients to handle elicitations using the same request/response flow as other server-driven interactions (like shell and `apply_patch` tools). This also updates the underlying MCP elicitation request handling in core to pass through the full MCP request (including URL and form data) so we can expose it properly in app-server. ### Why not `item/mcpToolCall/elicitationRequest`? This is because MCP elicitations are related to MCP servers first, and only optionally to a specific MCP tool call. In the MCP protocol, elicitation is a server-to-client capability: the server sends `elicitation/create`, and the client replies with an elicitation result. RMCP models it that way as well. In practice an elicitation is often triggered by an MCP tool call, but not always. ### What changed - add `mcpServer/elicitation/request` to the v2 app-server API - translate core `codex/event/elicitation_request` events into the new v2 server request - map client responses back into `Op::ResolveElicitation` so the MCP server can continue - update app-server docs and generated protocol schema - add an end-to-end app-server test that covers the full round trip through a real RMCP elicitation flow - The new test exercises a realistic case where an MCP tool call triggers an elicitation, the app-server emits mcpServer/elicitation/request, the client accepts it, and the tool call resumes and completes successfully. ### app-server API flow - Client starts a thread with `thread/start`. - Client starts a turn with `turn/start`. - App-server sends `item/started` for the `mcpToolCall`. - While that tool call is in progress, app-server sends `mcpServer/elicitation/request`. - Client responds to that request with `{ action: "accept" \| "decline" \| "cancel" }`. - App-server sends `serverRequest/resolved`. - App-server sends `item/completed` for the mcpToolCall. - App-server sends `turn/completed`. - If the turn is interrupted while the elicitation is pending, app-server still sends `serverRequest/resolved` before the turn finishes.	2026-03-05 07:20:20 -08:00
sayan-oai	03d55f0e6f	chore: add web_search_tool_type for image support (#13538 ) add `web_search_tool_type` on model_info that can be populated from backend. will be used to filter which models can use `web_search` with images and which cant. added small unit test.	2026-03-05 07:02:27 +00:00
joeytrasatti-openai	22f4113ac1	Preserve persisted thread git info in resume (#13504 ) ## Summary - ensure `thread.resume` reuses the stored `gitInfo` instead of rebuilding it from the live working tree - persist and apply thread git metadata through the resume flow and add a regression test covering branch mismatch cases ## Testing - Not run (not requested)	2026-03-04 17:16:43 -08:00
iceweasel-oai	54a1c81d73	allow apps to specify cwd for sandbox setup. (#13484 ) The electron app doesn't start up the app-server in a particular workspace directory. So sandbox setup happens in the app-installed directory instead of the project workspace. This allows the app do specify the workspace cwd so that the sandbox setup actually sets up the ACLs instead of exiting fast and then having the first shell command be slow.	2026-03-04 10:54:30 -08:00
Val Kharitonov	4f6c4bb143	support 'flex' tier in app-server in addition to 'fast' (#13391 )	2026-03-03 22:46:05 -08:00
Owen Lin	0fbd84081b	feat(app-server): add a skills/changed v2 notification (#13414 ) This adds a first-class app-server v2 `skills/changed` notification for the existing skills live-reload signal. Before this change, clients only had the legacy raw `codex/event/skills_update_available` event. With this PR, v2 clients can listen for a typed JSON-RPC notification instead of depending on the legacy `codex/event/*` stream, which we want to remove soon.	2026-03-03 17:01:00 -08:00
Curtis 'Fjord' Hawthorne	b92146d48b	Add under-development original-resolution view_image support (#13050 ) ## Summary Add original-resolution support for `view_image` behind the under-development `view_image_original_resolution` feature flag. When the flag is enabled and the target model is `gpt-5.3-codex` or newer, `view_image` now preserves original PNG/JPEG/WebP bytes and sends `detail: "original"` to the Responses API instead of using the legacy resize/compress path. ## What changed - Added `view_image_original_resolution` as an under-development feature flag. - Added `ImageDetail` to the protocol models and support for serializing `detail: "original"` on tool-returned images. - Added `PromptImageMode::Original` to `codex-utils-image`. - Preserves original PNG/JPEG/WebP bytes. - Keeps legacy behavior for the resize path. - Updated `view_image` to: - use the shared `local_image_content_items_with_label_number(...)` helper in both code paths - select original-resolution mode only when: - the feature flag is enabled, and - the model slug parses as `gpt-5.3-codex` or newer - Kept local user image attachments on the existing resize path; this change is specific to `view_image`. - Updated history/image accounting so only `detail: "original"` images use the docs-based GPT-5 image cost calculation; legacy images still use the old fixed estimate. - Added JS REPL guidance, gated on the same feature flag, to prefer JPEG at 85% quality unless lossless is required, while still allowing other formats when explicitly requested. - Updated tests and helper code that construct `FunctionCallOutputContentItem::InputImage` to carry the new `detail` field. ## Behavior ### Feature off - `view_image` keeps the existing resize/re-encode behavior. - History estimation keeps the existing fixed-cost heuristic. ### Feature on + `gpt-5.3-codex+` - `view_image` sends original-resolution images with `detail: "original"`. - PNG/JPEG/WebP source bytes are preserved when possible. - History estimation uses the GPT-5 docs-based image-cost calculation for those `detail: "original"` images. #### [git stack](https://github.com/magus/git-stack-cli) - 👉 `1` https://github.com/openai/codex/pull/13050 - ⏳ `2` https://github.com/openai/codex/pull/13331 - ⏳ `3` https://github.com/openai/codex/pull/13049	2026-03-03 15:56:54 -08:00
joeytrasatti-openai	935754baa3	Add thread metadata update endpoint to app server (#13280 ) ## Summary - add the v2 `thread/metadata/update` API, including protocol/schema/TypeScript exports and app-server docs - patch stored thread `gitInfo` in sqlite without resuming the thread, with validation plus support for explicit `null` clears - repair missing sqlite thread rows from rollout data before patching, and make those repairs safe by inserting only when absent and updating only git columns so newer metadata is not clobbered - keep sqlite authoritative for mutable thread git metadata by preserving existing sqlite git fields during reconcile/backfill and only using rollout `SessionMeta` git fields to fill gaps - add regression coverage for the endpoint, repair paths, concurrent sqlite writes, clearing git fields, and rollout/backfill reconciliation - fix the login server shutdown race so cancelling before the waiter starts still terminates `block_until_done()` correctly ## Testing - `cargo test -p codex-state apply_rollout_items_preserves_existing_git_branch_and_fills_missing_git_fields` - `cargo test -p codex-state update_thread_git_info_preserves_newer_non_git_metadata` - `cargo test -p codex-core backfill_sessions_preserves_existing_git_branch_and_fills_missing_git_fields` - `cargo test -p codex-app-server thread_metadata_update` - `cargo test` - currently fails in existing `codex-core` grep-files tests with `unsupported call: grep_files`: - `suite::grep_files::grep_files_tool_collects_matches` - `suite::grep_files::grep_files_tool_reports_empty_results`	2026-03-03 15:56:11 -08:00
Owen Lin	167158f93c	chore(app-server): delete v1 RPC methods and notifications (#13375 ) ## Summary This removes the old app-server v1 methods and notifications we no longer need, while keeping the small set the main codex app client still depends on for now. The remaining legacy surface is: - `initialize` - `getConversationSummary` - `getAuthStatus` - `gitDiffToRemote` - `fuzzyFileSearch` - `fuzzyFileSearch/sessionStart` - `fuzzyFileSearch/sessionUpdate` - `fuzzyFileSearch/sessionStop` And the raw `codex/event/*` notifications emitted from core. These notifications will be removed in a followup PR. ## What changed - removed deprecated v1 request variants from the protocol and app-server dispatcher - removed deprecated typed notifications: `authStatusChange`, `loginChatGptComplete`, and `sessionConfigured` - updated the app-server test client to use v2 flows instead of deleted v1 flows - deleted legacy-only app-server test suites and added focused coverage for `getConversationSummary` - regenerated app-server schema fixtures and updated the MCP interface docs to match the remaining compatibility surface ## Testing - `just write-app-server-schema` - `cargo test -p codex-app-server-protocol` - `cargo test -p codex-app-server`	2026-03-03 13:18:25 -08:00
pash-openai	07e532dcb9	app-server service tier plumbing (plus some cleanup) (#13334 ) followup to https://github.com/openai/codex/pull/13212 to expose fast tier controls to app server (majority of this PR is generated schema jsons - actual code is +69 / -35 and +24 tests ) - add service tier fields to the app-server protocol surfaces used by thread lifecycle, turn start, config, and session configured events - thread service tier through the app-server message processor and core thread config snapshots - allow runtime config overrides to carry service tier for app-server callers cleanup: - Removing useless "legacy" code supporting "standard" - we moved to None \| "fast", so "standard" is not needed.	2026-03-03 02:35:09 -08:00
Ruslan Nigmatullin	9022cdc563	app-server: Silence thread status changes caused by thread being created (#13079 ) Currently we emit `thread/status/changed` with `Idle` status right before sending `thread/started` event (which also has `Idle` status in it). It feels that there is no point in that as client has no way to know prior state of the thread as it didn't exist yet, so silence these kinds of notifications.	2026-03-03 00:52:28 +00:00
Owen Lin	146b798129	fix(app-server): emit turn/started only when turn actually starts (#13261 ) This is a follow-up for https://github.com/openai/codex/pull/13047 ## Why We had a race where `turn/started` could be observed before the thread had actually transitioned to `Active`. This was because we eagerly emitted `turn/started` in the request handler for `turn/start` (and `review/start`). That was showing up as flaky `thread/resume` tests, but the real issue was broader: a client could see `turn/started` and still get back an idle thread immediately afterward. The first idea was to eagerly call `thread_watch_manager.note_turn_started(...)` from the `turn/start` request path. That turns out to be unsafe, because `submit(Op::UserInput)` only queues work. If a turn starts and completes quickly, request-path bookkeeping can race with the real lifecycle events and leave stale running state behind. The real fix is to move `turn/started` to emit only after the turn _actually_ starts, so we do that by waiting for the `EventMsg::TurnStarted` notification emitted by codex core. We do this for both `turn/start` and `review/start`. I also verified this change is safe for our first-party codex apps - they don't have any assumptions that `turn/started` is emitted before the RPC response to `turn/start` (which is correct anyway). I also removed `single_client_mode` since it isn't really necessary now. ## Testing - `cargo test -p codex-app-server thread_resume -- --nocapture` - `cargo test -p codex-app-server 'suite::v2::turn_start::turn_start_emits_notifications_and_accepts_model_override' -- --exact --nocapture` - `cargo test -p codex-app-server`	2026-03-02 16:43:31 -08:00
Ahmed Ibrahim	b20b6aa46f	Update realtime websocket API (#13265 ) - migrate the realtime websocket transport to the new session and handoff flow - make the realtime model configurable in config.toml and use API-key auth for the websocket --------- Co-authored-by: Codex <noreply@openai.com>	2026-03-02 16:05:40 -08:00

1 2 3 4 5 ...

327 commits