core-agent-ide

Author	SHA1	Message	Date
Michael Bolin	abbd74e2be	feat: make sandbox read access configurable with `ReadOnlyAccess` (#11387 ) `SandboxPolicy::ReadOnly` previously implied broad read access and could not express a narrower read surface. This change introduces an explicit read-access model so we can support user-configurable read restrictions in follow-up work, while preserving current behavior today. It also ensures unsupported backends fail closed for restricted-read policies instead of silently granting broader access than intended. ## What - Added `ReadOnlyAccess` in protocol with: - `Restricted { include_platform_defaults, readable_roots }` - `FullAccess` - Updated `SandboxPolicy` to carry read-access configuration: - `ReadOnly { access: ReadOnlyAccess }` - `WorkspaceWrite { ..., read_only_access: ReadOnlyAccess }` - Preserved existing behavior by defaulting current construction paths to `ReadOnlyAccess::FullAccess`. - Threaded the new fields through sandbox policy consumers and call sites across `core`, `tui`, `linux-sandbox`, `windows-sandbox`, and related tests. - Updated Seatbelt policy generation to honor restricted read roots by emitting scoped read rules when full read access is not granted. - Added fail-closed behavior on Linux and Windows backends when restricted read access is requested but not yet implemented there (`UnsupportedOperation`). - Regenerated app-server protocol schema and TypeScript artifacts, including `ReadOnlyAccess`. ## Compatibility / rollout - Runtime behavior remains unchanged by default (`FullAccess`). - API/schema changes are in place so future config wiring can enable restricted read access without another policy-shape migration.	2026-02-11 18:31:14 -08:00
Michael Bolin	572ab66496	test(app-server): stabilize app/list thread feature-flag test by using file-backed MCP OAuth creds (#11521 ) ## Why `suite::v2::app_list::list_apps_uses_thread_feature_flag_when_thread_id_is_provided` has been flaky in CI. The test exercises `thread/start`, which initializes `codex_apps`. In CI/Linux, that path can reach OS keyring-backed MCP OAuth credential lookup (`Codex MCP Credentials`) and intermittently abort the MCP process (observed stack overflow in `zbus`), causing the test to fail before the assertion logic runs. ## What Changed - Updated the test config in `codex-rs/app-server/tests/suite/v2/app_list.rs` to set `mcp_oauth_credentials_store = "file"` in both relevant config-writing paths: - The in-test config override inside `list_apps_uses_thread_feature_flag_when_thread_id_is_provided` - `write_connectors_config(...)`, which is used by the v2 `app_list` test suite - This keeps test coverage focused on thread-scoped app feature flags while removing OS keyring/DBus dependency from this test path. ## How It Was Verified - `cargo test -p codex-app-server` - `cargo test -p codex-app-server list_apps_uses_thread_feature_flag_when_thread_id_is_provided -- --nocapture`	2026-02-11 18:30:18 -08:00
Max Johnson	c0ecc2e1e1	app-server: thread resume subscriptions (#11474 ) This stack layer makes app-server thread event delivery connection-aware so resumed/attached threads only emit notifications and approval prompts to subscribed connections. - Added per-thread subscription tracking in `ThreadState` (`subscribed_connections`) and mapped subscription ids to `(thread_id, connection_id)`. - Updated listener lifecycle so removing a subscription or closing a connection only removes that connection from the thread’s subscriber set; listener shutdown now happens when the last subscriber is gone. - Added `connection_closed(connection_id)` plumbing (`lib.rs` -> `message_processor.rs` -> `codex_message_processor.rs`) so disconnect cleanup happens immediately. - Scoped bespoke event handling outputs through `TargetedOutgoing` to send requests/notifications only to subscribed connections. - Kept existing threadresume behavior while aligning with the latest split-loop transport structure.	2026-02-11 16:21:13 -08:00
Max Johnson	b5339a591d	refactor: codex app-server ThreadState (#11419 ) this is a no-op functionality wise. consolidates thread-specific message processor / event handling state in ThreadState	2026-02-11 12:20:54 -08:00
iceweasel-oai	87279de434	Promote Windows Sandbox (#11341 ) 1. Move Windows Sandbox NUX to right after trust directory screen 2. Don't offer read-only as an option in Sandbox NUX. Elevated/Legacy/Quit 3. Don't allow new untrusted directories. It's trust or quit 4. move experimental sandbox features to `[windows] sandbox="elevated\|unelevatd"` 5. Copy tweaks = elevated -> default, non-elevated -> non-admin	2026-02-11 11:48:33 -08:00
Max Johnson	7053aa5457	Reapply "Add app-server transport layer with websocket support" (#11370 ) Reapply "Add app-server transport layer with websocket support" with additional fixes from https://github.com/openai/codex/pull/11313/changes to avoid deadlocking. This reverts commit `47356ff83c`. ## Summary To avoid deadlocking when queues are full, we maintain separate tokio tasks dedicated to incoming vs outgoing event handling - split the app-server main loop into two tasks in `run_main_with_transport` - inbound handling (`transport_event_rx`) - outbound handling (`outgoing_rx` + `thread_created_rx`) - separate incoming and outgoing websocket tasks ## Validation Integration tests, testing thoroughly e2e in codex app w/ >10 concurrent requests <img width="1365" height="979" alt="Screenshot 2026-02-10 at 2 54 22 PM" src="https://github.com/user-attachments/assets/47ca2c13-f322-4e5c-bedd-25859cbdc45f" /> --------- Co-authored-by: jif-oai <jif@openai.com>	2026-02-11 18:13:39 +00:00
gt-oai	886d9377d3	Cache cloud requirements (#11305 ) We're loading these from the web on every startup. This puts them in a local file with a 1hr TTL. We sign the downloaded requirements with a key compiled into the Codex CLI to prevent unsophisticated tampering (determined circumvention is outside of our threat model: after all, one could just compile Codex without any of these checks). If any of the following are true, we ignore the local cache and re-fetch from Cloud: * The signature is invalid for the payload (== requirements, sign time, ttl, user identity) * The identity does not match the auth'd user's identity * The TTL has expired * We cannot parse requirements.toml from the payload	2026-02-11 14:06:41 +00:00
Michael Bolin	8b7f8af343	feat: split codex-common into smaller utils crates (#11422 ) We are removing feature-gated shared crates from the `codex-rs` workspace. `codex-common` grouped several unrelated utilities behind `[features]`, which made dependency boundaries harder to reason about and worked against the ongoing effort to eliminate feature flags from workspace crates. Splitting these utilities into dedicated crates under `utils/` aligns this area with existing workspace structure and keeps each dependency explicit at the crate boundary. ## What changed - Removed `codex-rs/common` (`codex-common`) from workspace members and workspace dependencies. - Added six new utility crates under `codex-rs/utils/`: - `codex-utils-cli` - `codex-utils-elapsed` - `codex-utils-sandbox-summary` - `codex-utils-approval-presets` - `codex-utils-oss` - `codex-utils-fuzzy-match` - Migrated the corresponding modules out of `codex-common` into these crates (with tests), and added matching `BUILD.bazel` targets. - Updated direct consumers to use the new crates instead of `codex-common`: - `codex-rs/cli` - `codex-rs/tui` - `codex-rs/exec` - `codex-rs/app-server` - `codex-rs/mcp-server` - `codex-rs/chatgpt` - `codex-rs/cloud-tasks` - Updated workspace lockfile entries to reflect the new dependency graph and removal of `codex-common`.	2026-02-11 12:59:24 +00:00
Michael Bolin	476c1a7160	Remove `test-support` feature from `codex-core` and replace it with explicit test toggles (#11405 ) ## Why `codex-core` was being built in multiple feature-resolved permutations because test-only behavior was modeled as crate features. For a large crate, those permutations increase compile cost and reduce cache reuse. ## Net Change - Removed the `test-support` crate feature and related feature wiring so `codex-core` no longer needs separate feature shapes for test consumers. - Standardized cross-crate test-only access behind `codex_core::test_support`. - External test code now imports helpers from `codex_core::test_support`. - Underlying implementation hooks are kept internal (`pub(crate)`) instead of broadly public. ## Outcome - Fewer `codex-core` build permutations. - Better incremental cache reuse across test targets. - No intended production behavior change.	2026-02-10 22:44:02 -08:00
xl-openai	fdd0cd1de9	feat: support multiple rate limits (#11260 ) Added multi-limit support end-to-end by carrying limit_name in rate-limit snapshots and handling multiple buckets instead of only codex. Extended /usage client parsing to consume additional_rate_limits Updated TUI /status and in-memory state to store/render per-limit snapshots Extended app-server rate-limit read response: kept rate_limits and added rate_limits_by_name. Adjusted usage-limit error messaging for non-default codex limit buckets	2026-02-10 20:09:31 -08:00
Celia Chen	641d5268fa	chore: persist turn_id in rollout session and make turn_id uuid based (#11246 ) Problem: 1. turn id is constructed in-memory; 2. on resuming threads, turn_id might not be unique; 3. client cannot no the boundary of a turn from rollout files easily. This PR does three things: 1. persist `task_started` and `task_complete` events; 1. persist `turn_id` in rollout turn events; 5. generate turn_id as unique uuids instead of incrementing it in memory. This helps us resolve the issue of clients wanting to have unique turn ids for resuming a thread, and knowing the boundry of each turn in rollout files. example debug logs ``` 2026-02-11T00:32:10.746876Z DEBUG codex_app_server_protocol::protocol::thread_history: built turn from rollout items turn_index=8 turn=Turn { id: "019c4a07-d809-74c3-bc4b-fd9618487b4b", items: [UserMessage { id: "item-24", content: [Text { text: "hi", text_elements: [] }] }, AgentMessage { id: "item-25", text: "Hi. I’m in the workspace with your current changes loaded and ready. Send the next task and I’ll execute it end-to-end." }], status: Completed, error: None } 2026-02-11T00:32:10.746888Z DEBUG codex_app_server_protocol::protocol::thread_history: built turn from rollout items turn_index=9 turn=Turn { id: "019c4a18-1004-76c0-a0fb-a77610f6a9b8", items: [UserMessage { id: "item-26", content: [Text { text: "hello", text_elements: [] }] }, AgentMessage { id: "item-27", text: "Hello. Ready for the next change in `codex-rs`; I can continue from the current in-progress diff or start a new task." }], status: Completed, error: None } 2026-02-11T00:32:10.746899Z DEBUG codex_app_server_protocol::protocol::thread_history: built turn from rollout items turn_index=10 turn=Turn { id: "019c4a19-41f0-7db0-ad78-74f1503baeb8", items: [UserMessage { id: "item-28", content: [Text { text: "hello", text_elements: [] }] }, AgentMessage { id: "item-29", text: "Hello. Send the specific change you want in `codex-rs`, and I’ll implement it and run the required checks." }], status: Completed, error: None } ``` backward compatibility: if you try to resume an old session without task_started and task_complete event populated, the following happens: - If you resume and do nothing: those reconstructed historical IDs can differ next time you resume. - If you resume and send a new turn: the new turn gets a fresh UUID from live submission flow and is persisted, so that new turn’s ID is stable on later resumes. I think this behavior is fine, because we only care about deterministic turn id once a turn is triggered.	2026-02-11 03:56:01 +00:00
pakrym-oai	c68999ee6d	Prefer websocket transport when model opts in (#11386 ) Summary - add a `prefer_websockets` field to `ModelInfo`, defaulting to `false` in all fixtures and constructors - wire the new flag into websocket selection so models that opt in always use websocket transport even when the feature gate is off Testing - Not run (not requested)	2026-02-10 18:50:48 -08:00
pakrym-oai	bfd4e2112c	Disable very flaky tests (#11394 ) Collected from last 20 builds of main in https://github.com/openai/codex/commits/main/.	2026-02-10 18:50:11 -08:00
jif-oai	847a6092e6	fix: reduce usage of `open_if_present` (#11344 )	2026-02-10 19:25:07 +00:00
jif-oai	a364dd8b56	feat: opt-out of events in the app-server (#11319 ) Add `optOutNotificationMethods` in the app-server to opt-out events based on exact method matching	2026-02-10 18:04:52 +00:00
Shijie Rao	c4b771a16f	Fix: update parallel tool call exec approval to approve on request id (#11162 ) ### Summary In parallel tool call, exec command approvals were not approved at request level but at a turn level. i.e. when a single request is approved, the system currently treats all requests in turn as approved. ### Before https://github.com/user-attachments/assets/d50ed129-b3d2-4b2f-97fa-8601eb11f6a8 ### After https://github.com/user-attachments/assets/36528a43-a4aa-4775-9e12-f13287ef19fc	2026-02-10 09:38:00 -08:00
Max Johnson	47356ff83c	Revert "Add app-server transport layer with websocket support (#10693 )" (#11323 ) Suspected cause of deadlocking bug	2026-02-10 17:37:49 +00:00
Michael Bolin	44ebf4588f	feat: retain NetworkProxy, when appropriate (#11207 ) As of this PR, `SessionServices` retains a `Option<StartedNetworkProxy>`, if appropriate. Now the `network` field on `Config` is `Option<NetworkProxySpec>` instead of `Option<NetworkProxy>`. Over in `Session::new()`, we invoke `NetworkProxySpec::start_proxy()` to create the `StartedNetworkProxy`, which is a new struct that retains the `NetworkProxy` as well as the `NetworkProxyHandle`. (Note that `Drop` is implemented for `NetworkProxyHandle` to ensure the proxies are shutdown when it is dropped.) The `NetworkProxy` from the `StartedNetworkProxy` is threaded through to the appropriate places. --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/11207). * #11285 * __->__ #11207	2026-02-10 02:09:23 -08:00
Matthew Zeng	005e040f97	[apps] Add thread_id param to optionally load thread config for apps feature check. (#11279 ) - [x] Add thread_id param to optionally load thread config for apps feature check	2026-02-09 23:10:26 -08:00
Owen Lin	53741013ab	fix(app-server): for external auth, replace id_token with chatgpt_acc… (#11240 ) …ount_id and chatgpt_plan_type ### Summary Following up on external auth mode which was introduced here: https://github.com/openai/codex/pull/10012 Turns out some clients have a differently shaped ID token and don't have a chosen workspace (aka chatgpt_account_id) encoded in their ID token. So, let's replace `id_token` param with `chatgpt_account_id` and `chatgpt_plan_type` (optional) when initializing the external ChatGPT auth mode (`account/login/start` with `chatgptAuthTokens`). The client was able to test end-to-end with a Codex build from this branch and verified it worked!	2026-02-09 20:48:58 -08:00
Dylan Hurd	168c359b71	Adjust shell command timeouts for Windows (#11247 ) Summary - add platform-aware defaults for shell command timeouts so Windows tests get longer waits - keep medium timeout longer on Windows to ensure flakiness is reduced Testing - Not run (not requested)	2026-02-09 20:03:32 -08:00
Josh McKinney	de59e550c0	test: deflake nextest child-process leak in MCP harnesses (#11263 ) ## Summary - add deterministic child-process cleanup to both test `McpProcess` helpers - keep Tokio `kill_on_drop(true)` but also reap via bounded `try_wait()` polling in `Drop` - document the failure mode and why this avoids nondeterministic `LEAK` flakes ## Why `cargo nextest` leak detection can intermittently report `LEAK` when a spawned server outlives test teardown, making CI flaky. ## Testing - `just fmt` - `cargo test -p codex-app-server` - `cargo test -p codex-mcp-server` ## Failing CI Reference - Original failing job: https://github.com/openai/codex/actions/runs/21845226299/job/63039443593?pr=11245	2026-02-10 03:43:24 +00:00
xl-openai	a33ee46e3b	feat: extend skills/list to support additional roots. (#10835 ) Add an optional perCwdExtraUserRoots	2026-02-09 13:30:38 -08:00
gt-oai	9fe925b15a	Load requirements on windows (#10770 ) We support requirements on Unix, loading from `/etc/codex/requirements.toml`. On MacOS, we also support MDM. Now, on Windows, we'll load requirements from `%ProgramData%\OpenAI\Codex\requirements.toml`	2026-02-09 16:05:38 +00:00
jif-oai	6cf61725d0	feat: do not close unified exec processes across turns (#10799 ) With this PR we do not close the unified exec processes (i.e. background terminals) at the end of a turn unless: * The user interrupt the turn * The user decide to clean the processes through `app-server` or `/clean` I made sure that `codex exec` correctly kill all the processes	2026-02-09 10:27:46 +00:00
Michael Bolin	383b45279e	feat: include NetworkConfig through ExecParams (#11105 ) This PR adds the following field to `Config`: ```rust pub network: Option<NetworkProxy>, ``` Though for the moment, it will always be initialized as `None` (this will be addressed in a subsequent PR). This PR does the work to thread `network` through to `execute_exec_env()`, `process_exec_tool_call()`, and `UnifiedExecRuntime.run()` to ensure it is available whenever we span a process.	2026-02-09 03:32:17 +00:00
Matthew Zeng	45b7763c3f	[apps] Improve app loading. (#10994 ) There are two concepts of apps that we load in the harness: - Directory apps, which is all the apps that the user can install. - Accessible apps, which is what the user actually installed and can be $ inserted and be used by the model. These are extracted from the tools that are loaded through the gateway MCP. Previously we wait for both sets of apps before returning the full apps list. Which causes many issues because accessible apps won't be available to the UI or the model if directory apps aren't loaded or failed to load. In this PR we are separating them so that accessible apps can be loaded separately and are instantly available to be shown in the UI and to be provided in model context. We also added an app-server event so that clients can subscribe to also get accessible apps without being blocked on the full app list. - [x] Separate accessible apps and directory apps loading. - [x] `app/list` request will also emit `app/list/updated` notifications that app-server clients can subscribe. Which allows clients to get accessible apps list to render in the $ menu without being blocked by directory apps. - [x] Cache both accessible and directory apps with 1 hour TTL to avoid reloading them when creating new threads. - [x] TUI improvements to redraw $ menu and /apps menu when app list is updated.	2026-02-08 15:24:56 -08:00
Matthew Zeng	9f1009540b	Upgrade rmcp to 0.14 (#10718 ) - [x] Upgrade rmcp to 0.14	2026-02-08 15:07:53 -08:00
Eric Traut	b3de6c7f2b	Defer persistence of rollout file (#11028 ) - Defer rollout persistence for fresh threads (`InitialHistory::New`): keep rollout events in memory and only materialize rollout file + state DB row on first `EventMsg::UserMessage`. - Keep precomputed rollout path available before materialization. - Change `thread/start` to build thread response from live config snapshot and optional precomputed path. - Improve pre-materialization behavior in app-server/TUI: clearer invalid-request errors for file-backed ops and a friendlier `/fork` “not ready yet” UX. - Update tests to match deferred semantics across start/read/archive/unarchive/fork/resume/review flows. - Improved resilience of user_shell test, which should be unrelated to this change but must be affected by timing changes For Reviewers: * The primary change is in recorder.rs * Most of the other changes were to fix up broken assumptions in existing tests Testing: * Manually tested CLI * Exercised app server paths by manually running IDE Extension with rebuilt CLI binary * Only user-visible change is that `/fork` in TUI generates visible error if used prior to first turn	2026-02-07 23:05:03 -08:00
Charley Cunningham	e6662d6387	app-server: treat null mode developer instructions as built-in defaults (#10983 ) ## Summary - make `turn/start` normalize `collaborationMode.settings.developer_instructions: null` to the built-in instructions for the selected mode - prevent app-server clients from accidentally clearing mode-switch developer instructions by sending `null` - document this behavior in the v2 protocol and app-server docs ## What changed - `codex-rs/app-server/src/codex_message_processor.rs` - added a small `normalize_turn_start_collaboration_mode` helper - in `turn_start`, apply normalization before `OverrideTurnContext` - `codex-rs/app-server/tests/suite/v2/turn_start.rs` - extended `turn_start_accepts_collaboration_mode_override_v2` to assert the outgoing request includes default-mode instruction text when the client sends `developer_instructions: null` - `codex-rs/app-server-protocol/src/protocol/v2.rs` - clarified `TurnStartParams.collaboration_mode` docs: `settings.developer_instructions: null` means use built-in mode instructions - regenerated schema fixture: - `codex-rs/app-server-protocol/schema/typescript/v2/TurnStartParams.ts` - docs: - `codex-rs/app-server/README.md` - `codex-rs/docs/codex_mcp_interface.md`	2026-02-07 12:59:41 -08:00
viyatb-oai	739908a12c	feat(core): add network constraints schema to requirements.toml (#10958 ) ## Summary Add `requirements.toml` schema support for admin-defined network constraints in the requirements layer example config: ``` [experimental_network] enabled = true allowed_domains = ["api.openai.com"] denied_domains = ["example.com"] ```	2026-02-07 19:48:24 +00:00
jif-oai	62605fa471	Add resume_agent collab tool (#10903 ) Summary - add the new resume_agent collab tool path through core, protocol, and the app server API, including the resume events - update the schema/TypeScript definitions plus docs so resume_agent appears in generated artifacts and README - note that resumed agents rehydrate rollout history without overwriting their base instructions Testing - Not run (not requested)	2026-02-07 17:31:45 +01:00
Michael Bolin	a118494323	feat: add support for allowed_web_search_modes in requirements.toml (#10964 ) This PR makes it possible to disable live web search via an enterprise config even if the user is running in `--yolo` mode (though cached web search will still be available). To do this, create `/etc/codex/requirements.toml` as follows: ```toml # "live" is not allowed; "disabled" is allowed even though not listed explicitly. allowed_web_search_modes = ["cached"] ``` Or set `requirements_toml_base64` MDM as explained on https://developers.openai.com/codex/security/#locations. ### Why - Enforce admin/MDM/`requirements.toml` constraints on web-search behavior, independent of user config and per-turn sandbox defaults. - Ensure per-turn config resolution and review-mode overrides never crash when constraints are present. ### What - Add `allowed_web_search_modes` to requirements parsing and surface it in app-server v2 `ConfigRequirements` (`allowedWebSearchModes`), with fixtures updated. - Define a requirements allowlist type (`WebSearchModeRequirement`) and normalize semantics: - `disabled` is always implicitly allowed (even if not listed). - An empty list is treated as `["disabled"]`. - Make `Config.web_search_mode` a `Constrained<WebSearchMode>` and apply requirements via `ConstrainedWithSource<WebSearchMode>`. - Update per-turn resolution (`resolve_web_search_mode_for_turn`) to: - Prefer `Live → Cached → Disabled` when `SandboxPolicy::DangerFullAccess` is active (subject to requirements), unless the user preference is explicitly `Disabled`. - Otherwise, honor the user’s preferred mode, falling back to an allowed mode when necessary. - Update TUI `/debug-config` and app-server mapping to display normalized `allowed_web_search_modes` (including implicit `disabled`). - Fix web-search integration tests to assert cached behavior under `SandboxPolicy::ReadOnly` (since `DangerFullAccess` legitimately prefers `live` when allowed).	2026-02-07 05:55:15 +00:00
Eric Traut	4d52428fa2	Fixed a flaky test (#10970 ) ## Summary Stabilize v2 review integration tests by making them hermetic with respect to model discovery. `app-server` review tests were intermittently timing out in CI (especially on Windows runners) because their test config allowed remote model refresh. During `thread/start`, the test process could issue live `/v1/models` requests, introducing external network latency and nondeterministic timing before review flow assertions. This change disables remote model fetching in the review test config helper used by these tests.	2026-02-06 21:26:26 -08:00
Javi	87ce50f118	app-server: print help message to console when starting websockets server (#10943 ) Follow-up to https://github.com/openai/codex/pull/10693 <img width="596" height="77" alt="image" src="https://github.com/user-attachments/assets/9140df70-01d1-4c5a-85ee-ca15a09a0e77" />	2026-02-07 00:18:42 +00:00
Ahmed Ibrahim	ba8b5d9018	Treat compaction failure as failure state (#10927 ) - Return compaction errors from local and remote compaction flows.\n- Stop turns/tasks when auto-compaction fails instead of continuing execution.	2026-02-06 13:51:46 -08:00
canvrno-oai	36c16e0c58	Add app configs to config.toml (#10822 ) Adds app configs to config.toml + tests	2026-02-06 10:29:08 -08:00
Eric Traut	4521a6e852	Removed "exec_policy" feature flag (#10851 ) This is no longer needed because it's on by default	2026-02-06 08:59:47 -08:00
jif-oai	aab61934af	Handle required MCP startup failures across components (#10902 ) Summary - add a `required` flag for MCP servers everywhere config/CLI data is touched so mandatory helpers can be round-tripped - have `codex exec` and `codex app-server` thread start/resume fail fast when required MCPs fail to initialize	2026-02-06 17:14:37 +01:00
Eric Traut	dd80e332c4	Removed the "remote_compaction" feature flag (#10840 ) This feature is always on now	2026-02-05 23:54:57 -08:00
Owen Lin	0d8b2b74c4	feat(app-server): turn/steer API (#10821 ) This PR adds a dedicated `turn/steer` API for appending user input to an in-flight turn. ## Motivation Currently, steering in the app is implemented by just calling `turn/start` while a turn is running. This has some really weird quirks: - Client gets back a new `turn.id`, even though streamed events/approvals remained tied to the original active turn ID. - All the various turn-level override params on `turn/start` do not apply to the "steer", and would only apply to the next real turn. - There can also be a race condition where the client thinks the turn is active but the server has already completed it, so there might be bugs if the client has baked in some client-specific behavior thinking it's a steer when in fact the server kicked off a new turn. This is particularly possible when running a client against a remote app-server. Having a dedicated `turn/steer` API eliminates all those quirks. `turn/steer` behavior: - Requires an active turn on threadId. Returns a JSON-RPC error if there is no active turn. - If expectedTurnId is provided, it must match the active turn (more useful when connecting to a remote app-server). - Does not emit `turn/started`. - Does not accept turn overrides (`cwd`, `model`, `sandbox`, etc.) or `outputSchema` to accurately reflect that these are not applied when steering.	2026-02-06 00:35:04 +00:00
Matthew Zeng	729b016515	Add stage field for experimental flags. (#10793 ) - [x] Add stage field for experimental flags.	2026-02-05 23:31:04 +00:00
xl-openai	43a7290f11	Sync app-server requirements API with refreshed cloud loader (#10815 ) configRequirements/read now returns updated cloud requirements after login.	2026-02-05 14:43:31 -08:00
Max Johnson	8473096efb	Add app-server transport layer with websocket support (#10693 ) - Adds --listen <URL> to codex app-server with two listen modes: - stdio:// (default, existing behavior) - ws://IP:PORT (new websocket transport) - Refactors message routing to be connection-aware: - Tracks per-connection session state (initialize/experimental capability) - Routes responses/errors to the originating connection - Broadcasts server notifications/requests to initialized connections - Updates initialization semantics to be per connection (not process-global), and updates app-server docs accordingly. - Adds websocket accept/read/write handling (JSON-RPC per text frame, ping/pong handling, connection lifecycle events). Testing - Unit tests for transport URL parsing and targeted response/error routing. - New websocket integration test validating: - per-connection initialization requirements - no cross-connection response leakage - same request IDs on different connections route independently.	2026-02-05 20:56:34 +00:00
Matthew Zeng	7e81f63698	[app-server] Add a method to list experimental features. (#10721 ) - [x] Add a method to list experimental features.	2026-02-05 20:04:01 +00:00
Owen Lin	3582b74d01	fix(auth): isolate chatgptAuthTokens concept to auth manager and app-server (#10423 ) So that the rest of the codebase (like TUI) don't need to be concerned whether ChatGPT auth was handled by Codex itself or passed in via app-server's external auth mode.	2026-02-05 10:46:06 -08:00
jif-oai	9ee746afd6	Leverage state DB metadata for thread summaries (#10621 ) Summary: - read conversation summaries and cwd info from the state DB when possible so we no longer rely on rollout files for metadata and avoid extra I/O - persist CLI version in thread metadata, surface it through summary builders, and add the necessary DB migration hooks - simplify thread listing by using enriched state DB data directly rather than reading rollout heads Testing: - Not run (not requested)	2026-02-05 16:39:11 +00:00
jif-oai	c67120f4a0	fix: flaky landlock (#10689 ) https://openai.slack.com/archives/C095U48JNL9/p1770243347893959	2026-02-05 10:30:18 +00:00
Dylan Hurd	fe8b474acd	fix(core,app-server) resume with different model (#10719 ) ## Summary When resuming with a different model, we should also append a developer message with the model instructions ## Testing - [x] Added unit tests	2026-02-05 00:40:05 -08:00
xl-openai	1e1146cd29	Reload cloud requirements after user login (#10725 ) Reload cloud requirements after user login so it could take effect immediately.	2026-02-05 00:27:16 -08:00

1 2 3 4 5 ...

358 commits