core-agent-ide

Author	SHA1	Message	Date
viyatb-oai	ae4de43ccc	feat(linux-sandbox): add bwrap support (#9938 ) ## Summary This PR introduces a gated Bubblewrap (bwrap) Linux sandbox path. The curent Linux sandbox path relies on in-process restrictions (including Landlock). Bubblewrap gives us a more uniform filesystem isolation model, especially explicit writable roots with the option to make some directories read-only and granular network controls. This is behind a feature flag so we can validate behavior safely before making it the default. - Added temporary rollout flag: - `features.use_linux_sandbox_bwrap` - Preserved existing default path when the flag is off. - In Bubblewrap mode: - Added internal retry without /proc when /proc mount is not permitted by the host/container.	2026-02-04 11:13:17 -08:00
Ahmed Ibrahim	38a47700b5	Add thread/compact v2 (#10445 ) - add `thread/compact` as a trigger-only v2 RPC that submits `Op::Compact` and returns `{}` immediately. - add v2 compaction e2e coverage for success and invalid/unknown thread ids, and update protocol schemas/docs.	2026-02-03 18:15:55 -08:00
Shijie Rao	750ebe154d	Feat: add upgrade to app server modelList (#10556 ) ### Summary * Add model upgrade to listModel app server endpoint to support dynamically show model upgrade banner.	2026-02-03 14:53:36 -08:00
xl-openai	f38d181795	feat: add APIs to list and download public remote skills (#10448 ) Add API to list / download from remote public skills	2026-02-03 14:09:37 -08:00
Owen Lin	d9ad5c3c49	fix(app-server): fix approval events in review mode (#10416 ) One of our partners flagged that they were seeing the wrong order of events when running `review/start` with command exec approvals: ``` {"method":"item/commandExecution/requestApproval","id":0,"params":{"threadId":"019c0b6b-6a42-7c02-99c4-98c80e88ac27","turnId":"0","itemId":"0","reason":"`/bin/zsh -lc 'git show b7a92b4eacf262c575f26b1e1ed621a357642e55 --stat'` requires approval: Xcode-required approval: Require explicit user confirmation for all commands.","proposedExecpolicyAmendment":null}} {"method":"item/started","params":{"item":{"type":"commandExecution","id":"call_AEjlbHqLYNM7kbU3N6uw1CNi","command":"/bin/zsh -lc 'git show b7a92b4eacf262c575f26b1e1ed621a357642e55 --stat'","cwd":"/Users/devingreen/Desktop/SampleProject","processId":null,"status":"inProgress","commandActions":[{"type":"unknown","command":"git show b7a92b4eacf262c575f26b1e1ed621a357642e55 --stat"}],"aggregatedOutput":null,"exitCode":null,"durationMs":null},"threadId":"019c0b6b-6a42-7c02-99c4-98c80e88ac27","turnId":"0"}} ``` Key fix: In the review sub‑agent delegate we were forwarding exec (and patch) approvals using the parent turn id (`parent_ctx.sub_id`) as the approval call_id. That made `item/commandExecution/requestApproval.itemId` differ from the actual `item/started` id. We now forward the sub‑agent’s `call_id` from the approval event instead, so the approval item id matches the commandExecution item id in review flows. Here’s the expected event order for an inline `review/start` that triggers an exec approval after this fix: 1. Response to review/start (JSON‑RPC response) - Includes `turn` (status inProgress) and `review_thread_id` (same as parent thread for inline). 2. `turn/started` notification - turnId is the review turn id (e.g., "0"). 3. `item/started` → EnteredReviewMode - item.id == turnId, marks entry into review mode. 4. `item/started` → commandExecution - item.id == <call_id> (e.g., "review-call-1"), status: inProgress. 5. `item/commandExecution/requestApproval` request - JSON‑RPC request (not a notification). - params.itemId == <call_id> and params.turnId == turnId. 6. Client replies to approval request (Approved / Declined / etc). 7. If approved: - Optional `item/commandExecution/outputDelta` notifications. - `item/completed` → commandExecution with status and exitCode. 8. Review finishes: - `item/started` → ExitedReviewMode - `item/completed` → ExitedReviewMode - (Agent message items may also appear, depending on review output.) 9. `turn/completed` notification The key being #4 and #5 are now in the proper order with the correct item id.	2026-02-03 12:08:17 -08:00
Max Johnson	66b196a725	Inject CODEX_THREAD_ID into the terminal environment (#10096 ) Inject CODEX_THREAD_ID (when applicable) into the terminal environment so that the agent (and skills) can refer to the current thread / session ID. Discussion: https://openai.slack.com/archives/C095U48JNL9/p1769542492067109	2026-02-03 11:31:12 -08:00
Charley Cunningham	d509df676b	Cleanup collaboration mode variants (#10404 ) ## Summary This PR simplifies collaboration modes to the visible set `default \| plan`, while preserving backward compatibility for older partners that may still send legacy mode names. Specifically: - Renames the old Code behavior to Default. - Keeps Plan as-is. - Removes Custom mode behavior (fallbacks now resolve to Default). - Keeps `PairProgramming` and `Execute` internally for compatibility plumbing, while removing them from schema/API and UI visibility. - Adds legacy input aliasing so older clients can still send old mode names. ## What Changed 1. Mode enum and compatibility - `ModeKind` now uses `Plan` + `Default` as active/public modes. - `ModeKind::Default` deserialization accepts legacy values: - `code` - `pair_programming` - `execute` - `custom` - `PairProgramming` and `Execute` variants remain in code but are hidden from protocol/schema generation. - `Custom` variant is removed; previous custom fallbacks now map to `Default`. 2. Collaboration presets and templates - Built-in presets now return only: - `Plan` - `Default` - Template rename: - `core/templates/collaboration_mode/code.md` -> `default.md` - `execute.md` and `pair_programming.md` remain on disk but are not surfaced in visible preset lists. 3. TUI updates - Updated user-facing naming and prompts from “Code” to “Default”. - Updated mode-cycle and indicator behavior to reflect only visible `Plan` and `Default`. - Updated corresponding tests and snapshots. 4. request_user_input behavior - `request_user_input` remains allowed only in `Plan` mode. - Rejection messaging now consistently treats non-plan modes as `Default`. 5. Schemas - Regenerated config and app-server schemas. - Public schema types now advertise mode values as: - `plan` - `default` ## Backward Compatibility Notes - Incoming legacy mode names (`code`, `pair_programming`, `execute`, `custom`) are accepted and coerced to `default`. - Outgoing/public schema surfaces intentionally expose only `plan \| default`. - This allows tolerant ingestion of older partner payloads while standardizing new integrations on the reduced mode set. ## Codex author `codex fork 019c1fae-693b-7840-b16e-9ad38ea0bd00`	2026-02-03 09:23:53 -08:00
Colin Young	7e07ec8f73	[Codex][CLI] Gate image inputs by model modalities (#10271 ) ###### Summary - Add input_modalities to model metadata so clients can determine supported input types. - Gate image paste/attach in TUI when the selected model does not support images. - Block submits that include images for unsupported models and show a clear warning. - Propagate modality metadata through app-server protocol/model-list responses. - Update related tests/fixtures. ###### Rationale - Models support different input modalities. - Clients need an explicit capability signal to prevent unsupported requests. - Backward-compatible defaults preserve existing behavior when modality metadata is absent. ###### Scope - codex-rs/protocol, codex-rs/core, codex-rs/tui - codex-rs/app-server-protocol, codex-rs/app-server - Generated app-server types / schema fixtures ###### Trade-offs - Default behavior assumes text + image when field is absent for compatibility. - Server-side validation remains the source of truth. ###### Follow-up - Non-TUI clients should consume input_modalities to disable unsupported attachments. - Model catalogs should explicitly set input_modalities for text-only models. ###### Testing - cargo fmt --all - cargo test -p codex-tui - env -u GITHUB_APP_KEY cargo test -p codex-core --lib - just write-app-server-schema - cargo run -p codex-cli --bin codex -- app-server generate-ts --out app-server-types - test against local backend <img width="695" height="199" alt="image" src="https://github.com/user-attachments/assets/d22dd04f-5eba-4db9-a7c5-a2506f60ec44" /> --------- Co-authored-by: Josh McKinney <joshka@openai.com>	2026-02-02 18:56:39 -08:00
Ahmed Ibrahim	b8addcddb9	Require models refresh on cli version mismatch (#10414 )	2026-02-02 18:55:25 -08:00
sayan-oai	fc05374344	chore: add phase to message responseitem (#10455 ) ### What add wiring for `phase` field on `ResponseItem::Message` to lay groundwork for differentiating model preambles and final messages. currently optional. follows pattern in #9698. updated schemas with `just write-app-server-schema` so we can see type changes. ### Tests Updated existing tests for SSE parsing and hydrating from history	2026-02-03 02:52:26 +00:00
Michael Bolin	66447d5d2c	feat: replace custom mcp-types crate with equivalents from rmcp (#10349 ) We started working with MCP in Codex before https://crates.io/crates/rmcp was mature, so we had our own crate for MCP types that was generated from the MCP schema: `8b95d3e082/codex-rs/mcp-types/README.md` Now that `rmcp` is more mature, it makes more sense to use their MCP types in Rust, as they handle details (like the `_meta` field) that our custom version ignored. Though one advantage that our custom types had is that our generated types implemented `JsonSchema` and `ts_rs::TS`, whereas the types in `rmcp` do not. As such, part of the work of this PR is leveraging the adapters between `rmcp` types and the serializable types that are API for us (app server and MCP) introduced in #10356. Note this PR results in a number of changes to `codex-rs/app-server-protocol/schema`, which merit special attention during review. We must ensure that these changes are still backwards-compatible, which is possible because we have: ```diff - export type CallToolResult = { content: Array<ContentBlock>, isError?: boolean, structuredContent?: JsonValue, }; + export type CallToolResult = { content: Array<JsonValue>, structuredContent?: JsonValue, isError?: boolean, _meta?: JsonValue, }; ``` so `ContentBlock` has been replaced with the more general `JsonValue`. Note that `ContentBlock` was defined as: ```typescript export type ContentBlock = TextContent \| ImageContent \| AudioContent \| ResourceLink \| EmbeddedResource; ``` so the deletion of those individual variants should not be a cause of great concern. Similarly, we have the following change in `codex-rs/app-server-protocol/schema/typescript/Tool.ts`: ``` - export type Tool = { annotations?: ToolAnnotations, description?: string, inputSchema: ToolInputSchema, name: string, outputSchema?: ToolOutputSchema, title?: string, }; + export type Tool = { name: string, title?: string, description?: string, inputSchema: JsonValue, outputSchema?: JsonValue, annotations?: JsonValue, icons?: Array<JsonValue>, _meta?: JsonValue, }; ``` so: - `annotations?: ToolAnnotations` ➡️ `JsonValue` - `inputSchema: ToolInputSchema` ➡️ `JsonValue` - `outputSchema?: ToolOutputSchema` ➡️ `JsonValue` and two new fields: `icons?: Array<JsonValue>, _meta?: JsonValue` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/10349). * #10357 * __->__ #10349 * #10356	2026-02-02 17:41:55 -08:00
jif-oai	059d386f03	feat: add `--experimental` to `generate-ts` (#10402 ) Adding a `--experimental` flag to the `generate-ts` fct in the app-sever. It can be called through one of those 2 command ``` just write-app-server-schema --experimental codex app-server generate-ts --experimental ```	2026-02-02 20:30:01 +00:00
jif-oai	3cc9122ee2	feat: experimental flags (#10231 ) ## Problem being solved - We need a single, reliable way to mark app-server API surface as experimental so that: 1. the runtime can reject experimental usage unless the client opts in 2. generated TS/JSON schemas can exclude experimental methods/fields for stable clients. Right now that’s easy to drift or miss when done ad-hoc. ## How to declare experimental methods and fields - Experimental method: add `#[experimental("method/name")]` to the `ClientRequest` variant in `client_request_definitions!`. - Experimental field: on the params struct, derive `ExperimentalApi` and annotate the field with `#[experimental("method/name.field")]` + set `inspect_params: true` for the method variant so `ClientRequest::experimental_reason()` inspects params for experimental fields. ## How the macro solves it - The new derive macro lives in `codex-rs/codex-experimental-api-macros/src/lib.rs` and is used via `#[derive(ExperimentalApi)]` plus `#[experimental("reason")]` attributes. - Structs: - Generates `ExperimentalApi::experimental_reason(&self)` that checks only annotated fields. - The “presence” check is type-aware: - `Option<T>`: `is_some_and(...)` recursively checks inner. - `Vec`/`HashMap`/`BTreeMap`: must be non-empty. - `bool`: must be `true`. - Other types: considered present (returns `true`). - Registers each experimental field in an `inventory` with `(type_name, serialized field name, reason)` and exposes `EXPERIMENTAL_FIELDS` for that type. Field names are converted from `snake_case` to `camelCase` for schema/TS filtering. - Enums: - Generates an exhaustive `match` returning `Some(reason)` for annotated variants and `None` otherwise (no wildcard arm). - Wiring: - Runtime gating uses `ExperimentalApi::experimental_reason()` in `codex-rs/app-server/src/message_processor.rs` to reject requests unless `InitializeParams.capabilities.experimental_api == true`. - Schema/TS export filters use the inventory list and `EXPERIMENTAL_CLIENT_METHODS` from `client_request_definitions!` to strip experimental methods/fields when `experimental_api` is false.	2026-02-02 11:06:50 +00:00
Charley Cunningham	d3514bbdd2	Bump thread updated_at on unarchive to refresh sidebar ordering (#10280 ) ## Summary - Touch restored rollout files on `thread/unarchive` so `updatedAt` reflects the unarchive time. - Add a regression test to ensure unarchiving bumps `updated_at` from an old mtime. ## Notes This fixes the UX issue where unarchived old threads don’t reappear near the top of recent threads.	2026-02-01 12:53:47 -08:00
Dylan Hurd	a33fa4bfe5	chore(config) Rename config setting to personality (#10314 ) ## Summary Let's make the setting name consistent with the SlashCommand! ## Testing - [x] Updated tests	2026-01-31 19:38:06 -08:00
Dylan Hurd	0f9858394b	feat(core,tui,app-server) personality migration (#10307 ) ## Summary Keep existing users on Pragmatic, to preserve behavior while new users default to Friendly ## Testing - [x] Tested locally - [x] add integration tests	2026-01-31 17:25:14 -07:00
Dylan Hurd	8a461765f3	chore(core) Default to friendly personality (#10305 ) ## Summary Update default personality to friendly ## Testing - [x] Unit tests pass	2026-01-31 17:11:32 -07:00
Dylan Hurd	ed9e02c9dc	chore(app-server) add personality update test (#10306 ) ## Summary Add some additional validation to ensure app-server handles Personality changes ## Testing - [x] These are tests	2026-01-31 14:49:55 -07:00
sayan-oai	eb86663dcb	add missing fields to WebSearchAction and update app-server types (#10276 ) - add `WebSearchAction` to app-server v2 types - add `queries` to `WebSearchAction::Search` type Updated tests.	2026-01-30 16:37:56 -08:00
gt-oai	149f3aa27a	Add enforce_residency to requirements (#10263 ) Add `enforce_residency` to requirements.toml and thread it through to a header on `default_client`.	2026-01-31 00:26:25 +00:00
gt-oai	a046481ad9	Wire up cloud reqs in exec, app-server (#10241 ) We're fetching cloud requirements in TUI in https://github.com/openai/codex/pull/10167. This adds the same fetching in exec and app-server binaries also.	2026-01-30 23:53:41 +00:00
Jeremy Rose	d59685f6d4	file-search: multi-root walk (#10240 ) Instead of a separate walker for each root in a multi-root walk, use a single walker.	2026-01-30 22:20:23 +00:00
Michael Bolin	e6d913af2d	chore: rename ChatGpt -> Chatgpt in type names (#10244 ) When using ChatGPT in names of types, we should be consistent, so this renames some types with `ChatGpt` in the name to `Chatgpt`. From https://rust-lang.github.io/api-guidelines/naming.html: > In `UpperCamelCase`, acronyms and contractions of compound words count as one word: use `Uuid` rather than `UUID`, `Usize` rather than `USize` or `Stdin` rather than `StdIn`. In `snake_case`, acronyms and contractions are lower-cased: `is_xid_start`. This PR updates existing uses of `ChatGpt` and changes them to `Chatgpt`. Though in all cases where it could affect the wire format, I visually inspected that we don't change anything there. That said, this _will_ change the codegen because it will affect the spelling of type names. For example, this renames `AuthMode::ChatGPT` to `AuthMode::Chatgpt` in `app-server-protocol`, but the wire format is still `"chatgpt"`. This PR also updates a number of types in `codex-rs/core/src/auth.rs`.	2026-01-30 11:18:39 -08:00
Charley Cunningham	ec4a2d07e4	Plan mode: stream proposed plans, emit plan items, and render in TUI (#9786 ) ## Summary - Stream proposed plans in Plan Mode using `<proposed_plan>` tags parsed in core, emitting plan deltas plus a plan `ThreadItem`, while stripping tags from normal assistant output. - Persist plan items and rebuild them on resume so proposed plans show in thread history. - Wire plan items/deltas through app-server protocol v2 and render a dedicated proposed-plan view in the TUI, including the “Implement this plan?” prompt only when a plan item is present. ## Changes ### Core (`codex-rs/core`) - Added a generic, line-based tag parser that buffers each line until it can disprove a tag prefix; implements auto-close on `finish()` for unterminated tags. `codex-rs/core/src/tagged_block_parser.rs` - Refactored proposed plan parsing to wrap the generic parser. `codex-rs/core/src/proposed_plan_parser.rs` - In plan mode, stream assistant deltas as: - Normal text → `AgentMessageContentDelta` - Plan text → `PlanDelta` + `TurnItem::Plan` start/completion (`codex-rs/core/src/codex.rs`) - Final plan item content is derived from the completed assistant message (authoritative), not necessarily the concatenated deltas. - Strips `<proposed_plan>` blocks from assistant text in plan mode so tags don’t appear in normal messages. (`codex-rs/core/src/stream_events_utils.rs`) - Persist `ItemCompleted` events only for plan items for rollout replay. (`codex-rs/core/src/rollout/policy.rs`) - Guard `update_plan` tool in Plan Mode with a clear error message. (`codex-rs/core/src/tools/handlers/plan.rs`) - Updated Plan Mode prompt to: - keep `<proposed_plan>` out of non-final reasoning/preambles - require exact tag formatting - allow only one `<proposed_plan>` block per turn (`codex-rs/core/templates/collaboration_mode/plan.md`) ### Protocol / App-server protocol - Added `TurnItem::Plan` and `PlanDeltaEvent` to core protocol items. (`codex-rs/protocol/src/items.rs`, `codex-rs/protocol/src/protocol.rs`) - Added v2 `ThreadItem::Plan` and `PlanDeltaNotification` with EXPERIMENTAL markers and note that deltas may not match the final plan item. (`codex-rs/app-server-protocol/src/protocol/v2.rs`) - Added plan delta route in app-server protocol common mapping. (`codex-rs/app-server-protocol/src/protocol/common.rs`) - Rebuild plan items from persisted `ItemCompleted` events on resume. (`codex-rs/app-server-protocol/src/protocol/thread_history.rs`) ### App-server - Forward plan deltas to v2 clients and map core plan items to v2 plan items. (`codex-rs/app-server/src/bespoke_event_handling.rs`, `codex-rs/app-server/src/codex_message_processor.rs`) - Added v2 plan item tests. (`codex-rs/app-server/tests/suite/v2/plan_item.rs`) ### TUI - Added a dedicated proposed plan history cell with special background and padding, and moved “• Proposed Plan” outside the highlighted block. (`codex-rs/tui/src/history_cell.rs`, `codex-rs/tui/src/style.rs`) - Only show “Implement this plan?” when a plan item exists. (`codex-rs/tui/src/chatwidget.rs`, `codex-rs/tui/src/chatwidget/tests.rs`) <img width="831" height="847" alt="Screenshot 2026-01-29 at 7 06 24 PM" src="https://github.com/user-attachments/assets/69794c8c-f96b-4d36-92ef-c1f5c3a8f286" /> ### Docs / Misc - Updated protocol docs to mention plan deltas. (`codex-rs/docs/protocol_v1.md`) - Minor plumbing updates in exec/debug clients to tolerate plan deltas. (`codex-rs/debug-client/src/reader.rs`, `codex-rs/exec/...`) ## Tests - Added core integration tests: - Plan mode strips plan from agent messages. - Missing `</proposed_plan>` closes at end-of-message. (`codex-rs/core/tests/suite/items.rs`) - Added unit tests for generic tag parser (prefix buffering, non-tag lines, auto-close). (`codex-rs/core/src/tagged_block_parser.rs`) - Existing app-server plan item tests in v2. (`codex-rs/app-server/tests/suite/v2/plan_item.rs`) ## Notes / Behavior - Plan output no longer appears in standard assistant text in Plan Mode; it streams via `PlanDelta` and completes as a `TurnItem::Plan`. - The final plan item content is authoritative and may diverge from streamed deltas (documented as experimental). - Reasoning summaries are not filtered; prompt instructs the model not to include `<proposed_plan>` outside the final plan message. ## Codex Author `codex fork 019bec2d-b09d-7450-b292-d7bcdddcdbfb`	2026-01-30 18:59:30 +00:00
gt-oai	5662eb8b75	Load exec policy rules from requirements (#10190 ) `requirements.toml` should be able to specify rules which always run. My intention here was that these rules could only ever be restrictive, which means the decision can be "prompt" or "forbidden" but never "allow". A requirement of "you must always allow this command" didn't make sense to me, but happy to be gaveled otherwise. Rules already applies the most restrictive decision, so we can safely merge these with rules found in other config folders.	2026-01-30 18:04:09 +00:00
Michael Bolin	377ab0c77c	feat: refactor CodexAuth so invalid state cannot be represented (#10208 ) Previously, `CodexAuth` was defined as follows: `d550fbf41a/codex-rs/core/src/auth.rs (L39-L46)` But if you looked at its constructors, we had creation for `AuthMode::ApiKey` where `storage` was built using a nonsensical path (`PathBuf::new()`) and `auth_dot_json` was `None`: `d550fbf41a/codex-rs/core/src/auth.rs (L212-L220)` By comparison, when `AuthMode::ChatGPT` was used, `api_key` was always `None`: `d550fbf41a/codex-rs/core/src/auth.rs (L665-L671)` https://github.com/openai/codex/pull/10012 took things further because it introduced a new `ChatgptAuthTokens` variant to `AuthMode`, which is important in when invoking `account/login/start` via the app server, but most logic _internal_ to the app server should just reason about two `AuthMode` variants: `ApiKey` and `ChatGPT`. This PR tries to clean things up as follows: - `LoginAccountParams` and `AuthMode` in `codex-rs/app-server-protocol/` both continue to have the `ChatgptAuthTokens` variant, though it is used exclusively for the on-the-wire messaging. - `codex-rs/core/src/auth.rs` now has its own `AuthMode` enum, which only has two variants: `ApiKey` and `ChatGPT`. - `CodexAuth` has been changed from a struct to an enum. It is a disjoint union where each variant (`ApiKey`, `ChatGpt`, and `ChatGptAuthTokens`) have only the associated fields that make sense for that variant. --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/10208). * #10224 * __->__ #10208	2026-01-30 09:33:23 -08:00
pap-openai	1ef5455eb6	Conversation naming (#8991 ) Session renaming: - `/rename my_session` - `/rename` without arg and passing an argument in `customViewPrompt` - AppExitInfo shows resume hint using the session name if set instead of uuid, defaults to uuid if not set - Names are stored in `CODEX_HOME/sessions.jsonl` Session resuming: - codex resume <name> lookup for `CODEX_HOME/sessions.jsonl` first entry matching the name and resumes the session --------- Co-authored-by: jif-oai <jif@openai.com>	2026-01-30 10:40:09 +00:00
Shijie Rao	a0ccef9d5c	Chore: plan mode do not include free form question and always include isOther (#10210 ) We should never ask a freeform question when planning and we should always include isOther as an escape hatch.	2026-01-30 01:19:24 -08:00
Dylan Hurd	e3ab0bd973	chore(personality) new schema with fallbacks (#10147 ) ## Summary Let's dial in this api contract in a bit more with more robust fallback behavior when model_instructions_template is false. Switches to a more explicit template / variables structure, with more fallbacks. ## Testing - [x] Adding unit tests - [x] Tested locally	2026-01-30 00:10:12 -07:00
Celia Chen	7151387474	[feat] persist dynamic tools in session rollout file (#10130 ) Add dynamic tools to rollout file for persistence & read from rollout on resume. Ran a real example and spotted the following in the rollout file: ``` {"timestamp":"2026-01-29T01:27:57.468Z","type":"session_meta","payload":{"id":"019c075d-3f0b-77e3-894e-c1c159b04b1e","timestamp":"2026-01-29T01:27:57.451Z","...."dynamic_tools":[{"name":"demo_tool","description":"Demo dynamic tool","inputSchema":{"additionalProperties":false,"properties":{"city":{"type":"string"}},"required":["city"],"type":"object"}}],"git":{"commit_hash":"ebc573f15c01b8af158e060cfedd401f043e9dfa","branch":"dev/cc/dynamic-tools","repository_url":"https://github.com/openai/codex.git"}}} ```	2026-01-30 01:10:00 +00:00
Owen Lin	81a17bb2c1	feat(app-server): support external auth mode (#10012 ) This enables a new use case where `codex app-server` is embedded into a parent application that will directly own the user's ChatGPT auth lifecycle, which means it owns the user’s auth tokens and refreshes it when necessary. The parent application would just want a way to pass in the auth tokens for codex to use directly. The idea is that we are introducing a new "auth mode" currently only exposed via app server: `chatgptAuthTokens` which consist of the `id_token` (stores account metadata) and `access_token` (the bearer token used directly for backend API calls). These auth tokens are only stored in-memory. This new mode is in addition to the existing `apiKey` and `chatgpt` auth modes. This PR reuses the shape of our existing app-server account APIs as much as possible: - Update `account/login/start` with a new `chatgptAuthTokens` variant, which will allow the client to pass in the tokens and have codex app-server use them directly. Upon success, the server emits `account/login/completed` and `account/updated` notifications. - A new server->client request called `account/chatgptAuthTokens/refresh` which the server can use whenever the access token previously passed in has expired and it needs a new one from the parent application. I leveraged the core 401 retry loop which typically triggers auth token refreshes automatically, but made it pluggable: - chatgpt mode refreshes internally, as usual. - chatgptAuthTokens mode calls the client via `account/chatgptAuthTokens/refresh`, the client responds with updated tokens, codex updates its in-memory auth, then retries. This RPC has a 10s timeout and handles JSON-RPC errors from the client. Also some additional things: - chatgpt logins are blocked while external auth is active (have to log out first. typically clients will pick one OR the other, not support both) - `account/logout` clears external auth in memory - Ensures that if `forced_chatgpt_workspace_id` is set via the user's config, we respect it in both: - `account/login/start` with `chatgptAuthTokens` (returns a JSON-RPC error back to the client) - `account/chatgptAuthTokens/refresh` (fails the turn, and on next request app-server will send another `account/chatgptAuthTokens/refresh` request to the client).	2026-01-29 23:46:04 +00:00
xl-openai	bdd8a7d58b	Better handling skill depdenencies on ENV VAR. (#9017 ) An experimental flow for env var skill dependencies. Skills can now declare required env vars in SKILL.md; if missing, the CLI prompts the user to get the value, and Core will store it in memory (eventually to a local persistent store) <img width="790" height="169" alt="image" src="https://github.com/user-attachments/assets/cd928918-9403-43cb-a7e7-b8d59bcccd9a" />	2026-01-29 14:13:30 -05:00
jif-oai	714dc8d8bd	feat: async backfill (#10089 )	2026-01-29 09:57:50 +00:00
Matthew Zeng	b9cd089d1f	[connectors] Support connectors part 2 - slash command and tui (#9728 ) - [x] Support `/apps` slash command to browse the apps in tui. - [x] Support inserting apps to prompt using `$`. - [x] Lots of simplification/renaming from connectors to apps.	2026-01-28 19:51:58 -08:00
Ahmed Ibrahim	52609c6f42	Add app-server compaction item notifications tests (#10123 ) - add v2 tests covering local + remote auto-compaction item started/completed notifications	2026-01-29 01:00:38 +00:00
Ahmed Ibrahim	b7edeee8ca	compaction (#10034 ) # External (non-OpenAI) Pull Request Requirements Before opening this Pull Request, please read the dedicated "Contributing" markdown file or your PR may be closed: https://github.com/openai/codex/blob/main/docs/contributing.md If your PR conforms to our contribution guidelines, replace this text with a detailed and high quality description of your changes. Include a link to a bug report or enhancement request.	2026-01-28 11:36:11 -08:00
Jeremy Rose	b8156706e6	file-search: improve file query perf (#9939 ) switch nucleo-matcher for nucleo and use a "file search session" w/ live updating query instead of a single hermetic run per query.	2026-01-28 10:54:43 -08:00
jif-oai	3878c3dc7c	feat: sqlite 1 (#10004 ) Add a `.sqlite` database to be used to store rollout metatdata (and later logs) This PR is phase 1: * Add the database and the required infrastructure * Add a backfill of the database * Persist the newly created rollout both in files and in the DB * When we need to get metadata or a rollout, consider the `JSONL` as the source of truth but compare the results with the DB and show any errors	2026-01-28 15:29:14 +01:00
Dylan Hurd	996e09ca24	feat(core) RequestRule (#9489 ) ## Summary Instead of trying to derive the prefix_rule for a command mechanically, let's let the model decide for us. ## Testing - [x] tested locally	2026-01-28 08:43:17 +00:00
Matthew Zeng	3bb8e69dd3	[skills] Auto install MCP dependencies when running skils with dependency specs. (#9982 ) Auto install MCP dependencies when running skils with dependency specs.	2026-01-27 19:02:45 -08:00
iceweasel-oai	c40ad65bd8	remove sandbox globals. (#9797 ) Threads sandbox updates through OverrideTurnContext for active turn Passes computed sandbox type into safety/exec	2026-01-27 11:04:23 -08:00
Owen Lin	fc0fd85349	fix(app-server, core): defer initial context write to rollout file until first turn (#9950 ) ### Overview Currently calling `thread/resume` will always bump the thread's `updated_at` timestamp. This PR makes it the `updated_at` timestamp changes only if a turn is triggered. ### Additonal context What we typically do on resuming a thread is always writing “initial context” to the rollout file immediately. This initial context includes: - Developer instructions derived from sandbox/approval policy + cwd - Optional developer instructions (if provided) - Optional collaboration-mode instructions - Optional user instructions (if provided) - Environment context (cwd, shell, etc.) This PR defers writing the “initial context” to the rollout file until the first `turn/start`, so we don't inadvertently bump the thread's `updated_at` timestamp until a turn is actually triggered. This works even though both `thread/resume` and `turn/start` accept overrides (such as `model`, `cwd`, etc.) because the initial context is seeded from the effective `TurnContext` in memory, computed at `turn/start` time, after both sets of overrides have been applied. NOTE: This is a very short-lived solution until we introduce sqlite. Then we can remove this.	2026-01-27 10:41:54 -08:00
blevy-oai	bdc4742bfc	Add MCP server `scopes` config and use it as fallback for OAuth login (#9647 ) ### Motivation - Allow MCP OAuth flows to request scopes defined in `config.toml` instead of requiring users to always pass `--scopes` on the CLI. CLI/remote parameters should still override config values. ### Description - Add optional `scopes: Option<Vec<String>>` to `McpServerConfig` and `RawMcpServerConfig`, and propagate it through deserialization and the built config types. - Serialize `scopes` into the MCP server TOML via `serialize_mcp_server_table` in `core/src/config/edit.rs` and include `scopes` in the generated config schema (`core/config.schema.json`). - CLI: update `codex-rs/cli/src/mcp_cmd.rs` `run_login` to fall back to `server.scopes` when the `--scopes` flag is empty, with explicit CLI scopes still taking precedence. - App server: update `codex-rs/app-server/src/codex_message_processor.rs` `mcp_server_oauth_login` to use `params.scopes.or_else(\|\| server.scopes.clone())` so the RPC path also respects configured scopes. - Update many test fixtures to initialize the new `scopes` field (set to `None`) so test code builds with the new struct field. ### Testing - Ran config tooling and formatters: `just write-config-schema` (succeeded), `just fmt` (succeeded), and `just fix -p codex-core`, `just fix -p codex-cli`, `just fix -p codex-app-server` (succeeded where applicable). - Ran unit tests for the CLI: `cargo test -p codex-cli` (passed). - Ran unit tests for core: `cargo test -p codex-core` (ran; many tests passed but several failed, including model refresh/403-related tests, shell snapshot/timeouts, and several `unified_exec` expectations). - Ran app-server tests: `cargo test -p codex-app-server` (ran; many integration-suite tests failed due to mocked/remote HTTP 401/403 responses and wiremock expectations). If you want, I can split the tests into smaller focused runs or help debug the failing integration tests (they appear to be unrelated to the config change and stem from external HTTP/mocking behaviors encountered during the test runs). ------ [Codex Task](https://chatgpt.com/codex/tasks/task_i_69718f505914832ea1f334b3ba064553)	2026-01-26 14:13:04 -08:00
jif-oai	247fb2de64	[app-server] feat: add filtering on thread list (#9897 )	2026-01-26 21:54:19 +00:00
Charley Cunningham	62266b13f8	Add thread/unarchive to restore archived rollouts (#9843 ) ## Summary - Adds a new `thread/unarchive` RPC to move archived thread rollouts back into the active `sessions/` tree. ## What changed - Protocol - Adds `thread/unarchive` request/response types and wiring. - Server - Implements `thread_unarchive` in the app server. - Validates the archived rollout path and thread ID. - Restores the rollout to `sessions/YYYY/MM/DD/...` based on the rollout filename timestamp. - Core - Adds `find_archived_thread_path_by_id_str` helper for archived rollouts. - Docs - Documents the new RPC and usage example. - Tests - Adds an end-to-end server test that: 1) starts a thread, 2) archives it, 3) unarchives it, 4) asserts the file is restored to `sessions/`. ## How to use ```json { "method": "thread/unarchive", "id": 24, "params": { "threadId": "<thread-id>" } } ``` ## Author Codex Session `codex resume 019bf158-54b6-7960-a696-9d85df7e1bc1` (soon I'll make this kind of session UUID forkable by anyone with the right `session_object_storage_url` line in their config, but for now just pasting it here for my reference)	2026-01-26 11:24:36 -08:00
Shijie Rao	3ba702c5b6	Feat: add isOther to question returned by request user input tool (#9890 ) ### Summary Add `isOther` to question object from request_user_input tool input and remove `other` option from the tool prompt to better handle tool input.	2026-01-26 09:52:38 -08:00
gt-oai	6316e57497	Fix up config disabled err msg (#9916 ) Before: <img width="745" height="375" alt="image" src="https://github.com/user-attachments/assets/d6c23562-b87f-4af9-8642-329aab8e594d" /> After: <img width="1042" height="354" alt="image" src="https://github.com/user-attachments/assets/c9a2413c-c945-4c34-8b7e-c6c9b8fbf762" /> Two changes: 1. only display if there is a `config.toml` that is skipped (i.e. if there is just `.codex/skills` but no `.codex/config.toml` we do not display the error) 2. clarify the implications and the fix in the error message.	2026-01-26 17:49:31 +00:00
gt-oai	48aeb67f7a	Fix flakey conversation flow test (#9784 ) I've seen this test fail with: ``` - Mock #1. Expected range of matching incoming requests: == 2 Number of matched incoming requests: 1 ``` This is because we pop the wrong task_complete events and then the test exits. I think this is because the MCP events are now buffered after https://github.com/openai/codex/pull/8874. So: 1. clear the buffer before we do any user message sending 2. additionally listen for task start before task complete 3. use the ID from task start to find the correct task complete event.	2026-01-26 15:58:14 +00:00
jif-oai	d594693d1a	feat: dynamic tools injection (#9539 ) ## Summary Add dynamic tool injection to thread startup in API v2, wire dynamic tool calls through the app server to clients, and plumb responses back into the model tool pipeline. ### Flow (high level) - Thread start injects `dynamic_tools` into the model tool list for that thread (validation is done here). - When the model emits a tool call for one of those names, core raises a `DynamicToolCallRequest` event. - The app server forwards it to the client as `item/tool/call`, waits for the client’s response, then submits a `DynamicToolResponse` back to core. - Core turns that into a `function_call_output` in the next model request so the model can continue. ### What changed - Added dynamic tool specs to v2 thread start params and protocol types; introduced `item/tool/call` (request/response) for dynamic tool execution. - Core now registers dynamic tool specs at request time and routes those calls via a new dynamic tool handler. - App server validates tool names/schemas, forwards dynamic tool call requests to clients, and publishes tool outputs back into the session. - Integration tests	2026-01-26 10:06:44 +00:00
Dylan Hurd	25fccc3d4d	chore(core) move model_instructions_template config (#9871 ) ## Summary Move `model_instructions_template` config to the experimental slug while we iterate on this feature ## Testing - [x] Tested locally, unit tests still pass	2026-01-26 07:02:11 +00:00

1 2 3 4 5 ...

302 commits