Commit graph

213 commits

Author SHA1 Message Date
Ahmed Ibrahim
b8addcddb9
Require models refresh on cli version mismatch (#10414) 2026-02-02 18:55:25 -08:00
sayan-oai
fc05374344
chore: add phase to message responseitem (#10455)
### What

add wiring for `phase` field on `ResponseItem::Message` to lay
groundwork for differentiating model preambles and final messages.
currently optional.

follows pattern in #9698.

updated schemas with `just write-app-server-schema` so we can see type
changes.

### Tests
Updated existing tests for SSE parsing and hydrating from history
2026-02-03 02:52:26 +00:00
jif-oai
3cc9122ee2
feat: experimental flags (#10231)
## Problem being solved
- We need a single, reliable way to mark app-server API surface as
experimental so that:
  1. the runtime can reject experimental usage unless the client opts in
2. generated TS/JSON schemas can exclude experimental methods/fields for
stable clients.

Right now that’s easy to drift or miss when done ad-hoc.

## How to declare experimental methods and fields
- **Experimental method**: add `#[experimental("method/name")]` to the
`ClientRequest` variant in `client_request_definitions!`.
- **Experimental field**: on the params struct, derive `ExperimentalApi`
and annotate the field with `#[experimental("method/name.field")]` + set
`inspect_params: true` for the method variant so
`ClientRequest::experimental_reason()` inspects params for experimental
fields.

## How the macro solves it
- The new derive macro lives in
`codex-rs/codex-experimental-api-macros/src/lib.rs` and is used via
`#[derive(ExperimentalApi)]` plus `#[experimental("reason")]`
attributes.
- **Structs**:
- Generates `ExperimentalApi::experimental_reason(&self)` that checks
only annotated fields.
  - The “presence” check is type-aware:
    - `Option<T>`: `is_some_and(...)` recursively checks inner.
    - `Vec`/`HashMap`/`BTreeMap`: must be non-empty.
    - `bool`: must be `true`.
    - Other types: considered present (returns `true`).
- Registers each experimental field in an `inventory` with `(type_name,
serialized field name, reason)` and exposes `EXPERIMENTAL_FIELDS` for
that type. Field names are converted from `snake_case` to `camelCase`
for schema/TS filtering.
- **Enums**:
- Generates an exhaustive `match` returning `Some(reason)` for annotated
variants and `None` otherwise (no wildcard arm).
- **Wiring**:
- Runtime gating uses `ExperimentalApi::experimental_reason()` in
`codex-rs/app-server/src/message_processor.rs` to reject requests unless
`InitializeParams.capabilities.experimental_api == true`.
- Schema/TS export filters use the inventory list and
`EXPERIMENTAL_CLIENT_METHODS` from `client_request_definitions!` to
strip experimental methods/fields when `experimental_api` is false.
2026-02-02 11:06:50 +00:00
Charley Cunningham
d3514bbdd2
Bump thread updated_at on unarchive to refresh sidebar ordering (#10280)
## Summary
- Touch restored rollout files on `thread/unarchive` so `updatedAt`
reflects the unarchive time.
- Add a regression test to ensure unarchiving bumps `updated_at` from an
old mtime.

## Notes
This fixes the UX issue where unarchived old threads don’t reappear near
the top of recent threads.
2026-02-01 12:53:47 -08:00
Dylan Hurd
8a461765f3
chore(core) Default to friendly personality (#10305)
## Summary
Update default personality to friendly

## Testing
- [x] Unit tests pass
2026-01-31 17:11:32 -07:00
Dylan Hurd
ed9e02c9dc
chore(app-server) add personality update test (#10306)
## Summary
Add some additional validation to ensure app-server handles Personality
changes

## Testing
- [x] These are tests
2026-01-31 14:49:55 -07:00
Michael Bolin
e6d913af2d
chore: rename ChatGpt -> Chatgpt in type names (#10244)
When using ChatGPT in names of types, we should be consistent, so this
renames some types with `ChatGpt` in the name to `Chatgpt`. From
https://rust-lang.github.io/api-guidelines/naming.html:

> In `UpperCamelCase`, acronyms and contractions of compound words count
as one word: use `Uuid` rather than `UUID`, `Usize` rather than `USize`
or `Stdin` rather than `StdIn`. In `snake_case`, acronyms and
contractions are lower-cased: `is_xid_start`.

This PR updates existing uses of `ChatGpt` and changes them to
`Chatgpt`. Though in all cases where it could affect the wire format, I
visually inspected that we don't change anything there. That said, this
_will_ change the codegen because it will affect the spelling of type
names.

For example, this renames `AuthMode::ChatGPT` to `AuthMode::Chatgpt` in
`app-server-protocol`, but the wire format is still `"chatgpt"`.

This PR also updates a number of types in `codex-rs/core/src/auth.rs`.
2026-01-30 11:18:39 -08:00
Charley Cunningham
ec4a2d07e4
Plan mode: stream proposed plans, emit plan items, and render in TUI (#9786)
## Summary
- Stream proposed plans in Plan Mode using `<proposed_plan>` tags parsed
in core, emitting plan deltas plus a plan `ThreadItem`, while stripping
tags from normal assistant output.
- Persist plan items and rebuild them on resume so proposed plans show
in thread history.
- Wire plan items/deltas through app-server protocol v2 and render a
dedicated proposed-plan view in the TUI, including the “Implement this
plan?” prompt only when a plan item is present.

## Changes

### Core (`codex-rs/core`)
- Added a generic, line-based tag parser that buffers each line until it
can disprove a tag prefix; implements auto-close on `finish()` for
unterminated tags. `codex-rs/core/src/tagged_block_parser.rs`
- Refactored proposed plan parsing to wrap the generic parser.
`codex-rs/core/src/proposed_plan_parser.rs`
- In plan mode, stream assistant deltas as:
  - **Normal text** → `AgentMessageContentDelta`
  - **Plan text** → `PlanDelta` + `TurnItem::Plan` start/completion  
  (`codex-rs/core/src/codex.rs`)
- Final plan item content is derived from the completed assistant
message (authoritative), not necessarily the concatenated deltas.
- Strips `<proposed_plan>` blocks from assistant text in plan mode so
tags don’t appear in normal messages.
(`codex-rs/core/src/stream_events_utils.rs`)
- Persist `ItemCompleted` events only for plan items for rollout replay.
(`codex-rs/core/src/rollout/policy.rs`)
- Guard `update_plan` tool in Plan Mode with a clear error message.
(`codex-rs/core/src/tools/handlers/plan.rs`)
- Updated Plan Mode prompt to:  
  - keep `<proposed_plan>` out of non-final reasoning/preambles  
  - require exact tag formatting  
  - allow only one `<proposed_plan>` block per turn  
  (`codex-rs/core/templates/collaboration_mode/plan.md`)

### Protocol / App-server protocol
- Added `TurnItem::Plan` and `PlanDeltaEvent` to core protocol items.
(`codex-rs/protocol/src/items.rs`, `codex-rs/protocol/src/protocol.rs`)
- Added v2 `ThreadItem::Plan` and `PlanDeltaNotification` with
EXPERIMENTAL markers and note that deltas may not match the final plan
item. (`codex-rs/app-server-protocol/src/protocol/v2.rs`)
- Added plan delta route in app-server protocol common mapping.
(`codex-rs/app-server-protocol/src/protocol/common.rs`)
- Rebuild plan items from persisted `ItemCompleted` events on resume.
(`codex-rs/app-server-protocol/src/protocol/thread_history.rs`)

### App-server
- Forward plan deltas to v2 clients and map core plan items to v2 plan
items. (`codex-rs/app-server/src/bespoke_event_handling.rs`,
`codex-rs/app-server/src/codex_message_processor.rs`)
- Added v2 plan item tests.
(`codex-rs/app-server/tests/suite/v2/plan_item.rs`)

### TUI
- Added a dedicated proposed plan history cell with special background
and padding, and moved “• Proposed Plan” outside the highlighted block.
(`codex-rs/tui/src/history_cell.rs`, `codex-rs/tui/src/style.rs`)
- Only show “Implement this plan?” when a plan item exists.
(`codex-rs/tui/src/chatwidget.rs`,
`codex-rs/tui/src/chatwidget/tests.rs`)

<img width="831" height="847" alt="Screenshot 2026-01-29 at 7 06 24 PM"
src="https://github.com/user-attachments/assets/69794c8c-f96b-4d36-92ef-c1f5c3a8f286"
/>

### Docs / Misc
- Updated protocol docs to mention plan deltas.
(`codex-rs/docs/protocol_v1.md`)
- Minor plumbing updates in exec/debug clients to tolerate plan deltas.
(`codex-rs/debug-client/src/reader.rs`, `codex-rs/exec/...`)

## Tests
- Added core integration tests:
  - Plan mode strips plan from agent messages.
  - Missing `</proposed_plan>` closes at end-of-message.  
  (`codex-rs/core/tests/suite/items.rs`)
- Added unit tests for generic tag parser (prefix buffering, non-tag
lines, auto-close). (`codex-rs/core/src/tagged_block_parser.rs`)
- Existing app-server plan item tests in v2.
(`codex-rs/app-server/tests/suite/v2/plan_item.rs`)

## Notes / Behavior
- Plan output no longer appears in standard assistant text in Plan Mode;
it streams via `PlanDelta` and completes as a `TurnItem::Plan`.
- The final plan item content is authoritative and may diverge from
streamed deltas (documented as experimental).
- Reasoning summaries are not filtered; prompt instructs the model not
to include `<proposed_plan>` outside the final plan message.

## Codex Author
`codex fork 019bec2d-b09d-7450-b292-d7bcdddcdbfb`
2026-01-30 18:59:30 +00:00
Shijie Rao
a0ccef9d5c
Chore: plan mode do not include free form question and always include isOther (#10210)
We should never ask a freeform question when planning and we should
always include isOther as an escape hatch.
2026-01-30 01:19:24 -08:00
Dylan Hurd
e3ab0bd973
chore(personality) new schema with fallbacks (#10147)
## Summary
Let's dial in this api contract in a bit more with more robust fallback
behavior when model_instructions_template is false.

Switches to a more explicit template / variables structure, with more
fallbacks.

## Testing
- [x] Adding unit tests
- [x] Tested locally
2026-01-30 00:10:12 -07:00
Celia Chen
7151387474
[feat] persist dynamic tools in session rollout file (#10130)
Add dynamic tools to rollout file for persistence & read from rollout on
resume. Ran a real example and spotted the following in the rollout
file:
```
{"timestamp":"2026-01-29T01:27:57.468Z","type":"session_meta","payload":{"id":"019c075d-3f0b-77e3-894e-c1c159b04b1e","timestamp":"2026-01-29T01:27:57.451Z","...."dynamic_tools":[{"name":"demo_tool","description":"Demo dynamic tool","inputSchema":{"additionalProperties":false,"properties":{"city":{"type":"string"}},"required":["city"],"type":"object"}}],"git":{"commit_hash":"ebc573f15c01b8af158e060cfedd401f043e9dfa","branch":"dev/cc/dynamic-tools","repository_url":"https://github.com/openai/codex.git"}}}
```
2026-01-30 01:10:00 +00:00
Owen Lin
81a17bb2c1
feat(app-server): support external auth mode (#10012)
This enables a new use case where `codex app-server` is embedded into a
parent application that will directly own the user's ChatGPT auth
lifecycle, which means it owns the user’s auth tokens and refreshes it
when necessary. The parent application would just want a way to pass in
the auth tokens for codex to use directly.

The idea is that we are introducing a new "auth mode" currently only
exposed via app server: **`chatgptAuthTokens`** which consist of the
`id_token` (stores account metadata) and `access_token` (the bearer
token used directly for backend API calls). These auth tokens are only
stored in-memory. This new mode is in addition to the existing `apiKey`
and `chatgpt` auth modes.

This PR reuses the shape of our existing app-server account APIs as much
as possible:
- Update `account/login/start` with a new `chatgptAuthTokens` variant,
which will allow the client to pass in the tokens and have codex
app-server use them directly. Upon success, the server emits
`account/login/completed` and `account/updated` notifications.
- A new server->client request called
`account/chatgptAuthTokens/refresh` which the server can use whenever
the access token previously passed in has expired and it needs a new one
from the parent application.

I leveraged the core 401 retry loop which typically triggers auth token
refreshes automatically, but made it pluggable:
- **chatgpt** mode refreshes internally, as usual.
- **chatgptAuthTokens** mode calls the client via
`account/chatgptAuthTokens/refresh`, the client responds with updated
tokens, codex updates its in-memory auth, then retries. This RPC has a
10s timeout and handles JSON-RPC errors from the client.

Also some additional things:
- chatgpt logins are blocked while external auth is active (have to log
out first. typically clients will pick one OR the other, not support
both)
- `account/logout` clears external auth in memory
- Ensures that if `forced_chatgpt_workspace_id` is set via the user's
config, we respect it in both:
- `account/login/start` with `chatgptAuthTokens` (returns a JSON-RPC
error back to the client)
- `account/chatgptAuthTokens/refresh` (fails the turn, and on next
request app-server will send another `account/chatgptAuthTokens/refresh`
request to the client).
2026-01-29 23:46:04 +00:00
Matthew Zeng
b9cd089d1f
[connectors] Support connectors part 2 - slash command and tui (#9728)
- [x] Support `/apps` slash command to browse the apps in tui.
- [x] Support inserting apps to prompt using `$`.
- [x] Lots of simplification/renaming from connectors to apps.
2026-01-28 19:51:58 -08:00
Ahmed Ibrahim
52609c6f42
Add app-server compaction item notifications tests (#10123)
- add v2 tests covering local + remote auto-compaction item
started/completed notifications
2026-01-29 01:00:38 +00:00
Jeremy Rose
b8156706e6
file-search: improve file query perf (#9939)
switch nucleo-matcher for nucleo and use a "file search session" w/ live
updating query instead of a single hermetic run per query.
2026-01-28 10:54:43 -08:00
Dylan Hurd
996e09ca24
feat(core) RequestRule (#9489)
## Summary
Instead of trying to derive the prefix_rule for a command mechanically,
let's let the model decide for us.

## Testing
- [x] tested locally
2026-01-28 08:43:17 +00:00
Owen Lin
fc0fd85349
fix(app-server, core): defer initial context write to rollout file until first turn (#9950)
### Overview
Currently calling `thread/resume` will always bump the thread's
`updated_at` timestamp. This PR makes it the `updated_at` timestamp
changes only if a turn is triggered.

### Additonal context
What we typically do on resuming a thread is **always** writing “initial
context” to the rollout file immediately. This initial context includes:
- Developer instructions derived from sandbox/approval policy + cwd
- Optional developer instructions (if provided)
- Optional collaboration-mode instructions
- Optional user instructions (if provided)
- Environment context (cwd, shell, etc.)

This PR defers writing the “initial context” to the rollout file until
the first `turn/start`, so we don't inadvertently bump the thread's
`updated_at` timestamp until a turn is actually triggered.

This works even though both `thread/resume` and `turn/start` accept
overrides (such as `model`, `cwd`, etc.) because the initial context is
seeded from the effective `TurnContext` in memory, computed at
`turn/start` time, after both sets of overrides have been applied.

**NOTE**: This is a very short-lived solution until we introduce sqlite.
Then we can remove this.
2026-01-27 10:41:54 -08:00
jif-oai
247fb2de64
[app-server] feat: add filtering on thread list (#9897) 2026-01-26 21:54:19 +00:00
Charley Cunningham
62266b13f8
Add thread/unarchive to restore archived rollouts (#9843)
## Summary
- Adds a new `thread/unarchive` RPC to move archived thread rollouts
back into the active `sessions/` tree.

## What changed
- **Protocol**
  - Adds `thread/unarchive` request/response types and wiring.
- **Server**
  - Implements `thread_unarchive` in the app server.
  - Validates the archived rollout path and thread ID.
- Restores the rollout to `sessions/YYYY/MM/DD/...` based on the rollout
filename timestamp.
- **Core**
- Adds `find_archived_thread_path_by_id_str` helper for archived
rollouts.
- **Docs**
  - Documents the new RPC and usage example.
- **Tests**
  - Adds an end-to-end server test that:
    1) starts a thread,
    2) archives it,
    3) unarchives it,
    4) asserts the file is restored to `sessions/`.

## How to use
```json
{ "method": "thread/unarchive", "id": 24, "params": { "threadId": "<thread-id>" } }
```

## Author Codex Session

`codex resume 019bf158-54b6-7960-a696-9d85df7e1bc1` (soon I'll make this
kind of session UUID forkable by anyone with the right
`session_object_storage_url` line in their config, but for now just
pasting it here for my reference)
2026-01-26 11:24:36 -08:00
Shijie Rao
3ba702c5b6
Feat: add isOther to question returned by request user input tool (#9890)
### Summary
Add `isOther` to question object from request_user_input tool input and
remove `other` option from the tool prompt to better handle tool input.
2026-01-26 09:52:38 -08:00
gt-oai
48aeb67f7a
Fix flakey conversation flow test (#9784)
I've seen this test fail with:

```
 - Mock #1.
        	Expected range of matching incoming requests: == 2
        	Number of matched incoming requests: 1
```

This is because we pop the wrong task_complete events and then the test
exits. I think this is because the MCP events are now buffered after
https://github.com/openai/codex/pull/8874.

So:
1. clear the buffer before we do any user message sending
2. additionally listen for task start before task complete
3. use the ID from task start to find the correct task complete event.
2026-01-26 15:58:14 +00:00
jif-oai
d594693d1a
feat: dynamic tools injection (#9539)
## Summary
Add dynamic tool injection to thread startup in API v2, wire dynamic
tool calls through the app server to clients, and plumb responses back
into the model tool pipeline.

### Flow (high level)
- Thread start injects `dynamic_tools` into the model tool list for that
thread (validation is done here).
- When the model emits a tool call for one of those names, core raises a
`DynamicToolCallRequest` event.
- The app server forwards it to the client as `item/tool/call`, waits
for the client’s response, then submits a `DynamicToolResponse` back to
core.
- Core turns that into a `function_call_output` in the next model
request so the model can continue.

### What changed
- Added dynamic tool specs to v2 thread start params and protocol types;
introduced `item/tool/call` (request/response) for dynamic tool
execution.
- Core now registers dynamic tool specs at request time and routes those
calls via a new dynamic tool handler.
- App server validates tool names/schemas, forwards dynamic tool call
requests to clients, and publishes tool outputs back into the session.
- Integration tests
2026-01-26 10:06:44 +00:00
Dylan Hurd
25fccc3d4d
chore(core) move model_instructions_template config (#9871)
## Summary
Move `model_instructions_template` config to the experimental slug while
we iterate on this feature

## Testing
- [x] Tested locally, unit tests still pass
2026-01-26 07:02:11 +00:00
Dylan Hurd
031bafd1fb
feat(tui) /personality (#9718)
## Summary
Adds /personality selector in the TUI, which leverages the new core
interface in #9644

Notes:
- We are doing some of our own state management for model_info loading
here, but not sure if that's ideal. open to opinions on simpler
approach, but would like to avoid blocking on a larger refactor
- Right now, the `/personality` selector just hides when the model
doesn't support it. we can update this behavior down the line

## Testing
- [x] Tested locally
- [x] Added snapshot tests
2026-01-25 21:59:42 -08:00
Ahmed Ibrahim
58450ba2a1
Use collaboration mode masks without mutating base settings (#9806)
Keep an unmasked base collaboration mode and apply the active mask on
demand. Simplify the TUI mask helpers and update tests/docs to match the
mask contract.
2026-01-25 07:35:31 +00:00
jif-oai
83775f4df1
feat: ephemeral threads (#9765)
Add ephemeral threads capabilities. Only exposed through the
`app-server` v2

The idea is to disable the rollout recorder for those threads.
2026-01-24 14:57:40 +00:00
Ahmed Ibrahim
b3127e2eeb
Have a coding mode and only show coding and plan (#9802) 2026-01-23 19:28:49 -08:00
Ahmed Ibrahim
69cfc73dc6
change collaboration mode to struct (#9793)
Shouldn't cause behavioral change
2026-01-23 17:00:23 -08:00
Dylan Hurd
2b1ee24e11
feat(app-server) Expose personality (#9674)
### Motivation
Exposes a per-thread / per-turn `personality` override in the v2
app-server API so clients can influence model communication style at
thread/turn start. Ensures the override is passed into the session
configuration resolution so it becomes effective for subsequent turns
and headless runners.

### Testing
- [x] Add an integration-style test
`turn_start_accepts_personality_override_v2` in
`codex-rs/app-server/tests/suite/v2/turn_start.rs` that verifies a
`/personality` override results in a developer update message containing
`<personality_spec>` in the outbound model request.

------
[Codex
Task](https://chatgpt.com/codex/tasks/task_i_6971d646b1c08322a689a54d2649f3fe)
2026-01-22 18:00:20 -08:00
Matthew Zeng
a2c829a808
[connectors] Support connectors part 1 - App server & MCP (#9667)
In order to make Codex work with connectors, we add a built-in gateway
MCP that acts as a transparent proxy between the client and the
connectors. The gateway MCP collects actions that are accessible to the
user and sends them down to the user, when a connector action is chosen
to be called, the client invokes the action through the gateway MCP as
well.

 - [x] Add the system built-in gateway MCP to list and run connectors.
 - [x] Add the app server methods and protocol
2026-01-22 16:48:43 -08:00
Owen Lin
733cb68496
feat(app-server): support archived threads in thread/list (#9571) 2026-01-22 12:22:36 -08:00
Owen Lin
80240b3b67
feat(app-server): thread/read API (#9569) 2026-01-22 12:22:01 -08:00
pakrym-oai
b511c38ddb
Support end_turn flag (#9698)
Experimental flag that signals the end of the turn.
2026-01-22 17:27:48 +00:00
Dylan Hurd
96a72828be
feat(core) ModelInfo.model_instructions_template (#9597)
## Summary
#9555 is the start of a rename, so I'm starting to standardize here.
Sets up `model_instructions` templating with a strongly-typed object for
injecting a personality block into the model instructions.

## Testing
- [x] Added tests
- [x] Ran locally
2026-01-21 18:11:18 -08:00
Eric Traut
2ca9a56528
Add layered config.toml support to app server (#9510)
This PR adds support for chained (layered) config.toml file merging for
clients that use the app server interface. This feature already exists
for the TUI, but it does not work for GUI clients.

It does the following:
* Changes code paths for new thread, resume thread, and fork thread to
use the effective config based on the cwd.
* Updates the `config/read` API to accept an optional `cwd` parameter.
If specified, the API returns the effective config based on that cwd
path. Also optionally includes all layers including project config
files. If cwd is not specified, the API falls back on its older behavior
where it considers only the global (non-project) config files when
computing the effective config.

The changes in codex_message_processor.rs look deceptively large. They
mostly just involve moving existing blocks of code to a later point in
some functions so it can use the cwd to calculate the config.

This PR builds upon #9509 and should be reviewed and merged after that
PR.

Tested:
* Verified change with (dependent, as-yet-uncommitted) changes to IDE
Extension and confirmed correct behavior

The full fix requires additional changes in the IDE Extension code base,
but they depend on this PR.
2026-01-21 14:21:48 -08:00
Shijie Rao
3fcb40245e
Chore: update plan mode output in prompt (#9592)
### Summary
* Update plan prompt output
* Update requestUserInput response to be a single key value pair
`answer: String`.
2026-01-21 14:12:18 -08:00
charley-oai
0523a259c8
Reject ask user question tool in Execute and Custom (#9560)
## Summary
- Keep `request_user_input` in the tool list but reject it at runtime in
Execute/Custom modes with a clear model-facing error.
- Add a session accessor for current collaboration mode and enforce the
gate in the request_user_input handler.
- Update core/app-server tests to use Plan mode for success and add
Execute/Custom rejection coverage.
2026-01-20 18:32:17 -08:00
Ahmed Ibrahim
3a0eeb8edf
Show session header before configuration (#9568)
We were skipping if we know the model. We shouldn't
2026-01-21 02:13:54 +00:00
charley-oai
be9e55c5fc
Add total (non-partial) TextElement placeholder accessors (#9545)
## Summary
- Make `TextElement` placeholders private and add a text-backed accessor
to avoid assuming `Some`.
- Since they are optional in the protocol, we want to make sure any
accessors properly handle the None case (getting the placeholder using
the byte range in the text)
- Preserve placeholders during protocol/app-server conversions using the
accessor fallback.
- Update TUI composer/remap logic and tests to use the new
constructor/accessor.
2026-01-20 14:04:11 -08:00
Dylan Hurd
675f165c56
fix(core) Preserve base_instructions in SessionMeta (#9427)
## Summary
This PR consolidates base_instructions onto SessionMeta /
SessionConfiguration, so we ensure `base_instructions` is set once per
session and should be (mostly) immutable, unless:
- overridden by config on resume / fork
- sub-agent tasks, like review or collab


In a future PR, we should convert all references to `base_instructions`
to consistently used the typed struct, so it's less likely that we put
other strings there. See #9423. However, this PR is already quite
complex, so I'm deferring that to a follow-up.

## Testing
- [x] Added a resume test to assert that instructions are preserved. In
particular, `resume_switches_models_preserves_base_instructions` fails
against main.

Existing test coverage thats assert base instructions are preserved
across multiple requests in a session:
- Manual compact keeps baseline instructions:
core/tests/suite/compact.rs:199
- Auto-compact keeps baseline instructions:
core/tests/suite/compact.rs:1142
- Prompt caching reuses the same instructions across two requests:
core/tests/suite/prompt_caching.rs:150 and
core/tests/suite/prompt_caching.rs:157
- Prompt caching with explicit expected string across two requests:
core/tests/suite/prompt_caching.rs:213 and
core/tests/suite/prompt_caching.rs:222
- Resume with model switch keeps original instructions:
core/tests/suite/resume.rs:136
- Compact/resume/fork uses request 0 instructions for later expected
payloads: core/tests/suite/compact_resume_fork.rs:215
2026-01-19 21:59:36 -08:00
Shijie Rao
57ec3a8277
Feat: request user input tool (#9472)
### Summary
* Add `requestUserInput` tool that the model can use for gather
feedback/asking question mid turn.


### Tool input schema
```
{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "title": "requestUserInput input",
  "type": "object",
  "additionalProperties": false,
  "required": ["questions"],
  "properties": {
    "questions": {
      "type": "array",
      "description": "Questions to show the user (1-3). Prefer 1 unless multiple independent decisions block progress.",
      "minItems": 1,
      "maxItems": 3,
      "items": {
        "type": "object",
        "additionalProperties": false,
        "required": ["id", "header", "question"],
        "properties": {
          "id": {
            "type": "string",
            "description": "Stable identifier for mapping answers (snake_case)."
          },
          "header": {
            "type": "string",
            "description": "Short header label shown in the UI (12 or fewer chars)."
          },
          "question": {
            "type": "string",
            "description": "Single-sentence prompt shown to the user."
          },
          "options": {
            "type": "array",
            "description": "Optional 2-3 mutually exclusive choices. Put the recommended option first and suffix its label with \"(Recommended)\". Only include \"Other\" option if we want to include a free form option. If the question is free form in nature, do not include any option.",
            "minItems": 2,
            "maxItems": 3,
            "items": {
              "type": "object",
              "additionalProperties": false,
              "required": ["value", "label", "description"],
              "properties": {
                "value": {
                  "type": "string",
                  "description": "Machine-readable value (snake_case)."
                },
                "label": {
                  "type": "string",
                  "description": "User-facing label (1-5 words)."
                },
                "description": {
                  "type": "string",
                  "description": "One short sentence explaining impact/tradeoff if selected."
                }
              }
            }
          }
        }
      }
    }
  }
}
```

### Tool output schema
```
{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "title": "requestUserInput output",
  "type": "object",
  "additionalProperties": false,
  "required": ["answers"],
  "properties": {
    "answers": {
      "type": "object",
      "description": "Map of question id to user answer.",
      "additionalProperties": {
        "type": "object",
        "additionalProperties": false,
        "required": ["selected"],
        "properties": {
          "selected": {
            "type": "array",
            "items": { "type": "string" }
          },
          "other": {
            "type": ["string", "null"]
          }
        }
      }
    }
  }
}
```
2026-01-19 10:17:30 -08:00
Ahmed Ibrahim
f72f87fbee
Add collaboration modes test prompts (#9443)
# External (non-OpenAI) Pull Request Requirements

Before opening this Pull Request, please read the dedicated
"Contributing" markdown file or your PR may be closed:
https://github.com/openai/codex/blob/main/docs/contributing.md

If your PR conforms to our contribution guidelines, replace this text
with a detailed and high quality description of your changes.

Include a link to a bug report or enhancement request.
2026-01-18 11:39:08 -08:00
Dylan Hurd
80d7a5d7fe
chore(instructions) Remove unread SessionMeta.instructions field (#9423)
### Description
- Remove the now-unused `instructions` field from the session metadata
to simplify SessionMeta and stop propagating transient instruction text
through the rollout recorder API. This was only saving
user_instructions, and was never being read.
- Stop passing user instructions into the rollout writer at session
creation so the rollout header only contains canonical session metadata.

### Testing

- Ran `just fmt` which completed successfully.
- Ran `just fix -p codex-protocol`, `just fix -p codex-core`, `just fix
-p codex-app-server`, `just fix -p codex-tui`, and `just fix -p
codex-tui2` which completed (Clippy fixes applied) as part of
verification.
- Ran `cargo test -p codex-protocol` which passed (28 tests).
- Ran `cargo test -p codex-core` which showed failures in a small set of
tests (not caused by the protocol type change directly):
`default_client::tests::test_create_client_sets_default_headers`,
several `models_manager::manager::tests::refresh_available_models_*`,
and `shell_snapshot::tests::linux_sh_snapshot_includes_sections` (these
tests failed in this CI run).
- Ran `cargo test -p codex-app-server` which reported several failing
integration tests (including
`suite::codex_message_processor_flow::test_codex_jsonrpc_conversation_flow`,
`suite::output_schema::send_user_turn_*`, and
`suite::user_agent::get_user_agent_returns_current_codex_user_agent`).
- `cargo test -p codex-tui` and `cargo test -p codex-tui2` were
attempted but aborted due to disk space exhaustion (`No space left on
device`).

------
[Codex
Task](https://chatgpt.com/codex/tasks/task_i_696bd8ce632483228d298cf07c7eb41c)
2026-01-17 16:02:28 -08:00
Ahmed Ibrahim
8f0e0300d2
Expose collaboration presets (#9421)
Expose collaboration presets for clients

---------

Co-authored-by: Josh McKinney <joshka@openai.com>
2026-01-17 12:32:50 -08:00
Ahmed Ibrahim
146d54cede
Add collaboration_mode override to turns (#9408) 2026-01-16 21:51:25 -08:00
Anton Panasenko
c26fe64539
feat: show forked from session id in /status (#9330)
Summary:
- Add forked_from to SessionMeta/SessionConfiguredEvent and persist it
for forked sessions.
- Surface forked_from in /status for tui + tui2 and add snapshots.
2026-01-16 13:41:46 -08:00
Owen Lin
f1653dd4d3
feat(app-server, core): return threads by created_at or updated_at (#9247)
Add support for returning threads by either `created_at` OR `updated_at`
descending. Previously core always returned threads ordered by
`created_at`.

This PR:
- updates core to be able to list threads by `updated_at` OR
`created_at` descending based on what the caller wants
- also update `thread/list` in app-server to expose this (default to
`created_at` if not specified)

All existing codepaths (app-server, TUI) still default to `created_at`,
so no behavior change is expected with this PR.

**Implementation**
To sort by `updated_at` is a bit nontrivial (whereas `created_at` is
easy due to the way we structure the folders and filenames on disk,
which are all based on `created_at`).

The most naive way to do this without introducing a cache file or sqlite
DB (which we have to implement/maintain) is to scan files in reverse
`created_at` order on disk, and look at the file's mtime (last modified
timestamp according to the filesystem) until we reach `MAX_SCAN_FILES`
(currently set to 10,000). Then, we can return the most recent N
threads.

Based on some quick and dirty benchmarking on my machine with ~1000
rollout files, calling `thread/list` with limit 50, the `updated_at`
path is slower as expected due to all the I/O:
- updated-at: average 103.10 ms
- created-at: average 41.10 ms

Those absolute numbers aren't a big deal IMO, but we can certainly
optimize this in a followup if needed by introducing more state stored
on disk.

**Caveat**
There's also a limitation in that any files older than `MAX_SCAN_FILES`
will be excluded, which means if a user continues a REALLY old thread,
it's possible to not be included. In practice that should not be too big
of an issue.

If a user makes...
- 1000 rollouts/day → threads older than 10 days won't show up
- 100 rollouts/day → ~100 days

If this becomes a problem for some reason, even more motivation to
implement an updated_at cache.
2026-01-16 20:58:55 +00:00
charley-oai
1fa8350ae7
Add text element metadata to protocol, app server, and core (#9331)
The second part of breaking up PR
https://github.com/openai/codex/pull/9116

Summary:

- Add `TextElement` / `ByteRange` to protocol user inputs and user
message events with defaults.
- Thread `text_elements` through app-server v1/v2 request handling and
history rebuild.
- Preserve UI metadata only in user input/events (not `ContentItem`)
while keeping local image attachments in user events for rehydration.

Details:

- Protocol: `UserInput::Text` carries `text_elements`;
`UserMessageEvent` carries `text_elements` + `local_images`.
Serialization includes empty vectors for backward compatibility.
- app-server-protocol: v1 defines `V1TextElement` / `V1ByteRange` in
camelCase with conversions; v2 uses its own camelCase wrapper.
- app-server: v1/v2 input mapping includes `text_elements`; thread
history rebuilds include them.
- Core: user event emission preserves UI metadata while model history
stays clean; history replay round-trips the metadata.
2026-01-15 17:26:41 -08:00
Ahmed Ibrahim
a09711332a
Add migration_markdown in model_info (#9219)
Next step would be to clean Model Upgrade in model presets

---------

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: aibrahim-oai <219906144+aibrahim-oai@users.noreply.github.com>
2026-01-15 01:55:22 +00:00
charley-oai
4a9c2bcc5a
Add text element metadata to types (#9235)
Initial type tweaking PR to make the diff of
https://github.com/openai/codex/pull/9116 smaller

This should not change any behavior, just adds some fields to types
2026-01-14 16:41:50 -08:00