Commit graph

3066 commits

Author SHA1 Message Date
charley-oai
4a9c2bcc5a
Add text element metadata to types (#9235)
Initial type tweaking PR to make the diff of
https://github.com/openai/codex/pull/9116 smaller

This should not change any behavior, just adds some fields to types
2026-01-14 16:41:50 -08:00
Michael Bolin
3728db11b8
fix: eliminate unnecessary clone() for each SSE event (#9238)
Given how many SSE events we get, seems worth fixing.
2026-01-15 00:06:09 +00:00
viyatb-oai
e59e7d163d
fix: correct linux sandbox uid/gid mapping after unshare (#9234)
fixes https://github.com/openai/codex/issues/9233
## Summary
- capture effective uid/gid before unshare for user namespace maps
- pass captured ids into uid/gid map writer

## Testing
- just fmt
- just fix -p codex-linux-sandbox
- cargo test -p codex-linux-sandbox
2026-01-14 15:35:53 -08:00
sayan-oai
5e426ac270
add WebSearchMode enum (#9216)
### What
Add `WebSearchMode` enum (disabled, cached live, defaults to cached) to
config + V2 protocol. This enum takes precedence over legacy flags:
`web_search_cached`, `web_search_request`, and `tools.web_search`.

Keep `--search` as live.

### Tests
Added tests
2026-01-14 12:51:42 -08:00
Josh McKinney
27da8a68d3
fix(tui): disable double-press quit shortcut (#9220)
Disables the default Ctrl+C/Ctrl+D double-press quit UX (keeps the code
path behind a const) while we rethink the quit/interrupt flow.

Tests:
- just fmt
- cargo clippy --fix --all-features --tests --allow-dirty --allow-no-vcs
-p codex-tui
- cargo test -p codex-tui --lib
2026-01-14 20:28:18 +00:00
Josh McKinney
0471ddbe74
fix(tui2): align Steer submit keys (#9218)
- Remove legacy Ctrl+K queuing in tui2; Tab is the queue key.
- Make Enter queue when Steer is disabled and submit immediately when
Steer is enabled.
- Add Steer keybinding docs on both tui and tui2 chat composers.
2026-01-14 19:35:55 +00:00
pakrym-oai
e6d2ef432d
Rename hierarchical_agents to child_agents_md (#9215)
Clearer name
2026-01-14 19:14:24 +00:00
jif-oai
577e1fd1b2
feat: adding piped process to replace PTY when needed (#8797) 2026-01-14 18:44:04 +00:00
gt-oai
fe1e0da102
s/mcp_server_requirements/mcp_servers (#9212)
A simple `s/mcp_server_requirements/mcp_servers/g` for an unreleased
feature. @bolinfest correctly pointed out, it's already in
`requirements.toml` so the `_requirements` is redundant.
2026-01-14 18:41:52 +00:00
pakrym-oai
e958d0337e
Log headers in trace mode (#9214)
To enable:

```
export RUST_LOG="warn,codex_=trace"
```

Sample: 
```
Request completed method=POST url=https://chatgpt.com/backend-api/codex/responses status=200 OK headers={"date": "Wed, 14 Jan 2026 18:21:21 GMT", "transfer-encoding": "chunked", "connection": "keep-alive", "x-codex-plan-type": "business", "x-codex-primary-used-percent": "3", "x-codex-secondary-used-percent": "6", "x-codex-primary-window-minutes": "300", "x-codex-primary-over-secondary-limit-percent": "0", "x-codex-secondary-window-minutes": "10080", "x-codex-primary-reset-after-seconds": "9944", "x-codex-secondary-reset-after-seconds": "171121", "x-codex-primary-reset-at": "1768424824", "x-codex-secondary-reset-at": "1768586001", "x-codex-credits-has-credits": "False", "x-codex-credits-balance": "", "x-codex-credits-unlimited": "False", "x-models-etag": "W/\"7a7ffbc83c159dbd7a2a73aaa9c91b7a\"", "x-oai-request-id": "ffedcd30-6d8a-4c4d-be10-8ebb23c142c8", "x-envoy-upstream-service-time": "417", "x-openai-proxy-wasm": "v0.1", "cf-cache-status": "DYNAMIC", "set-cookie": "__cf_bm=xFKeaMbWNbKO5ZX.K5cJBhj34OA1QvnF_3nkdMThjlA-1768414881-1.0.1.1-uLpsE_BDkUfcmOMaeKVQmv_6_2ytnh_R3lO_il5N5K3YPQEkBo0cOMTdma6bK0Gz.hQYcIesFwKIJht1kZ9JKqAYYnjgB96hF4.sii2U3cE; path=/; expires=Wed, 14-Jan-26 18:51:21 GMT; domain=.chatgpt.com; HttpOnly; Secure; SameSite=None", "report-to": "{\"endpoints\":[{\"url\":\"https:\/\/a.nel.cloudflare.com\/report\/v4?s=4Kc7g4zUhKkIm3xHuB6ba4jyIUqqZ07ETwIPAYQASikRjA8JesbtUKDP9tSrZ5PnzWldaiSz5dZVQFI579LEsCMlMUSelTvmyQ8j4FbFDawi%2FprWZ5iRePiaSalr\"}],\"group\":\"cf-nel\",\"max_age\":604800}", "nel": "{\"success_fraction\":0.01,\"report_to\":\"cf-nel\",\"max_age\":604800}", "strict-transport-security": "max-age=31536000; includeSubDomains; preload", "x-content-type-options": "nosniff", "cross-origin-opener-policy": "same-origin-allow-popups", "referrer-policy": "strict-origin-when-cross-origin", "server": "cloudflare", "cf-ray": "9bdf270adc7aba3a-SEA"} version=HTTP/1.1
```
2026-01-14 18:38:12 +00:00
Ahmed Ibrahim
8e937fbba9
Get model on session configured (#9191)
- Don't try to precompute model unless you know it from `config`
- Block `/model` on session configured
- Queue messages until session configured
- show "loading" in status until session configured
2026-01-14 10:20:41 -08:00
Celia Chen
02f67bace8
fix: Emit response.completed immediately for Responses SSE (#9170)
we see windows test failures like this:
https://github.com/openai/codex/actions/runs/20930055601/job/60138344260.

The issue is that SSE connections sometimes remain open after the
completion event esp. for windows. We should emit the completion event
and return immediately. this is consistent with the protocol:

> The Model streams responses back in an SSE, which are collected until
"completed" message and the SSE terminates

from
https://github.com/openai/codex/blob/dev/cc/fix-windows-test/codex-rs/docs/protocol_v1.md#L37.

this helps us achieve parity with responses websocket logic here:
https://github.com/openai/codex/blob/dev/cc/fix-windows-test/codex-rs/codex-api/src/endpoint/responses_websocket.rs#L220-L227.
2026-01-14 10:05:00 -08:00
jif-oai
3d322fa9d8
feat: add collab prompt (#9208)
Adding a prompt for collab tools. This is only for internal use and the
prompt won't be gated for now as it is not stable yet.

The goal of this PR is to provide the tool required to iterate on the
prompt
2026-01-14 09:58:15 -08:00
jif-oai
6a939ed7a4
feat: emit events around collab tools (#9095)
Emit the following events around the collab tools. On the `app-server`
this will be under `item/started` and `item/completed`
```
#[derive(Debug, Clone, Deserialize, Serialize, PartialEq, JsonSchema, TS)]
pub struct CollabAgentSpawnBeginEvent {
    /// Identifier for the collab tool call.
    pub call_id: String,
    /// Thread ID of the sender.
    pub sender_thread_id: ThreadId,
    /// Initial prompt sent to the agent. Can be empty to prevent CoT leaking at the
    /// beginning.
    pub prompt: String,
}

#[derive(Debug, Clone, Deserialize, Serialize, PartialEq, JsonSchema, TS)]
pub struct CollabAgentSpawnEndEvent {
    /// Identifier for the collab tool call.
    pub call_id: String,
    /// Thread ID of the sender.
    pub sender_thread_id: ThreadId,
    /// Thread ID of the newly spawned agent, if it was created.
    pub new_thread_id: Option<ThreadId>,
    /// Initial prompt sent to the agent. Can be empty to prevent CoT leaking at the
    /// beginning.
    pub prompt: String,
    /// Last known status of the new agent reported to the sender agent.
    pub status: AgentStatus,
}

#[derive(Debug, Clone, Deserialize, Serialize, PartialEq, JsonSchema, TS)]
pub struct CollabAgentInteractionBeginEvent {
    /// Identifier for the collab tool call.
    pub call_id: String,
    /// Thread ID of the sender.
    pub sender_thread_id: ThreadId,
    /// Thread ID of the receiver.
    pub receiver_thread_id: ThreadId,
    /// Prompt sent from the sender to the receiver. Can be empty to prevent CoT
    /// leaking at the beginning.
    pub prompt: String,
}

#[derive(Debug, Clone, Deserialize, Serialize, PartialEq, JsonSchema, TS)]
pub struct CollabAgentInteractionEndEvent {
    /// Identifier for the collab tool call.
    pub call_id: String,
    /// Thread ID of the sender.
    pub sender_thread_id: ThreadId,
    /// Thread ID of the receiver.
    pub receiver_thread_id: ThreadId,
    /// Prompt sent from the sender to the receiver. Can be empty to prevent CoT
    /// leaking at the beginning.
    pub prompt: String,
    /// Last known status of the receiver agent reported to the sender agent.
    pub status: AgentStatus,
}

#[derive(Debug, Clone, Deserialize, Serialize, PartialEq, JsonSchema, TS)]
pub struct CollabWaitingBeginEvent {
    /// Thread ID of the sender.
    pub sender_thread_id: ThreadId,
    /// Thread ID of the receiver.
    pub receiver_thread_id: ThreadId,
    /// ID of the waiting call.
    pub call_id: String,
}

#[derive(Debug, Clone, Deserialize, Serialize, PartialEq, JsonSchema, TS)]
pub struct CollabWaitingEndEvent {
    /// Thread ID of the sender.
    pub sender_thread_id: ThreadId,
    /// Thread ID of the receiver.
    pub receiver_thread_id: ThreadId,
    /// ID of the waiting call.
    pub call_id: String,
    /// Last known status of the receiver agent reported to the sender agent.
    pub status: AgentStatus,
}

#[derive(Debug, Clone, Deserialize, Serialize, PartialEq, JsonSchema, TS)]
pub struct CollabCloseBeginEvent {
    /// Identifier for the collab tool call.
    pub call_id: String,
    /// Thread ID of the sender.
    pub sender_thread_id: ThreadId,
    /// Thread ID of the receiver.
    pub receiver_thread_id: ThreadId,
}

#[derive(Debug, Clone, Deserialize, Serialize, PartialEq, JsonSchema, TS)]
pub struct CollabCloseEndEvent {
    /// Identifier for the collab tool call.
    pub call_id: String,
    /// Thread ID of the sender.
    pub sender_thread_id: ThreadId,
    /// Thread ID of the receiver.
    pub receiver_thread_id: ThreadId,
    /// Last known status of the receiver agent reported to the sender agent before
    /// the close.
    pub status: AgentStatus,
}
```
2026-01-14 17:55:57 +00:00
Josh McKinney
4283a7432b
tui: double-press Ctrl+C/Ctrl+D to quit (#8936)
## Problem

Codex’s TUI quit behavior has historically been easy to trigger
accidentally and hard to reason
about.

- `Ctrl+C`/`Ctrl+D` could terminate the UI immediately, which is a
common key to press while trying
  to dismiss a modal, cancel a command, or recover from a stuck state.
- “Quit” and “shutdown” were not consistently separated, so some exit
paths could bypass the
  shutdown/cleanup work that should run before the process terminates.

This PR makes quitting both safer (harder to do by accident) and more
uniform across quit
gestures, while keeping the shutdown-first semantics explicit.

## Mental model

After this change, the system treats quitting as a UI request that is
coordinated by the app
layer.

- The UI requests exit via `AppEvent::Exit(ExitMode)`.
- `ExitMode::ShutdownFirst` is the normal user path: the app triggers
`Op::Shutdown`, continues
rendering while shutdown runs, and only ends the UI loop once shutdown
has completed.
- `ExitMode::Immediate` exists as an escape hatch (and as the
post-shutdown “now actually exit”
signal); it bypasses cleanup and should not be the default for
user-triggered quits.

User-facing quit gestures are intentionally “two-step” for safety:

- `Ctrl+C` and `Ctrl+D` no longer exit immediately.
- The first press arms a 1-second window and shows a footer hint (“ctrl
+ <key> again to quit”).
- Pressing the same key again within the window requests a
shutdown-first quit; otherwise the
  hint expires and the next press starts a fresh window.

Key routing remains modal-first:

- A modal/popup gets first chance to consume `Ctrl+C`.
- If a modal handles `Ctrl+C`, any armed quit shortcut is cleared so
dismissing a modal cannot
  prime a subsequent `Ctrl+C` to quit.
- `Ctrl+D` only participates in quitting when the composer is empty and
no modal/popup is active.

The design doc `docs/exit-confirmation-prompt-design.md` captures the
intended routing and the
invariants the UI should maintain.

## Non-goals

- This does not attempt to redesign modal UX or make modals uniformly
dismissible via `Ctrl+C`.
It only ensures modals get priority and that quit arming does not leak
across modal handling.
- This does not introduce a persistent confirmation prompt/menu for
quitting; the goal is to keep
  the exit gesture lightweight and consistent.
- This does not change the semantics of core shutdown itself; it changes
how the UI requests and
  sequences it.

## Tradeoffs

- Quitting via `Ctrl+C`/`Ctrl+D` now requires a deliberate second
keypress, which adds friction for
  users who relied on the old “instant quit” behavior.
- The UI now maintains a small time-bounded state machine for the armed
shortcut, which increases
  complexity and introduces timing-dependent behavior.

This design was chosen over alternatives (a modal confirmation prompt or
a long-lived “are you
sure” state) because it provides an explicit safety barrier while
keeping the flow fast and
keyboard-native.

## Architecture

- `ChatWidget` owns the quit-shortcut state machine and decides when a
quit gesture is allowed
  (idle vs cancellable work, composer state, etc.).
- `BottomPane` owns rendering and local input routing for modals/popups.
It is responsible for
consuming cancellation keys when a view is active and for
showing/expiring the footer hint.
- `App` owns shutdown sequencing: translating
`AppEvent::Exit(ShutdownFirst)` into `Op::Shutdown`
  and only terminating the UI loop when exit is safe.

This keeps “what should happen” decisions (quit vs interrupt vs ignore)
in the chat/widget layer,
while keeping “how it looks and which view gets the key” in the
bottom-pane layer.

## Observability

You can tell this is working by running the TUIs and exercising the quit
gestures:

- While idle: pressing `Ctrl+C` (or `Ctrl+D` with an empty composer and
no modal) shows a footer
hint for ~1 second; pressing again within that window exits via
shutdown-first.
- While streaming/tools/review are active: `Ctrl+C` interrupts work
rather than quitting.
- With a modal/popup open: `Ctrl+C` dismisses/handles the modal (if it
chooses to) and does not
arm a quit shortcut; a subsequent quick `Ctrl+C` should not quit unless
the user re-arms it.

Failure modes are visible as:

- Quits that happen immediately (no hint window) from `Ctrl+C`/`Ctrl+D`.
- Quits that occur while a modal is open and consuming `Ctrl+C`.
- UI termination before shutdown completes (cleanup skipped).

## Tests

- Updated/added unit and snapshot coverage in `codex-tui` and
`codex-tui2` to validate:
  - The quit hint appears and expires on the expected key.
- Double-press within the window triggers a shutdown-first quit request.
- Modal-first routing prevents quit bypass and clears any armed shortcut
when a modal consumes
    `Ctrl+C`.

These tests focus on the UI-level invariants and rendered output; they
do not attempt to validate
real terminal key-repeat timing or end-to-end process shutdown behavior.

---
Screenshot:
<img width="912" height="740" alt="Screenshot 2026-01-13 at 1 05 28 PM"
src="https://github.com/user-attachments/assets/18f3d22e-2557-47f2-a369-ae7a9531f29f"
/>
2026-01-14 17:42:52 +00:00
pakrym-oai
92472e7baa
Use current model for review (#9179)
Instead of having a hard-coded default review model, use the current
model for running `/review` unless one is specified in the config.

Also inherit current reasoning effort
2026-01-14 08:59:41 -08:00
viyatb-oai
e1447c3009
feat: add support for read-only bind mounts in the linux sandbox (#9112)
### Motivation

- Landlock alone cannot prevent writes to sensitive in-repo files like
`.git/` when the repo root is writable, so explicit mount restrictions
are required for those paths.
- The sandbox must set up any mounts before calling Landlock so Landlock
can still be applied afterwards and the two mechanisms compose
correctly.

### Description

- Add a new `linux-sandbox` helper `apply_read_only_mounts` in
`linux-sandbox/src/mounts.rs` that: unshares namespaces, maps uids/gids
when required, makes mounts private, bind-mounts targets, and remounts
them read-only.
- Wire the mount step into the sandbox flow by calling
`apply_read_only_mounts(...)` before network/seccomp and before applying
Landlock rules in `linux-sandbox/src/landlock.rs`.
2026-01-14 08:30:46 -08:00
jif-oai
bcd7858ced
feat: add auto refresh on thread listeners (#9105)
This PR is in the scope of multi-agent work. 

An agent (=thread) can now spawn other agents. Those other agents are
not attached to any clients. We need a way to make sure that the clients
are aware of the new threads to look at (for approval for example). This
PR adds a channel to the `ThreadManager` that pushes the ID of those
newly created agents such that the client (here the app-server) can also
subscribe to those ones.
2026-01-14 16:26:01 +00:00
jif-oai
32b1795ff4
chore: clamp min yield time for empty write_stdin (#9156)
After evals, 0 impact on performance
2026-01-14 16:25:40 +00:00
Ahmed Ibrahim
bdae0035ec
Render exec output deltas inline (#9194) 2026-01-14 08:11:12 -08:00
jif-oai
7532f34699
fix: drop double waiting header in TUI (#9145) 2026-01-14 09:52:34 +00:00
jif-oai
bc6d9ef6fc
feat: only source shell snapshot if the file exists (#9197) 2026-01-14 01:28:25 -08:00
jif-oai
dc3deaa3e7
feat: return an error if the image sent by the user is a bad image (#9146)
## Before
When we detect an `InvalidImageRequest`, we replace the image by a
placeholder and keep going

## Now
In such `InvalidImageRequest`, we check if the image is due to a user
message or a tool call output. For tool call output we still replace it
with a placeholder to avoid breaking the agentic loop bu tif this is
because of a user message, we send an error to the user
2026-01-14 09:07:45 +00:00
jif-oai
6fbb89e858
fix: shell snapshot clean-up (#9155)
Clean all shell snapshot files corresponding to sessions that have not
been updated in 7 days
Those files should never leak. The only known cases were it can leak are
during non graceful interrupt of the process (`kill -9, `panic`, OS
crash, ...)
2026-01-14 09:05:46 +00:00
jif-oai
258fc4b401
feat: add sourcing of rc files to shell snapshot (#9150) 2026-01-14 08:58:10 +00:00
Ahmed Ibrahim
b9ff4ec830
change api default model (#9188) 2026-01-13 22:33:34 -08:00
Michael Bolin
0c09dc3c03
feat: add threadId to MCP server messages (#9192)
This favors `threadId` instead of `conversationId` so we use the same
terms as https://developers.openai.com/codex/sdk/.

To test the local build:

```
cd codex-rs
cargo build --bin codex
npx -y @modelcontextprotocol/inspector ./target/debug/codex mcp-server
```

I sent:

```json
{
  "method": "tools/call",
  "params": {
    "name": "codex",
    "arguments": {
      "prompt": "favorite ls option?"
    },
    "_meta": {
      "progressToken": 0
    }
  }
}
```

and got:

```json
{
  "content": [
    {
      "type": "text",
      "text": "`ls -lah` (or `ls -alh`) — long listing, includes dotfiles, human-readable sizes."
    }
  ],
  "structuredContent": {
    "threadId": "019bbb20-bff6-7130-83aa-bf45ab33250e"
  }
}
```

and successfully used the `threadId` in the follow-up with the
`codex-reply` tool call:

```json
{
  "method": "tools/call",
  "params": {
    "name": "codex-reply",
    "arguments": {
      "prompt": "what is the long versoin",
      "threadId": "019bbb20-bff6-7130-83aa-bf45ab33250e"
    },
    "_meta": {
      "progressToken": 1
    }
  }
}
```

whose response also has the `threadId`:

```json
{
  "content": [
    {
      "type": "text",
      "text": "Long listing is `ls -l` (adds permissions, owner/group, size, timestamp)."
    }
  ],
  "structuredContent": {
    "threadId": "019bbb20-bff6-7130-83aa-bf45ab33250e"
  }
}
```

Fixes https://github.com/openai/codex/issues/3712.
2026-01-13 22:14:41 -08:00
sayan-oai
5675af5190
chore: clarify default shell for unified_exec (#8997)
The description of the `shell` arg for `exec_command` states the default
is `/bin/bash`, but AFAICT it's the user's default shell.

Default logic
[here](2a06d64bc9/codex-rs/core/src/tools/handlers/unified_exec.rs (L123)).

EDIT: #9004 has an alternative where we inform the model of the default
shell itself.
2026-01-13 21:37:41 -08:00
Eric Traut
31d9b6f4d2
Improve handling of config and rules errors for app server clients (#9182)
When an invalid config.toml key or value is detected, the CLI currently
just quits. This leaves the VSCE in a dead state.

This PR changes the behavior to not quit and bubble up the config error
to users to make it actionable. It also surfaces errors related to
"rules" parsing.

This allows us to surface these errors to users in the VSCE, like this:

<img width="342" height="129" alt="Screenshot 2026-01-13 at 4 29 22 PM"
src="https://github.com/user-attachments/assets/a79ffbe7-7604-400c-a304-c5165b6eebc4"
/>

<img width="346" height="244" alt="Screenshot 2026-01-13 at 4 45 06 PM"
src="https://github.com/user-attachments/assets/de874f7c-16a2-4a95-8c6d-15f10482e67b"
/>
2026-01-13 17:57:09 -08:00
Ahmed Ibrahim
5a82a72d93
Use offline cache for tui migrations (#9186) 2026-01-13 17:44:12 -08:00
Josh McKinney
ce49e92848
fix(tui): harden paste-burst state transitions (#9124)
User-facing symptom: On terminals that deliver pastes as rapid
KeyCode::Char/Enter streams (notably Windows), paste-burst transient
state
can leak into the next input. Users can see Enter insert a newline when
they meant to submit, or see characters appear late / handled through
the
wrong path.

System problem: PasteBurst is time-based. Clearing only the
classification window (e.g. via clear_window_after_non_char()) can erase
last_plain_char_time without emitting buffered text. If a buffer is
still
non-empty after that, flush_if_due() no longer has a timeout clock to
flush against, so the buffer can get "stuck" until another plain char
arrives.

This was surfaced while adding deterministic regression tests for
paste-burst behavior.

Fix: when disabling burst detection, defuse any in-flight burst state:
flush held/buffered text through handle_paste() (so it follows normal
paste integration), then clear timing and Enter suppression.

Document the rationale inline and update docs/tui-chat-composer.md so
"disable_paste_burst" matches the actual behavior.
2026-01-14 01:42:21 +00:00
Ahmed Ibrahim
4d787a2cc2
Renew cache ttl on etag match (#9174)
so we don't do unnecessary fetches
2026-01-14 01:21:41 +00:00
Ahmed Ibrahim
c96c26cf5b
[CODEX-4427] improve parsed commands (#8933)
**make command summaries more accurate by distinguishing
list/search/read operations across common CLI tools.**
- Added parsing helpers to centralize operand/flag handling (cd_target,
sed_read_path, first_non_flag_operand, single_non_flag_operand,
parse_grep_like, awk_data_file_operand, python_walks_files,
is_python_command) and reused them in summarize_main_tokens/shell
parsing.
- Newly parsed list-files commands: git ls-files, rg --files (incl.
rga/ripgrep-all), eza/exa, tree, du, python -c file-walks, plus fd/find
map to ListFiles when no query.
- Newly parsed search commands: git grep, grep/egrep/fgrep, ag/ack/pt,
rg/rga files-with-matches flags (-l/-L, --files-with-matches,
--files-without-match), with improved flag skipping to avoid
misclassifying args as paths.
- Newly parsed read commands: bat/batcat, less, more, awk <file>, and
more flexible sed -n range + file detection.
- refine “small formatting command” detection for awk/sed, handle cd
with -- or multiple operands, keep pipeline summaries focused on primary
command.
2026-01-13 16:59:33 -08:00
Ahmed Ibrahim
7e33ac7eb6
clean models manager (#9168)
Have only the following Methods:
- `list_models`: getting current available models
- `try_list_models`: sync version no refresh for tui use
- `get_default_model`: get the default model (should be tightened to
core and received on session configuration)
- `get_model_info`: get `ModelInfo` for a specific model (should be
tightened to core but used in tests)
- `refresh_if_new_etag`: trigger refresh on different etags

Also move the cache to its own struct
2026-01-13 16:55:33 -08:00
github-actions[bot]
ebbbee70c6
Update models.json (#9136)
Automated update of models.json.

Co-authored-by: aibrahim-oai <219906144+aibrahim-oai@users.noreply.github.com>
2026-01-13 16:46:24 -08:00
Michael Bolin
4c673086bc
fix: integration test for #9011 (#9166)
Adds an integration test for the new behavior introduced in
https://github.com/openai/codex/pull/9011. The work to create the test
setup was substantial enough that I thought it merited a separate PR.

This integration test spawns `codex` in TUI mode, which requires
spawning a PTY to run successfully, so I had to introduce quite a bit of
scaffolding in `run_codex_cli()`. I was surprised to discover that we
have not done this in our codebase before, so perhaps this should get
moved to a common location so it can be reused.

The test itself verifies that a malformed `rules` in `$CODEX_HOME`
prints a human-readable error message and exits nonzero.
2026-01-13 23:39:34 +00:00
Michael Bolin
2cd1a0a45e
fix: report an appropriate error in the TUI for malformed rules (#9011)
The underlying issue is that when we encountered an error starting a
conversation (any sort of error, though making `$CODEX_HOME/rules` a
file rather than folder was the example in #8803), then we were writing
the message to stderr, but this could be printed over by our UI
framework so the user would not see it. In general, we disallow the use
of `eprintln!()` in this part of the code for exactly this reason,
though this was suppressed by an `#[allow(clippy::print_stderr)]`.

This attempts to clean things up by changing `handle_event()` and
`handle_tui_event()` to return a `Result<AppRunControl>` instead of a
`Result<bool>`, which is a new type introduced in this PR (and depends
on `ExitReason`, also a new type):

```rust
#[derive(Debug)]
pub(crate) enum AppRunControl {
    Continue,
    Exit(ExitReason),
}

#[derive(Debug, Clone)]
pub enum ExitReason {
    UserRequested,
    Fatal(String),
}
```

This makes it possible to exit the primary control flow of the TUI with
richer information. This PR adds `ExitReason` to the existing
`AppExitInfo` struct and updates `handle_app_exit()` to print the error
and exit code `1` in the event of `ExitReason::Fatal`.

I tried to create an integration test for this, but it was a bit
involved, so I published it as a separate PR:
https://github.com/openai/codex/pull/9166. For this PR, please have
faith in my manual testing!

Fixes https://github.com/openai/codex/issues/8803.




---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/9011).
* #9166
* __->__ #9011
2026-01-13 23:21:32 +00:00
pakrym-oai
9f8d3c14ce
Fix flakiness in WebSocket tests (#9169)
The connection was being added to the list after the WebSocket response
was sent.

So the test can sometimes race and observe connections before the list
was updated.

After this change, connection and request is added to the list before
the response is sent.
2026-01-13 15:09:59 -08:00
xl-openai
89403c5e11
Allow close skill popup with esc. (#9165)
<img width="398" height="133" alt="image"
src="https://github.com/user-attachments/assets/3084e793-ce5b-4f92-ad60-4c73e65c21c5"
/>
<img width="242" height="86" alt="image"
src="https://github.com/user-attachments/assets/57dd5587-0aea-4a55-91b8-273702939cb2"
/>

You can now esc to quit the skill popup and submit the input as it is.
2026-01-13 15:02:44 -08:00
Marius Wichtner
3c711f3d16
Fix spinner/Esc interrupt when MCP startup completes mid-turn (#8661)
## **Problem**

Codex’s TUI uses a single “task running” indicator (spinner + Esc interrupt hint)
to communicate “the UI is busy”. In practice, “busy” can mean two different
things: an agent turn is running, or MCP servers are still starting up. Without a
clear contract, those lifecycles can interfere: startup completion can clear the
spinner while a turn is still in progress, or the UI can appear idle while MCP is
still booting. This is user-visible confusion during the most important moments
(startup and the first turn), so it was worth making the contract explicit and
guarding it.

## **Mental model**

`ChatWidget` is the UI-side adapter for the `codex_core::protocol` event stream.
It receives `EventMsg` events and updates two major UI surfaces: the transcript
(history/streaming cells) and the bottom pane (composer + status indicator).

The key concept after this change is that the bottom pane’s “task running”
indicator is treated as **derived UI-busy state**, not “agent is running”. It is
considered active while either:
- an agent turn is in progress (`TurnStarted` → completion/abort), or
- MCP startup is in progress (`McpStartupUpdate` → `McpStartupComplete`).

Those lifecycles are tracked independently, and the bottom-pane indicator is
defined as their union.

## **Non-goals**

- This does not introduce separate UI indicators for “turn busy” vs “MCP busy”.
- This does not change MCP startup behavior, ordering guarantees, or core
  protocol semantics.
- This does not rework unrelated status/header rendering or transcript layout.

## **Tradeoffs**

- The “one flag represents multiple lifecycles” approach remains lossy: it
  preserves correct “busy vs idle” semantics but cannot express *which* kind of
  busy is happening without further UI changes.
- The design keeps complexity low by keeping a single derived boolean, rather
  than adding a more expressive bottom-pane state machine. That’s chosen because
  it matches existing UX and minimizes churn while fixing the confusion.

## **Architecture**

- `codex-core` owns the actual lifecycles and emits `codex_core::protocol`
  events.
- `ChatWidget` owns the UI interpretation of those lifecycles. It is responsible
  for keeping the bottom pane’s derived “busy” state consistent with the event
  stream, and for updating the status header when MCP progress updates arrive.
- The bottom pane remains a dumb renderer of the single “task running” flag; it
  does not learn about MCP or agent turns directly.

## **Observability**

- When working: the spinner/Esc hint stays visible during MCP startup and does
  not disappear mid-turn when `McpStartupComplete` arrives; startup status
  headers can update without clearing “busy” for an active turn.
- When broken: you’ll see the spinner/hint flicker off while output is still
  streaming, or the UI appears idle while MCP startup status is still changing.

## **Tests**

- Adds/strengthens a regression test that asserts MCP startup completion does
  not clear the “task running” indicator for an active turn (in both `tui` and
  `tui2` variants).
- These tests prove the **contract** (“busy is the union of turn + startup”) at
  the UI boundary; they do not attempt to validate MCP startup ordering,
  real-world startup timing, or backend integration behavior.

Fixes #7017

Signed-off-by: 2mawi2 <2mawi2@users.noreply.github.com>
Co-authored-by: 2mawi2 <2mawi2@users.noreply.github.com>
Co-authored-by: Josh McKinney <joshka@openai.com>
2026-01-13 13:56:09 -08:00
Josh McKinney
141d2b5022
test(tui): add deterministic paste-burst tests (#9121)
Replace the old timing-dependent non-ASCII paste test with deterministic
coverage by forcing an active `PasteBurst` and asserting the exact flush
payload.

Add focused unit tests for `PasteBurst` transitions, and add short
"Behavior:" rustdoc notes on chat composer tests to make the state
machine contracts explicit.
2026-01-13 21:55:44 +00:00
Dylan Hurd
ebacd28817
fix(windows-sandbox-rs) bump SETUP_VERSION (#9134)
## Summary
Bumps the windows setup version, to re-trigger windows sandbox setup for
users in the experimental sandbox. We've seen some drift in the ACL
controls, amongst a few other changes. Hopefully this should fix #9062.

## Testing
- [x] Tested locally
2026-01-13 13:47:29 -08:00
Matthew Zeng
e25d2ab3bf
Fresh tooltips (#9130)
Fresh tooltips
2026-01-13 13:06:35 -08:00
Owen Lin
bde734fd1e
feat(app-server): add an --analytics-default-enabled flag (#9118)
Add a new `codex app-server --analytics-default-enabled` CLI flag that
controls whether analytics are enabled by default.

Analytics are disabled by default for app-server. Users have to
explicitly opt in
via the `analytics` section in the config.toml file.

However, for first-party use cases like the VSCode IDE extension, we
default analytics
to be enabled by default by setting this flag. Users can still opt out
by setting this
in their config.toml:

```toml
[analytics]
enabled = false
```

See https://developers.openai.com/codex/config-advanced/#metrics for
more details.
2026-01-13 11:59:39 -08:00
Josh McKinney
58e8f75b27
fix(tui): document paste-burst state machine (#9020)
Add a narrative doc and inline rustdoc explaining how `ChatComposer`
and `PasteBurst` compose into a single state machine on terminals that
lack reliable bracketed paste (notably Windows).

This documents the key states, invariants, and integration points
(`handle_input_basic`, `handle_non_ascii_char`, tick-driven flush) so
future changes are easier to reason about.
2026-01-13 11:48:31 -08:00
gt-oai
2651980bdf
Restrict MCP servers from requirements.toml (#9101)
Enterprises want to restrict the MCP servers their users can use.

Admins can now specify an allowlist of MCPs in `requirements.toml`. The
MCP servers are matched on both Name and Transport (local path or HTTP
URL) -- both must match to allow the MCP server. This prevents
circumventing the allowlist by renaming MCP servers in user config. (It
is still possible to replace the local path e.g. rewrite say
`/usr/local/github-mcp` with a nefarious MCP. We could allow hash
pinning in the future, but that would break updates. I also think this
represents a broader, out-of-scope problem.)

We introduce a new field to Constrained: "normalizer". In general, it is
a fn(T) -> T and applies when `Constrained<T>.set()` is called. In this
particular case, it disables MCP servers which do not match the
allowlist. An alternative solution would remove this and instead throw a
ConstraintError. That would stop Codex launching if any MCP server was
configured which didn't match. I think this is bad.

We currently reuse the enabled flag on MCP servers to disable them, but
don't propagate any information about why they are disabled. I'd like to
add that in a follow up PR, possibly by switching out enabled with an
enum.

In action:

```
# MCP server config has two MCPs. We are going to allowlist one of them.
➜  codex git:(gt/restrict-mcps) ✗ cat ~/.codex/config.toml | grep mcp_servers -A1
[mcp_servers.hello_world]
command = "hello-world-mcp"
--
[mcp_servers.docs]
command = "docs-mcp"

# Restrict the MCPs to the hello_world MCP.
➜  codex git:(gt/restrict-mcps) ✗ defaults read com.openai.codex requirements_toml_base64 | base64 -d
[mcp_server_allowlist.hello_world]
command = "hello-world-mcp"

# List the MCPs, observe hello_world is enabled and docs is disabled.
➜  codex git:(gt/restrict-mcps) ✗ just codex mcp list
cargo run --bin codex -- "$@"
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.25s
     Running `target/debug/codex mcp list`
Name         Command          Args  Env  Cwd  Status    Auth
docs         docs-mcp         -     -    -    disabled  Unsupported
hello_world  hello-world-mcp  -     -    -    enabled   Unsupported

# Remove the restrictions.
➜  codex git:(gt/restrict-mcps) ✗ defaults delete com.openai.codex requirements_toml_base64

# Observe both MCPs are enabled.
➜  codex git:(gt/restrict-mcps) ✗ just codex mcp list
cargo run --bin codex -- "$@"
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.25s
     Running `target/debug/codex mcp list`
Name         Command          Args  Env  Cwd  Status   Auth
docs         docs-mcp         -     -    -    enabled  Unsupported
hello_world  hello-world-mcp  -     -    -    enabled  Unsupported

# A new requirements that updates the command to one that does not match.
➜  codex git:(gt/restrict-mcps) ✗ cat ~/requirements.toml
[mcp_server_allowlist.hello_world]
command = "hello-world-mcp-v2"

# Use those requirements.
➜  codex git:(gt/restrict-mcps) ✗ defaults write com.openai.codex requirements_toml_base64 "$(base64 -i /Users/gt/requirements.toml)"

# Observe both MCPs are disabled.
➜  codex git:(gt/restrict-mcps) ✗ just codex mcp list
cargo run --bin codex -- "$@"
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.75s
     Running `target/debug/codex mcp list`
Name         Command          Args  Env  Cwd  Status    Auth
docs         docs-mcp         -     -    -    disabled  Unsupported
hello_world  hello-world-mcp  -     -    -    disabled  Unsupported
```
2026-01-13 19:45:00 +00:00
Anton Panasenko
51d75bb80a
fix: drop session span at end of the session (#9126) 2026-01-13 11:36:00 -08:00
charley-oai
57ba758df5
Fix queued messages during /review (#9122)
Sending a message during /review interrupts the review, whereas during
normal operation, sending a message while the agent is running will
queue the message. This is unexpected behavior, and since /review
usually takes a while, it takes away a potentially useful operation.

Summary
- Treat review mode as an active task for message queuing so inputs
don’t inject into the running review turn.
- Prevents user submissions from rendering immediately in the transcript
while the review continues streaming.
- Keeps review UX consistent with normal “task running” behavior and
avoids accidental interrupt/replacement.

Notes
- This change only affects UI queuing logic; core review flow and task
lifecycle remain unchanged.
2026-01-13 11:23:22 -08:00
sayan-oai
40e2405998
add generated jsonschema for config.toml (#8956)
### What
Add JSON Schema generation for `config.toml`, with checked‑in
`docs/config.schema.json`. We can move the schema elsewhere if preferred
(and host it if there's demand).

Add fixture test to prevent drift and `just write-config-schema` to
regenerate on schema changes.

Generate MCP config schema from `RawMcpServerConfig` instead of
`McpServerConfig` because that is the runtime type used for
deserialization.

Populate feature flag values into generated schema so they can be
autocompleted.

### Tests
Added tests + regenerate script to prevent drift. Tested autocompletions
using generated jsonschema locally with Even Better TOML.



https://github.com/user-attachments/assets/5aa7cd39-520c-4a63-96fb-63798183d0bc
2026-01-13 10:22:51 -08:00
Devon Rifkin
fe03320791
ollama: default to Responses API for built-ins (#8798)
This is an alternate PR to solving the same problem as
<https://github.com/openai/codex/pull/8227>.

In this PR, when Ollama is used via `--oss` (or via `model_provider =
"ollama"`), we default it to use the Responses format. At runtime, we do
an Ollama version check, and if the version is older than when Responses
support was added to Ollama, we print out a warning.

Because there's no way of configuring the wire api for a built-in
provider, we temporarily add a new `oss_provider`/`model_provider`
called `"ollama-chat"` that will force the chat format.

Once the `"chat"` format is fully removed (see
<https://github.com/openai/codex/discussions/7782>), `ollama-chat` can
be removed as well

---------

Co-authored-by: Eric Traut <etraut@openai.com>
Co-authored-by: Michael Bolin <mbolin@openai.com>
2026-01-13 09:51:41 -08:00