Enhance pty utils:
* Support closing stdin
* Separate stderr and stdout streams to allow consumers differentiate them
* Provide compatibility helper to merge both streams back into combined one
* Support specifying terminal size for pty, including on-demand resizes while process is already running
* Support terminating the process while still consuming its outputs
- override startup tooltips with model availability NUX and persist
per-model show counts in config
- stop showing each model after four exposures and fall back to normal
tooltips
## Problem
Codex’s TUI quit behavior has historically been easy to trigger
accidentally and hard to reason
about.
- `Ctrl+C`/`Ctrl+D` could terminate the UI immediately, which is a
common key to press while trying
to dismiss a modal, cancel a command, or recover from a stuck state.
- “Quit” and “shutdown” were not consistently separated, so some exit
paths could bypass the
shutdown/cleanup work that should run before the process terminates.
This PR makes quitting both safer (harder to do by accident) and more
uniform across quit
gestures, while keeping the shutdown-first semantics explicit.
## Mental model
After this change, the system treats quitting as a UI request that is
coordinated by the app
layer.
- The UI requests exit via `AppEvent::Exit(ExitMode)`.
- `ExitMode::ShutdownFirst` is the normal user path: the app triggers
`Op::Shutdown`, continues
rendering while shutdown runs, and only ends the UI loop once shutdown
has completed.
- `ExitMode::Immediate` exists as an escape hatch (and as the
post-shutdown “now actually exit”
signal); it bypasses cleanup and should not be the default for
user-triggered quits.
User-facing quit gestures are intentionally “two-step” for safety:
- `Ctrl+C` and `Ctrl+D` no longer exit immediately.
- The first press arms a 1-second window and shows a footer hint (“ctrl
+ <key> again to quit”).
- Pressing the same key again within the window requests a
shutdown-first quit; otherwise the
hint expires and the next press starts a fresh window.
Key routing remains modal-first:
- A modal/popup gets first chance to consume `Ctrl+C`.
- If a modal handles `Ctrl+C`, any armed quit shortcut is cleared so
dismissing a modal cannot
prime a subsequent `Ctrl+C` to quit.
- `Ctrl+D` only participates in quitting when the composer is empty and
no modal/popup is active.
The design doc `docs/exit-confirmation-prompt-design.md` captures the
intended routing and the
invariants the UI should maintain.
## Non-goals
- This does not attempt to redesign modal UX or make modals uniformly
dismissible via `Ctrl+C`.
It only ensures modals get priority and that quit arming does not leak
across modal handling.
- This does not introduce a persistent confirmation prompt/menu for
quitting; the goal is to keep
the exit gesture lightweight and consistent.
- This does not change the semantics of core shutdown itself; it changes
how the UI requests and
sequences it.
## Tradeoffs
- Quitting via `Ctrl+C`/`Ctrl+D` now requires a deliberate second
keypress, which adds friction for
users who relied on the old “instant quit” behavior.
- The UI now maintains a small time-bounded state machine for the armed
shortcut, which increases
complexity and introduces timing-dependent behavior.
This design was chosen over alternatives (a modal confirmation prompt or
a long-lived “are you
sure” state) because it provides an explicit safety barrier while
keeping the flow fast and
keyboard-native.
## Architecture
- `ChatWidget` owns the quit-shortcut state machine and decides when a
quit gesture is allowed
(idle vs cancellable work, composer state, etc.).
- `BottomPane` owns rendering and local input routing for modals/popups.
It is responsible for
consuming cancellation keys when a view is active and for
showing/expiring the footer hint.
- `App` owns shutdown sequencing: translating
`AppEvent::Exit(ShutdownFirst)` into `Op::Shutdown`
and only terminating the UI loop when exit is safe.
This keeps “what should happen” decisions (quit vs interrupt vs ignore)
in the chat/widget layer,
while keeping “how it looks and which view gets the key” in the
bottom-pane layer.
## Observability
You can tell this is working by running the TUIs and exercising the quit
gestures:
- While idle: pressing `Ctrl+C` (or `Ctrl+D` with an empty composer and
no modal) shows a footer
hint for ~1 second; pressing again within that window exits via
shutdown-first.
- While streaming/tools/review are active: `Ctrl+C` interrupts work
rather than quitting.
- With a modal/popup open: `Ctrl+C` dismisses/handles the modal (if it
chooses to) and does not
arm a quit shortcut; a subsequent quick `Ctrl+C` should not quit unless
the user re-arms it.
Failure modes are visible as:
- Quits that happen immediately (no hint window) from `Ctrl+C`/`Ctrl+D`.
- Quits that occur while a modal is open and consuming `Ctrl+C`.
- UI termination before shutdown completes (cleanup skipped).
## Tests
- Updated/added unit and snapshot coverage in `codex-tui` and
`codex-tui2` to validate:
- The quit hint appears and expires on the expected key.
- Double-press within the window triggers a shutdown-first quit request.
- Modal-first routing prevents quit bypass and clears any armed shortcut
when a modal consumes
`Ctrl+C`.
These tests focus on the UI-level invariants and rendered output; they
do not attempt to validate
real terminal key-repeat timing or end-to-end process shutdown behavior.
---
Screenshot:
<img width="912" height="740" alt="Screenshot 2026-01-13 at 1 05 28 PM"
src="https://github.com/user-attachments/assets/18f3d22e-2557-47f2-a369-ae7a9531f29f"
/>
When an invalid config.toml key or value is detected, the CLI currently
just quits. This leaves the VSCE in a dead state.
This PR changes the behavior to not quit and bubble up the config error
to users to make it actionable. It also surfaces errors related to
"rules" parsing.
This allows us to surface these errors to users in the VSCE, like this:
<img width="342" height="129" alt="Screenshot 2026-01-13 at 4 29 22 PM"
src="https://github.com/user-attachments/assets/a79ffbe7-7604-400c-a304-c5165b6eebc4"
/>
<img width="346" height="244" alt="Screenshot 2026-01-13 at 4 45 06 PM"
src="https://github.com/user-attachments/assets/de874f7c-16a2-4a95-8c6d-15f10482e67b"
/>
Adds an integration test for the new behavior introduced in
https://github.com/openai/codex/pull/9011. The work to create the test
setup was substantial enough that I thought it merited a separate PR.
This integration test spawns `codex` in TUI mode, which requires
spawning a PTY to run successfully, so I had to introduce quite a bit of
scaffolding in `run_codex_cli()`. I was surprised to discover that we
have not done this in our codebase before, so perhaps this should get
moved to a common location so it can be reused.
The test itself verifies that a malformed `rules` in `$CODEX_HOME`
prints a human-readable error message and exits nonzero.
We're running into quite a bit of drag maintaining this test, since
every time we add fields to an EventMsg that happened to be dumped into
the `binary-size-log.jsonl` fixture, this test starts to fail. The fix
is usually to either manually update the `binary-size-log.jsonl` fixture
file, or update the `upgrade_event_payload_for_tests` function to map
the data in that file into something workable.
Eason says it's fine to delete this test, so let's just delete it
## What?
Fixed error handling in `insert_history_lines_to_writer` where all
terminal operations were silently ignoring errors via `.ok()`.
## Why?
Silent I/O failures could leave the terminal in an inconsistent state
(e.g., scroll region not reset) with no way to debug. This violates Rust
error handling best practices.
## How?
- Changed function signature to return `io::Result<()>`
- Replaced all `.ok()` calls with `?` operator to propagate errors
- Added `tracing::warn!` in wrapper function for backward compatibility
- Updated 15 test call sites to handle Result with `.expect()`
## Testing
- ✅ Pass all tests
## Type of Change
- [x] Bug fix (non-breaking change)
---------
Signed-off-by: Huaiwu Li <lhwzds@gmail.com>
Co-authored-by: Eric Traut <etraut@openai.com>
This adds `parsed_cmd: Vec<ParsedCommand>` to `ExecApprovalRequestEvent`
in the core protocol (`protocol/src/protocol.rs`), which is also what
this field is named on `ExecCommandBeginEvent`. Honestly, I don't love
the name (it sounds like a single command, but it is actually a list of
them), but I don't want to get distracted by a naming discussion right
now.
This also adds `parsed_cmd` to `ExecCommandApprovalParams` in
`codex-rs/app-server-protocol/src/protocol.rs`, so it will be available
via `codex app-server`, as well.
For consistency, I also updated `ExecApprovalElicitRequestParams` in
`codex-rs/mcp-server/src/exec_approval.rs` to include this field under
the name `codex_parsed_cmd`, as that struct already has a number of
special `codex_*` fields. Note this is the code for when Codex is used
as an MCP _server_ and therefore has to conform to the official spec for
an MCP elicitation type.
`ClientRequest::NewConversation` picks up the reasoning level from the user's defaults in `config.toml`, so it should be reported in `NewConversationResponse`.
Adding the `rollout_path` to the `NewConversationResponse` makes it so a
client can perform subsequent operations on a `(ConversationId,
PathBuf)` pair. #3353 will introduce support for `ArchiveConversation`.
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/3352).
* #3353
* __->__ #3352
this dramatically improves time to run `cargo test -p codex-core` (~25x
speedup).
before:
```
cargo test -p codex-core 35.96s user 68.63s system 19% cpu 8:49.80 total
```
after:
```
cargo test -p codex-core 5.51s user 8.16s system 63% cpu 21.407 total
```
both tests measured "hot", i.e. on a 2nd run with no filesystem changes,
to exclude compile times.
approach inspired by [Delete Cargo Integration
Tests](https://matklad.github.io/2021/02/27/delete-cargo-integration-tests.html),
we move all test cases in tests/ into a single suite in order to have a
single binary, as there is significant overhead for each test binary
executed, and because test execution is only parallelized with a single
binary.
We want to send an aggregated output of stderr and stdout so we don't
have to aggregate it stderr+stdout as we lose order sometimes.
---------
Co-authored-by: Gabriel Peal <gpeal@users.noreply.github.com>
Codex created this PR from the following prompt:
> upgrade this entire repo to Rust 1.89. Note that this requires
updating codex-rs/rust-toolchain.toml as well as the workflows in
.github/. Make sure that things are "clippy clean" as this change will
likely uncover new Clippy errors. `just fmt` and `cargo clippy --tests`
are sufficient to check for correctness
Note this modifies a lot of lines because it folds nested `if`
statements using `&&`.
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/2465).
* #2467
* __->__ #2465
Wait for newlines, then render markdown on a line by line basis. Word wrap it for the current terminal size and then spit it out line by line into the UI. Also adds tests and fixes some UI regressions.
We wait until we have an entire newline, then format it with markdown and stream in to the UI. This reduces time to first token but is the right thing to do with our current rendering model IMO. Also lets us add word wrapping!
Stream models thoughts and responses instead of waiting for the whole
thing to come through. Very rough right now, but I'm making the risk call to push through.
As stated in `codex-rs/README.md`:
Today, Codex CLI is written in TypeScript and requires Node.js 22+ to
run it. For a number of users, this runtime requirement inhibits
adoption: they would be better served by a standalone executable. As
maintainers, we want Codex to run efficiently in a wide range of
environments with minimal overhead. We also want to take advantage of
operating system-specific APIs to provide better sandboxing, where
possible.
To that end, we are moving forward with a Rust implementation of Codex
CLI contained in this folder, which has the following benefits:
- The CLI compiles to small, standalone, platform-specific binaries.
- Can make direct, native calls to
[seccomp](https://man7.org/linux/man-pages/man2/seccomp.2.html) and
[landlock](https://man7.org/linux/man-pages/man7/landlock.7.html) in
order to support sandboxing on Linux.
- No runtime garbage collection, resulting in lower memory consumption
and better, more predictable performance.
Currently, the Rust implementation is materially behind the TypeScript
implementation in functionality, so continue to use the TypeScript
implmentation for the time being. We will publish native executables via
GitHub Releases as soon as we feel the Rust version is usable.