core-agent-ide

Author	SHA1	Message	Date
pakrym-oai	bfd4e2112c	Disable very flaky tests (#11394 ) Collected from last 20 builds of main in https://github.com/openai/codex/commits/main/.	2026-02-10 18:50:11 -08:00
Josh McKinney	4283a7432b	tui: double-press Ctrl+C/Ctrl+D to quit (#8936 ) ## Problem Codex’s TUI quit behavior has historically been easy to trigger accidentally and hard to reason about. - `Ctrl+C`/`Ctrl+D` could terminate the UI immediately, which is a common key to press while trying to dismiss a modal, cancel a command, or recover from a stuck state. - “Quit” and “shutdown” were not consistently separated, so some exit paths could bypass the shutdown/cleanup work that should run before the process terminates. This PR makes quitting both safer (harder to do by accident) and more uniform across quit gestures, while keeping the shutdown-first semantics explicit. ## Mental model After this change, the system treats quitting as a UI request that is coordinated by the app layer. - The UI requests exit via `AppEvent::Exit(ExitMode)`. - `ExitMode::ShutdownFirst` is the normal user path: the app triggers `Op::Shutdown`, continues rendering while shutdown runs, and only ends the UI loop once shutdown has completed. - `ExitMode::Immediate` exists as an escape hatch (and as the post-shutdown “now actually exit” signal); it bypasses cleanup and should not be the default for user-triggered quits. User-facing quit gestures are intentionally “two-step” for safety: - `Ctrl+C` and `Ctrl+D` no longer exit immediately. - The first press arms a 1-second window and shows a footer hint (“ctrl + <key> again to quit”). - Pressing the same key again within the window requests a shutdown-first quit; otherwise the hint expires and the next press starts a fresh window. Key routing remains modal-first: - A modal/popup gets first chance to consume `Ctrl+C`. - If a modal handles `Ctrl+C`, any armed quit shortcut is cleared so dismissing a modal cannot prime a subsequent `Ctrl+C` to quit. - `Ctrl+D` only participates in quitting when the composer is empty and no modal/popup is active. The design doc `docs/exit-confirmation-prompt-design.md` captures the intended routing and the invariants the UI should maintain. ## Non-goals - This does not attempt to redesign modal UX or make modals uniformly dismissible via `Ctrl+C`. It only ensures modals get priority and that quit arming does not leak across modal handling. - This does not introduce a persistent confirmation prompt/menu for quitting; the goal is to keep the exit gesture lightweight and consistent. - This does not change the semantics of core shutdown itself; it changes how the UI requests and sequences it. ## Tradeoffs - Quitting via `Ctrl+C`/`Ctrl+D` now requires a deliberate second keypress, which adds friction for users who relied on the old “instant quit” behavior. - The UI now maintains a small time-bounded state machine for the armed shortcut, which increases complexity and introduces timing-dependent behavior. This design was chosen over alternatives (a modal confirmation prompt or a long-lived “are you sure” state) because it provides an explicit safety barrier while keeping the flow fast and keyboard-native. ## Architecture - `ChatWidget` owns the quit-shortcut state machine and decides when a quit gesture is allowed (idle vs cancellable work, composer state, etc.). - `BottomPane` owns rendering and local input routing for modals/popups. It is responsible for consuming cancellation keys when a view is active and for showing/expiring the footer hint. - `App` owns shutdown sequencing: translating `AppEvent::Exit(ShutdownFirst)` into `Op::Shutdown` and only terminating the UI loop when exit is safe. This keeps “what should happen” decisions (quit vs interrupt vs ignore) in the chat/widget layer, while keeping “how it looks and which view gets the key” in the bottom-pane layer. ## Observability You can tell this is working by running the TUIs and exercising the quit gestures: - While idle: pressing `Ctrl+C` (or `Ctrl+D` with an empty composer and no modal) shows a footer hint for ~1 second; pressing again within that window exits via shutdown-first. - While streaming/tools/review are active: `Ctrl+C` interrupts work rather than quitting. - With a modal/popup open: `Ctrl+C` dismisses/handles the modal (if it chooses to) and does not arm a quit shortcut; a subsequent quick `Ctrl+C` should not quit unless the user re-arms it. Failure modes are visible as: - Quits that happen immediately (no hint window) from `Ctrl+C`/`Ctrl+D`. - Quits that occur while a modal is open and consuming `Ctrl+C`. - UI termination before shutdown completes (cleanup skipped). ## Tests - Updated/added unit and snapshot coverage in `codex-tui` and `codex-tui2` to validate: - The quit hint appears and expires on the expected key. - Double-press within the window triggers a shutdown-first quit request. - Modal-first routing prevents quit bypass and clears any armed shortcut when a modal consumes `Ctrl+C`. These tests focus on the UI-level invariants and rendered output; they do not attempt to validate real terminal key-repeat timing or end-to-end process shutdown behavior. --- Screenshot: <img width="912" height="740" alt="Screenshot 2026-01-13 at 1 05 28 PM" src="https://github.com/user-attachments/assets/18f3d22e-2557-47f2-a369-ae7a9531f29f" />	2026-01-14 17:42:52 +00:00
Eric Traut	31d9b6f4d2	Improve handling of config and rules errors for app server clients (#9182 ) When an invalid config.toml key or value is detected, the CLI currently just quits. This leaves the VSCE in a dead state. This PR changes the behavior to not quit and bubble up the config error to users to make it actionable. It also surfaces errors related to "rules" parsing. This allows us to surface these errors to users in the VSCE, like this: <img width="342" height="129" alt="Screenshot 2026-01-13 at 4 29 22 PM" src="https://github.com/user-attachments/assets/a79ffbe7-7604-400c-a304-c5165b6eebc4" /> <img width="346" height="244" alt="Screenshot 2026-01-13 at 4 45 06 PM" src="https://github.com/user-attachments/assets/de874f7c-16a2-4a95-8c6d-15f10482e67b" />	2026-01-13 17:57:09 -08:00
Michael Bolin	4c673086bc	fix: integration test for #9011 (#9166 ) Adds an integration test for the new behavior introduced in https://github.com/openai/codex/pull/9011. The work to create the test setup was substantial enough that I thought it merited a separate PR. This integration test spawns `codex` in TUI mode, which requires spawning a PTY to run successfully, so I had to introduce quite a bit of scaffolding in `run_codex_cli()`. I was surprised to discover that we have not done this in our codebase before, so perhaps this should get moved to a common location so it can be reused. The test itself verifies that a malformed `rules` in `$CODEX_HOME` prints a human-readable error message and exits nonzero.	2026-01-13 23:39:34 +00:00
Owen Lin	ae2a084fae	chore: delete chatwidget::tests::binary_size_transcript_snapshot tui test (#6759 ) We're running into quite a bit of drag maintaining this test, since every time we add fields to an EventMsg that happened to be dumped into the `binary-size-log.jsonl` fixture, this test starts to fail. The fix is usually to either manually update the `binary-size-log.jsonl` fixture file, or update the `upgrade_event_payload_for_tests` function to map the data in that file into something workable. Eason says it's fine to delete this test, so let's just delete it	2025-11-17 11:11:41 -08:00
Huaiwu Li	9a638dbf4e	fix(tui): propagate errors in insert_history_lines_to_writer (#4266 ) ## What? Fixed error handling in `insert_history_lines_to_writer` where all terminal operations were silently ignoring errors via `.ok()`. ## Why? Silent I/O failures could leave the terminal in an inconsistent state (e.g., scroll region not reset) with no way to debug. This violates Rust error handling best practices. ## How? - Changed function signature to return `io::Result<()>` - Replaced all `.ok()` calls with `?` operator to propagate errors - Added `tracing::warn!` in wrapper function for backward compatibility - Updated 15 test call sites to handle Result with `.expect()` ## Testing - ✅ Pass all tests ## Type of Change - [x] Bug fix (non-breaking change) --------- Signed-off-by: Huaiwu Li <lhwzds@gmail.com> Co-authored-by: Eric Traut <etraut@openai.com>	2025-10-30 18:07:51 -07:00
Michael Bolin	995f5c3614	feat: add Vec<ParsedCommand> to ExecApprovalRequestEvent (#5222 ) This adds `parsed_cmd: Vec<ParsedCommand>` to `ExecApprovalRequestEvent` in the core protocol (`protocol/src/protocol.rs`), which is also what this field is named on `ExecCommandBeginEvent`. Honestly, I don't love the name (it sounds like a single command, but it is actually a list of them), but I don't want to get distracted by a naming discussion right now. This also adds `parsed_cmd` to `ExecCommandApprovalParams` in `codex-rs/app-server-protocol/src/protocol.rs`, so it will be available via `codex app-server`, as well. For consistency, I also updated `ExecApprovalElicitRequestParams` in `codex-rs/mcp-server/src/exec_approval.rs` to include this field under the name `codex_parsed_cmd`, as that struct already has a number of special `codex_*` fields. Note this is the code for when Codex is used as an MCP _server_ and therefore has to conform to the official spec for an MCP elicitation type.	2025-10-15 13:58:40 -07:00
Jeremy Rose	43b63ccae8	update composer + user message styling (#4240 ) Changes: - the composer and user messages now have a colored background that stretches the entire width of the terminal. - the prompt character was changed from a cyan `▌` to a bold `›`. - the "working" shimmer now follows the "dark gray" color of the terminal, better matching the terminal's color scheme \| Terminal + Background \| Screenshot \| \|------------------------------\|------------\| \| iTerm with dark bg \| <img width="810" height="641" alt="Screenshot 2025-09-25 at 11 44 52 AM" src="https://github.com/user-attachments/assets/1317e579-64a9-4785-93e6-98b0258f5d92" /> \| \| iTerm with light bg \| <img width="845" height="540" alt="Screenshot 2025-09-25 at 11 46 29 AM" src="https://github.com/user-attachments/assets/e671d490-c747-4460-af0b-3f8d7f7a6b8e" /> \| \| iTerm with color bg \| <img width="825" height="564" alt="Screenshot 2025-09-25 at 11 47 12 AM" src="https://github.com/user-attachments/assets/141cda1b-1164-41d5-87da-3be11e6a3063" /> \| \| Terminal.app with dark bg \| <img width="577" height="367" alt="Screenshot 2025-09-25 at 11 45 22 AM" src="https://github.com/user-attachments/assets/93fc4781-99f7-4ee7-9c8e-3db3cd854fe5" /> \| \| Terminal.app with light bg \| <img width="577" height="367" alt="Screenshot 2025-09-25 at 11 46 04 AM" src="https://github.com/user-attachments/assets/19bf6a3c-91e0-447b-9667-b8033f512219" /> \| \| Terminal.app with color bg \| <img width="577" height="367" alt="Screenshot 2025-09-25 at 11 45 50 AM" src="https://github.com/user-attachments/assets/dd7c4b5b-342e-4028-8140-f4e65752bd0b" /> \|	2025-09-26 16:35:56 -07:00
Thibault Sottiaux	c93e77b68b	feat: update default (#4076 ) Changes: - Default model and docs now use gpt-5-codex. - Disables the GPT-5 Codex NUX by default. - Keeps presets available for API key users.	2025-09-22 20:10:52 -07:00
Michael Bolin	9bbeb75361	feat: include reasoning_effort in NewConversationResponse (#3506 ) `ClientRequest::NewConversation` picks up the reasoning level from the user's defaults in `config.toml`, so it should be reported in `NewConversationResponse`.	2025-09-11 21:04:40 -07:00
Michael Bolin	2a76a08a9e	fix: include rollout_path in NewConversationResponse (#3352 ) Adding the `rollout_path` to the `NewConversationResponse` makes it so a client can perform subsequent operations on a `(ConversationId, PathBuf)` pair. #3353 will introduce support for `ArchiveConversation`. --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/3352). * #3353 * __->__ #3352	2025-09-09 00:11:48 -07:00
Michael Bolin	91708bb031	fix: fix serde_as annotation and verify with test (#3170 ) I didn't do https://github.com/openai/codex/pull/3163 correctly the first time: now verified with a test.	2025-09-04 10:38:00 -07:00
Jeremy Rose	578ff09e17	prefer ratatui Stylized for constructing lines/spans (#3068 ) no functional change, just simplifying ratatui styling and adding guidance in AGENTS.md for future.	2025-09-02 23:19:54 +00:00
Jeremy Rose	e442ecedab	rework message styling (#2877 ) https://github.com/user-attachments/assets/cf07f62b-1895-44bb-b9c3-7a12032eb371	2025-09-02 17:29:58 +00:00
Jeremy Rose	17e5077507	do not show timeouts as "sandbox error"s (#2587 ) 🙅🫸 ``` ✗ Failed (exit -1) └ 🧪 cargo test --all-features -q sandbox error: command timed out ``` 😌👉 ``` ✗ Failed (exit -1) └ 🧪 cargo test --all-features -q error: command timed out ```	2025-08-25 17:52:23 -07:00
Jeremy Rose	32bbbbad61	test: faster test execution in codex-core (#2633 ) this dramatically improves time to run `cargo test -p codex-core` (~25x speedup). before: ``` cargo test -p codex-core 35.96s user 68.63s system 19% cpu 8:49.80 total ``` after: ``` cargo test -p codex-core 5.51s user 8.16s system 63% cpu 21.407 total ``` both tests measured "hot", i.e. on a 2nd run with no filesystem changes, to exclude compile times. approach inspired by [Delete Cargo Integration Tests](https://matklad.github.io/2021/02/27/delete-cargo-integration-tests.html), we move all test cases in tests/ into a single suite in order to have a single binary, as there is significant overhead for each test binary executed, and because test execution is only parallelized with a single binary.	2025-08-24 11:10:53 -07:00
Ahmed Ibrahim	957d44918d	send-aggregated output (#2364 ) We want to send an aggregated output of stderr and stdout so we don't have to aggregate it stderr+stdout as we lose order sometimes. --------- Co-authored-by: Gabriel Peal <gpeal@users.noreply.github.com>	2025-08-23 16:54:31 +00:00
Jeremy Rose	d994019f3f	tui: coalesce command output; show unabridged commands in transcript (#2590 ) https://github.com/user-attachments/assets/effec7c7-732a-4b61-a2ae-3cb297b6b19b	2025-08-22 16:32:31 -07:00
Jeremy Rose	e95cad1946	hide CoT by default; show headers in status indicator (#2316 ) Plan is for full CoT summaries to be visible in a "transcript view" when we implement that, but for now they're hidden. https://github.com/user-attachments/assets/e8a1b0ef-8f2a-48ff-9625-9c3c67d92cdb	2025-08-20 16:58:56 -07:00
Michael Bolin	50c48e88f5	chore: upgrade to Rust 1.89 (#2465 ) Codex created this PR from the following prompt: > upgrade this entire repo to Rust 1.89. Note that this requires updating codex-rs/rust-toolchain.toml as well as the workflows in .github/. Make sure that things are "clippy clean" as this change will likely uncover new Clippy errors. `just fmt` and `cargo clippy --tests` are sufficient to check for correctness Note this modifies a lot of lines because it folds nested `if` statements using `&&`. --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/2465). * #2467 * __->__ #2465	2025-08-19 13:22:02 -07:00
Jeremy Rose	bb9ce3cb78	tui: standardize tree prefix glyphs to └ (#2274 ) Replace mixed `⎿` and `L` prefixes with `└` in TUI rendering. <img width="454" height="659" alt="Screenshot 2025-08-13 at 4 02 03 PM" src="https://github.com/user-attachments/assets/61c9c7da-830b-4040-bb79-a91be90870ca" />	2025-08-13 19:14:03 -04:00
easong-openai	6340acd885	Re-add markdown streaming (#2029 ) Wait for newlines, then render markdown on a line by line basis. Word wrap it for the current terminal size and then spit it out line by line into the UI. Also adds tests and fixes some UI regressions.	2025-08-12 17:37:28 -07:00
easong-openai	52e12f2b6c	Revert "Streaming markdown (#1920 )" (#1981 ) This reverts commit `2b7139859e`.	2025-08-08 01:38:39 +00:00
easong-openai	2b7139859e	Streaming markdown (#1920 ) We wait until we have an entire newline, then format it with markdown and stream in to the UI. This reduces time to first token but is the right thing to do with our current rendering model IMO. Also lets us add word wrapping!	2025-08-07 18:26:47 -07:00
easong-openai	906d449760	Stream model responses (#1810 ) Stream models thoughts and responses instead of waiting for the whole thing to come through. Very rough right now, but I'm making the risk call to push through.	2025-08-05 04:23:22 +00:00
Michael Bolin	31d0d7a305	feat: initial import of Rust implementation of Codex CLI in codex-rs/ (#629 ) As stated in `codex-rs/README.md`: Today, Codex CLI is written in TypeScript and requires Node.js 22+ to run it. For a number of users, this runtime requirement inhibits adoption: they would be better served by a standalone executable. As maintainers, we want Codex to run efficiently in a wide range of environments with minimal overhead. We also want to take advantage of operating system-specific APIs to provide better sandboxing, where possible. To that end, we are moving forward with a Rust implementation of Codex CLI contained in this folder, which has the following benefits: - The CLI compiles to small, standalone, platform-specific binaries. - Can make direct, native calls to [seccomp](https://man7.org/linux/man-pages/man2/seccomp.2.html) and [landlock](https://man7.org/linux/man-pages/man7/landlock.7.html) in order to support sandboxing on Linux. - No runtime garbage collection, resulting in lower memory consumption and better, more predictable performance. Currently, the Rust implementation is materially behind the TypeScript implementation in functionality, so continue to use the TypeScript implmentation for the time being. We will publish native executables via GitHub Releases as soon as we feel the Rust version is usable.	2025-04-24 13:31:40 -07:00

26 commits