core-agent-ide

Author	SHA1	Message	Date
Jeremy Rose	799364de87	Enable TUI notifications by default (#6633 ) ## Summary - default the `tui.notifications` setting to enabled so desktop notifications work out of the box - update configuration tests and documentation to reflect the new default ## Testing - `cargo test -p codex-core` (fails: `exec::tests::kill_child_process_group_kills_grandchildren_on_timeout` is flaky in this sandbox because the spawned grandchild process stays alive) - `cargo test -p codex-core exec::tests::kill_child_process_group_kills_grandchildren_on_timeout` (fails: same sandbox limitation as above) ------ [Codex Task](https://chatgpt.com/codex/tasks/task_i_69166f811144832c9e8aaf8ee2642373)	2025-11-14 09:28:09 -08:00
Celia Chen	526777c9b4	[App server] add mcp tool call item started/completed events (#6642 ) this PR does two things: 1. refactor `apply_bespoke_event_handling` into a separate file as it's getting kind of long; 2. add mcp tool call `item/started` and `item/completed` events. To roll out app server events asap we didn't properly migrate mcp core events to use TurnItem for mcp tool calls - this will be a follow-up PR. real events generated in log: ``` { "method": "codex/event/mcp_tool_call_end", "params": { "conversationId": "019a8021-26af-7c20-83db-21ca81e44d68", "id": "0", "msg": { "call_id": "call_7EjRQkD9HnfyMWf7tGrT9FKA", "duration": { "nanos": 92708, "secs": 0 }, "invocation": { "arguments": { "server": "" }, "server": "codex", "tool": "list_mcp_resources" }, "result": { "Ok": { "content": [ { "text": "{\"resources\":[]}", "type": "text" } ], "isError": false } }, "type": "mcp_tool_call_end" } } } { "method": "item/completed", "params": { "item": { "arguments": { "server": "" }, "error": null, "id": "call_7EjRQkD9HnfyMWf7tGrT9FKA", "result": { "content": [ { "text": "{\"resources\":[]}", "type": "text" } ], "structuredContent": null }, "server": "codex", "status": "completed", "tool": "list_mcp_resources", "type": "mcpToolCall" } } } ```	2025-11-14 08:08:43 -08:00
jif-oai	f17b392470	feat: cache tokenizer (#6609 )	2025-11-14 17:05:00 +01:00
jif-oai	63c8c01f40	feat: better UI for unified_exec (#6515 ) <img width="376" height="132" alt="Screenshot 2025-11-12 at 17 36 22" src="https://github.com/user-attachments/assets/ce693f0d-5ca0-462e-b170-c20811dcc8d5" />	2025-11-14 16:31:12 +01:00
jif-oai	4788fb179a	feat: add resume logs when doing /new (#6660 ) <img width="769" height="803" alt="Screenshot 2025-11-14 at 10 25 49" src="https://github.com/user-attachments/assets/12fbc21e-cab9-4d0a-a484-1aeb60219f96" />	2025-11-14 11:42:16 +01:00
pakrym-oai	6c384eb9c6	tests: replace mount_sse_once_match with mount_sse_once for SSE mocking (#6640 )	2025-11-13 18:04:05 -08:00
Ahmed Ibrahim	2a6e9b20df	Promote shared helpers for suite tests (#6460 ) ## Summary - add `TestCodex::submit_turn_with_policies` and extend the response helpers with reusable tool-call utilities - update the grep_files, read_file, list_dir, shell_serialization, and tools suites to rely on the shared helpers instead of local copies - make the list_dir helper return `anyhow::Result` so clippy no longer warns about `expect` ## Testing - `just fix -p codex-core` - `cargo test -p codex-core --test all suite::grep_files::grep_files_tool_collects_matches` - `cargo test -p codex-core suite::grep_files::grep_files_tool_collects_matches -- --ignored` (filter requests ignored tests so nothing runs, but the build stays clean) ------ [Codex Task](https://chatgpt.com/codex/tasks/task_i_69112d53abac83219813cab4d7cb6446)	2025-11-13 17:12:10 -08:00
Ahmed Ibrahim	f3c6b1334b	Use shared network gating helper in chat completion tests (#6461 ) ## Summary - replace the bespoke network check in the chat completion payload and SSE tests with the existing `skip_if_no_network!` helper so they follow the same gating convention as the rest of the suite ## Testing - `just fmt` ------ [Codex Task](https://chatgpt.com/codex/tasks/task_i_69112d4cb9f08321ba773e8ccf39778e)	2025-11-13 17:11:43 -08:00
Ahmed Ibrahim	9890ceb939	Avoid double truncation (#6631 ) 1. Avoid double truncation by giving 10% above the tool default constant 2. Add tests that fails when const = 1	2025-11-13 16:59:31 -08:00
pakrym-oai	7b027e7536	Revert "Revert "Overhaul shell detection and centralize command generation for unified exec"" (#6607 ) Reverts openai/codex#6606	2025-11-13 16:45:17 -08:00
Owen Lin	db2aa57d73	[app-server] small fixes for JSON schema export and one-of types (#6614 ) A partner is consuming our generated JSON schema bundle for app-server and identified a few issues: - not all polymorphic / one-of types have a type descriminator - `"$ref": "#/definitions/v2/SandboxPolicy"` is missing - "Option<>" is an invalid schema name, and also unnecessary This PR: - adds the type descriminator to the various types that are missing it except for `SessionSource` and `SubAgentSource` because they are serialized to disk (adding this would break backwards compat for resume), and they should not be necessary to consume for an integration with app-server. - removes the special handling in `export.rs` of various types like SandboxPolicy, which turned out to be unnecessary and incorrect - filters out `Option<>` which was auto-generated for request params that don't need a body For context, we currently pull in wayyy more types than we need through the `EventMsg` god object which we are not planning to expose in API v2 (this is how I suspect `SessionSource` and `SubAgentSource` are being pulled in). But until we have all the necessary v2 notifications in place that will allow us to remove `EventMsg`, we will keep exporting it for now.	2025-11-13 16:25:17 -08:00
Celia Chen	b8ec97c0ef	[App-server] add new v2 events:`item/reasoning/delta`, `item/agentMessage/delta` & `item/reasoning/summaryPartAdded` (#6559 ) core event to app server event mapping: 1. `codex/event/reasoning_content_delta` -> `item/reasoning/summaryTextDelta`. 2. `codex/event/reasoning_raw_content_delta` -> `item/reasoning/textDelta` 3. `codex/event/agent_message_content_delta` → `item/agentMessage/delta`. 4. `codex/event/agent_reasoning_section_break` -> `item/reasoning/summaryPartAdded`. Also added a change in core to pass down content index, summary index and item id from events. Tested with the `git checkout owen/app_server_test_client && cargo run -p codex-app-server-test-client -- send-message-v2 "hello"` and verified that new events are emitted correctly.	2025-11-14 00:25:01 +00:00
Dylan Hurd	2c1b693da4	chore(core) Consolidate apply_patch tests (#6545 ) ## Summary Consolidates our apply_patch tests into one suite, and ensures each test case tests the various ways the harness supports apply_patch: 1. Freeform custom tool call 2. JSON function tool 3. Simple shell call 4. Heredoc shell call There are a few test cases that are specific to a particular variant, I've left those alone. ## Testing - [x] This adds a significant number of tests	2025-11-13 15:52:39 -08:00
Dan Hernandez	b4a53aef47	feat: Add support for --add-dir to exec and TypeScript SDK (#6565 ) ## Summary Adds support for specifying additional directories in the TypeScript SDK through a new `additionalDirectories` option in `ThreadOptions`. ## Changes - Added `additionalDirectories` parameter to `ThreadOptions` interface - Updated `CodexExec` to accept and pass through additional directories via the `--config` flag for `sandbox_workspace_write.writable_roots` - Added comprehensive test coverage for the new functionality ## Test plan - Added test case that verifies `additionalDirectories` is correctly passed as repeated flags - Existing tests continue to pass --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-11-13 13:47:10 -08:00
pakrym-oai	0792a7953d	Update default yield time (#6610 ) 10s for exec and 250ms for write_stdin	2025-11-13 10:24:41 -08:00
pakrym-oai	041d6ad902	Migrate prompt caching tests to test_codex (#6605 ) To hopefully fix the flakiness	2025-11-13 09:19:38 -08:00
pakrym-oai	e6995174c1	Revert "Overhaul shell detection and centralize command generation for unified exec" (#6606 ) Reverts openai/codex#6577	2025-11-13 08:43:00 -08:00
pakrym-oai	d28e912214	Overhaul shell detection and centralize command generation for unified exec (#6577 ) This fixes command display for unified exec. All `cd`s and `ls`es are now parsed. <img width="452" height="237" alt="image" src="https://github.com/user-attachments/assets/ce92d81f-f74c-485a-9b34-1eaa29290ec6" /> Deletes a ton of tests that were doing nothing from shell.rs. --------- Co-authored-by: Pavel Krymets <pavel@krymets.com>	2025-11-13 08:28:09 -08:00
Ahmed Ibrahim	ba74cee6f7	fix model picker wrapping (#6589 ) Previously the popup measured rows using the full content width while the renderer drew them with 2 columns of padding, so at certain widths the layout allocated too little vertical space and hid the third option. Now both desired_height and render call a shared helper that subtracts the padding before measuring, so the height we reserve always matches what we draw and the menu doesn't drops entries. https://github.com/user-attachments/assets/59058fd9-1e34-4325-b5fe-fc888dfcb6bc	2025-11-13 08:09:13 -08:00
jif-oai	2a417c47ac	feat: proxy context left after compaction (#6597 )	2025-11-13 16:54:03 +01:00
Dylan Hurd	8dcbd29edd	chore(core) Update prompt for gpt-5.1 (#6588 ) ## Summary Updates the prompt for GPT-5.1	2025-11-13 07:51:28 -08:00
pakrym-oai	34621166d5	Default to explicit medium reasoning for 5.1 (#6593 )	2025-11-13 07:58:42 +00:00
pakrym-oai	e3dd362c94	Reasoning level update (#6586 ) Automatically update reasoning levels when migrating between models	2025-11-13 06:24:36 +00:00
Ahmed Ibrahim	305fe73d83	copy for model migration nudge (#6585 )	2025-11-13 05:56:30 +00:00
Ahmed Ibrahim	e3aaee00c8	feat: show gpt mini (#6583 )	2025-11-13 05:21:00 +00:00
Ahmed Ibrahim	b1979b70a8	remove porcupine model slug (#6580 )	2025-11-13 04:43:31 +00:00
Eric Traut	73ed30d7e5	Avoid hang when tool's process spawns grandchild that shares stderr/stdout (#6575 ) We've received many reports of codex hanging when calling certain tools. [Here](https://github.com/openai/codex/issues/3204) is one example. This is likely a major cause. The problem occurs when `consume_truncated_output` waits for `stdout` and `stderr` to be closed once the child process terminates. This normally works fine, but it doesn't handle the case where the child has spawned grandchild processes that inherits `stdout` and `stderr`. The fix was originally written by @md-oai in [this PR](https://github.com/openai/codex/pull/1852), which has gone stale. I've copied the original fix (which looks sound to me) and added an integration test to prevent future regressions.	2025-11-12 20:08:12 -08:00
Ahmed Ibrahim	ad7eaa80f9	Change model picker to include gpt5.1 (#6569 ) - Change the presets - Change the tests that make sure we keep the list of tools updated - Filter out deprecated models	2025-11-12 19:44:53 -08:00
Ahmed Ibrahim	966d71c02a	Update subtitle of model picker as part of the nux (#6567 )	2025-11-12 18:30:43 -08:00
pakrym-oai	f97874093e	Set verbosity to low for 5.1 (#6568 ) And improve test coverage	2025-11-13 01:40:52 +00:00
Ahmed Ibrahim	e63ab0dd65	NUX for gpt5.1 (#6561 ) - Introducing a screen to inform users of model changes. - Config name is being passed to be able to reuse this component in the future for future models	2025-11-13 01:24:21 +00:00
Owen Lin	964220ac94	[app-server] feat: thread/resume supports history, path, and overrides (#6483 ) This updates `thread/resume` to be at parity with v1's `ResumeConversationParams`. Turns out history is useful for codex cloud and path is useful for the VSCode extension. And config overrides are always useful.	2025-11-12 22:02:43 +00:00
pakrym-oai	2f58e69997	Do not double encode request bodies in logging (#6558 )	2025-11-12 21:28:42 +00:00
pakrym-oai	ec69a4a810	Add gpt-5.1 model definitions (#6551 )	2025-11-12 12:44:36 -08:00
Eric Traut	ad09c138b9	Fixed status output to use auth information from AuthManager (#6529 ) This PR addresses https://github.com/openai/codex/issues/6360. The root problem is that the TUI was directly loading the `auth.json` file to access the auth information. It should instead be using the AuthManager, which records the current auth information. The `auth.json` file can be overwritten at any time by other instances of the CLI or extension, so its information can be out of sync with the current instance. The `/status` command should always report the auth information associated with the current instance. An alternative fix for this bug was submitted by @chojs23 in [this PR](https://github.com/openai/codex/pull/6495). That approach was only a partial fix.	2025-11-12 10:26:50 -08:00
jif-oai	e00eb50db3	feat: only wait for mutating tools for ghost commit (#6534 )	2025-11-12 18:16:32 +00:00
pakrym-oai	7d9ad3effd	Fix otel tests (#6541 ) Mount responses only once, remove unneeded retries and add a final assistant messages to complete the turn.	2025-11-12 16:35:34 +00:00
Michael Bolin	c3a710ee14	chore: verify boolean values can be parsed as config overrides (#6516 ) This is important to ensure that this: ``` codex --enable unified_exec ``` and this: ``` codex --config features.unified_exec=true ``` are equivalent. Also that when it is passed programmatically: `807e2c27f0/codex-rs/app-server-protocol/src/protocol/v1.rs (L55)` then this should work for `config`: ```json {"features": {"shell_command_tool": true}} ``` though I believe also this: ```json {"features.shell_command_tool": true} ```	2025-11-12 08:19:16 -08:00
Michael Bolin	29364f3a9b	feat: shell_command tool (#6510 ) This adds support for a new variant of the shell tool behind a flag. To test, run `codex` with `--enable shell_command_tool`, which will register the tool with Codex under the name `shell_command` that accepts the following shape: ```python { command: str workdir: str \| None, timeout_ms: int \| None, with_escalated_permissions: bool \| None, justification: str \| None, } ``` This is comparable to the existing tool registered under `shell`/`container.exec`. The primary difference is that it accepts `command` as a `str` instead of a `str[]`. The `shell_command` tool executes by running `execvp(["bash", "-lc", command])`, though the exact arguments to `execvp(3)` depend on the user's default shell. The hypothesis is that this will simplify things for the model. For example, on Windows, instead of generating: ```json {"command": ["pwsh.exe", "-NoLogo", "-Command", "ls -Name"]} ``` The model could simply generate: ```json {"command": "ls -Name"} ``` As part of this change, I extracted some logic out of `user_shell.rs` as `Shell::derive_exec_args()` so that it can be reused in `codex-rs/core/src/tools/handlers/shell.rs`. Note the original code generated exec arg lists like: ```javascript ["bash", "-lc", command] ["zsh", "-lc", command] ["pwsh.exe", "-NoProfile", "-Command", command] ``` Using `-l` for Bash and Zsh, but then specifying `-NoProfile` for PowerShell seemed inconsistent to me, so I changed this in the new implementation while also adding a `use_login_shell: bool` option to make this explicit. If we decide to add a `login: bool` to `ShellCommandToolCallParams` like we have for unified exec: `807e2c27f0/codex-rs/core/src/tools/handlers/unified_exec.rs (L33-L34)` Then this should make it straightforward to support.	2025-11-12 08:18:57 -08:00
jif-oai	530db0ad73	feat: warning switch model on resume (#6507 ) <img width="1259" height="40" alt="Screenshot 2025-11-11 at 14 01 41" src="https://github.com/user-attachments/assets/48ead3d2-d89c-4d8a-a578-82d9663dbd88" />	2025-11-12 11:13:37 +00:00
Gabriel Peal	424bfecd0b	Re-add prettier log-level=warn to generate-ts (#6528 ) I added it in https://github.com/openai/codex/pull/6342 but it was removed in https://github.com/openai/codex/pull/5063/files#diff-e2aa6dad1e886b7765158a27aefd1be5de99baa71b44f6bc5ce3fe462b9ae5d3R135 as a result of a bad diamond merge	2025-11-11 21:30:01 -05:00
Lionel Cheng	eb1c651c00	Update full-auto description with on-request (#6523 ) This PR fixes #6522 by correcting the comment for `full-auto` in both `codex-rs/exec/src/cli.rs` and `codex-rs/tui/src/cli.rs` from `-a on-failure` to `-a on-request` to make it coherent with `codex-rs/tui/src/lib.rs:97-105`: ```rust pub async fn run_main( mut cli: Cli, codex_linux_sandbox_exe: Option<PathBuf>, ) -> std::io::Result<AppExitInfo> { let (sandbox_mode, approval_policy) = if cli.full_auto { ( Some(SandboxMode::WorkspaceWrite), Some(AskForApproval::OnRequest), ) ``` Running `just codex --help` or `just codex exec --help` should now yield the correct description of `full-auto` CLI argument. Signed-off-by: lionelchg <lionel.cheng@hotmail.fr>	2025-11-11 15:59:20 -08:00
Celia Chen	e357fc723d	[app-server] add item started/completed events for turn items (#6517 ) This one should be quite straightforward, as it's just a translation of TurnItem events we already emit to ThreadItem that app-server exposes to customers. To test, cp my change to owen/app_server_test_client and do the following: ``` cargo build -p codex-cli RUST_LOG=codex_app_server=info CODEX_BIN=target/debug/codex cargo run -p codex-app-server-test-client -- send-message-v2 "hello" ``` example event before (still kept there for backward compatibility): ``` { < "method": "codex/event/item_completed", < "params": { < "conversationId": "019a74cc-fad9-7ab3-83a3-f42827b7b074", < "id": "0", < "msg": { < "item": { < "Reasoning": { < "id": "rs_03d183492e07e20a016913a936eb8c81a1a7671a103fee8afc", < "raw_content": [], < "summary_text": [ < "Hey! What would you like to work on? I can explore the repo, run specific tests, or implement a change. Let's keep it short and straightforward. There's no need for a lengthy introduction or elaborate planning, just a friendly greeting and an open offer to help. I want to make sure the user feels welcomed and understood right from the start. It's all about keeping the tone friendly and concise!" < ] < } < }, < "thread_id": "019a74cc-fad9-7ab3-83a3-f42827b7b074", < "turn_id": "0", < "type": "item_completed" < } < } < } ``` after (v2): ``` < { < "method": "item/completed", < "params": { < "item": { < "id": "rs_03d183492e07e20a016913a936eb8c81a1a7671a103fee8afc", < "text": "Hey! What would you like to work on? I can explore the repo, run specific tests, or implement a change. Let's keep it short and straightforward. There's no need for a lengthy introduction or elaborate planning, just a friendly greeting and an open offer to help. I want to make sure the user feels welcomed and understood right from the start. It's all about keeping the tone friendly and concise!", < "type": "reasoning" < } < } < } ```	2025-11-11 22:43:24 +00:00
pakrym-oai	807e2c27f0	Add unified exec escalation handling and tests (#6492 ) Similar implementation to the shell tool	2025-11-11 08:19:35 -08:00
jif-oai	ad279eacdc	nit: logs to trace (#6503 )	2025-11-11 13:37:06 +00:00
jif-oai	052b052832	Enable ghost_commit feature by default (#6041 ) ## Summary - enable the ghost_commit feature flag by default ## Testing - just fmt ------ https://chatgpt.com/codex/tasks/task_i_6904ce2d0370832dbb3c2c09a90fb188	2025-11-11 09:20:46 +00:00
Celia Chen	6951872776	[hygiene][app-server] have a helper function for duplicate code in turn APIs (#6488 ) turn_start and turn_interrupt have some logic that can be shared. have a helper function for it.	2025-11-11 02:44:47 +00:00
pakrym-oai	bb7b0213a8	Colocate more of bash parsing (#6489 ) Move a few callsites that were detecting `bash -lc` into a shared helper.	2025-11-11 02:38:36 +00:00
pakrym-oai	6c36318bd8	Use codex-linux-sandbox in unified exec (#6480 ) Unified exec isn't working on Linux because we don't provide the correct arg0. The library we use for pty management doesn't allow setting arg0 separately from executable. Use the same aliasing strategy we use for `apply_patch` for `codex-linux-sandbox`. Use `#[ctor]` hack to dispatch codex-linux-sandbox calls. Addresses https://github.com/openai/codex/issues/6450	2025-11-10 17:17:09 -08:00
zhao-oai	930f81a17b	flip rate limit status bar (#6482 ) flipping rate limit status bar to match chat.com/codex/settings/usage <img width="848" height="420" alt="Screenshot 2025-11-10 at 4 53 41 PM" src="https://github.com/user-attachments/assets/e326db3f-4405-412d-9e62-337282ec9a35" />	2025-11-11 01:13:10 +00:00

1 2 3 4 5 ...

1474 commits