core-agent-ide

Author	SHA1	Message	Date
pakrym-oai	018a2d2e50	Ignore unified_exec_respects_workdir_override (#6693 )	2025-11-14 15:00:31 -08:00
pakrym-oai	cfcc87a953	Order outputs before inputs (#6691 ) For better caching performance all output items should be rendered in the order they were produced before all new input items (for example, all function_call before all function_call_output).	2025-11-14 14:54:11 -08:00
Owen Lin	c3951e505d	feat: add app-server-test-client crate for internal use (#5391 ) For app-server development it's been helpful to be able to trigger some test flows end-to-end and print the JSON-RPC messages sent between client and server.	2025-11-14 12:39:58 -08:00
iceweasel-oai	abb7b79701	fix codex detection, add new security-focused smoketests. (#6682 ) Fix 'codex' detection to look for debug build, then release build, then installed. Adds more smoketests around security from @viyatb-oai	2025-11-14 12:08:59 -08:00
Ryan Lopopolo	936650001f	feat(ts-sdk): allow overriding CLI environment (#6648 ) ## Summary - add an `env` option for the TypeScript Codex client and plumb it into `CodexExec` so the CLI can run without inheriting `process.env` - extend the test spy to capture spawn environments, add coverage for the new option, and document how to use it ## Testing - `pnpm test` (fails: corepack cannot download pnpm because outbound network access is blocked in the sandbox) ------ [Codex Task](https://chatgpt.com/codex/tasks/task_i_6916b2d7c7548322a72d61d91a2dac85)	2025-11-14 19:44:19 +00:00
iceweasel-oai	37fba28ac3	templates and build step for validating/submitting winget package (#6485 )	2025-11-14 11:06:44 -08:00
pakrym-oai	4ba562d2dd	Add test timeout (#6612 ) Add an overall test timeout of 30s.	2025-11-14 09:30:37 -08:00
Jeremy Rose	799364de87	Enable TUI notifications by default (#6633 ) ## Summary - default the `tui.notifications` setting to enabled so desktop notifications work out of the box - update configuration tests and documentation to reflect the new default ## Testing - `cargo test -p codex-core` (fails: `exec::tests::kill_child_process_group_kills_grandchildren_on_timeout` is flaky in this sandbox because the spawned grandchild process stays alive) - `cargo test -p codex-core exec::tests::kill_child_process_group_kills_grandchildren_on_timeout` (fails: same sandbox limitation as above) ------ [Codex Task](https://chatgpt.com/codex/tasks/task_i_69166f811144832c9e8aaf8ee2642373)	2025-11-14 09:28:09 -08:00
jif-oai	4719cba19a	tmp: drop sccache for windows (#6673 )	2025-11-14 17:29:05 +01:00
Celia Chen	526777c9b4	[App server] add mcp tool call item started/completed events (#6642 ) this PR does two things: 1. refactor `apply_bespoke_event_handling` into a separate file as it's getting kind of long; 2. add mcp tool call `item/started` and `item/completed` events. To roll out app server events asap we didn't properly migrate mcp core events to use TurnItem for mcp tool calls - this will be a follow-up PR. real events generated in log: ``` { "method": "codex/event/mcp_tool_call_end", "params": { "conversationId": "019a8021-26af-7c20-83db-21ca81e44d68", "id": "0", "msg": { "call_id": "call_7EjRQkD9HnfyMWf7tGrT9FKA", "duration": { "nanos": 92708, "secs": 0 }, "invocation": { "arguments": { "server": "" }, "server": "codex", "tool": "list_mcp_resources" }, "result": { "Ok": { "content": [ { "text": "{\"resources\":[]}", "type": "text" } ], "isError": false } }, "type": "mcp_tool_call_end" } } } { "method": "item/completed", "params": { "item": { "arguments": { "server": "" }, "error": null, "id": "call_7EjRQkD9HnfyMWf7tGrT9FKA", "result": { "content": [ { "text": "{\"resources\":[]}", "type": "text" } ], "structuredContent": null }, "server": "codex", "status": "completed", "tool": "list_mcp_resources", "type": "mcpToolCall" } } } ```	2025-11-14 08:08:43 -08:00
jif-oai	f17b392470	feat: cache tokenizer (#6609 )	2025-11-14 17:05:00 +01:00
jif-oai	63c8c01f40	feat: better UI for unified_exec (#6515 ) <img width="376" height="132" alt="Screenshot 2025-11-12 at 17 36 22" src="https://github.com/user-attachments/assets/ce693f0d-5ca0-462e-b170-c20811dcc8d5" />	2025-11-14 16:31:12 +01:00
jif-oai	4788fb179a	feat: add resume logs when doing /new (#6660 ) <img width="769" height="803" alt="Screenshot 2025-11-14 at 10 25 49" src="https://github.com/user-attachments/assets/12fbc21e-cab9-4d0a-a484-1aeb60219f96" />	2025-11-14 11:42:16 +01:00
pakrym-oai	6c384eb9c6	tests: replace mount_sse_once_match with mount_sse_once for SSE mocking (#6640 )	2025-11-13 18:04:05 -08:00
Ahmed Ibrahim	2a6e9b20df	Promote shared helpers for suite tests (#6460 ) ## Summary - add `TestCodex::submit_turn_with_policies` and extend the response helpers with reusable tool-call utilities - update the grep_files, read_file, list_dir, shell_serialization, and tools suites to rely on the shared helpers instead of local copies - make the list_dir helper return `anyhow::Result` so clippy no longer warns about `expect` ## Testing - `just fix -p codex-core` - `cargo test -p codex-core --test all suite::grep_files::grep_files_tool_collects_matches` - `cargo test -p codex-core suite::grep_files::grep_files_tool_collects_matches -- --ignored` (filter requests ignored tests so nothing runs, but the build stays clean) ------ [Codex Task](https://chatgpt.com/codex/tasks/task_i_69112d53abac83219813cab4d7cb6446)	2025-11-13 17:12:10 -08:00
Ahmed Ibrahim	f3c6b1334b	Use shared network gating helper in chat completion tests (#6461 ) ## Summary - replace the bespoke network check in the chat completion payload and SSE tests with the existing `skip_if_no_network!` helper so they follow the same gating convention as the rest of the suite ## Testing - `just fmt` ------ [Codex Task](https://chatgpt.com/codex/tasks/task_i_69112d4cb9f08321ba773e8ccf39778e)	2025-11-13 17:11:43 -08:00
Ahmed Ibrahim	9890ceb939	Avoid double truncation (#6631 ) 1. Avoid double truncation by giving 10% above the tool default constant 2. Add tests that fails when const = 1	2025-11-13 16:59:31 -08:00
pakrym-oai	7b027e7536	Revert "Revert "Overhaul shell detection and centralize command generation for unified exec"" (#6607 ) Reverts openai/codex#6606	2025-11-13 16:45:17 -08:00
Owen Lin	db2aa57d73	[app-server] small fixes for JSON schema export and one-of types (#6614 ) A partner is consuming our generated JSON schema bundle for app-server and identified a few issues: - not all polymorphic / one-of types have a type descriminator - `"$ref": "#/definitions/v2/SandboxPolicy"` is missing - "Option<>" is an invalid schema name, and also unnecessary This PR: - adds the type descriminator to the various types that are missing it except for `SessionSource` and `SubAgentSource` because they are serialized to disk (adding this would break backwards compat for resume), and they should not be necessary to consume for an integration with app-server. - removes the special handling in `export.rs` of various types like SandboxPolicy, which turned out to be unnecessary and incorrect - filters out `Option<>` which was auto-generated for request params that don't need a body For context, we currently pull in wayyy more types than we need through the `EventMsg` god object which we are not planning to expose in API v2 (this is how I suspect `SessionSource` and `SubAgentSource` are being pulled in). But until we have all the necessary v2 notifications in place that will allow us to remove `EventMsg`, we will keep exporting it for now.	2025-11-13 16:25:17 -08:00
Celia Chen	b8ec97c0ef	[App-server] add new v2 events:`item/reasoning/delta`, `item/agentMessage/delta` & `item/reasoning/summaryPartAdded` (#6559 ) core event to app server event mapping: 1. `codex/event/reasoning_content_delta` -> `item/reasoning/summaryTextDelta`. 2. `codex/event/reasoning_raw_content_delta` -> `item/reasoning/textDelta` 3. `codex/event/agent_message_content_delta` → `item/agentMessage/delta`. 4. `codex/event/agent_reasoning_section_break` -> `item/reasoning/summaryPartAdded`. Also added a change in core to pass down content index, summary index and item id from events. Tested with the `git checkout owen/app_server_test_client && cargo run -p codex-app-server-test-client -- send-message-v2 "hello"` and verified that new events are emitted correctly.	2025-11-14 00:25:01 +00:00
Dylan Hurd	2c1b693da4	chore(core) Consolidate apply_patch tests (#6545 ) ## Summary Consolidates our apply_patch tests into one suite, and ensures each test case tests the various ways the harness supports apply_patch: 1. Freeform custom tool call 2. JSON function tool 3. Simple shell call 4. Heredoc shell call There are a few test cases that are specific to a particular variant, I've left those alone. ## Testing - [x] This adds a significant number of tests	2025-11-13 15:52:39 -08:00
pakrym-oai	547be54ee8	Only list failed tests (#6619 ) Makes output easier to parse	2025-11-13 13:50:33 -08:00
Dan Hernandez	b4a53aef47	feat: Add support for --add-dir to exec and TypeScript SDK (#6565 ) ## Summary Adds support for specifying additional directories in the TypeScript SDK through a new `additionalDirectories` option in `ThreadOptions`. ## Changes - Added `additionalDirectories` parameter to `ThreadOptions` interface - Updated `CodexExec` to accept and pass through additional directories via the `--config` flag for `sandbox_workspace_write.writable_roots` - Added comprehensive test coverage for the new functionality ## Test plan - Added test case that verifies `additionalDirectories` is correctly passed as repeated flags - Existing tests continue to pass --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-11-13 13:47:10 -08:00
Dan Hernandez	439bc5dbbe	Add AbortSignal support to TypeScript SDK (#6378 ) ## Summary Adds AbortSignal support to the TypeScript SDK for canceling thread execution using AbortController. ## Changes - Add `signal?: AbortSignal` property to `TurnOptions` type - Pass signal through Thread class methods to exec layer - Add signal parameter to `CodexExecArgs` - Leverage Node.js native `spawn()` signal support for automatic cancellation - Add comprehensive test coverage (6 tests covering all abort scenarios) ## Implementation The implementation uses Node.js's built-in AbortSignal support in `spawn()` (available since Node v15, SDK requires >=18), which automatically handles: - Checking if already aborted before starting - Killing the child process when abort is triggered - Emitting appropriate error events - All cleanup operations This is a one-line change to the core implementation (`signal: args.signal` passed to spawn), making it simple, reliable, and maintainable. ## Usage Example ```typescript import { Codex } from '@openai/codex-sdk'; const codex = new Codex({ apiKey: 'your-api-key' }); const thread = codex.startThread(); // Create AbortController const controller = new AbortController(); // Run with abort signal const resultPromise = thread.run("Your prompt here", { signal: controller.signal }); // Cancel anytime controller.abort('User requested cancellation'); ``` ## Testing All tests pass (23 total across SDK): - ✅ Aborts when signal is already aborted (both run and runStreamed) - ✅ Aborts during execution/iteration - ✅ Completes normally when not aborted - ✅ Backward compatible (signal is optional) Tests verified to fail correctly when signal support is removed (no false positives). --------- Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: pakrym-oai <pakrym@openai.com>	2025-11-13 13:35:42 -08:00
pakrym-oai	c95bd345ea	Enable close-stale-contributor-prs.yml workflow (#6615 ) Tested on https://github.com/openai/codex/pull/3036	2025-11-13 11:50:54 -08:00
pakrym-oai	0792a7953d	Update default yield time (#6610 ) 10s for exec and 250ms for write_stdin	2025-11-13 10:24:41 -08:00
pakrym-oai	6cda3de3a4	Close stale PRs workflow (#6594 ) Closes stale PRs from OpenAI employees.	2025-11-13 10:01:51 -08:00
pakrym-oai	041d6ad902	Migrate prompt caching tests to test_codex (#6605 ) To hopefully fix the flakiness	2025-11-13 09:19:38 -08:00
pakrym-oai	e6995174c1	Revert "Overhaul shell detection and centralize command generation for unified exec" (#6606 ) Reverts openai/codex#6577	2025-11-13 08:43:00 -08:00
pakrym-oai	d28e912214	Overhaul shell detection and centralize command generation for unified exec (#6577 ) This fixes command display for unified exec. All `cd`s and `ls`es are now parsed. <img width="452" height="237" alt="image" src="https://github.com/user-attachments/assets/ce92d81f-f74c-485a-9b34-1eaa29290ec6" /> Deletes a ton of tests that were doing nothing from shell.rs. --------- Co-authored-by: Pavel Krymets <pavel@krymets.com>	2025-11-13 08:28:09 -08:00
Ahmed Ibrahim	ba74cee6f7	fix model picker wrapping (#6589 ) Previously the popup measured rows using the full content width while the renderer drew them with 2 columns of padding, so at certain widths the layout allocated too little vertical space and hid the third option. Now both desired_height and render call a shared helper that subtracts the padding before measuring, so the height we reserve always matches what we draw and the menu doesn't drops entries. https://github.com/user-attachments/assets/59058fd9-1e34-4325-b5fe-fc888dfcb6bc	2025-11-13 08:09:13 -08:00
jif-oai	2a417c47ac	feat: proxy context left after compaction (#6597 )	2025-11-13 16:54:03 +01:00
Dylan Hurd	8dcbd29edd	chore(core) Update prompt for gpt-5.1 (#6588 ) ## Summary Updates the prompt for GPT-5.1	2025-11-13 07:51:28 -08:00
pakrym-oai	34621166d5	Default to explicit medium reasoning for 5.1 (#6593 )	2025-11-13 07:58:42 +00:00
pakrym-oai	e3dd362c94	Reasoning level update (#6586 ) Automatically update reasoning levels when migrating between models	2025-11-13 06:24:36 +00:00
Ahmed Ibrahim	305fe73d83	copy for model migration nudge (#6585 )	2025-11-13 05:56:30 +00:00
Ahmed Ibrahim	e3aaee00c8	feat: show gpt mini (#6583 )	2025-11-13 05:21:00 +00:00
Ahmed Ibrahim	b1979b70a8	remove porcupine model slug (#6580 )	2025-11-13 04:43:31 +00:00
Eric Traut	73ed30d7e5	Avoid hang when tool's process spawns grandchild that shares stderr/stdout (#6575 ) We've received many reports of codex hanging when calling certain tools. [Here](https://github.com/openai/codex/issues/3204) is one example. This is likely a major cause. The problem occurs when `consume_truncated_output` waits for `stdout` and `stderr` to be closed once the child process terminates. This normally works fine, but it doesn't handle the case where the child has spawned grandchild processes that inherits `stdout` and `stderr`. The fix was originally written by @md-oai in [this PR](https://github.com/openai/codex/pull/1852), which has gone stale. I've copied the original fix (which looks sound to me) and added an integration test to prevent future regressions.	2025-11-12 20:08:12 -08:00
Ahmed Ibrahim	ad7eaa80f9	Change model picker to include gpt5.1 (#6569 ) - Change the presets - Change the tests that make sure we keep the list of tools updated - Filter out deprecated models	2025-11-12 19:44:53 -08:00
Ahmed Ibrahim	966d71c02a	Update subtitle of model picker as part of the nux (#6567 )	2025-11-12 18:30:43 -08:00
pakrym-oai	f97874093e	Set verbosity to low for 5.1 (#6568 ) And improve test coverage	2025-11-13 01:40:52 +00:00
Ahmed Ibrahim	e63ab0dd65	NUX for gpt5.1 (#6561 ) - Introducing a screen to inform users of model changes. - Config name is being passed to be able to reuse this component in the future for future models	2025-11-13 01:24:21 +00:00
Owen Lin	964220ac94	[app-server] feat: thread/resume supports history, path, and overrides (#6483 ) This updates `thread/resume` to be at parity with v1's `ResumeConversationParams`. Turns out history is useful for codex cloud and path is useful for the VSCode extension. And config overrides are always useful.	2025-11-12 22:02:43 +00:00
pakrym-oai	2f58e69997	Do not double encode request bodies in logging (#6558 )	2025-11-12 21:28:42 +00:00
pakrym-oai	ec69a4a810	Add gpt-5.1 model definitions (#6551 )	2025-11-12 12:44:36 -08:00
Eric Traut	ad09c138b9	Fixed status output to use auth information from AuthManager (#6529 ) This PR addresses https://github.com/openai/codex/issues/6360. The root problem is that the TUI was directly loading the `auth.json` file to access the auth information. It should instead be using the AuthManager, which records the current auth information. The `auth.json` file can be overwritten at any time by other instances of the CLI or extension, so its information can be out of sync with the current instance. The `/status` command should always report the auth information associated with the current instance. An alternative fix for this bug was submitted by @chojs23 in [this PR](https://github.com/openai/codex/pull/6495). That approach was only a partial fix.	2025-11-12 10:26:50 -08:00
jif-oai	e00eb50db3	feat: only wait for mutating tools for ghost commit (#6534 )	2025-11-12 18:16:32 +00:00
pakrym-oai	7d9ad3effd	Fix otel tests (#6541 ) Mount responses only once, remove unneeded retries and add a final assistant messages to complete the turn.	2025-11-12 16:35:34 +00:00
Michael Bolin	c3a710ee14	chore: verify boolean values can be parsed as config overrides (#6516 ) This is important to ensure that this: ``` codex --enable unified_exec ``` and this: ``` codex --config features.unified_exec=true ``` are equivalent. Also that when it is passed programmatically: `807e2c27f0/codex-rs/app-server-protocol/src/protocol/v1.rs (L55)` then this should work for `config`: ```json {"features": {"shell_command_tool": true}} ``` though I believe also this: ```json {"features.shell_command_tool": true} ```	2025-11-12 08:19:16 -08:00

... 34 35 36 37 38 ...

3718 commits