For better caching performance all output items should be rendered in
the order they were produced before all new input items (for example,
all function_call before all function_call_output).
For app-server development it's been helpful to be able to trigger some
test flows end-to-end and print the JSON-RPC messages sent between
client and server.
## Summary
- add an `env` option for the TypeScript Codex client and plumb it into
`CodexExec` so the CLI can run without inheriting `process.env`
- extend the test spy to capture spawn environments, add coverage for
the new option, and document how to use it
## Testing
- `pnpm test` *(fails: corepack cannot download pnpm because outbound
network access is blocked in the sandbox)*
------
[Codex
Task](https://chatgpt.com/codex/tasks/task_i_6916b2d7c7548322a72d61d91a2dac85)
## Summary
- default the `tui.notifications` setting to enabled so desktop
notifications work out of the box
- update configuration tests and documentation to reflect the new
default
## Testing
- `cargo test -p codex-core` *(fails:
`exec::tests::kill_child_process_group_kills_grandchildren_on_timeout`
is flaky in this sandbox because the spawned grandchild process stays
alive)*
- `cargo test -p codex-core
exec::tests::kill_child_process_group_kills_grandchildren_on_timeout`
*(fails: same sandbox limitation as above)*
------
[Codex
Task](https://chatgpt.com/codex/tasks/task_i_69166f811144832c9e8aaf8ee2642373)
## Summary
- add `TestCodex::submit_turn_with_policies` and extend the response
helpers with reusable tool-call utilities
- update the grep_files, read_file, list_dir, shell_serialization, and
tools suites to rely on the shared helpers instead of local copies
- make the list_dir helper return `anyhow::Result` so clippy no longer
warns about `expect`
## Testing
- `just fix -p codex-core`
- `cargo test -p codex-core --test all
suite::grep_files::grep_files_tool_collects_matches`
- `cargo test -p codex-core
suite::grep_files::grep_files_tool_collects_matches -- --ignored`
(filter requests ignored tests so nothing runs, but the build stays
clean)
------
[Codex
Task](https://chatgpt.com/codex/tasks/task_i_69112d53abac83219813cab4d7cb6446)
## Summary
- replace the bespoke network check in the chat completion payload and
SSE tests with the existing `skip_if_no_network!` helper so they follow
the same gating convention as the rest of the suite
## Testing
- `just fmt`
------
[Codex
Task](https://chatgpt.com/codex/tasks/task_i_69112d4cb9f08321ba773e8ccf39778e)
A partner is consuming our generated JSON schema bundle for app-server
and identified a few issues:
- not all polymorphic / one-of types have a type descriminator
- `"$ref": "#/definitions/v2/SandboxPolicy"` is missing
- "Option<>" is an invalid schema name, and also unnecessary
This PR:
- adds the type descriminator to the various types that are missing it
except for `SessionSource` and `SubAgentSource` because they are
serialized to disk (adding this would break backwards compat for
resume), and they should not be necessary to consume for an integration
with app-server.
- removes the special handling in `export.rs` of various types like
SandboxPolicy, which turned out to be unnecessary and incorrect
- filters out `Option<>` which was auto-generated for request params
that don't need a body
For context, we currently pull in wayyy more types than we need through
the `EventMsg` god object which we are **not** planning to expose in API
v2 (this is how I suspect `SessionSource` and `SubAgentSource` are being
pulled in). But until we have all the necessary v2 notifications in
place that will allow us to remove `EventMsg`, we will keep exporting it
for now.
core event to app server event mapping:
1. `codex/event/reasoning_content_delta` ->
`item/reasoning/summaryTextDelta`.
2. `codex/event/reasoning_raw_content_delta` ->
`item/reasoning/textDelta`
3. `codex/event/agent_message_content_delta` →
`item/agentMessage/delta`.
4. `codex/event/agent_reasoning_section_break` ->
`item/reasoning/summaryPartAdded`.
Also added a change in core to pass down content index, summary index
and item id from events.
Tested with the `git checkout owen/app_server_test_client && cargo run
-p codex-app-server-test-client -- send-message-v2 "hello"` and verified
that new events are emitted correctly.
## Summary
Consolidates our apply_patch tests into one suite, and ensures each test
case tests the various ways the harness supports apply_patch:
1. Freeform custom tool call
2. JSON function tool
3. Simple shell call
4. Heredoc shell call
There are a few test cases that are specific to a particular variant,
I've left those alone.
## Testing
- [x] This adds a significant number of tests
## Summary
Adds support for specifying additional directories in the TypeScript SDK
through a new `additionalDirectories` option in `ThreadOptions`.
## Changes
- Added `additionalDirectories` parameter to `ThreadOptions` interface
- Updated `CodexExec` to accept and pass through additional directories
via the `--config` flag for `sandbox_workspace_write.writable_roots`
- Added comprehensive test coverage for the new functionality
## Test plan
- Added test case that verifies `additionalDirectories` is correctly
passed as repeated flags
- Existing tests continue to pass
---------
Co-authored-by: Claude <noreply@anthropic.com>
## Summary
Adds AbortSignal support to the TypeScript SDK for canceling thread
execution using AbortController.
## Changes
- Add `signal?: AbortSignal` property to `TurnOptions` type
- Pass signal through Thread class methods to exec layer
- Add signal parameter to `CodexExecArgs`
- Leverage Node.js native `spawn()` signal support for automatic
cancellation
- Add comprehensive test coverage (6 tests covering all abort scenarios)
## Implementation
The implementation uses Node.js's built-in AbortSignal support in
`spawn()` (available since Node v15, SDK requires >=18), which
automatically handles:
- Checking if already aborted before starting
- Killing the child process when abort is triggered
- Emitting appropriate error events
- All cleanup operations
This is a one-line change to the core implementation (`signal:
args.signal` passed to spawn), making it simple, reliable, and
maintainable.
## Usage Example
```typescript
import { Codex } from '@openai/codex-sdk';
const codex = new Codex({ apiKey: 'your-api-key' });
const thread = codex.startThread();
// Create AbortController
const controller = new AbortController();
// Run with abort signal
const resultPromise = thread.run("Your prompt here", {
signal: controller.signal
});
// Cancel anytime
controller.abort('User requested cancellation');
```
## Testing
All tests pass (23 total across SDK):
- ✅ Aborts when signal is already aborted (both run and runStreamed)
- ✅ Aborts during execution/iteration
- ✅ Completes normally when not aborted
- ✅ Backward compatible (signal is optional)
Tests verified to fail correctly when signal support is removed (no
false positives).
---------
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: pakrym-oai <pakrym@openai.com>
This fixes command display for unified exec. All `cd`s and `ls`es are
now parsed.
<img width="452" height="237" alt="image"
src="https://github.com/user-attachments/assets/ce92d81f-f74c-485a-9b34-1eaa29290ec6"
/>
Deletes a ton of tests that were doing nothing from shell.rs.
---------
Co-authored-by: Pavel Krymets <pavel@krymets.com>
Previously the popup measured rows using the full content width while
the renderer drew them with 2 columns of padding, so at certain widths
the layout allocated too little vertical space and hid the third option.
Now both desired_height and render call a shared helper that subtracts
the padding before measuring, so the height we reserve always matches
what we draw and the menu doesn't drops entries.
https://github.com/user-attachments/assets/59058fd9-1e34-4325-b5fe-fc888dfcb6bc
We've received many reports of codex hanging when calling certain tools.
[Here](https://github.com/openai/codex/issues/3204) is one example. This
is likely a major cause. The problem occurs when
`consume_truncated_output` waits for `stdout` and `stderr` to be closed
once the child process terminates. This normally works fine, but it
doesn't handle the case where the child has spawned grandchild processes
that inherits `stdout` and `stderr`.
The fix was originally written by @md-oai in [this
PR](https://github.com/openai/codex/pull/1852), which has gone stale.
I've copied the original fix (which looks sound to me) and added an
integration test to prevent future regressions.
- Introducing a screen to inform users of model changes.
- Config name is being passed to be able to reuse this component in the
future for future models
This updates `thread/resume` to be at parity with v1's
`ResumeConversationParams`. Turns out history is useful for codex cloud
and path is useful for the VSCode extension. And config overrides are
always useful.
This PR addresses https://github.com/openai/codex/issues/6360. The root
problem is that the TUI was directly loading the `auth.json` file to
access the auth information. It should instead be using the AuthManager,
which records the current auth information. The `auth.json` file can be
overwritten at any time by other instances of the CLI or extension, so
its information can be out of sync with the current instance. The
`/status` command should always report the auth information associated
with the current instance.
An alternative fix for this bug was submitted by @chojs23 in [this
PR](https://github.com/openai/codex/pull/6495). That approach was only a
partial fix.
This is important to ensure that this:
```
codex --enable unified_exec
```
and this:
```
codex --config features.unified_exec=true
```
are equivalent. Also that when it is passed programmatically:
807e2c27f0/codex-rs/app-server-protocol/src/protocol/v1.rs (L55)
then this should work for `config`:
```json
{"features": {"shell_command_tool": true}}
```
though I believe also this:
```json
{"features.shell_command_tool": true}
```