- The total token used returned from the api doesn't account for the
reasoning items before the assistant message
- Account for those for auto compaction
- Add the encrypted reasoning effort in the common tests utils
- Add a test to make sure it works as expected
This introduces a new feature to Codex when it operates as an MCP
_client_ where if an MCP _server_ replies that it has an entry named
`"codex/sandbox-state"` in its _server capabilities_, then Codex will
send it an MCP notification with the following structure:
```json
{
"method": "codex/sandbox-state/update",
"params": {
"sandboxPolicy": {
"type": "workspace-write",
"network-access": false,
"exclude-tmpdir-env-var": false
"exclude-slash-tmp": false
},
"codexLinuxSandboxExe": null,
"sandboxCwd": "/Users/mbolin/code/codex2"
}
}
```
or with whatever values are appropriate for the initial `sandboxPolicy`.
**NOTE:** Codex _should_ continue to send the MCP server notifications
of the same format if these things change over the lifetime of the
thread, but that isn't wired up yet.
The result is that `shell-tool-mcp` can consume these values so that
when it calls `codex_core::exec::process_exec_tool_call()` in
`codex-rs/exec-server/src/posix/escalate_server.rs`, it is now sure to
call it with the correct values (whereas previously we relied on
hardcoded values).
While I would argue this is a supported use case within the MCP
protocol, the `rmcp` crate that we are using today does not support
custom notifications. As such, I had to patch it and I submitted it for
review, so hopefully it will be accepted in some form:
https://github.com/modelcontextprotocol/rust-sdk/pull/556
To test out this change from end-to-end:
- I ran `cargo build` in `~/code/codex2/codex-rs/exec-server`
- I built the fork of Bash in `~/code/bash/bash`
- I added the following to my `~/.codex/config.toml`:
```toml
# Use with `codex --disable shell_tool`.
[mcp_servers.execshell]
args = ["--bash", "/Users/mbolin/code/bash/bash"]
command = "/Users/mbolin/code/codex2/codex-rs/target/debug/codex-exec-mcp-server"
```
- From `~/code/codex2/codex-rs`, I ran `just codex --disable shell_tool`
- When the TUI started up, I verified that the sandbox mode is
`workspace-write`
- I ran `/mcp` to verify that the shell tool from the MCP is there:
<img width="1387" height="1400" alt="image"
src="https://github.com/user-attachments/assets/1a8addcc-5005-4e16-b59f-95cfd06fd4ab"
/>
- Then I asked it:
> what is the output of `gh issue list`
because this should be auto-approved with our existing dummy policy:
af63e6eccc/codex-rs/exec-server/src/posix.rs (L157-L164)
And it worked:
<img width="1387" height="1400" alt="image"
src="https://github.com/user-attachments/assets/7568d2f7-80da-4d68-86d0-c265a6f5e6c1"
/>
`process_exec_tool_call()` was taking `SandboxType` as a param, but in
practice, the only place it was constructed was in
`codex_message_processor.rs` where it was derived from the other
`sandbox_policy` param, so this PR inlines the logic that decides the
`SandboxType` into `process_exec_tool_call()`.
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/7122).
* #7112
* __->__ #7122
Piping to codex fails to do anything useful and locks up the process.
We currently check for stdout, but not stdin
```
❯ echo foo|just c
cargo run --bin codex -- "$@"
Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.21s
Running `target/debug/codex`
Error: stdin is not a terminal
error: Recipe `codex` failed on line 10 with exit code 1
```
The unified exec tool has a `login` option that defaults to `true`:
3bdcbc7292/codex-rs/core/src/tools/handlers/unified_exec.rs (L35-L36)
This updates the `ExecParams` for `shell-tool-mcp` to support the same
parameter. Note it is declared as `Option<bool>` to ensure it is marked
optional in the generated JSON schema.
## Summary
When incrementing the minor version, we should reset patch to 0, rather
than keeping it.
## Testing
- [x] tested locally with dry_run and `get_latest_release_version`
mocked out
---------
Co-authored-by: Michael Bolin <mbolin@openai.com>
https://github.com/openai/codex/pull/7005 introduced a new part of the
release process that added multiple files named `bash` in the `dist/`
folder used as the basis of the GitHub Release. I believe that all file
names in a GitHub Release have to be unique, which is why the recent
release build failed:
https://github.com/openai/codex/actions/runs/19577669780/job/56070183504
Based on the output of the **List** step, I believe these are the
appropriate artifacts to delete as a quick fix.
Add a `Declined` status for when we request an approval from the user
and the user declines. This allows us to distinguish from commands that
actually ran, but failed.
This behaves similarly to apply_patch / FileChange, which does the same
thing.
This adds a GitHub workflow for building a new npm module we are
experimenting with that contains an MCP server for running Bash
commands. The new workflow, `shell-tool-mcp`, is a dependency of the
general `release` workflow so that we continue to use one version number
for all artifacts across the project in one GitHub release.
`.github/workflows/shell-tool-mcp.yml` is the primary workflow
introduced by this PR, which does the following:
- builds the `codex-exec-mcp-server` and `codex-execve-wrapper`
executables for both arm64 and x64 versions of Mac and Linux (preferring
the MUSL version for Linux)
- builds Bash (dynamically linked) for a [comically] large number of
platforms (both x64 and arm64 for most) with a small patch specified by
`shell-tool-mcp/patches/bash-exec-wrapper.patch`:
- `debian-11`
- `debian-12`
- `ubuntu-20.04`
- `ubuntu-22.04`
- `ubuntu-24.04`
- `centos-9`
- `macos-13` (x64 only)
- `macos-14` (arm64 only)
- `macos-15` (arm64 only)
- builds the TypeScript for the [new] Node module declared in the
`shell-tool-mcp/` folder, which creates `bin/mcp-server.js`
- adds all of the native binaries to `shell-tool-mcp/vendor/` folder;
`bin/mcp-server.js` does a runtime check to determine which ones to
execute
- uses `npm pack` to create the `.tgz` for the module
- if `publish: true` is set, invokes the `npm publish` call with the
`.tgz`
The justification for building Bash for so many different operating
systems is because, since it is dynamically linked, we want to increase
our confidence that the version we build is compatible with the glibc
whatever OS we end up running on. (Note this is less of a concern with
`codex-exec-mcp-server` and `codex-execve-wrapper` on Linux, as they are
statically linked.)
This PR also introduces the code for the npm module in `shell-tool-mcp/`
(the proposed module name is `@openai/codex-shell-tool-mcp`). Initially,
I intended the module to be a single file of vanilla JavaScript (like
[`codex-cli/bin/codex.js`](ab5972d447/codex-cli/bin/codex.js)),
but some of the logic seemed a bit tricky, so I decided to port it to
TypeScript and add unit tests.
`shell-tool-mcp/src/index.ts` defines the `main()` function for the
module, which performs runtime checks to determine the clang triple to
find the path to the Rust executables within the `vendor/` folder
(`resolveTargetTriple()`). It uses a combination of `readOsRelease()`
and `resolveBashPath()` to determine the correct Bash executable to run
in the environment. Ultimately, it spawns a command like the following:
```
codex-exec-mcp-server \
--execve codex-execve-wrapper \
--bash custom-bash "$@"
```
Note `.github/workflows/shell-tool-mcp-ci.yml` defines a fairly standard
CI job for the module (`format`/`build`/`test`).
To test this PR, I pushed this branch to my personal fork of Codex and
ran the CI job there:
https://github.com/bolinfest/codex/actions/runs/19564311320
Admittedly, the graph looks a bit wild now:
<img width="5115" height="2969" alt="Screenshot 2025-11-20 at 11 44
58 PM"
src="https://github.com/user-attachments/assets/cc5ef306-efc1-4ed7-a137-5347e394f393"
/>
But when it finished, I was able to download `codex-shell-tool-mcp-npm`
from the **Artifacts** for the workflow in an empty temp directory,
unzip the `.zip` and then the `.tgz` inside it, followed by `xattr -rc
.` to remove the quarantine bits. Then I ran:
```shell
npx @modelcontextprotocol/inspector node /private/tmp/foobar4/package/bin/mcp-server.js
```
which launched the MCP Inspector and I was able to use it as expected!
This bodes well that this should work once the package is published to
npm:
```shell
npx @modelcontextprotocol/inspector npx @openai/codex-shell-tool-mcp
```
Also, to verify the package contains what I expect:
```shell
/tmp/foobar4/package$ tree
.
├── bin
│ └── mcp-server.js
├── package.json
├── README.md
└── vendor
├── aarch64-apple-darwin
│ ├── bash
│ │ ├── macos-14
│ │ │ └── bash
│ │ └── macos-15
│ │ └── bash
│ ├── codex-exec-mcp-server
│ └── codex-execve-wrapper
├── aarch64-unknown-linux-musl
│ ├── bash
│ │ ├── centos-9
│ │ │ └── bash
│ │ ├── debian-11
│ │ │ └── bash
│ │ ├── debian-12
│ │ │ └── bash
│ │ ├── ubuntu-20.04
│ │ │ └── bash
│ │ ├── ubuntu-22.04
│ │ │ └── bash
│ │ └── ubuntu-24.04
│ │ └── bash
│ ├── codex-exec-mcp-server
│ └── codex-execve-wrapper
├── x86_64-apple-darwin
│ ├── bash
│ │ └── macos-13
│ │ └── bash
│ ├── codex-exec-mcp-server
│ └── codex-execve-wrapper
└── x86_64-unknown-linux-musl
├── bash
│ ├── centos-9
│ │ └── bash
│ ├── debian-11
│ │ └── bash
│ ├── debian-12
│ │ └── bash
│ ├── ubuntu-20.04
│ │ └── bash
│ ├── ubuntu-22.04
│ │ └── bash
│ └── ubuntu-24.04
│ └── bash
├── codex-exec-mcp-server
└── codex-execve-wrapper
26 directories, 26 files
```
It stopped working (found zero duplicates) starting three days ago when
the model was switched from `gpt-5` to `gpt-5.1`. I'm not sure why it
stopped working. This is an attempt to get it working again by using the
default model for the codex action (which is presumably
`gpt-5.1-codex-max`).
### Summary
After #7022, we no longer need this warning. We should also clean up the
schema for the notification, but this is a quick fix to just stop the
behavior in the VSCE
## Testing
- [x] Ran locally
Previously, we were running into an issue where we would run the `shell`
tool call with a timeout of 10s, but it fired an elicitation asking for
user approval, the time the user took to respond to the elicitation was
counted agains the 10s timeout, so the `shell` tool call would fail with
a timeout error unless the user is very fast!
This PR addresses this issue by introducing a "stopwatch" abstraction
that is used to manage the timeout. The idea is:
- `Stopwatch::new()` is called with the _real_ timeout of the `shell`
tool call.
- `process_exec_tool_call()` is called with the `Cancellation` variant
of `ExecExpiration` because it should not manage its own timeout in this
case
- the `Stopwatch` expiration is wired up to the `cancel_rx` passed to
`process_exec_tool_call()`
- when an elicitation for the `shell` tool call is received, the
`Stopwatch` pauses
- because it is possible for multiple elicitations to arrive
concurrently, it keeps track of the number of "active pauses" and does
not resume until that counter goes down to zero
I verified that I can test the MCP server using
`@modelcontextprotocol/inspector` and specify `git status` as the
`command` with a timeout of 500ms and that the elicitation pops up and I
have all the time in the world to respond whereas previous to this PR,
that would not have been possible.
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/6973).
* #7005
* __->__ #6973
* #6972
This updates `ExecParams` so that instead of taking `timeout_ms:
Option<u64>`, it now takes a more general cancellation mechanism,
`ExecExpiration`, which is an enum that includes a
`Cancellation(tokio_util::sync::CancellationToken)` variant.
If the cancellation token is fired, then `process_exec_tool_call()`
returns in the same way as if a timeout was exceeded.
This is necessary so that in #6973, we can manage the timeout logic
external to the `process_exec_tool_call()` because we want to "suspend"
the timeout when an elicitation from a human user is pending.
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/6972).
* #7005
* #6973
* __->__ #6972
- Use /bin/sh instead of /bin/bash on FreeBSD/OpenBSD in the process
group timeout test to avoid command-not-found failures.
- Accept /usr/local/bin/bash as a valid SHELL path to match common
FreeBSD installations.
- Switch the shell serialization duration test to /bin/sh for improved
portability across Unix platforms.
With this change, `cargo test -p codex-core --lib` runs and passes on
FreeBSD.
Our Restricted Token contains 3 SIDs (Logon, Everyone, {WorkspaceWrite
Capability || ReadOnly Capability})
because it must include Everyone, that left us vulnerable to directories
that allow writes to Everyone. Even though those directories do not have
ACEs that enable our capability SIDs to write to them, they could still
be written to even in ReadOnly mode, or even in WorkspaceWrite mode if
they are outside of a writable root.
A solution to this is to explicitly add *Deny* ACEs to these
directories, always for the ReadOnly Capability SID, and for the
WorkspaceWrite SID if the directory is outside of a workspace root.
Under a restricted token, Windows always checks Deny ACEs before Allow
ACEs so even though our restricted token would allow a write to these
directories due to the Everyone SID, it fails first because of the Deny
ACE on the capability SID
second attempt to fix this test after
https://github.com/openai/codex/pull/6884. I think this flakiness is
happening because yield_time is too small for a 10,000 step loop in
python.
adding execpolicycheck tool onto codex cli
this is useful for validating policies (can be multiple) against
commands.
it will also surface errors in policy syntax:
<img width="1150" height="281" alt="Screenshot 2025-11-19 at 12 46
21 PM"
src="https://github.com/user-attachments/assets/8f99b403-564c-4172-acc9-6574a8d13dc3"
/>
this PR also changes output format when there's no match in the CLI.
instead of returning the raw string `noMatch`, we return
`{"noMatch":{}}`
this PR is a rewrite of: https://github.com/openai/codex/pull/6932 (due
to the numerous merge conflicts present in the original PR)
---------
Co-authored-by: Michael Bolin <mbolin@openai.com>
setting user shell timeout to an unreasonably high value since there
isn't an easy way to have a command run without timeouts
currently, user shell commands timeout is 10 seconds
## Summary
- TUI feedback note now only links to the bug-report template when the
category is bug/bad result.
- Good result/other feedback shows a thank-you+thread ID instead of
funneling people to file a bug.
- Added a helper + unit test so future changes keep the behavior
consistent.
## Testing
- just fmt
- just fix -p codex-tui
- cargo test -p codex-tui
Fixes#6839
### Description
- codex exec --json resume --last "<prompt>" bailed out because clap
treated the prompt as SESSION_ID. I removed the conflicts_with flag and
reinterpret that positional as a prompt when
--last is set, so the flow now keeps working in JSON mode.
(codex-rs/exec/src/cli.rs:84-104, codex-rs/exec/src/lib.rs:75-130)
- Added a regression test that exercises resume --last in JSON mode to
ensure the prompt is accepted and the rollout file is updated.
(codex-rs/exec/tests/suite/resume.rs:126-178)
### Testing
- just fmt
- cargo test -p codex-exec
- just fix -p codex-exec
- cargo test -p codex-exec
#6717
Signed-off-by: Dmitri Khokhlov <dkhokhlov@cribl.io>
I just noticed that in the VSCode / Codex extension when you type @ the
number of results is around 70:
- small video of searching for `mod.rs` inside `codex` repository:
https://github.com/user-attachments/assets/46e53d31-adff-465e-b32b-051c4c1c298c
- while in the CLI the number of results is currently of 8 which is
quite small:
<img width="615" height="439" alt="Screenshot 2025-11-20 at 09 42 04"
src="https://github.com/user-attachments/assets/1c6d12cb-3b1f-4d5b-9ad3-6b12975eaaec"
/>
I bumped it to 20. I had several cases where I wanted a file and did not
find it because the number of results was too small
Signed-off-by: lionel-oai <lionel@openai.com>
Co-authored-by: lionel-oai <lionel@openai.com>
## Summary
- add an explicit `override_usage` string to `AddArgs` so clap prints
`<NAME>` before the command/url choice, matching the actual parser and
docs
### Before
Usage: codex mcp add [OPTIONS] <COMMAND|--url <URL>> <NAME>
### After
Usage: codex mcp add [OPTIONS] <NAME> [--url <URL> | -- <COMMAND>...]
---------
Signed-off-by: kyuheon-kr <kyuheon.kr@gmail.com>