Commit graph

14 commits

Author SHA1 Message Date
zhao-oai
b1c918d8f7
feat: exec policy integration in shell mcp (#7609)
adding execpolicy support into the `posix` mcp

Co-authored-by: Michael Bolin <mbolin@openai.com>
2025-12-04 21:55:54 -08:00
Michael Bolin
6b5b9a687e
feat: support --version flag for @openai/codex-shell-tool-mcp (#7504)
I find it helpful to easily verify which version is running.

Tested:

```shell
~/code/codex3/codex-rs/exec-server$ cargo run --bin codex-exec-mcp-server -- --help
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.19s
     Running `/Users/mbolin/code/codex3/codex-rs/target/debug/codex-exec-mcp-server --help`
Usage: codex-exec-mcp-server [OPTIONS]

Options:
      --execve <EXECVE_WRAPPER>  Executable to delegate execve(2) calls to in Bash
      --bash <BASH_PATH>         Path to Bash that has been patched to support execve() wrapping
  -h, --help                     Print help
  -V, --version                  Print version
~/code/codex3/codex-rs/exec-server$ cargo run --bin codex-exec-mcp-server -- --version
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.17s
     Running `/Users/mbolin/code/codex3/codex-rs/target/debug/codex-exec-mcp-server --version`
codex-exec-server 0.0.0
```
2025-12-02 23:43:25 +00:00
Josh McKinney
ec49b56874
chore: add cargo-deny configuration (#7119)
- add GitHub workflow running cargo-deny on push/PR
- document cargo-deny allowlist with workspace-dep notes and advisory
ignores
- align workspace crates to inherit version/edition/license for
consistent checks
2025-11-24 12:22:18 -08:00
Michael Bolin
c6f68c9df8
feat: declare server capability in shell-tool-mcp (#7112)
This introduces a new feature to Codex when it operates as an MCP
_client_ where if an MCP _server_ replies that it has an entry named
`"codex/sandbox-state"` in its _server capabilities_, then Codex will
send it an MCP notification with the following structure:

```json
{
  "method": "codex/sandbox-state/update",
  "params": {
    "sandboxPolicy": {
      "type": "workspace-write",
      "network-access": false,
      "exclude-tmpdir-env-var": false
      "exclude-slash-tmp": false
    },
    "codexLinuxSandboxExe": null,
    "sandboxCwd": "/Users/mbolin/code/codex2"
  }
}
```

or with whatever values are appropriate for the initial `sandboxPolicy`.

**NOTE:** Codex _should_ continue to send the MCP server notifications
of the same format if these things change over the lifetime of the
thread, but that isn't wired up yet.

The result is that `shell-tool-mcp` can consume these values so that
when it calls `codex_core::exec::process_exec_tool_call()` in
`codex-rs/exec-server/src/posix/escalate_server.rs`, it is now sure to
call it with the correct values (whereas previously we relied on
hardcoded values).

While I would argue this is a supported use case within the MCP
protocol, the `rmcp` crate that we are using today does not support
custom notifications. As such, I had to patch it and I submitted it for
review, so hopefully it will be accepted in some form:

https://github.com/modelcontextprotocol/rust-sdk/pull/556

To test out this change from end-to-end:

- I ran `cargo build` in `~/code/codex2/codex-rs/exec-server`
- I built the fork of Bash in `~/code/bash/bash`
- I added the following to my `~/.codex/config.toml`:

```toml
# Use with `codex --disable shell_tool`.
[mcp_servers.execshell]
args = ["--bash", "/Users/mbolin/code/bash/bash"]
command = "/Users/mbolin/code/codex2/codex-rs/target/debug/codex-exec-mcp-server"
```

- From `~/code/codex2/codex-rs`, I ran `just codex --disable shell_tool`
- When the TUI started up, I verified that the sandbox mode is
`workspace-write`
- I ran `/mcp` to verify that the shell tool from the MCP is there:

<img width="1387" height="1400" alt="image"
src="https://github.com/user-attachments/assets/1a8addcc-5005-4e16-b59f-95cfd06fd4ab"
/>

- Then I asked it:

> what is the output of `gh issue list`

because this should be auto-approved with our existing dummy policy:


af63e6eccc/codex-rs/exec-server/src/posix.rs (L157-L164)

And it worked:

<img width="1387" height="1400" alt="image"
src="https://github.com/user-attachments/assets/7568d2f7-80da-4d68-86d0-c265a6f5e6c1"
/>
2025-11-21 16:11:01 -08:00
Michael Bolin
67975ed33a
refactor: inline sandbox type lookup in process_exec_tool_call (#7122)
`process_exec_tool_call()` was taking `SandboxType` as a param, but in
practice, the only place it was constructed was in
`codex_message_processor.rs` where it was derived from the other
`sandbox_policy` param, so this PR inlines the logic that decides the
`SandboxType` into `process_exec_tool_call()`.



---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/7122).
* #7112
* __->__ #7122
2025-11-21 22:53:05 +00:00
Michael Bolin
e8ef6d3c16
feat: support login as an option on shell-tool-mcp (#7120)
The unified exec tool has a `login` option that defaults to `true`:


3bdcbc7292/codex-rs/core/src/tools/handlers/unified_exec.rs (L35-L36)

This updates the `ExecParams` for `shell-tool-mcp` to support the same
parameter. Note it is declared as `Option<bool>` to ensure it is marked
optional in the generated JSON schema.
2025-11-21 22:14:41 +00:00
Michael Bolin
8e5f38c0f0
feat: waiting for an elicitation should not count against a shell tool timeout (#6973)
Previously, we were running into an issue where we would run the `shell`
tool call with a timeout of 10s, but it fired an elicitation asking for
user approval, the time the user took to respond to the elicitation was
counted agains the 10s timeout, so the `shell` tool call would fail with
a timeout error unless the user is very fast!

This PR addresses this issue by introducing a "stopwatch" abstraction
that is used to manage the timeout. The idea is:

- `Stopwatch::new()` is called with the _real_ timeout of the `shell`
tool call.
- `process_exec_tool_call()` is called with the `Cancellation` variant
of `ExecExpiration` because it should not manage its own timeout in this
case
- the `Stopwatch` expiration is wired up to the `cancel_rx` passed to
`process_exec_tool_call()`
- when an elicitation for the `shell` tool call is received, the
`Stopwatch` pauses
- because it is possible for multiple elicitations to arrive
concurrently, it keeps track of the number of "active pauses" and does
not resume until that counter goes down to zero

I verified that I can test the MCP server using
`@modelcontextprotocol/inspector` and specify `git status` as the
`command` with a timeout of 500ms and that the elicitation pops up and I
have all the time in the world to respond whereas previous to this PR,
that would not have been possible.

---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/6973).
* #7005
* __->__ #6973
* #6972
2025-11-20 16:45:38 -08:00
Michael Bolin
f56d1dc8fc
feat: update process_exec_tool_call() to take a cancellation token (#6972)
This updates `ExecParams` so that instead of taking `timeout_ms:
Option<u64>`, it now takes a more general cancellation mechanism,
`ExecExpiration`, which is an enum that includes a
`Cancellation(tokio_util::sync::CancellationToken)` variant.

If the cancellation token is fired, then `process_exec_tool_call()`
returns in the same way as if a timeout was exceeded.

This is necessary so that in #6973, we can manage the timeout logic
external to the `process_exec_tool_call()` because we want to "suspend"
the timeout when an elicitation from a human user is pending.








---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/6972).
* #7005
* #6973
* __->__ #6972
2025-11-20 16:29:57 -08:00
Michael Bolin
54e6e4ac32
fix: when displaying execv, show file instead of arg0 (#6966)
After merging https://github.com/openai/codex/pull/6958, I realized that
the `command` I was displaying was not quite right. Since we know it, we
should show the _exact_ program being executed (the first arg to
`execve(3)`) rather than `arg0` to be more precise.

Below is the same command I used to test
https://github.com/openai/codex/pull/6958, but now you can see it shows
`/Users/mbolin/.openai/bin/git` instead of just `git`.

<img width="1526" height="1444" alt="image"
src="https://github.com/user-attachments/assets/428128d1-c658-456e-a64e-fc6a0009cb34"
/>
2025-11-19 22:42:58 -08:00
Michael Bolin
e8af41de8a
fix: clean up elicitation used by exec-server (#6958)
Using appropriate message/title fields, I think this looks better now:

<img width="3370" height="3208" alt="image"
src="https://github.com/user-attachments/assets/e9bbf906-4ba8-4563-affc-62cdc6c97342"
/>

Though note that in the current version of the Inspector (`0.17.2`), you
cannot hit **Submit** until you fill out the field. I believe this is a
bug in the Inspector, as it does not properly handle the case when all
fields are optional. I put up a fix:

https://github.com/modelcontextprotocol/inspector/pull/926
2025-11-20 04:59:17 +00:00
zhao-oai
fb9849e1e3
migrating execpolicy -> execpolicy-legacy and execpolicy2 -> execpolicy (#6956) 2025-11-19 19:14:10 -08:00
Michael Bolin
13d378f2ce
chore: refactor exec-server to prepare it for standalone MCP use (#6944)
This PR reorganizes things slightly so that:

- Instead of a single multitool executable, `codex-exec-server`, we now
have two executables:
  - `codex-exec-mcp-server` to launch the MCP server
- `codex-execve-wrapper` is the `execve(2)` wrapper to use with the
`BASH_EXEC_WRAPPER` environment variable
- `BASH_EXEC_WRAPPER` must be a single executable: it cannot be a
command string composed of an executable with args (i.e., it no longer
adds the `escalate` subcommand, as before)
- `codex-exec-mcp-server` takes `--bash` and `--execve` as options.
Though if `--execve` is not specified, the MCP server will check the
directory containing `std::env::current_exe()` and attempt to use the
file named `codex-execve-wrapper` within it. In development, this works
out since these executables are side-by-side in the `target/debug`
folder.

With respect to testing, this also fixes an important bug in
`dummy_exec_policy()`, as I was using `ends_with()` as if it applied to
a `String`, but in this case, it is used with a `&Path`, so the
semantics are slightly different.

Putting this all together, I was able to test this by running the
following:

```
~/code/codex/codex-rs$ npx @modelcontextprotocol/inspector \
    ./target/debug/codex-exec-mcp-server --bash ~/code/bash/bash
```

If I try to run `git status` in `/Users/mbolin/code/codex` via the
`shell` tool from the MCP server:

<img width="1589" height="1335" alt="image"
src="https://github.com/user-attachments/assets/9db6aea8-7fbc-4675-8b1f-ec446685d6c4"
/>

then I get prompted with the following elicitation, as expected:

<img width="1589" height="1335" alt="image"
src="https://github.com/user-attachments/assets/21b68fe0-494d-4562-9bad-0ddc55fc846d"
/>

Though a current limitation is that the `shell` tool defaults to a
timeout of 10s, which means I only have 10s to respond to the
elicitation. Ideally, the time spent waiting for a response from a human
should not count against the timeout for the command execution. I will
address this in a subsequent PR.

---

Note `~/code/bash/bash` was created by doing:

```
cd ~/code
git clone https://github.com/bminor/bash
cd bash
git checkout a8a1c2fac029404d3f42cd39f5a20f24b6e4fe4b
<apply the patch below>
./configure
make
```

The patch:

```
diff --git a/execute_cmd.c b/execute_cmd.c
index 070f5119..d20ad2b9 100644
--- a/execute_cmd.c
+++ b/execute_cmd.c
@@ -6129,6 +6129,19 @@ shell_execve (char *command, char **args, char **env)
   char sample[HASH_BANG_BUFSIZ];
   size_t larray;

+  char* exec_wrapper = getenv("BASH_EXEC_WRAPPER");
+  if (exec_wrapper && *exec_wrapper && !whitespace (*exec_wrapper))
+    {
+      char *orig_command = command;
+
+      larray = strvec_len (args);
+
+      memmove (args + 2, args, (++larray) * sizeof (char *));
+      args[0] = exec_wrapper;
+      args[1] = orig_command;
+      command = exec_wrapper;
+    }
+
```
2025-11-19 16:38:14 -08:00
Michael Bolin
056c8f8279
fix: prepare ExecPolicy in exec-server for execpolicy2 cutover (#6888)
This PR introduces an extra layer of abstraction to prepare us for the
migration to execpolicy2:

- introduces a new trait, `EscalationPolicy`, whose `determine_action()`
method is responsible for producing the `EscalateAction`
- the existing `ExecPolicy` typedef is changed to return an intermediate
`ExecPolicyOutcome` instead of `EscalateAction`
- the default implementation of `EscalationPolicy`,
`McpEscalationPolicy`, composes `ExecPolicy`
- the `ExecPolicyOutcome` includes `codex_execpolicy2::Decision`, which
has a `Prompt` variant
- when `McpEscalationPolicy` gets `Decision::Prompt` back from
`ExecPolicy`, it prompts the user via an MCP elicitation and maps the
result into an `ElicitationAction`
- now that the end user can reply to an elicitation with `Decline` or
`Cancel`, we introduce a new variant, `EscalateAction::Deny`, which the
client handles by returning exit code `1` without running anything

Note the way the elicitation is created is still not quite right, but I
will fix that once we have things running end-to-end for real in a
follow-up PR.
2025-11-19 13:55:29 -08:00
Jeremy Rose
c1391b9f94
exec-server (#6630) 2025-11-19 00:20:19 +00:00