core-agent-ide/codex-rs/exec-server/tests/common/lib.rs

202 lines
6.5 KiB
Rust
Raw Normal View History

fix: change codex/sandbox-state/update from a notification to a request (#8142) Historically, `accept_elicitation_for_prompt_rule()` was flaky because we were using a notification to update the sandbox followed by a `shell` tool request that we expected to be subject to the new sandbox config, but because [rmcp](https://crates.io/crates/rmcp) MCP servers delegate each incoming message to a new Tokio task, messages are not guaranteed to be processed in order, so sometimes the `shell` tool call would run before the notification was processed. Prior to this PR, we relied on a generous `sleep()` between the notification and the request to reduce the change of the test flaking out. This PR implements a proper fix, which is to use a _request_ instead of a notification for the sandbox update so that we can wait for the response to the sandbox request before sending the request to the `shell` tool call. Previously, `rmcp` did not support custom requests, but I fixed that in https://github.com/modelcontextprotocol/rust-sdk/pull/590, which made it into the `0.12.0` release (see #8288). This PR updates `shell-tool-mcp` to expect `"codex/sandbox-state/update"` as a _request_ instead of a notification and sends the appropriate ack. Note this behavior is tied to our custom `codex/sandbox-state` capability, which Codex honors as an MCP client, which is why `core/src/mcp_connection_manager.rs` had to be updated as part of this PR, as well. This PR also updates the docs at `shell-tool-mcp/README.md`.
2025-12-18 15:32:01 -08:00
use codex_core::MCP_SANDBOX_STATE_METHOD;
use codex_core::SandboxState;
use codex_core::protocol::SandboxPolicy;
feat: introduce find_resource! macro that works with Cargo or Bazel (#8879) To support Bazelification in https://github.com/openai/codex/pull/8875, this PR introduces a new `find_resource!` macro that we use in place of our existing logic in tests that looks for resources relative to the compile-time `CARGO_MANIFEST_DIR` env var. To make this work, we plan to add the following to all `rust_library()` and `rust_test()` Bazel rules in the project: ``` rustc_env = { "BAZEL_PACKAGE": native.package_name(), }, ``` Our new `find_resource!` macro reads this value via `option_env!("BAZEL_PACKAGE")` so that the Bazel package _of the code using `find_resource!`_ is injected into the code expanded from the macro. (If `find_resource()` were a function, then `option_env!("BAZEL_PACKAGE")` would always be `codex-rs/utils/cargo-bin`, which is not what we want.) Note we only consider the `BAZEL_PACKAGE` value when the `RUNFILES_DIR` environment variable is set at runtime, indicating that the test is being run by Bazel. In this case, we have to concatenate the runtime `RUNFILES_DIR` with the compile-time `BAZEL_PACKAGE` value to build the path to the resource. In testing this change, I discovered one funky edge case in `codex-rs/exec-server/tests/common/lib.rs` where we have to _normalize_ (but not canonicalize!) the result from `find_resource!` because the path contains a `common/..` component that does not exist on disk when the test is run under Bazel, so it must be semantically normalized using the [`path-absolutize`](https://crates.io/crates/path-absolutize) crate before it is passed to `dotslash fetch`. Because this new behavior may be non-obvious, this PR also updates `AGENTS.md` to make humans/Codex aware that this API is preferred.
2026-01-07 18:06:08 -08:00
use codex_utils_cargo_bin::find_resource;
use rmcp::ClientHandler;
use rmcp::ErrorData as McpError;
use rmcp::RoleClient;
use rmcp::Service;
use rmcp::model::ClientCapabilities;
use rmcp::model::ClientInfo;
fix: change codex/sandbox-state/update from a notification to a request (#8142) Historically, `accept_elicitation_for_prompt_rule()` was flaky because we were using a notification to update the sandbox followed by a `shell` tool request that we expected to be subject to the new sandbox config, but because [rmcp](https://crates.io/crates/rmcp) MCP servers delegate each incoming message to a new Tokio task, messages are not guaranteed to be processed in order, so sometimes the `shell` tool call would run before the notification was processed. Prior to this PR, we relied on a generous `sleep()` between the notification and the request to reduce the change of the test flaking out. This PR implements a proper fix, which is to use a _request_ instead of a notification for the sandbox update so that we can wait for the response to the sandbox request before sending the request to the `shell` tool call. Previously, `rmcp` did not support custom requests, but I fixed that in https://github.com/modelcontextprotocol/rust-sdk/pull/590, which made it into the `0.12.0` release (see #8288). This PR updates `shell-tool-mcp` to expect `"codex/sandbox-state/update"` as a _request_ instead of a notification and sends the appropriate ack. Note this behavior is tied to our custom `codex/sandbox-state` capability, which Codex honors as an MCP client, which is why `core/src/mcp_connection_manager.rs` had to be updated as part of this PR, as well. This PR also updates the docs at `shell-tool-mcp/README.md`.
2025-12-18 15:32:01 -08:00
use rmcp::model::ClientRequest;
use rmcp::model::CreateElicitationRequestParams;
use rmcp::model::CreateElicitationResult;
fix: change codex/sandbox-state/update from a notification to a request (#8142) Historically, `accept_elicitation_for_prompt_rule()` was flaky because we were using a notification to update the sandbox followed by a `shell` tool request that we expected to be subject to the new sandbox config, but because [rmcp](https://crates.io/crates/rmcp) MCP servers delegate each incoming message to a new Tokio task, messages are not guaranteed to be processed in order, so sometimes the `shell` tool call would run before the notification was processed. Prior to this PR, we relied on a generous `sleep()` between the notification and the request to reduce the change of the test flaking out. This PR implements a proper fix, which is to use a _request_ instead of a notification for the sandbox update so that we can wait for the response to the sandbox request before sending the request to the `shell` tool call. Previously, `rmcp` did not support custom requests, but I fixed that in https://github.com/modelcontextprotocol/rust-sdk/pull/590, which made it into the `0.12.0` release (see #8288). This PR updates `shell-tool-mcp` to expect `"codex/sandbox-state/update"` as a _request_ instead of a notification and sends the appropriate ack. Note this behavior is tied to our custom `codex/sandbox-state` capability, which Codex honors as an MCP client, which is why `core/src/mcp_connection_manager.rs` had to be updated as part of this PR, as well. This PR also updates the docs at `shell-tool-mcp/README.md`.
2025-12-18 15:32:01 -08:00
use rmcp::model::CustomRequest;
use rmcp::model::ElicitationAction;
fix: change codex/sandbox-state/update from a notification to a request (#8142) Historically, `accept_elicitation_for_prompt_rule()` was flaky because we were using a notification to update the sandbox followed by a `shell` tool request that we expected to be subject to the new sandbox config, but because [rmcp](https://crates.io/crates/rmcp) MCP servers delegate each incoming message to a new Tokio task, messages are not guaranteed to be processed in order, so sometimes the `shell` tool call would run before the notification was processed. Prior to this PR, we relied on a generous `sleep()` between the notification and the request to reduce the change of the test flaking out. This PR implements a proper fix, which is to use a _request_ instead of a notification for the sandbox update so that we can wait for the response to the sandbox request before sending the request to the `shell` tool call. Previously, `rmcp` did not support custom requests, but I fixed that in https://github.com/modelcontextprotocol/rust-sdk/pull/590, which made it into the `0.12.0` release (see #8288). This PR updates `shell-tool-mcp` to expect `"codex/sandbox-state/update"` as a _request_ instead of a notification and sends the appropriate ack. Note this behavior is tied to our custom `codex/sandbox-state` capability, which Codex honors as an MCP client, which is why `core/src/mcp_connection_manager.rs` had to be updated as part of this PR, as well. This PR also updates the docs at `shell-tool-mcp/README.md`.
2025-12-18 15:32:01 -08:00
use rmcp::model::ServerResult;
use rmcp::service::RunningService;
use rmcp::transport::ConfigureCommandExt;
use rmcp::transport::TokioChildProcess;
use serde_json::json;
use std::collections::HashSet;
use std::path::Path;
use std::path::PathBuf;
use std::process::Stdio;
use std::sync::Arc;
use std::sync::Mutex;
use tokio::process::Command;
pub async fn create_transport<P>(
codex_home: P,
dotslash_cache: P,
) -> anyhow::Result<TokioChildProcess>
where
P: AsRef<Path>,
{
feat: introduce find_resource! macro that works with Cargo or Bazel (#8879) To support Bazelification in https://github.com/openai/codex/pull/8875, this PR introduces a new `find_resource!` macro that we use in place of our existing logic in tests that looks for resources relative to the compile-time `CARGO_MANIFEST_DIR` env var. To make this work, we plan to add the following to all `rust_library()` and `rust_test()` Bazel rules in the project: ``` rustc_env = { "BAZEL_PACKAGE": native.package_name(), }, ``` Our new `find_resource!` macro reads this value via `option_env!("BAZEL_PACKAGE")` so that the Bazel package _of the code using `find_resource!`_ is injected into the code expanded from the macro. (If `find_resource()` were a function, then `option_env!("BAZEL_PACKAGE")` would always be `codex-rs/utils/cargo-bin`, which is not what we want.) Note we only consider the `BAZEL_PACKAGE` value when the `RUNFILES_DIR` environment variable is set at runtime, indicating that the test is being run by Bazel. In this case, we have to concatenate the runtime `RUNFILES_DIR` with the compile-time `BAZEL_PACKAGE` value to build the path to the resource. In testing this change, I discovered one funky edge case in `codex-rs/exec-server/tests/common/lib.rs` where we have to _normalize_ (but not canonicalize!) the result from `find_resource!` because the path contains a `common/..` component that does not exist on disk when the test is run under Bazel, so it must be semantically normalized using the [`path-absolutize`](https://crates.io/crates/path-absolutize) crate before it is passed to `dotslash fetch`. Because this new behavior may be non-obvious, this PR also updates `AGENTS.md` to make humans/Codex aware that this API is preferred.
2026-01-07 18:06:08 -08:00
// `bash` is a test resource rather than a binary target, so we must use
// `find_resource!` to locate it instead of `cargo_bin()`.
let bash = find_resource!("../suite/bash")?;
// Need to ensure the artifact associated with the bash DotSlash file is
// available before it is run in a read-only sandbox.
let status = Command::new("dotslash")
.arg("--")
.arg("fetch")
.arg(bash.clone())
.env("DOTSLASH_CACHE", dotslash_cache.as_ref())
.status()
.await?;
assert!(status.success(), "dotslash fetch failed: {status:?}");
create_transport_with_shell_path(codex_home, dotslash_cache, bash).await
}
pub async fn create_transport_with_shell_path<P, Q, R>(
codex_home: P,
dotslash_cache: Q,
shell_path: R,
) -> anyhow::Result<TokioChildProcess>
where
P: AsRef<Path>,
Q: AsRef<Path>,
R: AsRef<Path>,
{
let mcp_executable = codex_utils_cargo_bin::cargo_bin("codex-exec-mcp-server")?;
let execve_wrapper = codex_utils_cargo_bin::cargo_bin("codex-execve-wrapper")?;
feat: introduce codex-utils-cargo-bin as an alternative to assert_cmd::Command (#8496) This PR introduces a `codex-utils-cargo-bin` utility crate that wraps/replaces our use of `assert_cmd::Command` and `escargot::CargoBuild`. As you can infer from the introduction of `buck_project_root()` in this PR, I am attempting to make it possible to build Codex under [Buck2](https://buck2.build) as well as `cargo`. With Buck2, I hope to achieve faster incremental local builds (largely due to Buck2's [dice](https://buck2.build/docs/insights_and_knowledge/modern_dice/) build strategy, as well as benefits from its local build daemon) as well as faster CI builds if we invest in remote execution and caching. See https://buck2.build/docs/getting_started/what_is_buck2/#why-use-buck2-key-advantages for more details about the performance advantages of Buck2. Buck2 enforces stronger requirements in terms of build and test isolation. It discourages assumptions about absolute paths (which is key to enabling remote execution). Because the `CARGO_BIN_EXE_*` environment variables that Cargo provides are absolute paths (which `assert_cmd::Command` reads), this is a problem for Buck2, which is why we need this `codex-utils-cargo-bin` utility. My WIP-Buck2 setup sets the `CARGO_BIN_EXE_*` environment variables passed to a `rust_test()` build rule as relative paths. `codex-utils-cargo-bin` will resolve these values to absolute paths, when necessary. --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/8496). * #8498 * __->__ #8496
2025-12-23 19:29:32 -08:00
let transport = TokioChildProcess::new(Command::new(&mcp_executable).configure(|cmd| {
cmd.arg("--bash").arg(shell_path.as_ref());
feat: introduce codex-utils-cargo-bin as an alternative to assert_cmd::Command (#8496) This PR introduces a `codex-utils-cargo-bin` utility crate that wraps/replaces our use of `assert_cmd::Command` and `escargot::CargoBuild`. As you can infer from the introduction of `buck_project_root()` in this PR, I am attempting to make it possible to build Codex under [Buck2](https://buck2.build) as well as `cargo`. With Buck2, I hope to achieve faster incremental local builds (largely due to Buck2's [dice](https://buck2.build/docs/insights_and_knowledge/modern_dice/) build strategy, as well as benefits from its local build daemon) as well as faster CI builds if we invest in remote execution and caching. See https://buck2.build/docs/getting_started/what_is_buck2/#why-use-buck2-key-advantages for more details about the performance advantages of Buck2. Buck2 enforces stronger requirements in terms of build and test isolation. It discourages assumptions about absolute paths (which is key to enabling remote execution). Because the `CARGO_BIN_EXE_*` environment variables that Cargo provides are absolute paths (which `assert_cmd::Command` reads), this is a problem for Buck2, which is why we need this `codex-utils-cargo-bin` utility. My WIP-Buck2 setup sets the `CARGO_BIN_EXE_*` environment variables passed to a `rust_test()` build rule as relative paths. `codex-utils-cargo-bin` will resolve these values to absolute paths, when necessary. --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/8496). * #8498 * __->__ #8496
2025-12-23 19:29:32 -08:00
cmd.arg("--execve").arg(&execve_wrapper);
cmd.env("CODEX_HOME", codex_home.as_ref());
cmd.env("DOTSLASH_CACHE", dotslash_cache.as_ref());
feat: introduce codex-utils-cargo-bin as an alternative to assert_cmd::Command (#8496) This PR introduces a `codex-utils-cargo-bin` utility crate that wraps/replaces our use of `assert_cmd::Command` and `escargot::CargoBuild`. As you can infer from the introduction of `buck_project_root()` in this PR, I am attempting to make it possible to build Codex under [Buck2](https://buck2.build) as well as `cargo`. With Buck2, I hope to achieve faster incremental local builds (largely due to Buck2's [dice](https://buck2.build/docs/insights_and_knowledge/modern_dice/) build strategy, as well as benefits from its local build daemon) as well as faster CI builds if we invest in remote execution and caching. See https://buck2.build/docs/getting_started/what_is_buck2/#why-use-buck2-key-advantages for more details about the performance advantages of Buck2. Buck2 enforces stronger requirements in terms of build and test isolation. It discourages assumptions about absolute paths (which is key to enabling remote execution). Because the `CARGO_BIN_EXE_*` environment variables that Cargo provides are absolute paths (which `assert_cmd::Command` reads), this is a problem for Buck2, which is why we need this `codex-utils-cargo-bin` utility. My WIP-Buck2 setup sets the `CARGO_BIN_EXE_*` environment variables passed to a `rust_test()` build rule as relative paths. `codex-utils-cargo-bin` will resolve these values to absolute paths, when necessary. --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/8496). * #8498 * __->__ #8496
2025-12-23 19:29:32 -08:00
// Important: pipe stdio so rmcp can speak JSON-RPC over stdin/stdout
cmd.stdin(Stdio::piped());
cmd.stdout(Stdio::piped());
feat: introduce codex-utils-cargo-bin as an alternative to assert_cmd::Command (#8496) This PR introduces a `codex-utils-cargo-bin` utility crate that wraps/replaces our use of `assert_cmd::Command` and `escargot::CargoBuild`. As you can infer from the introduction of `buck_project_root()` in this PR, I am attempting to make it possible to build Codex under [Buck2](https://buck2.build) as well as `cargo`. With Buck2, I hope to achieve faster incremental local builds (largely due to Buck2's [dice](https://buck2.build/docs/insights_and_knowledge/modern_dice/) build strategy, as well as benefits from its local build daemon) as well as faster CI builds if we invest in remote execution and caching. See https://buck2.build/docs/getting_started/what_is_buck2/#why-use-buck2-key-advantages for more details about the performance advantages of Buck2. Buck2 enforces stronger requirements in terms of build and test isolation. It discourages assumptions about absolute paths (which is key to enabling remote execution). Because the `CARGO_BIN_EXE_*` environment variables that Cargo provides are absolute paths (which `assert_cmd::Command` reads), this is a problem for Buck2, which is why we need this `codex-utils-cargo-bin` utility. My WIP-Buck2 setup sets the `CARGO_BIN_EXE_*` environment variables passed to a `rust_test()` build rule as relative paths. `codex-utils-cargo-bin` will resolve these values to absolute paths, when necessary. --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/8496). * #8498 * __->__ #8496
2025-12-23 19:29:32 -08:00
// Optional but very helpful while debugging:
cmd.stderr(Stdio::inherit());
}))?;
Ok(transport)
}
pub async fn write_default_execpolicy<P>(policy: &str, codex_home: P) -> anyhow::Result<()>
where
P: AsRef<Path>,
{
let policy_dir = codex_home.as_ref().join("rules");
tokio::fs::create_dir_all(&policy_dir).await?;
tokio::fs::write(policy_dir.join("default.rules"), policy).await?;
Ok(())
}
pub async fn notify_readable_sandbox<P, S>(
sandbox_cwd: P,
codex_linux_sandbox_exe: Option<PathBuf>,
service: &RunningService<RoleClient, S>,
fix: change codex/sandbox-state/update from a notification to a request (#8142) Historically, `accept_elicitation_for_prompt_rule()` was flaky because we were using a notification to update the sandbox followed by a `shell` tool request that we expected to be subject to the new sandbox config, but because [rmcp](https://crates.io/crates/rmcp) MCP servers delegate each incoming message to a new Tokio task, messages are not guaranteed to be processed in order, so sometimes the `shell` tool call would run before the notification was processed. Prior to this PR, we relied on a generous `sleep()` between the notification and the request to reduce the change of the test flaking out. This PR implements a proper fix, which is to use a _request_ instead of a notification for the sandbox update so that we can wait for the response to the sandbox request before sending the request to the `shell` tool call. Previously, `rmcp` did not support custom requests, but I fixed that in https://github.com/modelcontextprotocol/rust-sdk/pull/590, which made it into the `0.12.0` release (see #8288). This PR updates `shell-tool-mcp` to expect `"codex/sandbox-state/update"` as a _request_ instead of a notification and sends the appropriate ack. Note this behavior is tied to our custom `codex/sandbox-state` capability, which Codex honors as an MCP client, which is why `core/src/mcp_connection_manager.rs` had to be updated as part of this PR, as well. This PR also updates the docs at `shell-tool-mcp/README.md`.
2025-12-18 15:32:01 -08:00
) -> anyhow::Result<ServerResult>
where
P: AsRef<Path>,
S: Service<RoleClient> + ClientHandler,
{
let sandbox_state = SandboxState {
feat: make sandbox read access configurable with `ReadOnlyAccess` (#11387) `SandboxPolicy::ReadOnly` previously implied broad read access and could not express a narrower read surface. This change introduces an explicit read-access model so we can support user-configurable read restrictions in follow-up work, while preserving current behavior today. It also ensures unsupported backends fail closed for restricted-read policies instead of silently granting broader access than intended. ## What - Added `ReadOnlyAccess` in protocol with: - `Restricted { include_platform_defaults, readable_roots }` - `FullAccess` - Updated `SandboxPolicy` to carry read-access configuration: - `ReadOnly { access: ReadOnlyAccess }` - `WorkspaceWrite { ..., read_only_access: ReadOnlyAccess }` - Preserved existing behavior by defaulting current construction paths to `ReadOnlyAccess::FullAccess`. - Threaded the new fields through sandbox policy consumers and call sites across `core`, `tui`, `linux-sandbox`, `windows-sandbox`, and related tests. - Updated Seatbelt policy generation to honor restricted read roots by emitting scoped read rules when full read access is not granted. - Added fail-closed behavior on Linux and Windows backends when restricted read access is requested but not yet implemented there (`UnsupportedOperation`). - Regenerated app-server protocol schema and TypeScript artifacts, including `ReadOnlyAccess`. ## Compatibility / rollout - Runtime behavior remains unchanged by default (`FullAccess`). - API/schema changes are in place so future config wiring can enable restricted read access without another policy-shape migration.
2026-02-11 18:31:14 -08:00
sandbox_policy: SandboxPolicy::new_read_only_policy(),
codex_linux_sandbox_exe,
sandbox_cwd: sandbox_cwd.as_ref().to_path_buf(),
use_linux_sandbox_bwrap: false,
};
fix: change codex/sandbox-state/update from a notification to a request (#8142) Historically, `accept_elicitation_for_prompt_rule()` was flaky because we were using a notification to update the sandbox followed by a `shell` tool request that we expected to be subject to the new sandbox config, but because [rmcp](https://crates.io/crates/rmcp) MCP servers delegate each incoming message to a new Tokio task, messages are not guaranteed to be processed in order, so sometimes the `shell` tool call would run before the notification was processed. Prior to this PR, we relied on a generous `sleep()` between the notification and the request to reduce the change of the test flaking out. This PR implements a proper fix, which is to use a _request_ instead of a notification for the sandbox update so that we can wait for the response to the sandbox request before sending the request to the `shell` tool call. Previously, `rmcp` did not support custom requests, but I fixed that in https://github.com/modelcontextprotocol/rust-sdk/pull/590, which made it into the `0.12.0` release (see #8288). This PR updates `shell-tool-mcp` to expect `"codex/sandbox-state/update"` as a _request_ instead of a notification and sends the appropriate ack. Note this behavior is tied to our custom `codex/sandbox-state` capability, which Codex honors as an MCP client, which is why `core/src/mcp_connection_manager.rs` had to be updated as part of this PR, as well. This PR also updates the docs at `shell-tool-mcp/README.md`.
2025-12-18 15:32:01 -08:00
send_sandbox_state_update(sandbox_state, service).await
}
pub async fn notify_writable_sandbox_only_one_folder<P, S>(
writable_folder: P,
codex_linux_sandbox_exe: Option<PathBuf>,
service: &RunningService<RoleClient, S>,
fix: change codex/sandbox-state/update from a notification to a request (#8142) Historically, `accept_elicitation_for_prompt_rule()` was flaky because we were using a notification to update the sandbox followed by a `shell` tool request that we expected to be subject to the new sandbox config, but because [rmcp](https://crates.io/crates/rmcp) MCP servers delegate each incoming message to a new Tokio task, messages are not guaranteed to be processed in order, so sometimes the `shell` tool call would run before the notification was processed. Prior to this PR, we relied on a generous `sleep()` between the notification and the request to reduce the change of the test flaking out. This PR implements a proper fix, which is to use a _request_ instead of a notification for the sandbox update so that we can wait for the response to the sandbox request before sending the request to the `shell` tool call. Previously, `rmcp` did not support custom requests, but I fixed that in https://github.com/modelcontextprotocol/rust-sdk/pull/590, which made it into the `0.12.0` release (see #8288). This PR updates `shell-tool-mcp` to expect `"codex/sandbox-state/update"` as a _request_ instead of a notification and sends the appropriate ack. Note this behavior is tied to our custom `codex/sandbox-state` capability, which Codex honors as an MCP client, which is why `core/src/mcp_connection_manager.rs` had to be updated as part of this PR, as well. This PR also updates the docs at `shell-tool-mcp/README.md`.
2025-12-18 15:32:01 -08:00
) -> anyhow::Result<ServerResult>
where
P: AsRef<Path>,
S: Service<RoleClient> + ClientHandler,
{
let sandbox_state = SandboxState {
sandbox_policy: SandboxPolicy::WorkspaceWrite {
// Note that sandbox_cwd will already be included as a writable root
// when the sandbox policy is expanded.
writable_roots: vec![],
feat: make sandbox read access configurable with `ReadOnlyAccess` (#11387) `SandboxPolicy::ReadOnly` previously implied broad read access and could not express a narrower read surface. This change introduces an explicit read-access model so we can support user-configurable read restrictions in follow-up work, while preserving current behavior today. It also ensures unsupported backends fail closed for restricted-read policies instead of silently granting broader access than intended. ## What - Added `ReadOnlyAccess` in protocol with: - `Restricted { include_platform_defaults, readable_roots }` - `FullAccess` - Updated `SandboxPolicy` to carry read-access configuration: - `ReadOnly { access: ReadOnlyAccess }` - `WorkspaceWrite { ..., read_only_access: ReadOnlyAccess }` - Preserved existing behavior by defaulting current construction paths to `ReadOnlyAccess::FullAccess`. - Threaded the new fields through sandbox policy consumers and call sites across `core`, `tui`, `linux-sandbox`, `windows-sandbox`, and related tests. - Updated Seatbelt policy generation to honor restricted read roots by emitting scoped read rules when full read access is not granted. - Added fail-closed behavior on Linux and Windows backends when restricted read access is requested but not yet implemented there (`UnsupportedOperation`). - Regenerated app-server protocol schema and TypeScript artifacts, including `ReadOnlyAccess`. ## Compatibility / rollout - Runtime behavior remains unchanged by default (`FullAccess`). - API/schema changes are in place so future config wiring can enable restricted read access without another policy-shape migration.
2026-02-11 18:31:14 -08:00
read_only_access: Default::default(),
network_access: false,
// Disable writes to temp dir because this is a test, so
// writable_folder is likely also under /tmp and we want to be
// strict about what is writable.
exclude_tmpdir_env_var: true,
exclude_slash_tmp: true,
},
codex_linux_sandbox_exe,
sandbox_cwd: writable_folder.as_ref().to_path_buf(),
use_linux_sandbox_bwrap: false,
};
fix: change codex/sandbox-state/update from a notification to a request (#8142) Historically, `accept_elicitation_for_prompt_rule()` was flaky because we were using a notification to update the sandbox followed by a `shell` tool request that we expected to be subject to the new sandbox config, but because [rmcp](https://crates.io/crates/rmcp) MCP servers delegate each incoming message to a new Tokio task, messages are not guaranteed to be processed in order, so sometimes the `shell` tool call would run before the notification was processed. Prior to this PR, we relied on a generous `sleep()` between the notification and the request to reduce the change of the test flaking out. This PR implements a proper fix, which is to use a _request_ instead of a notification for the sandbox update so that we can wait for the response to the sandbox request before sending the request to the `shell` tool call. Previously, `rmcp` did not support custom requests, but I fixed that in https://github.com/modelcontextprotocol/rust-sdk/pull/590, which made it into the `0.12.0` release (see #8288). This PR updates `shell-tool-mcp` to expect `"codex/sandbox-state/update"` as a _request_ instead of a notification and sends the appropriate ack. Note this behavior is tied to our custom `codex/sandbox-state` capability, which Codex honors as an MCP client, which is why `core/src/mcp_connection_manager.rs` had to be updated as part of this PR, as well. This PR also updates the docs at `shell-tool-mcp/README.md`.
2025-12-18 15:32:01 -08:00
send_sandbox_state_update(sandbox_state, service).await
}
fix: change codex/sandbox-state/update from a notification to a request (#8142) Historically, `accept_elicitation_for_prompt_rule()` was flaky because we were using a notification to update the sandbox followed by a `shell` tool request that we expected to be subject to the new sandbox config, but because [rmcp](https://crates.io/crates/rmcp) MCP servers delegate each incoming message to a new Tokio task, messages are not guaranteed to be processed in order, so sometimes the `shell` tool call would run before the notification was processed. Prior to this PR, we relied on a generous `sleep()` between the notification and the request to reduce the change of the test flaking out. This PR implements a proper fix, which is to use a _request_ instead of a notification for the sandbox update so that we can wait for the response to the sandbox request before sending the request to the `shell` tool call. Previously, `rmcp` did not support custom requests, but I fixed that in https://github.com/modelcontextprotocol/rust-sdk/pull/590, which made it into the `0.12.0` release (see #8288). This PR updates `shell-tool-mcp` to expect `"codex/sandbox-state/update"` as a _request_ instead of a notification and sends the appropriate ack. Note this behavior is tied to our custom `codex/sandbox-state` capability, which Codex honors as an MCP client, which is why `core/src/mcp_connection_manager.rs` had to be updated as part of this PR, as well. This PR also updates the docs at `shell-tool-mcp/README.md`.
2025-12-18 15:32:01 -08:00
async fn send_sandbox_state_update<S>(
sandbox_state: SandboxState,
service: &RunningService<RoleClient, S>,
fix: change codex/sandbox-state/update from a notification to a request (#8142) Historically, `accept_elicitation_for_prompt_rule()` was flaky because we were using a notification to update the sandbox followed by a `shell` tool request that we expected to be subject to the new sandbox config, but because [rmcp](https://crates.io/crates/rmcp) MCP servers delegate each incoming message to a new Tokio task, messages are not guaranteed to be processed in order, so sometimes the `shell` tool call would run before the notification was processed. Prior to this PR, we relied on a generous `sleep()` between the notification and the request to reduce the change of the test flaking out. This PR implements a proper fix, which is to use a _request_ instead of a notification for the sandbox update so that we can wait for the response to the sandbox request before sending the request to the `shell` tool call. Previously, `rmcp` did not support custom requests, but I fixed that in https://github.com/modelcontextprotocol/rust-sdk/pull/590, which made it into the `0.12.0` release (see #8288). This PR updates `shell-tool-mcp` to expect `"codex/sandbox-state/update"` as a _request_ instead of a notification and sends the appropriate ack. Note this behavior is tied to our custom `codex/sandbox-state` capability, which Codex honors as an MCP client, which is why `core/src/mcp_connection_manager.rs` had to be updated as part of this PR, as well. This PR also updates the docs at `shell-tool-mcp/README.md`.
2025-12-18 15:32:01 -08:00
) -> anyhow::Result<ServerResult>
where
S: Service<RoleClient> + ClientHandler,
{
fix: change codex/sandbox-state/update from a notification to a request (#8142) Historically, `accept_elicitation_for_prompt_rule()` was flaky because we were using a notification to update the sandbox followed by a `shell` tool request that we expected to be subject to the new sandbox config, but because [rmcp](https://crates.io/crates/rmcp) MCP servers delegate each incoming message to a new Tokio task, messages are not guaranteed to be processed in order, so sometimes the `shell` tool call would run before the notification was processed. Prior to this PR, we relied on a generous `sleep()` between the notification and the request to reduce the change of the test flaking out. This PR implements a proper fix, which is to use a _request_ instead of a notification for the sandbox update so that we can wait for the response to the sandbox request before sending the request to the `shell` tool call. Previously, `rmcp` did not support custom requests, but I fixed that in https://github.com/modelcontextprotocol/rust-sdk/pull/590, which made it into the `0.12.0` release (see #8288). This PR updates `shell-tool-mcp` to expect `"codex/sandbox-state/update"` as a _request_ instead of a notification and sends the appropriate ack. Note this behavior is tied to our custom `codex/sandbox-state` capability, which Codex honors as an MCP client, which is why `core/src/mcp_connection_manager.rs` had to be updated as part of this PR, as well. This PR also updates the docs at `shell-tool-mcp/README.md`.
2025-12-18 15:32:01 -08:00
let response = service
.send_request(ClientRequest::CustomRequest(CustomRequest::new(
MCP_SANDBOX_STATE_METHOD,
Some(serde_json::to_value(sandbox_state)?),
)))
.await?;
fix: change codex/sandbox-state/update from a notification to a request (#8142) Historically, `accept_elicitation_for_prompt_rule()` was flaky because we were using a notification to update the sandbox followed by a `shell` tool request that we expected to be subject to the new sandbox config, but because [rmcp](https://crates.io/crates/rmcp) MCP servers delegate each incoming message to a new Tokio task, messages are not guaranteed to be processed in order, so sometimes the `shell` tool call would run before the notification was processed. Prior to this PR, we relied on a generous `sleep()` between the notification and the request to reduce the change of the test flaking out. This PR implements a proper fix, which is to use a _request_ instead of a notification for the sandbox update so that we can wait for the response to the sandbox request before sending the request to the `shell` tool call. Previously, `rmcp` did not support custom requests, but I fixed that in https://github.com/modelcontextprotocol/rust-sdk/pull/590, which made it into the `0.12.0` release (see #8288). This PR updates `shell-tool-mcp` to expect `"codex/sandbox-state/update"` as a _request_ instead of a notification and sends the appropriate ack. Note this behavior is tied to our custom `codex/sandbox-state` capability, which Codex honors as an MCP client, which is why `core/src/mcp_connection_manager.rs` had to be updated as part of this PR, as well. This PR also updates the docs at `shell-tool-mcp/README.md`.
2025-12-18 15:32:01 -08:00
Ok(response)
}
pub struct InteractiveClient {
pub elicitations_to_accept: HashSet<String>,
pub elicitation_requests: Arc<Mutex<Vec<CreateElicitationRequestParams>>>,
}
impl ClientHandler for InteractiveClient {
fn get_info(&self) -> ClientInfo {
let capabilities = ClientCapabilities::builder().enable_elicitation().build();
ClientInfo {
capabilities,
..Default::default()
}
}
fn create_elicitation(
&self,
request: CreateElicitationRequestParams,
_context: rmcp::service::RequestContext<RoleClient>,
) -> impl std::future::Future<Output = Result<CreateElicitationResult, McpError>> + Send + '_
{
self.elicitation_requests
.lock()
.unwrap()
.push(request.clone());
let message = match &request {
CreateElicitationRequestParams::FormElicitationParams { message, .. }
| CreateElicitationRequestParams::UrlElicitationParams { message, .. } => message,
};
let accept = self.elicitations_to_accept.contains(message);
async move {
if accept {
Ok(CreateElicitationResult {
action: ElicitationAction::Accept,
content: Some(json!({ "approve": true })),
})
} else {
Ok(CreateElicitationResult {
action: ElicitationAction::Decline,
content: None,
})
}
}
}
}