core-agent-ide

Author	SHA1	Message	Date
Michael Bolin	abbd74e2be	feat: make sandbox read access configurable with `ReadOnlyAccess` (#11387 ) `SandboxPolicy::ReadOnly` previously implied broad read access and could not express a narrower read surface. This change introduces an explicit read-access model so we can support user-configurable read restrictions in follow-up work, while preserving current behavior today. It also ensures unsupported backends fail closed for restricted-read policies instead of silently granting broader access than intended. ## What - Added `ReadOnlyAccess` in protocol with: - `Restricted { include_platform_defaults, readable_roots }` - `FullAccess` - Updated `SandboxPolicy` to carry read-access configuration: - `ReadOnly { access: ReadOnlyAccess }` - `WorkspaceWrite { ..., read_only_access: ReadOnlyAccess }` - Preserved existing behavior by defaulting current construction paths to `ReadOnlyAccess::FullAccess`. - Threaded the new fields through sandbox policy consumers and call sites across `core`, `tui`, `linux-sandbox`, `windows-sandbox`, and related tests. - Updated Seatbelt policy generation to honor restricted read roots by emitting scoped read rules when full read access is not granted. - Added fail-closed behavior on Linux and Windows backends when restricted read access is requested but not yet implemented there (`UnsupportedOperation`). - Regenerated app-server protocol schema and TypeScript artifacts, including `ReadOnlyAccess`. ## Compatibility / rollout - Runtime behavior remains unchanged by default (`FullAccess`). - API/schema changes are in place so future config wiring can enable restricted read access without another policy-shape migration.	2026-02-11 18:31:14 -08:00
Josh McKinney	34c88d10ea	deflake linux-sandbox NoNewPrivs timeout (#11245 ) Deflake `codex-linux-sandbox::all suite::landlock::test_no_new_privs_is_enabled`. CI has intermittently failed with `Sandbox(Timeout)` (exit 124) because the sandboxed `grep '^NoNewPrivs:' /proc/self/status` can run close to the short timeout budget. This updates only this test to use `LONG_TIMEOUT_MS`, which removes the near-threshold timeout behavior while keeping the rest of the suite unchanged. Refs (previous failures): - PR: https://github.com/openai/codex/actions/runs/21836764823/job/63009902779 - PR: https://github.com/openai/codex/actions/runs/21837427251/job/63012470353 - main: https://github.com/openai/codex/actions/runs/21830746538/job/62988079964 Validation: - Local: `cd codex-rs && cargo test -p codex-linux-sandbox` (non-Linux runs 0 tests)	2026-02-10 03:03:58 +00:00
Michael Bolin	383b45279e	feat: include NetworkConfig through ExecParams (#11105 ) This PR adds the following field to `Config`: ```rust pub network: Option<NetworkProxy>, ``` Though for the moment, it will always be initialized as `None` (this will be addressed in a subsequent PR). This PR does the work to thread `network` through to `execute_exec_env()`, `process_exec_tool_call()`, and `UnifiedExecRuntime.run()` to ensure it is available whenever we span a process.	2026-02-09 03:32:17 +00:00
viyatb-oai	ae4de43ccc	feat(linux-sandbox): add bwrap support (#9938 ) ## Summary This PR introduces a gated Bubblewrap (bwrap) Linux sandbox path. The curent Linux sandbox path relies on in-process restrictions (including Landlock). Bubblewrap gives us a more uniform filesystem isolation model, especially explicit writable roots with the option to make some directories read-only and granular network controls. This is behind a feature flag so we can validate behavior safely before making it the default. - Added temporary rollout flag: - `features.use_linux_sandbox_bwrap` - Preserved existing default path when the flag is off. - In Bubblewrap mode: - Added internal retry without /proc when /proc mount is not permitted by the host/container.	2026-02-04 11:13:17 -08:00
Max Johnson	66b196a725	Inject CODEX_THREAD_ID into the terminal environment (#10096 ) Inject CODEX_THREAD_ID (when applicable) into the terminal environment so that the agent (and skills) can refer to the current thread / session ID. Discussion: https://openai.slack.com/archives/C095U48JNL9/p1769542492067109	2026-02-03 11:31:12 -08:00
iceweasel-oai	c40ad65bd8	remove sandbox globals. (#9797 ) Threads sandbox updates through OverrideTurnContext for active turn Passes computed sandbox type into safety/exec	2026-01-27 11:04:23 -08:00
viyatb-oai	55bda1a0f2	revert: remove pre-Landlock bind mounts apply (#9300 ) Description This removes the pre‑Landlock read‑only bind‑mount step from the Linux sandbox so filesystem restrictions rely solely on Landlock again. `mounts.rs` is kept in place but left unused. The linux‑sandbox README is updated to match the new behavior and manual test expectations.	2026-01-15 09:47:57 -08:00
viyatb-oai	2259031d64	fix: fallback to Landlock-only when user namespaces unavailable and set PR_SET_NO_NEW_PRIVS early (#9250 ) fixes https://github.com/openai/codex/issues/9236 ### Motivation - Prevent sandbox setup from failing when unprivileged user namespaces are denied so Landlock-only protections can still be applied. - Ensure `PR_SET_NO_NEW_PRIVS` is set before installing seccomp and Landlock restrictions to avoid kernel `EPERM`/`LandlockRestrict` ordering issues. ### Description - Add `is_permission_denied` helper that detects `EPERM` / `PermissionDenied` from `CodexErr` to drive fallback logic. - In `apply_read_only_mounts` skip read-only bind-mount setup and return `Ok(())` when `unshare_user_and_mount_namespaces()` fails with permission-denied so Landlock rules can still be installed. - Add `set_no_new_privs()` and call it from `apply_sandbox_policy_to_current_thread` before installing seccomp filters and Landlock rules when disk or network access is restricted.	2026-01-14 22:24:34 -08:00
viyatb-oai	e1447c3009	feat: add support for read-only bind mounts in the linux sandbox (#9112 ) ### Motivation - Landlock alone cannot prevent writes to sensitive in-repo files like `.git/` when the repo root is writable, so explicit mount restrictions are required for those paths. - The sandbox must set up any mounts before calling Landlock so Landlock can still be applied afterwards and the two mechanisms compose correctly. ### Description - Add a new `linux-sandbox` helper `apply_read_only_mounts` in `linux-sandbox/src/mounts.rs` that: unshares namespaces, maps uids/gids when required, makes mounts private, bind-mounts targets, and remounts them read-only. - Wire the mount step into the sandbox flow by calling `apply_read_only_mounts(...)` before network/seccomp and before applying Landlock rules in `linux-sandbox/src/landlock.rs`.	2026-01-14 08:30:46 -08:00
Michael Bolin	642b7566df	fix: introduce AbsolutePathBuf as part of sandbox config (#7856 ) Changes the `writable_roots` field of the `WorkspaceWrite` variant of the `SandboxPolicy` enum from `Vec<PathBuf>` to `Vec<AbsolutePathBuf>`. This is helpful because now callers can be sure the value is an absolute path rather than a relative one. (Though when using an absolute path in a Seatbelt config policy, we still have to _canonicalize_ it first.) Because `writable_roots` can be read from a config file, it is important that we are able to resolve relative paths properly using the parent folder of the config file as the base path.	2025-12-12 15:25:22 -08:00
zhao-oai	e0fb3ca1db	refactoring with_escalated_permissions to use SandboxPermissions instead (#7750 ) helpful in the future if we want more granularity for requesting escalated permissions: e.g when running in readonly sandbox, model can request to escalate to a sandbox that allows writes	2025-12-10 17:18:48 +00:00
Michael Bolin	67975ed33a	refactor: inline sandbox type lookup in process_exec_tool_call (#7122 ) `process_exec_tool_call()` was taking `SandboxType` as a param, but in practice, the only place it was constructed was in `codex_message_processor.rs` where it was derived from the other `sandbox_policy` param, so this PR inlines the logic that decides the `SandboxType` into `process_exec_tool_call()`. --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/7122). * #7112 * __->__ #7122	2025-11-21 22:53:05 +00:00
Michael Bolin	f56d1dc8fc	feat: update process_exec_tool_call() to take a cancellation token (#6972 ) This updates `ExecParams` so that instead of taking `timeout_ms: Option<u64>`, it now takes a more general cancellation mechanism, `ExecExpiration`, which is an enum that includes a `Cancellation(tokio_util::sync::CancellationToken)` variant. If the cancellation token is fired, then `process_exec_tool_call()` returns in the same way as if a timeout was exceeded. This is necessary so that in #6973, we can manage the timeout logic external to the `process_exec_tool_call()` because we want to "suspend" the timeout when an elicitation from a human user is pending. --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/6972). * #7005 * #6973 * __->__ #6972	2025-11-20 16:29:57 -08:00
jif-oai	aa76003e28	chore: unify config crates (#5958 )	2025-10-30 10:28:32 +00:00
jif-oai	5e4f3bbb0b	chore: rework tools execution workflow (#5278 ) Re-work the tool execution flow. Read `orchestrator.rs` to understand the structure	2025-10-20 20:57:37 +01:00
jif-oai	be366a31ab	chore: clippy on redundant closure (#4058 ) Add redundant closure clippy rules and let Codex fix it by minimising FQP	2025-09-22 19:30:16 +00:00
Michael Bolin	8595237505	fix: ensure cwd for conversation and sandbox are separate concerns (#3874 ) Previous to this PR, both of these functions take a single `cwd`: `71038381aa/codex-rs/core/src/seatbelt.rs (L19-L25)` `71038381aa/codex-rs/core/src/landlock.rs (L16-L23)` whereas `cwd` and `sandbox_cwd` should be set independently (fixed in this PR). Added `sandbox_distinguishes_command_and_policy_cwds()` to `codex-rs/exec/tests/suite/sandbox.rs` to verify this.	2025-09-18 14:37:06 -07:00
pakrym-oai	863d9c237e	Include command output when sending timeout to model (#3576 ) Being able to see the output helps the model decide how to handle the timeout.	2025-09-14 14:38:26 -07:00
Jeremy Rose	32bbbbad61	test: faster test execution in codex-core (#2633 ) this dramatically improves time to run `cargo test -p codex-core` (~25x speedup). before: ``` cargo test -p codex-core 35.96s user 68.63s system 19% cpu 8:49.80 total ``` after: ``` cargo test -p codex-core 5.51s user 8.16s system 63% cpu 21.407 total ``` both tests measured "hot", i.e. on a 2nd run with no filesystem changes, to exclude compile times. approach inspired by [Delete Cargo Integration Tests](https://matklad.github.io/2021/02/27/delete-cargo-integration-tests.html), we move all test cases in tests/ into a single suite in order to have a single binary, as there is significant overhead for each test binary executed, and because test execution is only parallelized with a single binary.	2025-08-24 11:10:53 -07:00
Parker Thompson	a075424437	Added `allow-expect-in-tests` / `allow-unwrap-in-tests` (#2328 ) This PR: * Added the clippy.toml to configure allowable expect / unwrap usage in tests * Removed as many expect/allow lines as possible from tests * moved a bunch of allows to expects where possible Note: in integration tests, non `#[test]` helper functions are not covered by this so we had to leave a few lingering `expect(expect_used` checks around	2025-08-14 17:59:01 -07:00
Michael Bolin	08ed618f72	chore: introduce ConversationManager as a clearinghouse for all conversations (#2240 ) This PR does two things because after I got deep into the first one I started pulling on the thread to the second: - Makes `ConversationManager` the place where all in-memory conversations are created and stored. Previously, `MessageProcessor` in the `codex-mcp-server` crate was doing this via its `session_map`, but this is something that should be done in `codex-core`. - It unwinds the `ctrl_c: tokio::sync::Notify` that was threaded throughout our code. I think this made sense at one time, but now that we handle Ctrl-C within the TUI and have a proper `Op::Interrupt` event, I don't think this was quite right, so I removed it. For `codex exec` and `codex proto`, we now use `tokio::signal::ctrl_c()` directly, but we no longer make `Notify` a field of `Codex` or `CodexConversation`. Changes of note: - Adds the files `conversation_manager.rs` and `codex_conversation.rs` to `codex-core`. - `Codex` and `CodexSpawnOk` are no longer exported from `codex-core`: other crates must use `CodexConversation` instead (which is created via `ConversationManager`). - `core/src/codex_wrapper.rs` has been deleted in favor of `ConversationManager`. - `ConversationManager::new_conversation()` returns `NewConversation`, which is in line with the `new_conversation` tool we want to add to the MCP server. Note `NewConversation` includes `SessionConfiguredEvent`, so we eliminate checks in cases like `codex-rs/core/tests/client.rs` to verify `SessionConfiguredEvent` is the first event because that is now internal to `ConversationManager`. - Quite a bit of code was deleted from `codex-rs/mcp-server/src/message_processor.rs` since it no longer has to manage multiple conversations itself: it goes through `ConversationManager` instead. - `core/tests/live_agent.rs` has been deleted because I had to update a bunch of tests and all the tests in here were ignored, and I don't think anyone ever ran them, so this was just technical debt, at this point. - Removed `notify_on_sigint()` from `util.rs` (and in a follow-up, I hope to refactor the blandly-named `util.rs` into more descriptive files). - In general, I started replacing local variables named `codex` as `conversation`, where appropriate, though admittedly I didn't do it through all the integration tests because that would have added a lot of noise to this PR. --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/2240). * #2264 * #2263 * __->__ #2240	2025-08-13 13:38:18 -07:00
pakrym-oai	0cf57e1f42	Include output truncation message in tool call results (#2183 ) To avoid model being confused about incomplete output.	2025-08-11 11:52:05 -07:00
Michael Bolin	cd5f9074af	feat: add /tmp by default (#1919 ) Replaces the `include_default_writable_roots` option on `sandbox_workspace_write` (that defaulted to `true`, which was slightly weird/annoying) with `exclude_tmpdir_env_var`, which defaults to `false`. Though perhaps more importantly `/tmp` is now enabled by default as part of `sandbox_mode = "workspace-write"`, though `exclude_slash_tmp = false` can be used to disable this.	2025-08-07 00:17:00 -07:00
Dylan	725dd6be6a	[approval_policy] Add OnRequest approval_policy (#1865 ) ## Summary A split-up PR of #1763 , stacked on top of a tools refactor #1858 to make the change clearer. From the previous summary: > Let's try something new: tell the model about the sandbox, and let it decide when it will need to break the sandbox. Some local testing suggests that it works pretty well with zero iteration on the prompt! ## Testing - [x] Added unit tests - [x] Tested locally and it appears to work smoothly!	2025-08-05 20:44:20 -07:00
Michael Bolin	92f3566d78	chore: introduce SandboxPolicy::WorkspaceWrite::include_default_writable_roots (#1785 ) Without this change, it is challenging to create integration tests to verify that the folders not included in `writable_roots` in `SandboxPolicy::WorkspaceWrite` are read-only because, by default, `get_writable_roots_with_cwd()` includes `TMPDIR`, which is where most integrationt tests do their work. This introduces a `use_exact_writable_roots` option to disable the default includes returned by `get_writable_roots_with_cwd()`. --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/1785). * #1765 * __->__ #1785	2025-08-01 14:15:55 -07:00
aibrahim-oai	bc7beddaa2	feat: stream exec stdout events (#1786 ) ## Summary - stream command stdout as `ExecCommandStdout` events - forward streamed stdout to clients and ignore in human output processor - adjust call sites for new streaming API	2025-08-01 13:04:34 -07:00
Rene Leonhardt	82b0cebe8b	chore(rs): update dependencies (#1494 ) ### Chores - Update cargo dependencies - Remove unused cargo dependencies - Fix clippy warnings - Update Dockerfile (package.json requires node 22) - Let Dependabot update bun, cargo, devcontainers, docker, github-actions, npm (nix still not supported) ### TODO - Upgrade dependencies with breaking changes ```shell $ cargo update --verbose Unchanged crossterm v0.28.1 (available: v0.29.0) Unchanged schemars v0.8.22 (available: v1.0.4) ```	2025-07-10 11:08:16 -07:00
Michael Bolin	0776d78357	feat: redesign sandbox config (#1373 ) This is a major redesign of how sandbox configuration works and aims to fix https://github.com/openai/codex/issues/1248. Specifically, it replaces `sandbox_permissions` in `config.toml` (and the `-s`/`--sandbox-permission` CLI flags) with a "table" with effectively three variants: ```toml # Safest option: full disk is read-only, but writes and network access are disallowed. [sandbox] mode = "read-only" # The cwd of the Codex task is writable, as well as $TMPDIR on macOS. # writable_roots can be used to specify additional writable folders. [sandbox] mode = "workspace-write" writable_roots = [] # Optional, defaults to the empty list. network_access = false # Optional, defaults to false. # Disable sandboxing: use at your own risk!!! [sandbox] mode = "danger-full-access" ``` This should make sandboxing easier to reason about. While we have dropped support for `-s`, the way it works now is: - no flags => `read-only` - `--full-auto` => `workspace-write` - currently, there is no way to specify `danger-full-access` via a CLI flag, but we will revisit that as part of https://github.com/openai/codex/issues/1254 Outstanding issue: - As noted in the `TODO` on `SandboxPolicy::is_unrestricted()`, we are still conflating sandbox preferences with approval preferences in that case, which needs to be cleaned up.	2025-06-24 16:59:47 -07:00
Michael Bolin	9db53b33aa	fix: support arm64 build for Linux (#1225 ) Users were running into issues with glibc mismatches on arm64 linux. In the past, we did not provide a musl build for arm64 Linux because we had trouble getting the openssl dependency to build correctly. Though today I just tried the same trick in `Cargo.toml` that we were doing for `x86_64-unknown-linux-musl` (using `openssl-sys` with `features = ["vendored"]`), so I'm not sure what problem we had in the past the builds "just worked" today! Though one tweak that did have to be made is that the integration tests for Seccomp/Landlock empirically require longer timeouts on arm64 linux, or at least on the `ubuntu-24.04-arm` GitHub Runner. As such, we change the timeouts for arm64 in `codex-rs/linux-sandbox/tests/landlock.rs`. Though in solving this problem, I decided I needed a turnkey solution for testing the Linux build(s) from my Mac laptop, so this PR introduces `.devcontainer/Dockerfile` and `.devcontainer/devcontainer.json` to facilitate this. Detailed instructions are in `.devcontainer/README.md`. We will update `dotslash-config.json` and other release-related scripts in a follow-up PR.	2025-06-05 20:29:46 -07:00
Michael Bolin	89ef4efdcf	fix: overhaul how we spawn commands under seccomp/landlock on Linux (#1086 ) Historically, we spawned the Seatbelt and Landlock sandboxes in substantially different ways: For Seatbelt, we would run `/usr/bin/sandbox-exec` with our policy specified as an arg followed by the original command: `d1de7bb383/codex-rs/core/src/exec.rs (L147-L219)` For Landlock/Seccomp, we would do `tokio::runtime::Builder::new_current_thread()`, _invoke Landlock/Seccomp APIs to modify the permissions of that new thread_, and then spawn the command: `d1de7bb383/codex-rs/core/src/exec_linux.rs (L28-L49)` While it is neat that Landlock/Seccomp supports applying a policy to only one thread without having to apply it to the entire process, it requires us to maintain two different codepaths and is a bit harder to reason about. The tipping point was https://github.com/openai/codex/pull/1061, in which we had to start building up the `env` in an unexpected way for the existing Landlock/Seccomp approach to continue to work. This PR overhauls things so that we do similar things for Mac and Linux. It turned out that we were already building our own "helper binary" comparable to Mac's `sandbox-exec` as part of the `cli` crate: `d1de7bb383/codex-rs/cli/Cargo.toml (L10-L12)` We originally created this to build a small binary to include with the Node.js version of the Codex CLI to provide support for Linux sandboxing. Though the sticky bit is that, at this point, we still want to deploy the Rust version of Codex as a single, standalone binary rather than a CLI and a supporting sandboxing binary. To satisfy this goal, we use "the arg0 trick," in which we: * use `std::env::current_exe()` to get the path to the CLI that is currently running * use the CLI as the `program` for the `Command` * set `"codex-linux-sandbox"` as arg0 for the `Command` A CLI that supports sandboxing should check arg0 at the start of the program. If it is `"codex-linux-sandbox"`, it must invoke `codex_linux_sandbox::run_main()`, which runs the CLI as if it were `codex-linux-sandbox`. When acting as `codex-linux-sandbox`, we make the appropriate Landlock/Seccomp API calls and then use `execvp(3)` to spawn the original command, so do _replace_ the process rather than spawn a subprocess. Incidentally, we do this before starting the Tokio runtime, so the process should only have one thread when `execvp(3)` is called. Because the `core` crate that needs to spawn the Linux sandboxing is not a CLI in its own right, this means that every CLI that includes `core` and relies on this behavior has to (1) implement it and (2) provide the path to the sandboxing executable. While the path is almost always `std::env::current_exe()`, we needed to make this configurable for integration tests, so `Config` now has a `codex_linux_sandbox_exe: Option<PathBuf>` property to facilitate threading this through, introduced in https://github.com/openai/codex/pull/1089. This common pattern is now captured in `codex_linux_sandbox::run_with_sandbox()` and all of the `main.rs` functions that should use it have been updated as part of this PR. The `codex-linux-sandbox` crate added to the Cargo workspace as part of this PR now has the bulk of the Landlock/Seccomp logic, which makes `core` a bit simpler. Indeed, `core/src/exec_linux.rs` and `core/src/landlock.rs` were removed/ported as part of this PR. I also moved the unit tests for this code into an integration test, `linux-sandbox/tests/landlock.rs`, in which I use `env!("CARGO_BIN_EXE_codex-linux-sandbox")` as the value for `codex_linux_sandbox_exe` since `std::env::current_exe()` is not appropriate in that case.	2025-05-23 11:37:07 -07:00

30 commits