### Description
#### Summary
Introduces the core plumbing required for structured network approvals
#### What changed
- Added structured network policy decision modeling in core.
- Added approval payload/context types needed for network approval
semantics.
- Wired shell/unified-exec runtime plumbing to consume structured
decisions.
- Updated related core error/event surfaces for structured handling.
- Updated protocol plumbing used by core approval flow.
- Included small CLI debug sandbox compatibility updates needed by this
layer.
#### Why
establishes the minimal backend foundation for network approvals without
yet changing high-level orchestration or TUI behavior.
#### Notes
- Behavior remains constrained by existing requirements/config gating.
- Follow-up PRs in the stack handle orchestration, UX, and app-server
integration.
---------
Co-authored-by: Codex <199175422+chatgpt-codex-connector[bot]@users.noreply.github.com>
## Why
We currently carry multiple permission-related concepts directly on
`Config` for shell/unified-exec behavior (`approval_policy`,
`sandbox_policy`, `network`, `shell_environment_policy`,
`windows_sandbox_mode`).
Consolidating these into one in-memory struct makes permission handling
easier to reason about and sets up the next step: supporting named
permission profiles (`[permissions.PROFILE_NAME]`) without changing
behavior now.
This change is mostly mechanical: it updates existing callsites to go
through `config.permissions`, but it does not yet refactor those
callsites to take a single `Permissions` value in places where multiple
permission fields are still threaded separately.
This PR intentionally **does not** change the on-disk `config.toml`
format yet and keeps compatibility with legacy config keys.
## What Changed
- Introduced `Permissions` in `core/src/config/mod.rs`.
- Added `Config::permissions` and moved effective runtime permission
fields under it:
- `approval_policy`
- `sandbox_policy`
- `network`
- `shell_environment_policy`
- `windows_sandbox_mode`
- Updated config loading/building so these effective values are still
derived from the same existing config inputs and constraints.
- Updated Windows sandbox helpers/resolution to read/write via
`permissions`.
- Threaded the new field through all permission consumers across core
runtime, app-server, CLI/exec, TUI, and sandbox summary code.
- Updated affected tests to reference `config.permissions.*`.
- Renamed the struct/field from
`EffectivePermissions`/`effective_permissions` to
`Permissions`/`permissions` and aligned variable naming accordingly.
## Verification
- `just fix -p codex-core -p codex-tui -p codex-cli -p codex-app-server
-p codex-exec -p codex-utils-sandbox-summary`
- `cargo build -p codex-core -p codex-tui -p codex-cli -p
codex-app-server -p codex-exec -p codex-utils-sandbox-summary`
Reapply "Add app-server transport layer with websocket support" with
additional fixes from https://github.com/openai/codex/pull/11313/changes
to avoid deadlocking.
This reverts commit 47356ff83c.
## Summary
To avoid deadlocking when queues are full, we maintain separate tokio
tasks dedicated to incoming vs outgoing event handling
- split the app-server main loop into two tasks in
`run_main_with_transport`
- inbound handling (`transport_event_rx`)
- outbound handling (`outgoing_rx` + `thread_created_rx`)
- separate incoming and outgoing websocket tasks
## Validation
Integration tests, testing thoroughly e2e in codex app w/ >10 concurrent
requests
<img width="1365" height="979" alt="Screenshot 2026-02-10 at 2 54 22 PM"
src="https://github.com/user-attachments/assets/47ca2c13-f322-4e5c-bedd-25859cbdc45f"
/>
---------
Co-authored-by: jif-oai <jif@openai.com>
We are removing feature-gated shared crates from the `codex-rs`
workspace. `codex-common` grouped several unrelated utilities behind
`[features]`, which made dependency boundaries harder to reason about
and worked against the ongoing effort to eliminate feature flags from
workspace crates.
Splitting these utilities into dedicated crates under `utils/` aligns
this area with existing workspace structure and keeps each dependency
explicit at the crate boundary.
## What changed
- Removed `codex-rs/common` (`codex-common`) from workspace members and
workspace dependencies.
- Added six new utility crates under `codex-rs/utils/`:
- `codex-utils-cli`
- `codex-utils-elapsed`
- `codex-utils-sandbox-summary`
- `codex-utils-approval-presets`
- `codex-utils-oss`
- `codex-utils-fuzzy-match`
- Migrated the corresponding modules out of `codex-common` into these
crates (with tests), and added matching `BUILD.bazel` targets.
- Updated direct consumers to use the new crates instead of
`codex-common`:
- `codex-rs/cli`
- `codex-rs/tui`
- `codex-rs/exec`
- `codex-rs/app-server`
- `codex-rs/mcp-server`
- `codex-rs/chatgpt`
- `codex-rs/cloud-tasks`
- Updated workspace lockfile entries to reflect the new dependency graph
and removal of `codex-common`.
As of this PR, `SessionServices` retains a
`Option<StartedNetworkProxy>`, if appropriate.
Now the `network` field on `Config` is `Option<NetworkProxySpec>`
instead of `Option<NetworkProxy>`.
Over in `Session::new()`, we invoke `NetworkProxySpec::start_proxy()` to
create the `StartedNetworkProxy`, which is a new struct that retains the
`NetworkProxy` as well as the `NetworkProxyHandle`. (Note that `Drop` is
implemented for `NetworkProxyHandle` to ensure the proxies are shutdown
when it is dropped.)
The `NetworkProxy` from the `StartedNetworkProxy` is threaded through to
the appropriate places.
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/11207).
* #11285
* __->__ #11207
Summary
- add a `required` flag for MCP servers everywhere config/CLI data is
touched so mandatory helpers can be round-tripped
- have `codex exec` and `codex app-server` thread start/resume fail fast
when required MCPs fail to initialize
- Adds --listen <URL> to codex app-server with two listen modes:
- stdio:// (default, existing behavior)
- ws://IP:PORT (new websocket transport)
- Refactors message routing to be connection-aware:
- Tracks per-connection session state (initialize/experimental
capability)
- Routes responses/errors to the originating connection
- Broadcasts server notifications/requests to initialized connections
- Updates initialization semantics to be per connection (not
process-global), and updates app-server docs accordingly.
- Adds websocket accept/read/write handling (JSON-RPC per text frame,
ping/pong handling, connection lifecycle events).
Testing
- Unit tests for transport URL parsing and targeted response/error
routing.
- New websocket integration test validating:
- per-connection initialization requirements
- no cross-connection response leakage
- same request IDs on different connections route independently.
So that the rest of the codebase (like TUI) don't need to be concerned
whether ChatGPT auth was handled by Codex itself or passed in via
app-server's external auth mode.
## Summary
This PR introduces a gated Bubblewrap (bwrap) Linux sandbox path. The
curent Linux sandbox path relies on in-process restrictions (including
Landlock). Bubblewrap gives us a more uniform filesystem isolation
model, especially explicit writable roots with the option to make some
directories read-only and granular network controls.
This is behind a feature flag so we can validate behavior safely before
making it the default.
- Added temporary rollout flag:
- `features.use_linux_sandbox_bwrap`
- Preserved existing default path when the flag is off.
- In Bubblewrap mode:
- Added internal retry without /proc when /proc mount is not permitted
by the host/container.
## Problem being solved
- We need a single, reliable way to mark app-server API surface as
experimental so that:
1. the runtime can reject experimental usage unless the client opts in
2. generated TS/JSON schemas can exclude experimental methods/fields for
stable clients.
Right now that’s easy to drift or miss when done ad-hoc.
## How to declare experimental methods and fields
- **Experimental method**: add `#[experimental("method/name")]` to the
`ClientRequest` variant in `client_request_definitions!`.
- **Experimental field**: on the params struct, derive `ExperimentalApi`
and annotate the field with `#[experimental("method/name.field")]` + set
`inspect_params: true` for the method variant so
`ClientRequest::experimental_reason()` inspects params for experimental
fields.
## How the macro solves it
- The new derive macro lives in
`codex-rs/codex-experimental-api-macros/src/lib.rs` and is used via
`#[derive(ExperimentalApi)]` plus `#[experimental("reason")]`
attributes.
- **Structs**:
- Generates `ExperimentalApi::experimental_reason(&self)` that checks
only annotated fields.
- The “presence” check is type-aware:
- `Option<T>`: `is_some_and(...)` recursively checks inner.
- `Vec`/`HashMap`/`BTreeMap`: must be non-empty.
- `bool`: must be `true`.
- Other types: considered present (returns `true`).
- Registers each experimental field in an `inventory` with `(type_name,
serialized field name, reason)` and exposes `EXPERIMENTAL_FIELDS` for
that type. Field names are converted from `snake_case` to `camelCase`
for schema/TS filtering.
- **Enums**:
- Generates an exhaustive `match` returning `Some(reason)` for annotated
variants and `None` otherwise (no wildcard arm).
- **Wiring**:
- Runtime gating uses `ExperimentalApi::experimental_reason()` in
`codex-rs/app-server/src/message_processor.rs` to reject requests unless
`InitializeParams.capabilities.experimental_api == true`.
- Schema/TS export filters use the inventory list and
`EXPERIMENTAL_CLIENT_METHODS` from `client_request_definitions!` to
strip experimental methods/fields when `experimental_api` is false.
When using ChatGPT in names of types, we should be consistent, so this
renames some types with `ChatGpt` in the name to `Chatgpt`. From
https://rust-lang.github.io/api-guidelines/naming.html:
> In `UpperCamelCase`, acronyms and contractions of compound words count
as one word: use `Uuid` rather than `UUID`, `Usize` rather than `USize`
or `Stdin` rather than `StdIn`. In `snake_case`, acronyms and
contractions are lower-cased: `is_xid_start`.
This PR updates existing uses of `ChatGpt` and changes them to
`Chatgpt`. Though in all cases where it could affect the wire format, I
visually inspected that we don't change anything there. That said, this
_will_ change the codegen because it will affect the spelling of type
names.
For example, this renames `AuthMode::ChatGPT` to `AuthMode::Chatgpt` in
`app-server-protocol`, but the wire format is still `"chatgpt"`.
This PR also updates a number of types in `codex-rs/core/src/auth.rs`.
Previously, `CodexAuth` was defined as follows:
d550fbf41a/codex-rs/core/src/auth.rs (L39-L46)
But if you looked at its constructors, we had creation for
`AuthMode::ApiKey` where `storage` was built using a nonsensical path
(`PathBuf::new()`) and `auth_dot_json` was `None`:
d550fbf41a/codex-rs/core/src/auth.rs (L212-L220)
By comparison, when `AuthMode::ChatGPT` was used, `api_key` was always
`None`:
d550fbf41a/codex-rs/core/src/auth.rs (L665-L671)https://github.com/openai/codex/pull/10012 took things further because
it introduced a new `ChatgptAuthTokens` variant to `AuthMode`, which is
important in when invoking `account/login/start` via the app server, but
most logic _internal_ to the app server should just reason about two
`AuthMode` variants: `ApiKey` and `ChatGPT`.
This PR tries to clean things up as follows:
- `LoginAccountParams` and `AuthMode` in `codex-rs/app-server-protocol/`
both continue to have the `ChatgptAuthTokens` variant, though it is used
exclusively for the on-the-wire messaging.
- `codex-rs/core/src/auth.rs` now has its own `AuthMode` enum, which
only has two variants: `ApiKey` and `ChatGPT`.
- `CodexAuth` has been changed from a struct to an enum. It is a
disjoint union where each variant (`ApiKey`, `ChatGpt`, and
`ChatGptAuthTokens`) have only the associated fields that make sense for
that variant.
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/10208).
* #10224
* __->__ #10208
Session renaming:
- `/rename my_session`
- `/rename` without arg and passing an argument in `customViewPrompt`
- AppExitInfo shows resume hint using the session name if set instead of
uuid, defaults to uuid if not set
- Names are stored in `CODEX_HOME/sessions.jsonl`
Session resuming:
- codex resume <name> lookup for `CODEX_HOME/sessions.jsonl` first entry
matching the name and resumes the session
---------
Co-authored-by: jif-oai <jif@openai.com>
This enables a new use case where `codex app-server` is embedded into a
parent application that will directly own the user's ChatGPT auth
lifecycle, which means it owns the user’s auth tokens and refreshes it
when necessary. The parent application would just want a way to pass in
the auth tokens for codex to use directly.
The idea is that we are introducing a new "auth mode" currently only
exposed via app server: **`chatgptAuthTokens`** which consist of the
`id_token` (stores account metadata) and `access_token` (the bearer
token used directly for backend API calls). These auth tokens are only
stored in-memory. This new mode is in addition to the existing `apiKey`
and `chatgpt` auth modes.
This PR reuses the shape of our existing app-server account APIs as much
as possible:
- Update `account/login/start` with a new `chatgptAuthTokens` variant,
which will allow the client to pass in the tokens and have codex
app-server use them directly. Upon success, the server emits
`account/login/completed` and `account/updated` notifications.
- A new server->client request called
`account/chatgptAuthTokens/refresh` which the server can use whenever
the access token previously passed in has expired and it needs a new one
from the parent application.
I leveraged the core 401 retry loop which typically triggers auth token
refreshes automatically, but made it pluggable:
- **chatgpt** mode refreshes internally, as usual.
- **chatgptAuthTokens** mode calls the client via
`account/chatgptAuthTokens/refresh`, the client responds with updated
tokens, codex updates its in-memory auth, then retries. This RPC has a
10s timeout and handles JSON-RPC errors from the client.
Also some additional things:
- chatgpt logins are blocked while external auth is active (have to log
out first. typically clients will pick one OR the other, not support
both)
- `account/logout` clears external auth in memory
- Ensures that if `forced_chatgpt_workspace_id` is set via the user's
config, we respect it in both:
- `account/login/start` with `chatgptAuthTokens` (returns a JSON-RPC
error back to the client)
- `account/chatgptAuthTokens/refresh` (fails the turn, and on next
request app-server will send another `account/chatgptAuthTokens/refresh`
request to the client).
## Summary
- add `codex features enable <feature>` and `codex features disable
<feature>`
- persist feature flag changes to `config.toml` (respecting profile)
- print the under-development feature warning when enabling prerelease
features
- keep `features list` behavior unchanged and add unit/integration tests
## Testing
- cargo test -p codex-cli
Fix resume --last prompt parsing by dropping the clap conflict on the
codex resume subcommand so a positional prompt is accepted when --last
is set. This aligns interactive resume behavior with exec-mode logic and
avoids the “--last cannot be used with SESSION_ID” error.
This addresses #6717
### Motivation
- Allow MCP OAuth flows to request scopes defined in `config.toml`
instead of requiring users to always pass `--scopes` on the CLI.
CLI/remote parameters should still override config values.
### Description
- Add optional `scopes: Option<Vec<String>>` to `McpServerConfig` and
`RawMcpServerConfig`, and propagate it through deserialization and the
built config types.
- Serialize `scopes` into the MCP server TOML via
`serialize_mcp_server_table` in `core/src/config/edit.rs` and include
`scopes` in the generated config schema (`core/config.schema.json`).
- CLI: update `codex-rs/cli/src/mcp_cmd.rs` `run_login` to fall back to
`server.scopes` when the `--scopes` flag is empty, with explicit CLI
scopes still taking precedence.
- App server: update
`codex-rs/app-server/src/codex_message_processor.rs`
`mcp_server_oauth_login` to use `params.scopes.or_else(||
server.scopes.clone())` so the RPC path also respects configured scopes.
- Update many test fixtures to initialize the new `scopes` field (set to
`None`) so test code builds with the new struct field.
### Testing
- Ran config tooling and formatters: `just write-config-schema`
(succeeded), `just fmt` (succeeded), and `just fix -p codex-core`, `just
fix -p codex-cli`, `just fix -p codex-app-server` (succeeded where
applicable).
- Ran unit tests for the CLI: `cargo test -p codex-cli` (passed).
- Ran unit tests for core: `cargo test -p codex-core` (ran; many tests
passed but several failed, including model refresh/403-related tests,
shell snapshot/timeouts, and several `unified_exec` expectations).
- Ran app-server tests: `cargo test -p codex-app-server` (ran; many
integration-suite tests failed due to mocked/remote HTTP 401/403
responses and wiremock expectations).
If you want, I can split the tests into smaller focused runs or help
debug the failing integration tests (they appear to be unrelated to the
config change and stem from external HTTP/mocking behaviors encountered
during the test runs).
------
[Codex
Task](https://chatgpt.com/codex/tasks/task_i_69718f505914832ea1f334b3ba064553)
We've recently standardized a [feature maturity
model](https://developers.openai.com/codex/feature-maturity) that we're
using in our docs and support forums to communicate expectations to
users. This PR updates the internal stage names and descriptions to
match.
This change involves a simple internal rename and updates to a few
user-visible strings. No functional change.
## Summary
This PR makes a minor formatting adjustment to a `println!` message by
removing an extra empty line and explicitly using `\n` for clarity.
## Changes
- Adjusted console output formatting for the success message.
- No functional or behavioral changes.
The formatting of `codex features list` made it hard to follow. This PR
introduces column width math to make things nice.
Maybe slightly hard to machine-parse (since not a simple `\t`), but we
should introduce a `--json` option if that's really important.
You can see the before/after in the screenshot:
<img width="1119" height="932" alt="image"
src="https://github.com/user-attachments/assets/c99dce85-899a-4a2d-b4af-003938f5e1df"
/>
## Summary
- Retire the experimental TUI2 implementation and its feature flag.
- Remove TUI2-only config/schema/docs so the CLI stays on the
terminal-native path.
- Keep docs aligned with the legacy TUI while we focus on redraw-based
improvements.
## Customer impact
- Retires the TUI2 experiment and keeps Codex on the proven
terminal-native UI while we invest in redraw-based improvements to the
existing experience.
## Migration / compatibility
- If you previously set tui2-related options in config.toml, they are
now ignored and Codex continues using the existing terminal-native TUI
(no action required).
## Context
- What worked: a transcript-owned viewport delivered excellent resize
rewrap and high-fidelity copy (especially for code).
- Why stop: making that experience feel fully native across the
environment matrix (terminal emulator, OS, input modality, multiplexer,
font/theme, alt-screen behavior) creates a combinatorial explosion of
edge cases.
- What next: we are focusing on redraw-based improvements to the
existing terminal-native TUI so scrolling, selection, and copy remain
native while resize/redraw correctness improves.
## Testing
- just write-config-schema
- just fmt
- cargo clippy --fix --all-features --tests --allow-dirty --allow-no-vcs
-p codex-core
- cargo clippy --fix --all-features --tests --allow-dirty --allow-no-vcs
-p codex-cli
- cargo check
- cargo test -p codex-core
- cargo test -p codex-cli
Using terminal with TERM=dumb specifically mean that TUIs and the like
don't work. Ensure that codex doesn't run in these environments and exit
with odd errors like crossterm's "Error: The cursor position could not
be read within a normal duration"
---------
Co-authored-by: Josh McKinney <joshka@openai.com>
Continuation of breaking up this PR
https://github.com/openai/codex/pull/9116
## Summary
- Thread user text element ranges through TUI/TUI2 input, submission,
queueing, and history so placeholders survive resume/edit flows.
- Preserve local image attachments alongside text elements and rehydrate
placeholders when restoring drafts.
- Keep model-facing content shapes clean by attaching UI metadata only
to user input/events (no API content changes).
## Key Changes
- TUI/TUI2 composer now captures text element ranges, trims them with
text edits, and restores them when submission is suppressed.
- User history cells render styled spans for text elements and keep
local image paths for future rehydration.
- Initial chat widget bootstraps accept empty `initial_text_elements` to
keep initialization uniform.
- Protocol/core helpers updated to tolerate the new InputText field
shape without changing payloads sent to the API.
### What
Add `WebSearchMode` enum (disabled, cached live, defaults to cached) to
config + V2 protocol. This enum takes precedence over legacy flags:
`web_search_cached`, `web_search_request`, and `tools.web_search`.
Keep `--search` as live.
### Tests
Added tests
The underlying issue is that when we encountered an error starting a
conversation (any sort of error, though making `$CODEX_HOME/rules` a
file rather than folder was the example in #8803), then we were writing
the message to stderr, but this could be printed over by our UI
framework so the user would not see it. In general, we disallow the use
of `eprintln!()` in this part of the code for exactly this reason,
though this was suppressed by an `#[allow(clippy::print_stderr)]`.
This attempts to clean things up by changing `handle_event()` and
`handle_tui_event()` to return a `Result<AppRunControl>` instead of a
`Result<bool>`, which is a new type introduced in this PR (and depends
on `ExitReason`, also a new type):
```rust
#[derive(Debug)]
pub(crate) enum AppRunControl {
Continue,
Exit(ExitReason),
}
#[derive(Debug, Clone)]
pub enum ExitReason {
UserRequested,
Fatal(String),
}
```
This makes it possible to exit the primary control flow of the TUI with
richer information. This PR adds `ExitReason` to the existing
`AppExitInfo` struct and updates `handle_app_exit()` to print the error
and exit code `1` in the event of `ExitReason::Fatal`.
I tried to create an integration test for this, but it was a bit
involved, so I published it as a separate PR:
https://github.com/openai/codex/pull/9166. For this PR, please have
faith in my manual testing!
Fixes https://github.com/openai/codex/issues/8803.
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/9011).
* #9166
* __->__ #9011
Add a new `codex app-server --analytics-default-enabled` CLI flag that
controls whether analytics are enabled by default.
Analytics are disabled by default for app-server. Users have to
explicitly opt in
via the `analytics` section in the config.toml file.
However, for first-party use cases like the VSCode IDE extension, we
default analytics
to be enabled by default by setting this flag. Users can still opt out
by setting this
in their config.toml:
```toml
[analytics]
enabled = false
```
See https://developers.openai.com/codex/config-advanced/#metrics for
more details.
Enterprises want to restrict the MCP servers their users can use.
Admins can now specify an allowlist of MCPs in `requirements.toml`. The
MCP servers are matched on both Name and Transport (local path or HTTP
URL) -- both must match to allow the MCP server. This prevents
circumventing the allowlist by renaming MCP servers in user config. (It
is still possible to replace the local path e.g. rewrite say
`/usr/local/github-mcp` with a nefarious MCP. We could allow hash
pinning in the future, but that would break updates. I also think this
represents a broader, out-of-scope problem.)
We introduce a new field to Constrained: "normalizer". In general, it is
a fn(T) -> T and applies when `Constrained<T>.set()` is called. In this
particular case, it disables MCP servers which do not match the
allowlist. An alternative solution would remove this and instead throw a
ConstraintError. That would stop Codex launching if any MCP server was
configured which didn't match. I think this is bad.
We currently reuse the enabled flag on MCP servers to disable them, but
don't propagate any information about why they are disabled. I'd like to
add that in a follow up PR, possibly by switching out enabled with an
enum.
In action:
```
# MCP server config has two MCPs. We are going to allowlist one of them.
➜ codex git:(gt/restrict-mcps) ✗ cat ~/.codex/config.toml | grep mcp_servers -A1
[mcp_servers.hello_world]
command = "hello-world-mcp"
--
[mcp_servers.docs]
command = "docs-mcp"
# Restrict the MCPs to the hello_world MCP.
➜ codex git:(gt/restrict-mcps) ✗ defaults read com.openai.codex requirements_toml_base64 | base64 -d
[mcp_server_allowlist.hello_world]
command = "hello-world-mcp"
# List the MCPs, observe hello_world is enabled and docs is disabled.
➜ codex git:(gt/restrict-mcps) ✗ just codex mcp list
cargo run --bin codex -- "$@"
Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.25s
Running `target/debug/codex mcp list`
Name Command Args Env Cwd Status Auth
docs docs-mcp - - - disabled Unsupported
hello_world hello-world-mcp - - - enabled Unsupported
# Remove the restrictions.
➜ codex git:(gt/restrict-mcps) ✗ defaults delete com.openai.codex requirements_toml_base64
# Observe both MCPs are enabled.
➜ codex git:(gt/restrict-mcps) ✗ just codex mcp list
cargo run --bin codex -- "$@"
Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.25s
Running `target/debug/codex mcp list`
Name Command Args Env Cwd Status Auth
docs docs-mcp - - - enabled Unsupported
hello_world hello-world-mcp - - - enabled Unsupported
# A new requirements that updates the command to one that does not match.
➜ codex git:(gt/restrict-mcps) ✗ cat ~/requirements.toml
[mcp_server_allowlist.hello_world]
command = "hello-world-mcp-v2"
# Use those requirements.
➜ codex git:(gt/restrict-mcps) ✗ defaults write com.openai.codex requirements_toml_base64 "$(base64 -i /Users/gt/requirements.toml)"
# Observe both MCPs are disabled.
➜ codex git:(gt/restrict-mcps) ✗ just codex mcp list
cargo run --bin codex -- "$@"
Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.75s
Running `target/debug/codex mcp list`
Name Command Args Env Cwd Status Auth
docs docs-mcp - - - disabled Unsupported
hello_world hello-world-mcp - - - disabled Unsupported
```
Currently the callback URI for MCP authentication is dynamically
generated. More specifically, the callback URI is dynamic because the
port part of it is randomly chosen by the OS. This is not ideal as
callback URIs are recommended to be static and many authorization
servers do not support dynamic callback URIs.
This PR fixes that issue by exposing a new config option named
`mcp_oauth_callback_port`. When it is set, the callback URI is
constructed using this port rather than a random one chosen by the OS,
thereby making callback URI static.
Related issue: https://github.com/openai/codex/issues/8827
This PR configures Codex CLI so it can be built with
[Bazel](https://bazel.build) in addition to Cargo. The `.bazelrc`
includes configuration so that remote builds can be done using
[BuildBuddy](https://www.buildbuddy.io).
If you are familiar with Bazel, things should work as you expect, e.g.,
run `bazel test //... --keep-going` to run all the tests in the repo,
but we have also added some new aliases in the `justfile` for
convenience:
- `just bazel-test` to run tests locally
- `just bazel-remote-test` to run tests remotely (currently, the remote
build is for x86_64 Linux regardless of your host platform). Note we are
currently seeing the following test failures in the remote build, so we
still need to figure out what is happening here:
```
failures:
suite::compact::manual_compact_twice_preserves_latest_user_messages
suite::compact_resume_fork::compact_resume_after_second_compaction_preserves_history
suite::compact_resume_fork::compact_resume_and_fork_preserve_model_history_view
```
- `just build-for-release` to build release binaries for all
platforms/architectures remotely
To setup remote execution:
- [Create a buildbuddy account](https://app.buildbuddy.io/) (OpenAI
employees should also request org access at
https://openai.buildbuddy.io/join/ with their `@openai.com` email
address.)
- [Copy your API key](https://app.buildbuddy.io/docs/setup/) to
`~/.bazelrc` (add the line `build
--remote_header=x-buildbuddy-api-key=YOUR_KEY`)
- Use `--config=remote` in your `bazel` invocations (or add `common
--config=remote` to your `~/.bazelrc`, or use the `just` commands)
## CI
In terms of CI, this PR introduces `.github/workflows/bazel.yml`, which
uses Bazel to run the tests _locally_ on Mac and Linux GitHub runners
(we are working on supporting Windows, but that is not ready yet). Note
that the failures we are seeing in `just bazel-remote-test` do not occur
on these GitHub CI jobs, so everything in `.github/workflows/bazel.yml`
is green right now.
The `bazel.yml` uses extra config in `.github/workflows/ci.bazelrc` so
that macOS CI jobs build _remotely_ on Linux hosts (using the
`docker://docker.io/mbolin491/codex-bazel` Docker image declared in the
root `BUILD.bazel`) using cross-compilation to build the macOS
artifacts. Then these artifacts are downloaded locally to GitHub's macOS
runner so the tests can be executed natively. This is the relevant
config that enables this:
```
common:macos --config=remote
common:macos --strategy=remote
common:macos --strategy=TestRunner=darwin-sandbox,local
```
Because of the remote caching benefits we get from BuildBuddy, these new
CI jobs can be extremely fast! For example, consider these two jobs that
ran all the tests on Linux x86_64:
- Bazel 1m37s
https://github.com/openai/codex/actions/runs/20861063212/job/59940545209?pr=8875
- Cargo 9m20s
https://github.com/openai/codex/actions/runs/20861063192/job/59940559592?pr=8875
For now, we will continue to run both the Bazel and Cargo jobs for PRs,
but once we add support for Windows and running Clippy, we should be
able to cutover to using Bazel exclusively for PRs, which should still
speed things up considerably. We will probably continue to run the Cargo
jobs post-merge for commits that land on `main` as a sanity check.
Release builds will also continue to be done by Cargo for now.
Earlier attempt at this PR: https://github.com/openai/codex/pull/8832
Earlier attempt to add support for Buck2, now abandoned:
https://github.com/openai/codex/pull/8504
---------
Co-authored-by: David Zbarsky <dzbarsky@gmail.com>
Co-authored-by: Michael Bolin <mbolin@openai.com>
As explained in https://github.com/openai/codex/issues/8945 and
https://github.com/openai/codex/issues/8472, there are legitimate cases
where users expect processes spawned by Codex to inherit environment
variables such as `LD_LIBRARY_PATH` and `DYLD_LIBRARY_PATH`, where
failing to do so can cause significant performance issues.
This PR removes the use of
`codex_process_hardening::pre_main_hardening()` in Codex CLI (which was
added not in response to a known security issue, but because it seemed
like a prudent thing to do from a security perspective:
https://github.com/openai/codex/pull/4521), but we will continue to use
it in `codex-responses-api-proxy`. At some point, we probably want to
introduce a slightly different version of
`codex_process_hardening::pre_main_hardening()` in Codex CLI that
excludes said environment variables from the Codex process itself, but
continues to propagate them to subprocesses.
Historically we started with a CodexAuth that knew how to refresh it's
own tokens and then added AuthManager that did a different kind of
refresh (re-reading from disk).
I don't think it makes sense for both `CodexAuth` and `AuthManager` to
be mutable and contain behaviors.
Move all refresh logic into `AuthManager` and keep `CodexAuth` as a data
object.
Adds an optional `justification` parameter to the `prefix_rule()`
execpolicy DSL so policy authors can attach human-readable rationale to
a rule. That justification is propagated through parsing/matching and
can be surfaced to the model (or approval UI) when a command is blocked
or requires approval.
When a command is rejected (or gated behind approval) due to policy, a
generic message makes it hard for the model/user to understand what went
wrong and what to do instead. Allowing policy authors to supply a short
justification improves debuggability and helps guide the model toward
compliant alternatives.
Example:
```python
prefix_rule(
pattern = ["git", "push"],
decision = "forbidden",
justification = "pushing is blocked in this repo",
)
```
If Codex tried to run `git push origin main`, now the failure would
include:
```
`git push origin main` rejected: pushing is blocked in this repo
```
whereas previously, all it was told was:
```
execpolicy forbids this command
```
This PR introduces a `codex-utils-cargo-bin` utility crate that
wraps/replaces our use of `assert_cmd::Command` and
`escargot::CargoBuild`.
As you can infer from the introduction of `buck_project_root()` in this
PR, I am attempting to make it possible to build Codex under
[Buck2](https://buck2.build) as well as `cargo`. With Buck2, I hope to
achieve faster incremental local builds (largely due to Buck2's
[dice](https://buck2.build/docs/insights_and_knowledge/modern_dice/)
build strategy, as well as benefits from its local build daemon) as well
as faster CI builds if we invest in remote execution and caching.
See
https://buck2.build/docs/getting_started/what_is_buck2/#why-use-buck2-key-advantages
for more details about the performance advantages of Buck2.
Buck2 enforces stronger requirements in terms of build and test
isolation. It discourages assumptions about absolute paths (which is key
to enabling remote execution). Because the `CARGO_BIN_EXE_*` environment
variables that Cargo provides are absolute paths (which
`assert_cmd::Command` reads), this is a problem for Buck2, which is why
we need this `codex-utils-cargo-bin` utility.
My WIP-Buck2 setup sets the `CARGO_BIN_EXE_*` environment variables
passed to a `rust_test()` build rule as relative paths.
`codex-utils-cargo-bin` will resolve these values to absolute paths,
when necessary.
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/8496).
* #8498
* __->__ #8496
### Summary
With codesigning on Mac, Windows and Linux, we should be able to safely
remove `features.rmcp_client` and `use_experimental_use_rmcp_client`
check from the codebase now.