core-agent-ide

Author	SHA1	Message	Date
Gabriel Peal	bd3ce98190	Bump rmcp to 0.15 (#11539 ) https://github.com/modelcontextprotocol/rust-sdk/pull/598 in 0.14 broke some MCP oauth (like Linear) and https://github.com/modelcontextprotocol/rust-sdk/pull/641 fixed it in 0.15	2026-02-11 22:04:17 -08:00
Josh McKinney	de59e550c0	test: deflake nextest child-process leak in MCP harnesses (#11263 ) ## Summary - add deterministic child-process cleanup to both test `McpProcess` helpers - keep Tokio `kill_on_drop(true)` but also reap via bounded `try_wait()` polling in `Drop` - document the failure mode and why this avoids nondeterministic `LEAK` flakes ## Why `cargo nextest` leak detection can intermittently report `LEAK` when a spawned server outlives test teardown, making CI flaky. ## Testing - `just fmt` - `cargo test -p codex-app-server` - `cargo test -p codex-mcp-server` ## Failing CI Reference - Original failing job: https://github.com/openai/codex/actions/runs/21845226299/job/63039443593?pr=11245	2026-02-10 03:43:24 +00:00
Matthew Zeng	9f1009540b	Upgrade rmcp to 0.14 (#10718 ) - [x] Upgrade rmcp to 0.14	2026-02-08 15:07:53 -08:00
jif-oai	d2394a2494	chore: nuke chat/completions API (#10157 )	2026-02-03 11:31:57 +00:00
Michael Bolin	66447d5d2c	feat: replace custom mcp-types crate with equivalents from rmcp (#10349 ) We started working with MCP in Codex before https://crates.io/crates/rmcp was mature, so we had our own crate for MCP types that was generated from the MCP schema: `8b95d3e082/codex-rs/mcp-types/README.md` Now that `rmcp` is more mature, it makes more sense to use their MCP types in Rust, as they handle details (like the `_meta` field) that our custom version ignored. Though one advantage that our custom types had is that our generated types implemented `JsonSchema` and `ts_rs::TS`, whereas the types in `rmcp` do not. As such, part of the work of this PR is leveraging the adapters between `rmcp` types and the serializable types that are API for us (app server and MCP) introduced in #10356. Note this PR results in a number of changes to `codex-rs/app-server-protocol/schema`, which merit special attention during review. We must ensure that these changes are still backwards-compatible, which is possible because we have: ```diff - export type CallToolResult = { content: Array<ContentBlock>, isError?: boolean, structuredContent?: JsonValue, }; + export type CallToolResult = { content: Array<JsonValue>, structuredContent?: JsonValue, isError?: boolean, _meta?: JsonValue, }; ``` so `ContentBlock` has been replaced with the more general `JsonValue`. Note that `ContentBlock` was defined as: ```typescript export type ContentBlock = TextContent \| ImageContent \| AudioContent \| ResourceLink \| EmbeddedResource; ``` so the deletion of those individual variants should not be a cause of great concern. Similarly, we have the following change in `codex-rs/app-server-protocol/schema/typescript/Tool.ts`: ``` - export type Tool = { annotations?: ToolAnnotations, description?: string, inputSchema: ToolInputSchema, name: string, outputSchema?: ToolOutputSchema, title?: string, }; + export type Tool = { name: string, title?: string, description?: string, inputSchema: JsonValue, outputSchema?: JsonValue, annotations?: JsonValue, icons?: Array<JsonValue>, _meta?: JsonValue, }; ``` so: - `annotations?: ToolAnnotations` ➡️ `JsonValue` - `inputSchema: ToolInputSchema` ➡️ `JsonValue` - `outputSchema?: ToolOutputSchema` ➡️ `JsonValue` and two new fields: `icons?: Array<JsonValue>, _meta?: JsonValue` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/10349). * #10357 * __->__ #10349 * #10356	2026-02-02 17:41:55 -08:00
Shijie Rao	57ec3a8277	Feat: request user input tool (#9472 ) ### Summary * Add `requestUserInput` tool that the model can use for gather feedback/asking question mid turn. ### Tool input schema ``` { "$schema": "http://json-schema.org/draft-07/schema#", "title": "requestUserInput input", "type": "object", "additionalProperties": false, "required": ["questions"], "properties": { "questions": { "type": "array", "description": "Questions to show the user (1-3). Prefer 1 unless multiple independent decisions block progress.", "minItems": 1, "maxItems": 3, "items": { "type": "object", "additionalProperties": false, "required": ["id", "header", "question"], "properties": { "id": { "type": "string", "description": "Stable identifier for mapping answers (snake_case)." }, "header": { "type": "string", "description": "Short header label shown in the UI (12 or fewer chars)." }, "question": { "type": "string", "description": "Single-sentence prompt shown to the user." }, "options": { "type": "array", "description": "Optional 2-3 mutually exclusive choices. Put the recommended option first and suffix its label with \"(Recommended)\". Only include \"Other\" option if we want to include a free form option. If the question is free form in nature, do not include any option.", "minItems": 2, "maxItems": 3, "items": { "type": "object", "additionalProperties": false, "required": ["value", "label", "description"], "properties": { "value": { "type": "string", "description": "Machine-readable value (snake_case)." }, "label": { "type": "string", "description": "User-facing label (1-5 words)." }, "description": { "type": "string", "description": "One short sentence explaining impact/tradeoff if selected." } } } } } } } } } ``` ### Tool output schema ``` { "$schema": "http://json-schema.org/draft-07/schema#", "title": "requestUserInput output", "type": "object", "additionalProperties": false, "required": ["answers"], "properties": { "answers": { "type": "object", "description": "Map of question id to user answer.", "additionalProperties": { "type": "object", "additionalProperties": false, "required": ["selected"], "properties": { "selected": { "type": "array", "items": { "type": "string" } }, "other": { "type": ["string", "null"] } } } } } } ```	2026-01-19 10:17:30 -08:00
zbarsky-openai	2a06d64bc9	feat: add support for building with Bazel (#8875 ) This PR configures Codex CLI so it can be built with [Bazel](https://bazel.build) in addition to Cargo. The `.bazelrc` includes configuration so that remote builds can be done using [BuildBuddy](https://www.buildbuddy.io). If you are familiar with Bazel, things should work as you expect, e.g., run `bazel test //... --keep-going` to run all the tests in the repo, but we have also added some new aliases in the `justfile` for convenience: - `just bazel-test` to run tests locally - `just bazel-remote-test` to run tests remotely (currently, the remote build is for x86_64 Linux regardless of your host platform). Note we are currently seeing the following test failures in the remote build, so we still need to figure out what is happening here: ``` failures: suite::compact::manual_compact_twice_preserves_latest_user_messages suite::compact_resume_fork::compact_resume_after_second_compaction_preserves_history suite::compact_resume_fork::compact_resume_and_fork_preserve_model_history_view ``` - `just build-for-release` to build release binaries for all platforms/architectures remotely To setup remote execution: - [Create a buildbuddy account](https://app.buildbuddy.io/) (OpenAI employees should also request org access at https://openai.buildbuddy.io/join/ with their `@openai.com` email address.) - [Copy your API key](https://app.buildbuddy.io/docs/setup/) to `~/.bazelrc` (add the line `build --remote_header=x-buildbuddy-api-key=YOUR_KEY`) - Use `--config=remote` in your `bazel` invocations (or add `common --config=remote` to your `~/.bazelrc`, or use the `just` commands) ## CI In terms of CI, this PR introduces `.github/workflows/bazel.yml`, which uses Bazel to run the tests _locally_ on Mac and Linux GitHub runners (we are working on supporting Windows, but that is not ready yet). Note that the failures we are seeing in `just bazel-remote-test` do not occur on these GitHub CI jobs, so everything in `.github/workflows/bazel.yml` is green right now. The `bazel.yml` uses extra config in `.github/workflows/ci.bazelrc` so that macOS CI jobs build _remotely_ on Linux hosts (using the `docker://docker.io/mbolin491/codex-bazel` Docker image declared in the root `BUILD.bazel`) using cross-compilation to build the macOS artifacts. Then these artifacts are downloaded locally to GitHub's macOS runner so the tests can be executed natively. This is the relevant config that enables this: ``` common:macos --config=remote common:macos --strategy=remote common:macos --strategy=TestRunner=darwin-sandbox,local ``` Because of the remote caching benefits we get from BuildBuddy, these new CI jobs can be extremely fast! For example, consider these two jobs that ran all the tests on Linux x86_64: - Bazel 1m37s https://github.com/openai/codex/actions/runs/20861063212/job/59940545209?pr=8875 - Cargo 9m20s https://github.com/openai/codex/actions/runs/20861063192/job/59940559592?pr=8875 For now, we will continue to run both the Bazel and Cargo jobs for PRs, but once we add support for Windows and running Clippy, we should be able to cutover to using Bazel exclusively for PRs, which should still speed things up considerably. We will probably continue to run the Cargo jobs post-merge for commits that land on `main` as a sanity check. Release builds will also continue to be done by Cargo for now. Earlier attempt at this PR: https://github.com/openai/codex/pull/8832 Earlier attempt to add support for Buck2, now abandoned: https://github.com/openai/codex/pull/8504 --------- Co-authored-by: David Zbarsky <dzbarsky@gmail.com> Co-authored-by: Michael Bolin <mbolin@openai.com>	2026-01-09 11:09:43 -08:00
jif-oai	1aed01e99f	renaming: task to turn (#8963 )	2026-01-09 17:31:17 +00:00
Michael Bolin	e61bae12e3	feat: introduce codex-utils-cargo-bin as an alternative to assert_cmd::Command (#8496 ) This PR introduces a `codex-utils-cargo-bin` utility crate that wraps/replaces our use of `assert_cmd::Command` and `escargot::CargoBuild`. As you can infer from the introduction of `buck_project_root()` in this PR, I am attempting to make it possible to build Codex under [Buck2](https://buck2.build) as well as `cargo`. With Buck2, I hope to achieve faster incremental local builds (largely due to Buck2's [dice](https://buck2.build/docs/insights_and_knowledge/modern_dice/) build strategy, as well as benefits from its local build daemon) as well as faster CI builds if we invest in remote execution and caching. See https://buck2.build/docs/getting_started/what_is_buck2/#why-use-buck2-key-advantages for more details about the performance advantages of Buck2. Buck2 enforces stronger requirements in terms of build and test isolation. It discourages assumptions about absolute paths (which is key to enabling remote execution). Because the `CARGO_BIN_EXE_` environment variables that Cargo provides are absolute paths (which `assert_cmd::Command` reads), this is a problem for Buck2, which is why we need this `codex-utils-cargo-bin` utility. My WIP-Buck2 setup sets the `CARGO_BIN_EXE_` environment variables passed to a `rust_test()` build rule as relative paths. `codex-utils-cargo-bin` will resolve these values to absolute paths, when necessary. --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/8496). * #8498 * __->__ #8496	2025-12-23 19:29:32 -08:00
jif-oai	29381ba5c2	feat: add shell snapshot for shell command (#7786 )	2025-12-11 13:46:43 +00:00
Josh McKinney	ec49b56874	chore: add cargo-deny configuration (#7119 ) - add GitHub workflow running cargo-deny on push/PR - document cargo-deny allowlist with workspace-dep notes and advisory ignores - align workspace crates to inherit version/edition/license for consistent checks	2025-11-24 12:22:18 -08:00
pakrym-oai	767b66f407	Migrate coverage to shell_command (#7042 )	2025-11-21 03:44:00 +00:00
Michael Bolin	d9dbf48828	fix: separate `codex mcp` into `codex mcp-server` and `codex app-server` (#4471 ) This is a very large PR with some non-backwards-compatible changes. Historically, `codex mcp` (or `codex mcp serve`) started a JSON-RPC-ish server that had two overlapping responsibilities: - Running an MCP server, providing some basic tool calls. - Running the app server used to power experiences such as the VS Code extension. This PR aims to separate these into distinct concepts: - `codex mcp-server` for the MCP server - `codex app-server` for the "application server" Note `codex mcp` still exists because it already has its own subcommands for MCP management (`list`, `add`, etc.) The MCP logic continues to live in `codex-rs/mcp-server` whereas the refactored app server logic is in the new `codex-rs/app-server` folder. Note that most of the existing integration tests in `codex-rs/mcp-server/tests/suite` were actually for the app server, so all the tests have been moved with the exception of `codex-rs/mcp-server/tests/suite/mod.rs`. Because this is already a large diff, I tried not to change more than I had to, so `codex-rs/app-server/tests/common/mcp_process.rs` still uses the name `McpProcess` for now, but I will do some mechanical renamings to things like `AppServer` in subsequent PRs. While `mcp-server` and `app-server` share some overlapping functionality (like reading streams of JSONL and dispatching based on message types) and some differences (completely different message types), I ended up doing a bit of copypasta between the two crates, as both have somewhat similar `message_processor.rs` and `outgoing_message.rs` files for now, though I expect them to diverge more in the near future. One material change is that of the initialize handshake for `codex app-server`, as we no longer use the MCP types for that handshake. Instead, we update `codex-rs/protocol/src/mcp_protocol.rs` to add an `Initialize` variant to `ClientRequest`, which takes the `ClientInfo` object we need to update the `USER_AGENT_SUFFIX` in `codex-rs/app-server/src/message_processor.rs`. One other material change is in `codex-rs/app-server/src/codex_message_processor.rs` where I eliminated a use of the `send_event_as_notification()` method I am generally trying to deprecate (because it blindly maps an `EventMsg` into a `JSONNotification`) in favor of `send_server_notification()`, which takes a `ServerNotification`, as that is intended to be a custom enum of all notification types supported by the app server. So to make this update, I had to introduce a new variant of `ServerNotification`, `SessionConfigured`, which is a non-backwards compatible change with the old `codex mcp`, and clients will have to be updated after the next release that contains this PR. Note that `codex-rs/app-server/tests/suite/list_resume.rs` also had to be update to reflect this change. I introduced `codex-rs/utils/json-to-toml/src/lib.rs` as a small utility crate to avoid some of the copying between `mcp-server` and `app-server`.	2025-09-30 07:06:18 +00:00
Dylan	197f45a3be	[mcp-server] Expose fuzzy file search in MCP (#2677 ) ## Summary Expose a simple fuzzy file search implementation for mcp clients to work with ## Testing - [x] Tested locally	2025-09-29 12:19:09 -07:00
jif-oai	e5fe50d3ce	chore: unify cargo versions (#4044 ) Unify cargo versions at root	2025-09-22 16:47:01 +00:00
Michael Bolin	c172e8e997	feat: added SetDefaultModel to JSON-RPC server (#3512 ) This adds `SetDefaultModel`, which takes `model` and `reasoning_effort` as optional fields. If set, the field will overwrite what is in the user's `config.toml`. This reuses logic that was added to support the `/model` command in the TUI: https://github.com/openai/codex/pull/2799.	2025-09-11 23:44:17 -07:00
Eric Traut	e13b35ecb0	Simplify auth flow and reconcile differences between ChatGPT and API Key auth (#3189 ) This PR does the following: * Adds the ability to paste or type an API key. * Removes the `preferred_auth_method` config option. The last login method is always persisted in auth.json, so this isn't needed. * If OPENAI_API_KEY env variable is defined, the value is used to prepopulate the new UI. The env variable is otherwise ignored by the CLI. * Adds a new MCP server entry point "login_api_key" so we can implement this same API key behavior for the VS Code extension. <img width="473" height="140" alt="Screenshot 2025-09-04 at 3 51 04 PM" src="https://github.com/user-attachments/assets/c11bbd5b-8a4d-4d71-90fd-34130460f9d9" /> <img width="726" height="254" alt="Screenshot 2025-09-04 at 3 51 32 PM" src="https://github.com/user-attachments/assets/6cc76b34-309a-4387-acbc-15ee5c756db9" />	2025-09-11 09:16:34 -07:00
Michael Bolin	65f3528cad	feat: add UserInfo request to JSON-RPC server (#3428 ) This adds a simple endpoint that provides the email address encoded in `$CODEX_HOME/auth.json`. As noted, for now, we do not hit the server to verify this is the user's true email address.	2025-09-10 17:03:35 -07:00
Michael Bolin	44262d8fd8	fix: ensure output of codex-rs/mcp-types/generate_mcp_types.py matches codex-rs/mcp-types/src/lib.rs (#3439 ) https://github.com/openai/codex/pull/3395 updated `mcp-types/src/lib.rs` by hand, but that file is generated code that is produced by `mcp-types/generate_mcp_types.py`. Unfortunately, we do not have anything in CI to verify this right now, but I will address that in a subsequent PR. #3395 ended up introducing a change that added a required field when deserializing `InitializeResult`, breaking Codex when used as an MCP client, so the quick fix in #3436 was to make the new field `Optional` with `skip_serializing_if = "Option::is_none"`, but that did not address the problem that `mcp-types/generate_mcp_types.py` and `mcp-types/src/lib.rs` are out of sync. This PR gets things back to where they are in sync. It removes the custom `mcp_types::McpClientInfo` type that was added to `mcp-types/src/lib.rs` and forces us to use the generated `mcp_types::Implementation` type. Though this PR also updates `generate_mcp_types.py` to generate the additional `user_agent: Optional<String>` field on `Implementation` so that we can continue to specify it when Codex operates as an MCP server. However, this also requires us to specify `user_agent: None` when Codex operates as an MCP client. We may want to introduce our own `InitializeResult` type that is specific to when we run as a server to avoid this in the future, but my immediate goal is just to get things back in sync.	2025-09-10 16:14:41 -07:00
Eric Traut	acb28bf914	Improved resiliency of two auth-related tests (#3427 ) This PR improves two existing auth-related tests. They were failing when run in an environment where an `OPENAI_API_KEY` env variable was defined. The change makes them more resilient.	2025-09-10 11:46:02 -07:00
Gabriel Peal	8636bff46d	Set a user agent suffix when used as a mcp server (#3395 ) This automatically adds a user agent suffix whenever the CLI is used as a MCP server	2025-09-10 02:32:57 +00:00
Michael Bolin	ace14e8d36	feat: add ArchiveConversation to ClientRequest (#3353 ) Adds support for `ArchiveConversation` in the JSON-RPC server that takes a `(ConversationId, PathBuf)` pair and: - verifies the `ConversationId` corresponds to the rollout id at the `PathBuf` - if so, invokes `ConversationManager.remove_conversation(ConversationId)` - if the `CodexConversation` was in memory, send `Shutdown` and wait for `ShutdownComplete` with a timeout - moves the `.jsonl` file to `$CODEX_HOME/archived_sessions` --------- Co-authored-by: Gabriel Peal <gabriel@openai.com>	2025-09-09 11:39:00 -04:00
jif-oai	a9c68ea270	feat: Run cargo shear during CI (#3338 ) Run cargo shear as part of the CI to ensure no unused dependencies	2025-09-09 01:05:08 +00:00
Gabriel Peal	5c1416d99b	Add a getUserAgent MCP method (#3320 ) This will allow the extension to pass this user agent + a suffix for its requests	2025-09-08 13:30:13 -04:00
Ahmed Ibrahim	907d3dd348	MCP: add session resume + history listing; (#3185 ) # External (non-OpenAI) Pull Request Requirements Before opening this Pull Request, please read the dedicated "Contributing" markdown file or your PR may be closed: https://github.com/openai/codex/blob/main/docs/contributing.md If your PR conforms to our contribution guidelines, replace this text with a detailed and high quality description of your changes.	2025-09-04 23:44:18 +00:00
Dylan	82ed7bd285	[mcp-server] Update read config interface (#3093 ) ## Summary Follow-up to #3056 This PR updates the mcp-server interface for reading the config settings saved by the user. At risk of introducing _another_ Config struct, I think it makes sense to avoid tying our protocol to ConfigToml, as its become a bit unwieldy. GetConfigTomlResponse was a de-facto struct for this already - better to make it explicit, in my opinion. This is technically a breaking change of the mcp-server protocol, but given the previous interface was introduced so recently in #2725, and we have not yet even started to call it, I propose proceeding with the breaking change - but am open to preserving the old endpoint. ## Testing - [x] Added additional integration test coverage	2025-09-04 16:26:41 -07:00
Michael Bolin	f09170b574	chore: print stderr from MCP server to test output using eprintln! (#2849 ) Related to https://github.com/openai/codex/pull/2848, I don't see the stderr from `codex mcp` colocated with the other stderr from `test_shell_command_approval_triggers_elicitation()` when it fails even though we have `RUST_LOG=debug` set when we spawn `codex mcp`: `1e9e703b96/codex-rs/mcp-server/tests/common/mcp_process.rs (L65)` Let's try this new logic which should be more explicit.	2025-08-28 12:43:13 -07:00
Michael Bolin	1e9e703b96	chore: try to make it easier to debug the flakiness of test_shell_command_approval_triggers_elicitation (#2848 ) `test_shell_command_approval_triggers_elicitation()` is one of a number of integration tests that we have observed to be flaky on GitHub CI, so this PR tries to reduce the flakiness _and_ to provide us with more information when it flakes. Specifically: - Changed the command that we use to trigger the elicitation from `git init` to `python3 -c 'import pathlib; pathlib.Path(r"{}").touch()'` because running `git` seems more likely to invite variance. - Increased the timeout to wait for the task response from 10s to 20s. - Added more logging.	2025-08-28 12:33:33 -07:00
Dylan	0cec0770e2	[mcp-server] Add GetConfig endpoint (#2725 ) ## Summary Adds a GetConfig request to the MCP Protocol, so MCP clients can evaluate the resolved config.toml settings which the harness is using. ## Testing - [x] Added an end to end test of the endpoint	2025-08-27 09:59:03 -07:00
Eric Traut	dc42ec0eb4	Add AuthManager and enhance GetAuthStatus command (#2577 ) This PR adds a central `AuthManager` struct that manages the auth information used across conversations and the MCP server. Prior to this, each conversation and the MCP server got their own private snapshots of the auth information, and changes to one (such as a logout or token refresh) were not seen by others. This is especially problematic when multiple instances of the CLI are run. For example, consider the case where you start CLI 1 and log in to ChatGPT account X and then start CLI 2 and log out and then log in to ChatGPT account Y. The conversation in CLI 1 is still using account X, but if you create a new conversation, it will suddenly (and unexpectedly) switch to account Y. With the `AuthManager`, auth information is read from disk at the time the `ConversationManager` is constructed, and it is cached in memory. All new conversations use this same auth information, as do any token refreshes. The `AuthManager` is also used by the MCP server's GetAuthStatus command, which now returns the auth method currently used by the MCP server. This PR also includes an enhancement to the GetAuthStatus command. It now accepts two new (optional) input parameters: `include_token` and `refresh_token`. Callers can use this to request the in-use auth token and can optionally request to refresh the token. The PR also adds tests for the login and auth APIs that I recently added to the MCP server.	2025-08-22 13:10:11 -07:00
Michael Bolin	712bfa04ac	chore: move mcp-server/src/wire_format.rs to protocol/src/mcp_protocol.rs (#2423 ) The existing `wire_format.rs` should share more types with the `codex-protocol` crate (like `AskForApproval` instead of maintaining a parallel `CodexToolCallApprovalPolicy` enum), so this PR moves `wire_format.rs` into `codex-protocol`, renaming it as `mcp-protocol.rs`. We also de-dupe types, where appropriate. --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/2423). * #2424 * __->__ #2423	2025-08-18 09:36:57 -07:00
Michael Bolin	a269754668	remove mcp-server/src/mcp_protocol.rs and the code that depends on it (#2360 )	2025-08-18 00:29:18 -07:00
Michael Bolin	eda50d8372	feat: introduce ClientRequest::SendUserTurn (#2345 ) This adds a new request type, `SendUserTurn`, that makes it possible to submit a `Op::UserTurn` operation (introduced in #2329) to a conversation. This PR also adds a new integration test that verifies that changing from `AskForApproval::UnlessTrusted` to `AskForApproval::Never` mid-conversation ensures that an elicitation is no longer sent for running `python3 -c print(42)`. --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/2345). * __->__ #2345 * #2329 * #2343 * #2340 * #2338	2025-08-15 10:05:58 -07:00
Michael Bolin	265fd89e31	fix: try to fix flakiness in test_shell_command_approval_triggers_elicitation (#2344 ) I still see flakiness in `test_shell_command_approval_triggers_elicitation()` on occasion where `MockServer` claims it has not received all of its expected requests. I recently introduced a similar type of test in #2264, `test_codex_jsonrpc_conversation_flow()`, which I have not seen flake (yet!), so this PR pulls over two things I did in that test: - increased `worker_threads` from `2` to `4` - added an assertion to make sure the `task_complete` notification is received Honestly, I'm still not sure why `MockServer` claims it sometimes does not receive all its expected requests given that we assert that the final `JSONRPCResponse` is read on the stream, but let's give this a shot. Assuming this fixes things, my hypothesis is that the increase in `worker_threads` helps because perhaps there are async tasks in `MockServer` that do not reliably complete fully when there are not enough threads available? If that is correct, it seems like the test would still be flaky, though perhaps with lower frequency?	2025-08-15 09:17:20 -07:00
Michael Bolin	a62510e0ae	fix: verify notifications are sent with the conversationId set (#2278 ) This updates `CodexMessageProcessor` so that each notification it sends for a `EventMsg` from a `CodexConversation` such that: - The `params` always has an appropriate `conversationId` field. - The `method` is now includes the name of the `EventMsg` type rather than using `codex/event` as the `method` type for all notifications. (We currently prefix the method name with `codex/event/`, but I think that should go away once we formalize the notification schema in `wire_format.rs`.) As part of this, we update `test_codex_jsonrpc_conversation_flow()` to verify that the `task_finished` notification has made it through the system instead of sleeping for 5s and "hoping" the server finished processing the task. Note we have seen some flakiness in some of our other, similar integration tests, and I expect adding a similar check would help in those cases, as well.	2025-08-13 17:54:12 -07:00
Michael Bolin	e7bad650ff	feat: support traditional JSON-RPC request/response in MCP server (#2264 ) This introduces a new set of request types that our `codex mcp` supports. Note that these do not conform to MCP tool calls so that instead of having to send something like this: ```json { "jsonrpc": "2.0", "method": "tools/call", "id": 42, "params": { "name": "newConversation", "arguments": { "model": "gpt-5", "approvalPolicy": "on-request" } } } ``` we can send something like this: ```json { "jsonrpc": "2.0", "method": "newConversation", "id": 42, "params": { "model": "gpt-5", "approvalPolicy": "on-request" } } ``` Admittedly, this new format is not a valid MCP tool call, but we are OK with that right now. (That is, not everything we might want to request of `codex mcp` is something that is appropriate for an autonomous agent to do.) To start, this introduces four request types: - `newConversation` - `sendUserMessage` - `addConversationListener` - `removeConversationListener` The new `mcp-server/tests/codex_message_processor_flow.rs` shows how these can be used. The types are defined on the `CodexRequest` enum, so we introduce a new `CodexMessageProcessor` that is responsible for dealing with requests from this enum. The top-level `MessageProcessor` has been updated so that when `process_request()` is called, it first checks whether the request conforms to `CodexRequest` and dispatches it to `CodexMessageProcessor` if so. Note that I also decided to use `camelCase` for the on-the-wire format, as that seems to be the convention for MCP. For the moment, the new protocol is defined in `wire_format.rs` within the `mcp-server` crate, but in a subsequent PR, I will probably move it to its own crate to ensure the protocol has minimal dependencies and that we can codegen a schema from it. --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/2264). * #2278 * __->__ #2264	2025-08-13 17:36:29 -07:00
aibrahim-oai	97ab8fb610	MCP: add conversation.create tool [Stack 2/2] (#1783 ) Introduce conversation.create handler (handle_create_conversation) and wire it in MessageProcessor. Stack: Top: #1783 Bottom: #1784 --------- Co-authored-by: Gabriel Peal <gpeal@users.noreply.github.com>	2025-08-01 22:18:36 +00:00
aibrahim-oai	ad0295b893	MCP server: route structured tool-call requests and expose mcp_protocol [Stack 2/3] (#1751 ) - Expose mcp_protocol from mcp-server for reuse in tests and callers. - In MessageProcessor, detect structured ToolCallRequestParams in tools/call and forward to a new handler. - Add handle_new_tool_calls scaffold (returns error for now). - Test helper: add send_send_user_message_tool_call to McpProcess to send ConversationSendMessage requests; This is the second PR in a stack. Stack: Final: #1686 Intermediate: #1751 First: #1750	2025-08-01 02:46:04 +00:00
aibrahim-oai	19bef7659f	Serializing the `eventmsg` type to snake_case (#1709 ) This was an abrupt change on our clients. We need to serialize as snake_case.	2025-07-28 10:26:27 -07:00
aibrahim-oai	5a0079fea2	Changing method in MCP notifications (#1684 ) - Changing the codex/event type	2025-07-26 10:35:49 -07:00
Michael Bolin	7af9cedbd7	fix: create separate test_support crates to eliminate #[allow(dead_code)] (#1667 ) Because of a quirk of how implementation tests work in Rust, we had a number of `#[allow(dead_code)]` annotations that were misleading because the functions _were_ being used, just not by all integration tests in a `tests/` folder, so when compiling the test that did not use the function, clippy would complain that it was unused. This fixes things by create a "test_support" crate under the `tests/` folder that is imported as a dev dependency for the respective crate.	2025-07-24 12:19:46 -07:00
aibrahim-oai	01c0896f0f	Adding interrupt Support to MCP (#1646 )	2025-07-22 20:33:49 +00:00
pakrym-oai	6d82907082	Add support for custom base instructions (#1645 ) Allows providing custom instructions file as a config parameter and custom instruction text via MCP tool call.	2025-07-22 09:42:22 -07:00
Gabriel Peal	710f728124	Add an elicitation for approve patch and refactor tool calls (#1642 ) 1. Added an elicitation for `approve-patch` which is very similar to `approve-exec`. 2. Extracted both elicitations to their own files to prevent `codex_tool_runner` from blowing up in size.	2025-07-22 02:58:41 -04:00
Michael Bolin	d49d802b06	test: add integration test for MCP server (#1633 ) This PR introduces a single integration test for `cargo mcp`, though it also introduces a number of reusable components so that it should be easier to introduce more integration tests going forward. The new test is introduced in `codex-rs/mcp-server/tests/elicitation.rs` and the reusable pieces are in `codex-rs/mcp-server/tests/common`. The test itself verifies new functionality around elicitations introduced in https://github.com/openai/codex/pull/1623 (and the fix introduced in https://github.com/openai/codex/pull/1629) by doing the following: - starts a mock model provider with canned responses for `/v1/chat/completions` - starts the MCP server with a `config.toml` to use that model provider (and `approval_policy = "untrusted"`) - sends the `codex` tool call which causes the mock model provider to request a shell call for `git init` - the MCP server sends an elicitation to the client to approve the request - the client replies to the elicitation with `"approved"` - the MCP server runs the command and re-samples the model, getting a `"finish_reason": "stop"` - in turn, the MCP server sends the final response to the original `codex` tool call - verifies that `git init` ran as expected To test: ``` cargo test shell_command_approval_triggers_elicitation ``` In writing this test, I discovered that `ExecApprovalResponse` does not conform to `ElicitResult`, so I added a TODO to fix that, since I think that should be updated in a separate PR. As it stands, this PR does not update any business logic, though it does make a number of members of the `mcp-server` crate `pub` so they can be used in the test. One additional learning from this PR is that `std::process::Command::cargo_bin()` from the `assert_cmd` trait is only available for `std::process::Command`, but we really want to use `tokio::process::Command` so that everything is async and we can leverage utilities like `tokio::time::timeout()`. The trick I came up with was to use `cargo_bin()` to locate the program, and then to use `std::process::Command::get_program()` when constructing the `tokio::process::Command`.	2025-07-21 10:27:07 -07:00

45 commits