core-agent-ide

Author	SHA1	Message	Date
Owen Lin	db4d2599b5	feat(core): plumb distinct approval ids for command approvals (#12051 ) zsh fork PR stack: - https://github.com/openai/codex/pull/12051 👈 - https://github.com/openai/codex/pull/12052 With upcoming support for a fork of zsh that allows us to intercept `execve` and run execpolicy checks for each subcommand as part of a `CommandExecution`, it will be possible for there to be multiple approval requests for a shell command like `/path/to/zsh -lc 'git status && rg \"TODO\" src && make test'`. To support that, this PR introduces a new `approval_id` field across core, protocol, and app-server so that we can associate approvals properly for subcommands.	2026-02-18 01:55:57 +00:00
viyatb-oai	b527ee2890	feat(core): add structured network approval plumbing and policy decision model (#11672 ) ### Description #### Summary Introduces the core plumbing required for structured network approvals #### What changed - Added structured network policy decision modeling in core. - Added approval payload/context types needed for network approval semantics. - Wired shell/unified-exec runtime plumbing to consume structured decisions. - Updated related core error/event surfaces for structured handling. - Updated protocol plumbing used by core approval flow. - Included small CLI debug sandbox compatibility updates needed by this layer. #### Why establishes the minimal backend foundation for network approvals without yet changing high-level orchestration or TUI behavior. #### Notes - Behavior remains constrained by existing requirements/config gating. - Follow-up PRs in the stack handle orchestration, UX, and app-server integration. --------- Co-authored-by: Codex <199175422+chatgpt-codex-connector[bot]@users.noreply.github.com>	2026-02-14 04:18:12 +00:00
Michael Bolin	66447d5d2c	feat: replace custom mcp-types crate with equivalents from rmcp (#10349 ) We started working with MCP in Codex before https://crates.io/crates/rmcp was mature, so we had our own crate for MCP types that was generated from the MCP schema: `8b95d3e082/codex-rs/mcp-types/README.md` Now that `rmcp` is more mature, it makes more sense to use their MCP types in Rust, as they handle details (like the `_meta` field) that our custom version ignored. Though one advantage that our custom types had is that our generated types implemented `JsonSchema` and `ts_rs::TS`, whereas the types in `rmcp` do not. As such, part of the work of this PR is leveraging the adapters between `rmcp` types and the serializable types that are API for us (app server and MCP) introduced in #10356. Note this PR results in a number of changes to `codex-rs/app-server-protocol/schema`, which merit special attention during review. We must ensure that these changes are still backwards-compatible, which is possible because we have: ```diff - export type CallToolResult = { content: Array<ContentBlock>, isError?: boolean, structuredContent?: JsonValue, }; + export type CallToolResult = { content: Array<JsonValue>, structuredContent?: JsonValue, isError?: boolean, _meta?: JsonValue, }; ``` so `ContentBlock` has been replaced with the more general `JsonValue`. Note that `ContentBlock` was defined as: ```typescript export type ContentBlock = TextContent \| ImageContent \| AudioContent \| ResourceLink \| EmbeddedResource; ``` so the deletion of those individual variants should not be a cause of great concern. Similarly, we have the following change in `codex-rs/app-server-protocol/schema/typescript/Tool.ts`: ``` - export type Tool = { annotations?: ToolAnnotations, description?: string, inputSchema: ToolInputSchema, name: string, outputSchema?: ToolOutputSchema, title?: string, }; + export type Tool = { name: string, title?: string, description?: string, inputSchema: JsonValue, outputSchema?: JsonValue, annotations?: JsonValue, icons?: Array<JsonValue>, _meta?: JsonValue, }; ``` so: - `annotations?: ToolAnnotations` ➡️ `JsonValue` - `inputSchema: ToolInputSchema` ➡️ `JsonValue` - `outputSchema?: ToolOutputSchema` ➡️ `JsonValue` and two new fields: `icons?: Array<JsonValue>, _meta?: JsonValue` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/10349). * #10357 * __->__ #10349 * #10356	2026-02-02 17:41:55 -08:00
Eric Traut	c4af707e09	Removed experimental "command risk assessment" feature (#7799 ) This experimental feature received lukewarm reception during internal testing. Removing from the code base.	2025-12-10 09:48:11 -08:00
zhao-oai	3d35cb4619	Refactor execpolicy fallback evaluation (#7544 ) ## Refactor of the `execpolicy` crate To illustrate why we need this refactor, consider an agent attempting to run `apple \| rm -rf ./`. Suppose `apple` is allowed by `execpolicy`. Before this PR, `execpolicy` would consider `apple` and `pear` and only render one rule match: `Allow`. We would skip any heuristics checks on `rm -rf ./` and immediately approve `apple \| rm -rf ./` to run. To fix this, we now thread a `fallback` evaluation function into `execpolicy` that runs when no `execpolicy` rules match a given command. In our example, we would run `fallback` on `rm -rf ./` and prevent `apple \| rm -rf ./` from being run without approval.	2025-12-03 23:39:48 -08:00
zhao-oai	e925a380dc	whitelist command prefix integration in core and tui (#7033 ) this PR enables TUI to approve commands and add their prefixes to an allowlist: <img width="708" height="605" alt="Screenshot 2025-11-21 at 4 18 07 PM" src="https://github.com/user-attachments/assets/56a19893-4553-4770-a881-becf79eeda32" /> note: we only show the option to whitelist the command when 1) command is not multi-part (e.g `git add -A && git commit -m 'hello world'`) 2) command is not already matched by an existing rule	2025-12-03 23:17:02 -08:00
Jeremy Rose	7561a6aaf0	support MCP elicitations (#6947 ) No support for request schema yet, but we'll at least show the message and allow accept/decline. <img width="823" height="551" alt="Screenshot 2025-11-21 at 2 44 05 PM" src="https://github.com/user-attachments/assets/6fbb892d-ca12-4765-921e-9ac4b217534d" />	2025-11-21 14:44:53 -08:00
Owen Lin	d6c30ed25e	[app-server] feat: v2 apply_patch approval flow (#6760 ) This PR adds the API V2 version of the apply_patch approval flow, which centers around `ThreadItem::FileChange`. This PR wires the new RPC (`item/fileChange/requestApproval`, V2 only) and related events (`item/started`, `item/completed` for `ThreadItem::FileChange`, which are emitted in both V1 and V2) through the app-server protocol. The new approval RPC is only sent when the user initiates a turn with the new `turn/start` API so we don't break backwards compatibility with VSCE. Similar to https://github.com/openai/codex/pull/6758, the approach I took was to make as few changes to the Codex core as possible, leveraging existing `EventMsg` core events, and translating those in app-server. I did have to add a few additional fields to `EventMsg::PatchApplyBegin` and `EventMsg::PatchApplyEnd`, but those were fairly lightweight. However, the `EventMsg`s emitted by core are the following: ``` 1) Auto-approved (no request for approval)  - EventMsg::PatchApplyBegin - EventMsg::PatchApplyEnd 2) Approved by user - EventMsg::ApplyPatchApprovalRequest - EventMsg::PatchApplyBegin - EventMsg::PatchApplyEnd 3) Declined by user - EventMsg::ApplyPatchApprovalRequest - EventMsg::PatchApplyBegin - EventMsg::PatchApplyEnd ``` For a request triggering an approval, this would result in: ``` item/fileChange/requestApproval item/started item/completed ``` which is different from the `ThreadItem::CommandExecution` flow introduced in https://github.com/openai/codex/pull/6758, which does the below and is preferable: ``` item/started item/commandExecution/requestApproval item/completed ``` To fix this, we leverage `TurnSummaryStore` on codex_message_processor to store a little bit of state, allowing us to fire `item/started` and `item/fileChange/requestApproval` whenever we receive the underlying `EventMsg::ApplyPatchApprovalRequest`, and no-oping when we receive the `EventMsg::PatchApplyBegin` later. This is much less invasive than modifying the order of EventMsg within core (I tried). The resulting payloads: ``` { "method": "item/started", "params": { "item": { "changes": [ { "diff": "Hello from Codex!\n", "kind": "add", "path": "/Users/owen/repos/codex/codex-rs/APPROVAL_DEMO.txt" } ], "id": "call_Nxnwj7B3YXigfV6Mwh03d686", "status": "inProgress", "type": "fileChange" } } } ``` ``` { "id": 0, "method": "item/fileChange/requestApproval", "params": { "grantRoot": null, "itemId": "call_Nxnwj7B3YXigfV6Mwh03d686", "reason": null, "threadId": "019a9e11-8295-7883-a283-779e06502c6f", "turnId": "1" } } ``` ``` { "id": 0, "result": { "decision": "accept" } } ``` ``` { "method": "item/completed", "params": { "item": { "changes": [ { "diff": "Hello from Codex!\n", "kind": "add", "path": "/Users/owen/repos/codex/codex-rs/APPROVAL_DEMO.txt" } ], "id": "call_Nxnwj7B3YXigfV6Mwh03d686", "status": "completed", "type": "fileChange" } } } ```	2025-11-19 20:13:31 -08:00
Owen Lin	cecbd5b021	[app-server] feat: add v2 command execution approval flow (#6758 ) This PR adds the API V2 version of the command‑execution approval flow for the shell tool. This PR wires the new RPC (`item/commandExecution/requestApproval`, V2 only) and related events (`item/started`, `item/completed`, and `item/commandExecution/delta`, which are emitted in both V1 and V2) through the app-server protocol. The new approval RPC is only sent when the user initiates a turn with the new `turn/start` API so we don't break backwards compatibility with VSCE. The approach I took was to make as few changes to the Codex core as possible, leveraging existing `EventMsg` core events, and translating those in app-server. I did have to add additional fields to `EventMsg::ExecCommandEndEvent` to capture the command's input so that app-server can statelessly transform these events to a `ThreadItem::CommandExecution` item for the `item/completed` event. Once we stabilize the API and it's complete enough for our partners, we can work on migrating the core to be aware of command execution items as a first-class concept. Note: We'll need followup work to make sure these APIs work for the unified exec tool, but will wait til that's stable and landed before doing a pass on app-server. Example payloads below: ``` { "method": "item/started", "params": { "item": { "aggregatedOutput": null, "command": "/bin/zsh -lc 'touch /tmp/should-trigger-approval'", "cwd": "/Users/owen/repos/codex/codex-rs", "durationMs": null, "exitCode": null, "id": "call_lNWWsbXl1e47qNaYjFRs0dyU", "parsedCmd": [ { "cmd": "touch /tmp/should-trigger-approval", "type": "unknown" } ], "status": "inProgress", "type": "commandExecution" } } } ``` ``` { "id": 0, "method": "item/commandExecution/requestApproval", "params": { "itemId": "call_lNWWsbXl1e47qNaYjFRs0dyU", "parsedCmd": [ { "cmd": "touch /tmp/should-trigger-approval", "type": "unknown" } ], "reason": "Need to create file in /tmp which is outside workspace sandbox", "risk": null, "threadId": "019a93e8-0a52-7fe3-9808-b6bc40c0989a", "turnId": "1" } } ``` ``` { "id": 0, "result": { "acceptSettings": { "forSession": false }, "decision": "accept" } } ``` ``` { "params": { "item": { "aggregatedOutput": null, "command": "/bin/zsh -lc 'touch /tmp/should-trigger-approval'", "cwd": "/Users/owen/repos/codex/codex-rs", "durationMs": 224, "exitCode": 0, "id": "call_lNWWsbXl1e47qNaYjFRs0dyU", "parsedCmd": [ { "cmd": "touch /tmp/should-trigger-approval", "type": "unknown" } ], "status": "completed", "type": "commandExecution" } } } ```	2025-11-18 00:23:54 +00:00
Eric Traut	d5853d9c47	Changes to sandbox command assessment feature based on initial experiment feedback (#6091 ) * Removed sandbox risk categories; feedback indicates that these are not that useful and "less is more" * Tweaked the assessment prompt to generate terser answers * Fixed bug in orchestrator that prevents this feature from being exposed in the extension	2025-11-01 14:52:23 -07:00
Owen Lin	89c00611c2	[app-server] remove serde(skip_serializing_if = "Option::is_none") annotations (#5939 ) We had this annotation everywhere in app-server APIs which made it so that fields get serialized as `field?: T`, meaning if the field as `None` we would omit the field in the payload. Removing this annotation changes it so that we return `field: T \| null` instead, which makes codex app-server's API more aligned with the convention of public OpenAI APIs like Responses. Separately, remove the `#[ts(optional_fields = nullable)]` annotations that were recently added which made all the TS types become `field?: T \| null` which is not great since clients need to handle undefined and null. I think generally it'll be best to have optional types be either: - `field: T \| null` (preferred, aligned with public OpenAI APIs) - `field?: T` where we have to, such as types generated from the MCP schema: https://github.com/modelcontextprotocol/modelcontextprotocol/blob/main/schema/2025-06-18/schema.ts (see changes to `mcp-types/`) I updated @etraut-openai's unit test to check that all generated TS types are one or the other, not both (so will error if we have a type that has `field?: T \| null`). I don't think there's currently a good use case for that - but we can always revisit.	2025-10-30 18:18:53 +00:00
Eric Traut	069a38a06c	Add missing "nullable" macro to protocol structs that contain optional fields (#5901 ) This PR addresses a current hole in the TypeScript code generation for the API server protocol. Fields that are marked as "Optional<>" in the Rust code are serialized such that the value is omitted when it is deserialized — appearing as `undefined`, but the TS type indicates (incorrectly) that it is always defined but possibly `null`. This can lead to subtle errors that the TypeScript compiler doesn't catch. The fix is to include the `#[ts(optional_fields = nullable)]` macro for all protocol structs that contain one or more `Optional<>` fields. This PR also includes a new test that validates that all TS protocol code containing "\| null" in its type is marked optional ("?") to catch cases where `#[ts(optional_fields = nullable)]` is omitted.	2025-10-29 12:09:47 -07:00
Eric Traut	f8af4f5c8d	Added model summary and risk assessment for commands that violate sandbox policy (#5536 ) This PR adds support for a model-based summary and risk assessment for commands that violate the sandbox policy and require user approval. This aids the user in evaluating whether the command should be approved. The feature works by taking a failed command and passing it back to the model and asking it to summarize the command, give it a risk level (low, medium, high) and a risk category (e.g. "data deletion" or "data exfiltration"). It uses a new conversation thread so the context in the existing thread doesn't influence the answer. If the call to the model fails or takes longer than 5 seconds, it falls back to the current behavior. For now, this is an experimental feature and is gated by a config key `experimental_sandbox_command_assessment`. Here is a screen shot of the approval prompt showing the risk assessment and summary. <img width="723" height="282" alt="image" src="https://github.com/user-attachments/assets/4597dd7c-d5a0-4e9f-9d13-414bd082fd6b" />	2025-10-24 15:23:44 -07:00

13 commits