core-agent-ide

Author	SHA1	Message	Date
Eric Traut	f6fdfbeb98	Update Codex docs success link (#12805 ) Fix a stale documentation link in the sign-in flow	2026-02-25 12:02:41 -08:00
Ahmed Ibrahim	3f30746237	Add simple realtime text logs (#12807 ) Update realtime debug logs to include the actual text payloads in both input and output paths. - In `core/src/realtime_conversation.rs`: - `handle_start`: add extracted assistant text output to the `[realtime-text]` debug log. - `handle_text`: add incoming text input (`params.text`) to the `[realtime-text]` debug log. No tests were run (per request).	2026-02-25 12:01:48 -08:00
Owen Lin	a0fd94bde6	feat(app-server): add ThreadItem::DynamicToolCall (#12732 ) Previously, clients would call `thread/start` with dynamic_tools set, and when a model invokes a dynamic tool, it would just make the server->client `item/tool/call` request and wait for the client's response to complete the tool call. This works, but it doesn't have an `item/started` or `item/completed` event. Now we are doing this: - [new] emit `item/started` with `DynamicToolCall` populated with the call arguments - send an `item/tool/call` server request - [new] once the client responds, emit `item/completed` with `DynamicToolCall` populated with the response. Also, with `persistExtendedHistory: true`, dynamic tool calls are now reconstructable in `thread/read` and `thread/resume` as `ThreadItem::DynamicToolCall`.	2026-02-25 12:00:10 -08:00
Rasmus Rygaard	73eaebbd1c	Propagate session ID when compacting (#12802 ) We propagate the session ID when sending requests for inference but we don't do the same for compaction requests. This makes it hard to link compaction requests to their session for debugging purposes	2026-02-25 19:17:38 +00:00
Michael Bolin	648a420cbf	fix: enforce sandbox envelope for zsh fork execution (#12800 ) ## Why Zsh fork execution was still able to bypass the `WorkspaceWrite` model in edge cases because the fork path reconstructed command execution without preserving sandbox wrappers, and command extraction only accepted shell invocations in a narrow positional shape. This can allow commands to run with broader filesystem access than expected, which breaks the sandbox safety model. ## What changed - Preserved the sandboxed `ExecRequest` produced by `attempt.env_for(...)` when entering the zsh fork path in [`unix_escalation.rs`](https://github.com/openai/codex/blob/main/codex-rs/core/src/tools/runtimes/shell/unix_escalation.rs). - Updated `CoreShellCommandExecutor` to execute the sandboxed command and working directory captured from `attempt.env_for(...)`, instead of re-running a freshly reconstructed shell command. - Made zsh-fork script extraction robust to wrapped invocations by scanning command arguments for `-c`/`-lc` rather than only matching the first positional form. - Added unit tests in `unix_escalation.rs` to lock in wrapper-tolerant parsing behavior and keep unsupported shell forms rejected. - Tightened the regression in [`skill_approval.rs`](https://github.com/openai/codex/blob/main/codex-rs/core/tests/suite/skill_approval.rs): - `shell_zsh_fork_still_enforces_workspace_write_sandbox` now uses an explicit `WorkspaceWrite` policy with `exclude_tmpdir_env_var: true` and `exclude_slash_tmp: true`. - The test attempts to write to `/tmp/...`, which is only reliably outside writable roots with those explicit exclusions set. ## Verification - Added and passed the new unit tests around `extract_shell_script` parsing behavior with wrapped command shapes. - `extract_shell_script_supports_wrapped_command_prefixes` - `extract_shell_script_rejects_unsupported_shell_invocation` - Verified the regression with the focused integration test: `shell_zsh_fork_still_enforces_workspace_write_sandbox`. ## Manual Testing Prior to this change, if I ran Codex via: ``` just codex --config zsh_path=/Users/mbolin/code/codex2/codex-rs/app-server/tests/suite/zsh --enable shell_zsh_fork ``` and asked: ``` what is the output of /bin/ps ``` it would run it, even though the default sandbox should prevent the agent from running `/bin/ps` because it is setuid on MacOS. But with this change, I now see the expected failure because it is blocked by the sandbox: ``` /bin/ps exited with status 1 and produced no output in this environment. ```	2026-02-25 11:05:27 -08:00
pakrym-oai	9d7013eab0	Handle websocket timeout (#12791 ) Sometimes websockets will timeout with 400 error, ensure we retry it.	2026-02-25 10:31:37 -08:00
jif-oai	7b39e76a66	Revert "fix(bazel): replace askama templates with include_str! in memories" (#12795 ) Reverts openai/codex#11778	2026-02-25 18:06:17 +00:00
Ahmed Ibrahim	947092283a	Add app-server v2 thread realtime API (#12715 ) Add experimental `thread/realtime/*` v2 requests and notifications, then route app-server realtime events through that thread-scoped surface with integration coverage. --------- Co-authored-by: Codex <noreply@openai.com>	2026-02-25 09:59:10 -08:00
Curtis 'Fjord' Hawthorne	0543d0a022	Promote js_repl to experimental with Node requirement (#12712 ) ## Summary - Promote `js_repl` to an experimental feature that users can enable from `/experimental`. - Add `js_repl` experimental metadata, including the Node prerequisite and activation guidance. - Add regression coverage for the feature metadata and the `/experimental` popup. ## What Changed - Changed `Feature::JsRepl` from `Stage::UnderDevelopment` to `Stage::Experimental`. - Added experimental metadata for `js_repl` in `core/src/features.rs`: - name: `JavaScript REPL` - description: calls out interactive website debugging, inline JavaScript execution, and the required Node version (`>= v24.13.1`) - announcement: tells users to enable it, then start a new chat or restart Codex - Added a core unit test that verifies: - `js_repl` is experimental - `js_repl` is disabled by default - the hardcoded Node version in the description matches `node-version.txt` - Added a TUI test that opens the `/experimental` popup and verifies the rendered `js_repl` entry includes the Node requirement text. ## Testing - `just fmt` - `cargo test -p codex-tui` - `cargo test -p codex-core` (unit-test phase passed; stopped during the long `tests/all.rs` integration suite)	2026-02-25 09:44:52 -08:00
mcgrew-oai	9a393c9b6f	feat(network-proxy): add embedded OTEL policy audit logging (#12046 ) PR Summary This PR adds embedded-only OTEL policy audit logging for `codex-network-proxy` and threads audit metadata from `codex-core` into managed proxy startup. ### What changed - Added structured audit event emission in `network_policy.rs` with target `codex_otel.network_proxy`. - Emitted: - `codex.network_proxy.domain_policy_decision` once per domain-policy evaluation. - `codex.network_proxy.block_decision` for non-domain denies. - Added required policy/network fields, RFC3339 UTC millisecond `event.timestamp`, and fallback defaults (`http.request.method="none"`, `client.address="unknown"`). - Added non-domain deny audit emission in HTTP/SOCKS handlers for mode-guard and proxy-state denies, including unix-socket deny paths. - Added `REASON_UNIX_SOCKET_UNSUPPORTED` and used it for unsupported unix-socket auditing. - Added `NetworkProxyAuditMetadata` to runtime/state, re-exported from `lib.rs` and `state.rs`. - Added `start_proxy_with_audit_metadata(...)` in core config, with `start_proxy()` delegating to default metadata. - Wired metadata construction in `codex.rs` from session/auth context, including originator sanitization for OTEL-safe tagging. - Updated `network-proxy/README.md` with embedded-mode audit schema and behavior notes. - Refactored HTTP block-audit emission to a small local helper to reduce duplication. - Preserved existing unix-socket proxy-disabled host/path behavior for responses and blocked history while using an audit-only endpoint override (`server.address="unix-socket"`, `server.port=0`). ### Explicit exclusions - No standalone proxy OTEL startup work. - No `main.rs` binary wiring. - No `standalone_otel.rs`. - No standalone docs/tests. ### Tests - Extended `network_policy.rs` tests for event mapping, metadata propagation, fallbacks, timestamp format, and target prefix. - Extended HTTP tests to assert unix-socket deny block audit events. - Extended SOCKS tests to cover deny emission from handler deny branches. - Added/updated core tests to verify audit metadata threading into managed proxy state. ### Validation run - `just fmt` - `cargo test -p codex-network-proxy` ✅ - `cargo test -p codex-core` ran with one unrelated flaky timeout (`shell_snapshot::tests::snapshot_shell_does_not_inherit_stdin`), and the test passed when rerun directly ✅ --------- Co-authored-by: viyatb-oai <viyatb@openai.com>	2026-02-25 11:46:37 -05:00
jif-oai	8362b79cb4	feat: fix sqlite home (#12787 )	2026-02-25 15:52:55 +00:00
jif-oai	01f25a7b96	chore: unify max depth parameter (#12770 ) Users were confused	2026-02-25 15:20:24 +00:00
mcgrew-oai	bccce0d75f	otel: add host.name resource attribute to logs/traces via gethostname (#12352 ) PR Summary This PR adds the OpenTelemetry `host.name` resource attribute to Codex OTEL exports so every OTEL log (and trace, via the shared resource) carries the machine hostname. What changed - Added `host.name` to the shared OTEL `Resource` in `/Users/michael.mcgrew/code/codex/codex-rs/otel/src/otel_provider.rs` - This applies to both: - OTEL logs (`SdkLoggerProvider`) - OTEL traces (`SdkTracerProvider`) - Hostname is now resolved via `gethostname::gethostname()` (best-effort) - Value is trimmed - Empty values are omitted (non-fatal) - Added focused unit tests for: - including `host.name` when present - omitting `host.name` when missing/empty Why - `host.name` is host/process metadata and belongs on the OTEL `resource`, not per-event attributes. - Attaching it in the shared resource is the smallest change that guarantees coverage across all exported OTEL logs/traces. Scope / Non-goals - No public API changes - No changes to metrics behavior (this PR only updates log/trace resource metadata) Dependency updates - Added `gethostname` as a workspace dependency and `codex-otel` dependency - `Cargo.lock` updated accordingly - `MODULE.bazel.lock` unchanged after refresh/check Validation - `just fmt` - `cargo test -p codex-otel` - `just bazel-lock-update` - `just bazel-lock-check`	2026-02-25 09:54:45 -05:00
jif-oai	8d49e0d0c4	nit: migration (#12772 )	2026-02-25 13:56:52 +00:00
jif-oai	e4bfa763f6	feat: record memory usage (#12761 )	2026-02-25 13:48:40 +00:00
jif-oai	5441130e0a	feat: adding stream parser (#12666 ) Add a stream parser to extract citations (and others) from a stream. This support cases where markers are split in differen tokens. Codex never manage to make this code work so everything was done manually. Please review correctly and do not touch this part of the code without a very clear understanding of it	2026-02-25 13:27:58 +00:00
jif-oai	5a9a5b51b2	feat: add large stack test macro (#12768 ) This PR adds the macro `#[large_stack_test]` This spawns the tests in a dedicated tokio runtime with a larger stack. It is useful for tests that needs the full recursion on the harness (which is now too deep for windows for example)	2026-02-25 13:19:21 +00:00
jif-oai	bcd6e68054	Display pending child-thread approvals in TUI (#12767 ) Summary - propagate approval policy from parent to spawned agents and drop the Never override so sub-agents respect the caller’s request - refresh the pending-approval list whenever events arrive or the active thread changes and surface the list above the composer for inactive threads - add widgets, helpers, and tests covering the new pending-thread approval UI state ![Uploading Screenshot 2026-02-25 at 11.02.18.png…]()	2026-02-25 11:40:11 +00:00
Michael Bolin	93efcfd50d	feat: record whether a skill script is approved for the session (#12756 ) ## Why `unix_escalation.rs` checks a session-scoped approval cache before prompting again for an execve-intercepted skill script. Without also recording `ReviewDecision::ApprovedForSession`, that cache never gets populated, so the same skill script can still trigger repeated approval prompts within one session. ## What Changed - Add `execve_session_approvals` to `SessionServices` so the session can track approved skill script paths. - Record the script path when a skill-script prompt returns `ReviewDecision::ApprovedForSession`, but only for the skill-script path rather than broader prefix-rule approvals. - Reuse the cached approval on later execve callbacks by treating an already-approved skill script as `Decision::Allow`. --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/12756). * #12758 * __->__ #12756	2026-02-25 10:17:22 +00:00
alexsong-oai	6d6570d89d	Support external agent config detect and import (#12660 ) Migration Behavior * Config * Migrates settings.json into config.toml * Only adds fields when config.toml is missing, or when those fields are missing from the existing file * Supported mappings: env -> shell_environment_policy sandbox.enabled = true -> sandbox_mode = "workspace-write" * Skills * Copies home and repo .claude/skills into .agents/skills * Existing skill directories are not overwritten * SKILL.md content is rewritten from Claude-related terms to Codex * AgentsMd * Repo only * Migrates CLAUDE.md into AGENTS.md * Detect/import only proceed when AGENTS.md is missing or present but empty * Content is rewritten from Claude-related terms to Codex	2026-02-25 02:11:51 -08:00
jif-oai	f46b767b7e	feat: add search term to thread list (#12578 ) Add `searchTerm` to `thread/list` that will search for a match in the titles (the condition being `searchTerm` $$\in$$ `title`)	2026-02-25 09:59:41 +00:00
jif-oai	a046849438	fix: flaky test due to second-resolution for thread ordering (#12692 )	2026-02-25 09:59:25 +00:00
jif-oai	10c04e11b8	feat: add service name to app-server (#12319 ) Add service name to the app-server so that the app can use it's own service name This is on thread level because later we might plan the app-server to become a singleton on the computer	2026-02-25 09:51:42 +00:00
Celia Chen	6a3233da64	Surface skill permission profiles in zsh-fork exec approvals (#12753 ) ## Summary - Preserve each skill’s raw permissions block as a permission_profile on SkillMetadata during skill loading. - Keep compiling that same metadata into the existing runtime Permissions object, so current enforcement behavior stays intact. - When zsh-fork intercepts execution of a script that belongs to a skill, include the skill’s permission_profile in the exec approval request. - This lets approval UIs show the extra filesystem access the skill declared when prompting for approval.	2026-02-25 01:23:10 -08:00
Michael Bolin	c4ec6be4ab	fix: keep shell escalation exec paths absolute (#12750 ) ## Why In the `shell_zsh_fork` flow, `codex-shell-escalation` receives the executable path exactly as the shell passed it to `execve()`. That path is not guaranteed to be absolute. For commands such as `./scripts/hello-mbolin.sh`, if the shell was launched with a different `workdir`, resolving the intercepted `file` against the server process working directory makes policy checks and skill matching inspect the wrong executable. This change pushes that fix a step further by keeping the normalized path typed as `AbsolutePathBuf` throughout the rest of the escalation pipeline. That makes the absolute-path invariant explicit, so later code cannot accidentally treat the resolved executable path as an arbitrary `PathBuf`. ## What Changed - record the wrapper process working directory as an `AbsolutePathBuf` - update the escalation protocol so `workdir` is explicitly absolute while `file` remains the raw intercepted exec path - resolve a relative intercepted `file` against the request `workdir` as soon as the server receives the request - thread `AbsolutePathBuf` through `EscalationPolicy`, `CoreShellActionProvider`, and command normalization helpers so the resolved executable path stays type-checked as absolute - replace the `path-absolutize` dependency in `codex-shell-escalation` with `codex-utils-absolute-path` - add a regression test that covers a relative `file` with a distinct `workdir` ## Verification - `cargo test -p codex-shell-escalation`	2026-02-24 23:52:36 -08:00
Michael Bolin	59398125f6	feat: zsh-fork forces scripts/*/ for skills to trigger a prompt (#12730 ) Direct skill-script matches force `Decision::Prompt`, so skill-backed scripts require explicit approval before they run. (Note "allow for session" is not supported in this PR, but will be done in a follow-up.) In the process of implementing this, I fixed an important bug: `ShellZshFork` is supposed to keep ordinary allowed execs on the client-side `Run` path so later `execve()` calls are still intercepted and reviewed. After the shell-escalation port, `Decision::Allow` still mapped to `Escalate`, which moved `zsh` to server-side execution too early. That broke the intended flow for skill-backed scripts and made the approval prompt depend on the wrong execution path. ## What changed - In `codex-rs/core/src/tools/runtimes/shell/unix_escalation.rs`, `Decision::Allow` now returns `Run` unless escalation is actually required. - Removed the zsh-specific `argv[0]` fallback. With the `Allow -> Run` fix in place, zsh's later `execve()` of the script is intercepted normally, so the skill match happens on the script path itself. - Kept the skill-path handling in `determine_action()` focused on the direct `program` match path. ## Verification - Updated `shell_zsh_fork_prompts_for_skill_script_execution` in `codex-rs/core/tests/suite/skill_approval.rs` (gated behind `cfg(unix)`) to: - run under `SandboxPolicy::new_workspace_write_policy()` instead of `DangerFullAccess` - assert the approval command contains only the script path - assert the approved run returns both stdout and stderr markers in the shell output - Ran `cargo test -p codex-core shell_zsh_fork_prompts_for_skill_script_execution -- --nocapture` ## Manual Testing Run the dev build: ``` just codex --config zsh_path=/Users/mbolin/code/codex2/codex-rs/app-server/tests/suite/zsh --enable shell_zsh_fork ``` I have created `/Users/mbolin/.agents/skills/mbolin-test-skill` with: ``` ├── scripts │ └── hello-mbolin.sh └── SKILL.md ``` The skill: ``` --- name: mbolin-test-skill description: Used to exercise various features of skills. --- When this skill is invoked, run the `hello-mbolin.sh` script and report the output. ``` The script: ``` set -e # Note this script will fail if run with network disabled. curl --location openai.com ``` Use `$mbolin-test-skill` to invoke the skill manually and verify that I get prompted to run `hello-mbolin.sh`. --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/12730). * #12750 * __->__ #12730	2026-02-24 23:51:26 -08:00
viyatb-oai	c086b36b58	feat(ui): add network approval persistence plumbing (#12358 ) ## Summary - add TUI approval options for persistent network host rules - add app-server v2 approval payload plumbing for network approval context + proposed network policy amendments - add app-server handling to translate `applyNetworkPolicyAmendment` decisions back into core review decisions - update docs/test client output and generated app-server schemas/types	2026-02-25 07:06:19 +00:00
Curtis 'Fjord' Hawthorne	9501669a24	tests(js_repl): remove node-related skip paths from js_repl tests (#12185 ) ## Summary Remove js_repl/node test-skip paths and make Node setup explicit in CI so js_repl tests always run instead of silently skipping. ## Why We had multiple “expediency” skip paths that let js_repl tests pass without actually exercising Node-backed behavior. This reduced CI signal and hid runtime/environment regressions. ## What changed ### CI - Added Node setup using `codex-rs/node-version.txt` in: - `.github/workflows/rust-ci.yml` - `.github/workflows/bazel.yml` - Added a Unix PATH copy step in Bazel workflow to expose the setup-node binary in common paths. ### js_repl test harness - Added explicit js_repl sandbox test configuration helpers in: - `codex-rs/core/src/tools/js_repl/mod.rs` - `codex-rs/core/src/tools/handlers/js_repl.rs` - Added Linux arg0 dispatch glue for js_repl tests so sandbox subprocess entrypoint behavior is correct under Linux test execution. ### Removed skip behavior - Deleted runtime guard function and early-return skips in js_repl tests (`can_run_js_repl_runtime_tests` and related per-test short-circuits). - Removed view_image integration test skip behavior: - dropped `skip_if_no_network!(Ok(()))` - removed “skip on Node missing/too old” branch after js_repl output inspection. ## Impact - js_repl/node tests now consistently execute and fail loudly when the environment is not correctly provisioned. - CI has stronger signal for js_repl regressions instead of false green from conditional skips. ## Testing - `cargo test -p codex-core` (locally) to validate js_repl unit/integration behavior with skips removed. - CI expected to surface any remaining environment/runtime gaps directly (rather than masking them). #### [git stack](https://github.com/magus/git-stack-cli) - ✅ `1` https://github.com/openai/codex/pull/12300 - ✅ `2` https://github.com/openai/codex/pull/12275 - ✅ `3` https://github.com/openai/codex/pull/12205 - ✅ `4` https://github.com/openai/codex/pull/12407 - ✅ `5` https://github.com/openai/codex/pull/12372 - 👉 `6` https://github.com/openai/codex/pull/12185 - ⏳ `7` https://github.com/openai/codex/pull/10673	2026-02-24 22:52:14 -08:00
Michael Bolin	ddfa032eb8	fix: chatwidget was not honoring approval_id for an ExecApprovalRequestEvent (#12746 ) ## Why `ExecApprovalRequestEvent` can carry a distinct `approval_id` for subcommand approvals, including the `execve`-intercepted zsh-fork path. The session registers the pending approval callback under `approval_id` when one is present, but `ChatWidget` was stashing `call_id` in the approval modal state. When the user approved the command in the TUI, the response was sent back with the wrong identifier, so the pending approval could not be matched and the approval callback would not resolve. Note `approval_id` was introduced in https://github.com/openai/codex/pull/12051. ## What changed - In `tui/src/chatwidget.rs`, `ChatWidget` now uses `ExecApprovalRequestEvent::effective_approval_id()` when constructing `ApprovalRequest::Exec`. - That preserves the existing behavior for normal shell and `unified_exec` approvals, where `approval_id` is absent and the effective id still falls back to `call_id`. - For subcommand approvals that provide a distinct `approval_id`, the TUI now sends back the same key that `Session::request_command_approval()` registered. ## Verification - Traced the approval flow end to end to confirm the same effective approval id is now used on both sides of the round trip: - `Session::request_command_approval()` registers the pending callback under `approval_id.unwrap_or(call_id)`. - `ChatWidget` now emits `Op::ExecApproval` with that same effective id.	2026-02-24 22:27:05 -08:00
Curtis 'Fjord' Hawthorne	6cb2f02ef8	feat: update Docker image digest to reflect #12205 (#12372 ) This is a clone of #12371 for easier rebasing/testing. #### [git stack](https://github.com/magus/git-stack-cli) - ✅ `1` https://github.com/openai/codex/pull/12407 - 👉 `2` https://github.com/openai/codex/pull/12372 - ⏳ `3` https://github.com/openai/codex/pull/12185 - ⏳ `4` https://github.com/openai/codex/pull/10673 Co-authored-by: Michael Bolin <mbolin@openai.com>	2026-02-24 22:19:46 -08:00
Celia Chen	1151972fb2	feat: add experimental additionalPermissions to v2 command execution approval requests (#12737 ) This adds additionalPermissions to the app-server v2 item/commandExecution/requestApproval payload as an experimental field. The field is now exposed on CommandExecutionRequestApprovalParams and is populated from the existing core approval event when a command requests additional sandbox permissions. This PR also contains changes to make server requests to support experiment API. A real app server test client test: sample payload with experimental flag off: ``` { < "id": 0, < "method": "item/commandExecution/requestApproval", < "params": { < "command": "/bin/zsh -lc 'mkdir -p ~/some/test && touch ~/some/test/file'", < "commandActions": [ < { < "command": "mkdir -p '~/some/test'", < "type": "unknown" < }, < { < "command": "touch '~/some/test/file'", < "type": "unknown" < } < ], < "cwd": "/Users/celia/code/codex/codex-rs", < "itemId": "call_QLp0LWkQ1XkU6VW9T2vUZFWB", < "proposedExecpolicyAmendment": [ < "mkdir", < "-p", < "~/some/test" < ], < "reason": "Do you want to allow creating ~/some/test/file outside the workspace?", < "threadId": "019c9309-e209-7d82-a01b-dcf9556a354d", < "turnId": "019c9309-e27a-7f33-834f-6011e795c2d6" < } < } ``` with experimental flag on: ``` < { < "id": 0, < "method": "item/commandExecution/requestApproval", < "params": { < "additionalPermissions": { < "fileSystem": null, < "macos": null, < "network": true < }, < "command": "/bin/zsh -lc 'install -D /dev/null ~/some/test/file'", < "commandActions": [ < { < "command": "install -D /dev/null '~/some/test/file'", < "type": "unknown" < } < ], < "cwd": "/Users/celia/code/codex/codex-rs", < "itemId": "call_K3U4b3dRbj3eMCqslmncbGsq", < "proposedExecpolicyAmendment": [ < "install", < "-D" < ], < "reason": "Do you want to allow creating the file at ~/some/test/file outside the workspace sandbox?", < "threadId": "019c9303-3a8e-76e1-81bf-d67ac446d892", < "turnId": "019c9303-3af1-7143-88a1-73132f771234" < } < } ```	2026-02-25 05:16:35 +00:00
Curtis 'Fjord' Hawthorne	8f3f2c3c02	tests(js_repl): stabilize CI runtime test execution (#12407 ) ## Summary Stabilize `js_repl` runtime test setup in CI and move tool-facing `js_repl` behavior coverage into integration tests. This is a test/CI change only. No production `js_repl` behavior change is intended. ## Why - Bazel test sandboxes (especially on macOS) could resolve a different `node` than the one installed by `actions/setup-node`, which caused `js_repl` runtime/version failures. - `js_repl` runtime tests depend on platform-specific sandbox/test-harness behavior, so they need explicit gating in a base-stability commit. - Several tests in the `js_repl` unit test module were actually black-box/tool-level behavior tests and fit better in the integration suite. ## Changes - Add `actions/setup-node` to the Bazel and Rust `Tests` workflows, using the exact version pinned in the repo’s Node version file. - In Bazel (non-Windows), pass `CODEX_JS_REPL_NODE_PATH=$(which node)` into test env so `js_repl` uses the `actions/setup-node` runtime inside Bazel tests. - Add a new integration test suite for `js_repl` tool behavior and register it in the core integration test suite module. - Move black-box `js_repl` behavior tests into the integration suite (persistence/TLA, builtin tool invocation, recursive self-call rejection, `process` isolation, blocked builtin imports). - Keep white-box manager/kernel tests in the `js_repl` unit test module. - Gate `js_repl` runtime tests to run only on macOS and only when a usable Node runtime is available (skip on other platforms / missing Node in this commit). ## Impact - Reduces `js_repl` CI failures caused by Node resolution drift in Bazel. - Improves test organization by separating tool-facing behavior tests from white-box manager/kernel tests. - Keeps the base commit stable while expanding `js_repl` runtime coverage. #### [git stack](https://github.com/magus/git-stack-cli) - ✅ `1` https://github.com/openai/codex/pull/12372 - 👉 `2` https://github.com/openai/codex/pull/12407 - ⏳ `3` https://github.com/openai/codex/pull/12185 - ⏳ `4` https://github.com/openai/codex/pull/10673	2026-02-24 21:04:34 -08:00
Celia Chen	16ca527c80	chore: migrate additional permissions to PermissionProfile (#12731 ) This PR replaces the old `additional_permissions.fs_read/fs_write` shape with a shared `PermissionProfile` model and wires it through the command approval, sandboxing, protocol, and TUI layers. The schema is adopted from the `SkillManifestPermissions`, which is also refactored to use this unified struct. This helps us easily expose permission profiles in app server/core as a follow-up.	2026-02-25 03:35:28 +00:00
sayan-oai	e6bb5d8553	chore: change catalog mode to enum (#12656 ) make presence of custom catalog more clear by changing to enum instead of bool.	2026-02-24 19:33:32 -08:00
Curtis 'Fjord' Hawthorne	125fbec317	Fix js_repl view_image attachments in nested tool calls (#12725 ) ## Summary - Fix `js_repl` so `await codex.tool("view_image", { path })` actually attaches the image to the active turn when called from inside the JS REPL. - Restore the behavior expected by the existing `js_repl` image-attachment test. - This is a follow-up to [#12553](https://github.com/openai/codex/pull/12553), which changed `view_image` to return structured image content. ## Root Cause - [#12553](https://github.com/openai/codex/pull/12553) changed `view_image` from directly injecting a pending user image message to returning structured `function_call_output` content items. - The nested tool-call bridge inside `js_repl` serialized that tool response back to the JS runtime, but it did not mirror returned image content into the active turn. - As a result, `view_image` appeared to succeed inside `js_repl`, but no `input_image` was actually attached for the outer turn. ## What Changed - Updated the nested tool-call path in `js_repl` to inspect function tool responses for structured content items. - When a nested tool response includes `input_image` content, `js_repl` now injects a corresponding user `Message` into the active turn before returning the raw tool result back to the JS runtime. - Kept the normal JSON result flow intact, so `codex.tool(...)` still returns the original tool output object to JavaScript. ## Why - `js_repl` documentation and tests already assume that `view_image` can be used from inside the REPL to attach generated images to the model. - Without this fix, the nested call path silently dropped that attachment behavior.	2026-02-24 18:23:53 -08:00
sayan-oai	74e112ea09	add AWS_LC_SYS_NO_JITTER_ENTROPY=1 to release musl build step to unblock releases (#12720 ) linux musl build steps in `rust-release.yml` are [currently broken](https://github.com/openai/codex/actions/runs/22367312571) because of linking issues due to ubsan-calling types (`jitterentropy`) leaking into the build. add `AWS_LC_SYS_NO_JITTER_ENTROPY=1` to the musl build step to avoid linking those ubsan-calling types. this is a more temporary fix, we need to clean up ubsan usage upstream so they dont leak into release-build steps anyways. codex's more thorough explanation below: [pr 9859](https://github.com/openai/codex/pull/9859) added [MITM init](https://github.com/openai/codex/pull/9859/changes#diff-db782967007060c5520651633e1ea21681d64be21f2b791d3d84519860245b97R62-R68) in network-proxy, which wires in cert generation code (rcgen/rustls). this didnt bump/change dep versions, but it changed symbol reachability at link time. for musl builds, that made aws-lc-sys’s jitterentropy objects get pulled into the final link. those objects contain UBSan calls (__ubsan_handle_). musl release linking is static (-linux-musl-gcc, -nodefaultlibs) and does not link a musl UBSan runtime, so link fails with undefined __ubsan_*. before, our custom musl CI UBSan steps (install libubsan1, RUSTC_WRAPPER + LD_PRELOAD, partial flag scrubbing) masked some sanitizer issues. after this pr, more aws-lc code became link-reachable, and that band-aid wasn't enough.	2026-02-24 18:11:04 -08:00
Michael Bolin	e88f74d140	feat: pass helper executable paths via Arg0DispatchPaths (#12719 ) ## Why `codex-rs/core/src/tools/runtimes/shell/unix_escalation.rs` previously located `codex-execve-wrapper` by scanning `PATH` and sibling directories. That lookup is brittle and can select the wrong binary when the runtime environment differs from startup assumptions. We already pass `codex-linux-sandbox` from `codex-arg0`; `codex-execve-wrapper` should use the same startup-driven path plumbing. ## What changed - Introduced `Arg0DispatchPaths` in `codex-arg0` to carry both helper executable paths: - `codex_linux_sandbox_exe` - `main_execve_wrapper_exe` - Updated `arg0_dispatch_or_else()` to pass `Arg0DispatchPaths` to top-level binaries and preserve helper paths created in `prepend_path_entry_for_codex_aliases()`. - Threaded `Arg0DispatchPaths` through entrypoints in `cli`, `exec`, `tui`, `app-server`, and `mcp-server`. - Added `main_execve_wrapper_exe` to core configuration plumbing (`Config`, `ConfigOverrides`, and `SessionServices`). - Updated zsh-fork shell escalation to consume the configured `main_execve_wrapper_exe` and removed path-sniffing fallback logic. - Updated app-server config reload paths so reloaded configs keep the same startup-provided helper executable paths. ## References - [`Arg0DispatchPaths` definition](`e355b43d5c/codex-rs/arg0/src/lib.rs (L20-L24)`) - [`arg0_dispatch_or_else()` forwarding both paths](`e355b43d5c/codex-rs/arg0/src/lib.rs (L145-L176)`) - [zsh-fork escalation using configured wrapper path](`e355b43d5c/codex-rs/core/src/tools/runtimes/shell/unix_escalation.rs (L109-L150)`) ## Testing - `cargo check -p codex-arg0 -p codex-core -p codex-exec -p codex-tui -p codex-mcp-server -p codex-app-server` - `cargo test -p codex-arg0` - `cargo test -p codex-core tools::runtimes:🐚:unix_escalation:: -- --nocapture`	2026-02-24 17:44:38 -08:00
Michael Bolin	448fb6ac22	fix: clarify the value of SkillMetadata.path (#12729 ) Rename `SkillMetadata.path` to `SkillMetadata.path_to_skills_md` for clarity. Would ideally change the type to `AbsolutePathBuf`, but that can be done later.	2026-02-24 17:15:54 -08:00
Curtis 'Fjord' Hawthorne	63c2ac96cd	fix(js_repl): surface uncaught kernel errors and reset cleanly (#12636 ) ## Summary Improve `js_repl` behavior when the Node kernel hits a process-level failure (for example, an uncaught exception or unhandled Promise rejection). Instead of only surfacing a generic `js_repl kernel exited unexpectedly` after stdout EOF, `js_repl` now returns a clearer exec error for the active request, then resets the kernel cleanly. ## Why Some sandbox-denied operations can trigger Node errors that become process-level failures (for example, an unhandled EventEmitter `'error'` event). In that case: - the kernel process exits, - the host sees stdout EOF, - the user gets a generic kernel-exit error, - and the next request can briefly race with stale kernel state. This change improves that failure mode without monkeypatching Node APIs. ## Changes ### Kernel-side (`js_repl` Node process) - Add process-level handlers for: - `uncaughtException` - `unhandledRejection` - When one of these fires: - best-effort emit a normal `exec_result` error for the active exec - include actionable guidance to catch/handle async errors (including Promise rejections and EventEmitter `'error'` events) - exit intentionally so the host can reset/restart the kernel ### Host-side (`JsReplManager`) - Clear dead kernel state as soon as the stdout reader observes unexpected kernel exit/EOF. - This lets the next `js_repl` exec start a fresh kernel instead of hitting a stale broken-pipe path. ### Tests - Add regression coverage for: - uncaught async exception -> exec error + kernel recovery on next exec - Update forced-kernel-exit test to validate recovery behavior (next exec restarts cleanly) ## Impact - Better user-facing error for kernel crashes caused by uncaught/unhandled async failures. - Cleaner recovery behavior after kernel exit. ## Validation - `cargo test -p codex-core --lib tools::js_repl::tests::js_repl_uncaught_exception_returns_exec_error_and_recovers -- --exact` - `cargo test -p codex-core --lib tools::js_repl::tests::js_repl_forced_kernel_exit_recovers_on_next_exec -- --exact` - `just fmt`	2026-02-24 17:12:02 -08:00
Max Johnson	5163850025	codex-rs/app-server: graceful websocket restart on Ctrl-C (#12517 ) ## Summary - add graceful websocket app-server restart on Ctrl-C by draining until no assistant turns are running - stop the websocket acceptor and disconnect existing connections once the drain condition is met - add a websocket integration test that verifies Ctrl-C waits for an in-flight turn before exit ## Verification - `cargo check -p codex-app-server --quiet` - `cargo test -p codex-app-server --test all suite::v2::connection_handling_websocket` - I (maxj) tested remote and local Codex.app --------- Co-authored-by: Codex <noreply@openai.com>	2026-02-24 16:27:59 -08:00
Michael Bolin	3d356723c4	fix: make EscalateServer public and remove shell escalation wrappers (#12724 ) ## Why `codex-shell-escalation` exposed a `codex-core`-specific adapter layer (`ShellActionProvider`, `ShellPolicyFactory`, and `run_escalate_server`) that existed only to bridge `codex-core` to `EscalateServer`. That indirection increased API surface and obscured crate ownership without adding behavior. This change moves orchestration into `codex-core` so boundaries are clearer: `codex-shell-escalation` provides reusable escalation primitives, and `codex-core` provides shell-tool policy decisions. Admittedly, @pakrym rightfully requested this sort of cleanup as part of https://github.com/openai/codex/pull/12649, though this avoids moving all of `codex-shell-escalation` into `codex-core`. ## What changed - Made `EscalateServer` public and exported it from `shell-escalation`. - Removed the adapter layer from `shell-escalation`: - deleted `shell-escalation/src/unix/core_shell_escalation.rs` - removed exports for `ShellActionProvider`, `ShellPolicyFactory`, `EscalationPolicyFactory`, and `run_escalate_server` - Updated `core/src/tools/runtimes/shell/unix_escalation.rs` to: - create `Stopwatch`/cancellation in `codex-core` - instantiate `EscalateServer` directly - implement `EscalationPolicy` directly on `CoreShellActionProvider` Net effect: same escalation flow with fewer wrappers and a smaller public API. ## Verification - Manually reviewed the old vs. new escalation call flow to confirm timeout/cancellation behavior and approval policy decisions are preserved while removing wrapper types.	2026-02-24 16:20:08 -08:00
Eric Traut	8da40c9251	Raise image byte estimate for compaction token accounting (#12717 ) Increase `IMAGE_BYTES_ESTIMATE` from 340 bytes to 7,373 bytes so the existing 4-bytes/token heuristic yields an image estimate of ~1,844 tokens instead of ~85. This makes auto-compaction more conservative for image-heavy transcripts and avoids underestimating context usage, which can otherwise cause compaction to fail when there is not enough free context remaining. The new value was chosen because that's the image resolution cap used for our latest models. Follow-up to [#12419](https://github.com/openai/codex/pull/12419). Refs [#11845](https://github.com/openai/codex/issues/11845).	2026-02-24 16:11:38 -08:00
pakrym-oai	5571a022eb	Add app-server event tracing (#12695 ) To help with debugging	2026-02-24 14:45:50 -08:00
Won Park	ee1520e79e	feat(tui) - /copy (#12613 ) # /copy! /copy allows you to copy the latest complete message from Codex on the TUI.	2026-02-24 14:17:01 -08:00
zuxin-oai	61cd3a9700	fix: temp remove citation (#12711 ) - temp remove citation	2026-02-24 22:07:30 +00:00
Jeremy Rose	fefdc03b25	revert audio scope (#12700 )	2026-02-24 13:38:28 -08:00
daveaitel-openai	dcab40123f	Agent jobs (spawn_agents_on_csv) + progress UI (#10935 ) ## Summary - Add agent job support: spawn a batch of sub-agents from CSV, auto-run, auto-export, and store results in SQLite. - Simplify workflow: remove run/resume/get-status/export tools; spawn is deterministic and completes in one call. - Improve exec UX: stable, single-line progress bar with ETA; suppress sub-agent chatter in exec. ## Why Enables map-reduce style workflows over arbitrarily large repos using the existing Codex orchestrator. This addresses review feedback about overly complex job controls and non-deterministic monitoring. ## Demo (progress bar) ``` ./codex-rs/target/debug/codex exec \ --enable collab \ --enable sqlite \ --full-auto \ --progress-cursor \ -c agents.max_threads=16 \ -C /Users/daveaitel/code/codex \ - <<'PROMPT' Create /tmp/agent_job_progress_demo.csv with columns: path,area and 30 rows: path = item-01..item-30, area = test. Then call spawn_agents_on_csv with: - csv_path: /tmp/agent_job_progress_demo.csv - instruction: "Run `python - <<'PY'` to sleep a random 0.3–1.2s, then output JSON with keys: path, score (int). Set score = 1." - output_csv_path: /tmp/agent_job_progress_demo_out.csv PROMPT ``` ## Review feedback addressed - Auto-start jobs on spawn; removed run/resume/status/export tools. - Auto-export on success. - More descriptive tool spec + clearer prompts. - Avoid deadlocks on spawn failure; pending/running handled safely. - Progress bar no longer scrolls; stable single-line redraw. ## Tests - `cd codex-rs && cargo test -p codex-exec` - `cd codex-rs && cargo build -p codex-cli`	2026-02-24 21:00:19 +00:00
Eric Traut	bd192b54cd	Honor `project_root_markers` when discovering `AGENTS.md` (#12639 ) Fixes #12128 The docs indicates that `project_root_markers` are used to discover the project root for local config as well as `AGENTS.md`. It looks like it was never wired up to support the latter. Summary - resolve project docs by walking to the configured `project_root_markers` (or defaults) instead of assuming the Git root, while honoring CLI overrides and handling malformed configs - fall back to the project’s canonical path chain and add a test that makes sure custom markers upstream of `.git` are respected	2026-02-24 12:55:48 -08:00
Ahmed Ibrahim	b6ab2214e3	Add TUI realtime conversation mode (#12687 ) - Add a hidden `realtime_conversation` feature flag and `/realtime` slash command for start/stop live voice sessions. - Reuse transcription composer/footer UI for live metering, stream mic audio, play assistant audio, render realtime user text events, and force-close on feature disable. --------- Co-authored-by: Codex <noreply@openai.com>	2026-02-24 12:54:30 -08:00
Michael Bolin	3b5fc7547e	refactor: remove unused seatbelt unix socket arg (#12707 ) https://github.com/openai/codex/pull/12052 introduced an `allowed_unix_socket_paths` parameter to `create_seatbelt_command_args()`, but https://github.com/openai/codex/pull/12649 removed the abstraction that #12052 introduced, so this parameter is no longer necessary as it is always an empty slice.	2026-02-24 12:30:26 -08:00

1 2 3 4 5 ...

4032 commits