Commit graph

3891 commits

Author SHA1 Message Date
Yaroslav Volovich
dca9c40dd5
test(app-server): wait for turn/completed in turn_start tests (#12376)
## Summary
- switch a few app-server `turn_start` tests from
`codex/event/task_complete` waits to `turn/completed` waits
- avoid matching unrelated/background `task_complete` events
- keep this flaky test fix separate from the /title feature PR

## Why
On Windows ARM CI, these tests can return early after observing a
generic `codex/event/task_complete` notification from another task. That
can leave the mock Responses server with fewer calls than expected and
fail the test with a wiremock verification mismatch.

Using `turn/completed` matches the app-server turn lifecycle
notification the tests actually care about.

## Validation
- `cargo test -p codex-app-server
turn_start_updates_sandbox_and_cwd_between_turns_v2 -- --nocapture`
- `cargo test -p codex-app-server turn_start_exec_approval_ --
--nocapture`
- `just fmt`
2026-02-20 21:15:21 -08:00
Michael Bolin
48af93399e
feat: use OAI Responses API MessagePhase type directly in App Server v2 (#12422)
https://github.com/openai/codex/pull/10455 introduced the `phase` field,
and then https://github.com/openai/codex/pull/12072 introduced a
`MessagePhase` type in `v2.rs` that paralleled the `MessagePhase` type
in `codex-rs/protocol/src/models.rs`.

The app server protocol prefers `camelCase` while the Responses API uses
`snake_case`, so this meant we had two versions of `MessagePhase` with
different serialization rules. When the app server protocol refers to
types from the Responses API, we use the wire format of the the
Responses API even though it is inconsistent with the app server API.

This PR deletes `MessagePhase` from `v2.rs` and consolidates on the
Responses API version to eliminate confusion.
2026-02-20 20:43:36 -08:00
Michael Bolin
a73efab8dd
fix: address flakiness in thread_resume_rejoins_running_thread_even_with_override_mismatch (#12381)
## Why
`thread/resume` responses for already-running threads can be reported as
`Idle` even while a turn is still in progress. This is caused by a
timing window where the runtime watch state has not yet observed the
running-thread transition, so API clients can receive stale status
information at resume time.

Possibly related: https://github.com/openai/codex/pull/11786

## What
- Add a shared status normalization helper, `resolve_thread_status`, in
`codex-rs/app-server/src/thread_status.rs` that resolves
`Idle`/`NotLoaded` to `Active { active_flags: [] }` when an in-progress
turn is known.
- Reuse this helper across thread response paths in
`codex-rs/app-server/src/codex_message_processor.rs` (including
`thread/start`, `thread/unarchive`, `thread/read`, `thread/resume`,
`thread/fork`, and review/thread-started notification responses).
- In `handle_pending_thread_resume_request`, use both the in-memory
`active_turn_snapshot` and the resumed rollout turns to decide whether a
turn is in progress before resolving thread status for the response.
- Extend `thread_status` tests to validate the new status-resolution
behavior directly.

## Verification
- `cargo test -p codex-app-server
suite::v2::thread_resume::thread_resume_rejoins_running_thread_even_with_override_mismatch`
2026-02-20 20:36:04 -08:00
Ahmed Ibrahim
b237f7cbb1
Add experimental realtime websocket backend prompt override (#12418)
- add top-level `experimental_realtime_ws_backend_prompt` config key
(experimental / do not use) and include it in config schema
- apply the override only to `Op::RealtimeConversation` websocket
`backend_prompt`, with config + realtime tests
2026-02-20 20:10:51 -08:00
Charley Cunningham
4c1744afb2
Improve Plan mode reasoning selection flow (#12303)
Addresses https://github.com/openai/codex/issues/11013

## Summary
- add a Plan implementation path in the TUI that lets users choose
reasoning before switching to Default mode and implementing
- add Plan-mode reasoning scope handling (Plan-only override vs
all-modes default), including config/schema/docs plumbing for
`plan_mode_reasoning_effort`
- remove the hardcoded Plan preset medium default and make the reasoning
popup reflect the active Plan override as `(current)`
- split the collaboration-mode switch notification UI hint into #12307
to keep this diff focused

If I have `plan_mode_reasoning_effort = "medium"` set in my
`config.toml`:
<img width="699" height="127" alt="Screenshot 2026-02-20 at 6 59 37 PM"
src="https://github.com/user-attachments/assets/b33abf04-6b7a-49ed-b2e9-d24b99795369"
/>

If I don't have `plan_mode_reasoning_effort` set in my `config.toml`:
<img width="704" height="129" alt="Screenshot 2026-02-20 at 7 01 51 PM"
src="https://github.com/user-attachments/assets/88a086d4-d2f1-49c7-8be4-f6f0c0fa1b8d"
/>

## Codex author
`codex resume 019c78a2-726b-7fe3-adac-3fa4523dcc2a`
2026-02-20 20:08:56 -08:00
Ahmed Ibrahim
7ae5d88016
Add experimental realtime websocket URL override (#12416)
- add top-level `experimental_realtime_ws_base_url` config key
(experimental / do not use) and include it in config schema
- apply the override only to `Op::RealtimeConversation` websocket
transport, with config + realtime tests
2026-02-20 19:51:20 -08:00
Rohan Godha
0644ba7b7e
fix(nix): include libcap dependency on linux builds (#12415)
commit 923f931121 introduced a dependency
on `libcap`. This PR fixes the nix build by including `libcap` in nix's
build inputs

issue number: #12102. @etraut-openai gave me permission to open pr 

Testing:
running `nix run .#codex-rs` works on both macos (aarch64) and nixos
(x86-64)
2026-02-20 19:32:15 -08:00
Ahmed Ibrahim
6817f0be8a
Wire realtime api to core (#12268)
- Introduce `RealtimeConversationManager` for realtime API management 
- Add `op::conversation` to start conversation, insert audio, insert
text, and close conversation.
- emit conversation lifecycle and realtime events.
- Move shared realtime payload types into codex-protocol and add core
e2e websocket tests for start/replace/transport-close paths.

Things to consider:
- Should we use the same `op::` and `Events` channel to carry audio? I
think we should try this simple approach and later we can create
separate one if the channels got congested.
- Sending text updates to the client: we can start simple and later
restrict that.
- Provider auth isn't wired for now intentionally
2026-02-20 19:06:35 -08:00
natea-oai
936e744c93
Add field to Thread object for the latest rename set for a given thread (#12301)
Exposes through the app server updated names set for a thread. This
enables other surfaces to use the core as the source of truth for thread
naming. `threadName` is gathered using the helper functions used to
interact with `session_index.jsonl`, and is hydrated in:
- `thread/list`
- `thread/read`
- `thread/resume`
- `thread/unarchive`
- `thread/rollback`

We don't do this for `thread/start` and `thread/fork`.
2026-02-20 18:26:57 -08:00
Michael Bolin
53bcfaf42d
fix: explicitly list name collisions in JSON schema generation (#12406)
## Why

JSON schema codegen was silently resolving naming collisions by
appending numeric suffixes (for example `...2`, `...3`). That makes the
generated schema names unstable: removing an earlier colliding type can
cause a later type to be renumbered, which is a breaking change for
consumers that referenced the old generated name.

This PR makes those collisions explicit and reviewable.

Though note that once we remove `v1` from the codegen, we will no longer
support naming collisions. Or rather, naming collisions will have to be
handled explicitly rather than the numeric suffix approach.

## What Changed

- In `codex-rs/app-server-protocol/src/export.rs`, replaced implicit
numeric suffix collision handling for generated variant titles with
explicit special-case maps.
- Added a panic when a collision occurs without an entry in the map, so
new collisions fail loudly instead of silently renaming generated schema
types.
- Added the currently required special cases so existing generated names
remain stable.
- Extended the same approach to numbered `definitions` / `$defs`
collisions (for example `MessagePhase2`-style names) so those are also
explicitly tracked.

## Verification

- Ran targeted generator-path test:
- `cargo test -p codex-app-server-protocol
generate_json_filters_experimental_fields_and_methods -- --nocapture`


---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/12406).
* #12408
* __->__ #12406
2026-02-20 17:51:53 -08:00
Matthew Zeng
36a2a9fdbb
[apps] Bump MCP tool call timeout. (#12405)
- [x] Bump MCP tool call timeout.
2026-02-20 17:35:07 -08:00
Felipe Coury
a5d0757ed1
fix(tui): queued-message edit shortcut unreachable in some terminals (#12240)
## Problem

The TUI's "edit queued message" shortcut (Alt+Up) is either silently
swallowed or recognized as another key combination by Apple Terminal,
Warp, and VSCode's integrated terminal on macOS. Users in those
environments see the hint but pressing the keys does nothing.

## Mental model

When a model turn is in progress the user can still type follow-up
messages. These are queued and displayed below the composer with a hint
line showing how to pop the most recent one back into the editor. The
hint text and the actual key handler must agree on which shortcut is
used, and that shortcut must actually reach the TUI—i.e. it must not be
intercepted by the host terminal.

Three terminals are known to intercept Alt+Up: Apple Terminal (remaps it
to cursor movement), Warp (consumes it for its own command palette), and
VSCode (maps it to "move line up"). For these we use Shift+Left instead.

<p align="center">
<img width="283" height="182" alt="image"
src="https://github.com/user-attachments/assets/4a9c5d13-6e47-4157-bb41-28b4ce96a914"
/>
</p>

| macOS Native Terminal | Warp | VSCode Terminal |
|---|---|---|
| <img width="1557" height="1010" alt="SCR-20260219-kigi"
src="https://github.com/user-attachments/assets/f4ff52f8-119e-407b-a3f3-52f564c36d70"
/> | <img width="1479" height="1261" alt="SCR-20260219-krrf"
src="https://github.com/user-attachments/assets/5807d7c4-17ae-4a2b-aa27-238fd49d90fd"
/> | <img width="1612" height="1312" alt="SCR-20260219-ksbz"
src="https://github.com/user-attachments/assets/1cedb895-6966-4d63-ac5f-0eea0f7057e8"
/> |

## Non-goals

- Making the binding user-configurable at runtime (deferred to a broader
keybinding-config effort).
- Remapping any other shortcuts that might be terminal-specific.

## Tradeoffs

- **Exhaustive match instead of a wildcard default.** The
`queued_message_edit_binding_for_terminal` function explicitly lists
every `TerminalName` variant. This is intentional: adding a new terminal
to the enum will produce a compile error, forcing the author to decide
which binding that terminal should use.

- **Binding lives on `ChatWidget`, hint lives on `QueuedUserMessages`.**
The key event handler that actually acts on the press is in
`ChatWidget`, but the rendered hint text is inside `QueuedUserMessages`.
These are kept in sync by `ChatWidget` calling
`bottom_pane.set_queued_message_edit_binding(self.queued_message_edit_binding)`
during construction. A mismatch would show the wrong hint but would not
lose data.

## Architecture

```mermaid
  graph TD
      TI["terminal_info().name"] --> FN["queued_message_edit_binding_for_terminal(name)"]
      FN --> KB["KeyBinding"]
      KB --> CW["ChatWidget.queued_message_edit_binding<br/><i>key event matching</i>"]
      KB --> BP["BottomPane.set_queued_message_edit_binding()"]
      BP --> QUM["QueuedUserMessages.edit_binding<br/><i>rendered in hint line</i>"]

      subgraph "Special terminals (Shift+Left)"
          AT["Apple Terminal"]
          WT["Warp"]
          VS["VSCode"]
      end

      subgraph "Default (Alt+Up)"
          GH["Ghostty"]
          IT["iTerm2"]
          OT["Others…"]
      end

      AT --> FN
      WT --> FN
      VS --> FN
      GH --> FN
      IT --> FN
      OT --> FN
```

No new crates or public API surface. The only cross-crate dependency
added is `codex_core::terminal::{TerminalName, terminal_info}`, which
already existed for telemetry.

## Observability

No new logging. Terminal detection already emits a `tracing::debug!` log
line at startup with the detected terminal name, which is sufficient to
diagnose binding mismatches.

## Tests

- Existing `alt_up_edits_most_recent_queued_message` test is preserved
and explicitly sets the Alt+Up binding to isolate from the host
terminal.
- New parameterized async tests verify Shift+Left works for Apple
Terminal, Warp, and VSCode.
- A sync unit test asserts the mapping table covers the three special
terminals (Shift+Left) and that iTerm2 still gets Alt+Up.
  
Fixes #4490
2026-02-20 16:56:41 -08:00
Matthew Zeng
4ebdddaa34
[apps] Fix gateway url. (#12403)
- [x] Fix connectors gateway url.
2026-02-21 00:47:15 +00:00
Charley Cunningham
021e39b303
Show model/reasoning hint when switching modes (#12307)
## Summary
- show an info message when switching collaboration modes changes the
effective model or reasoning
- include the target mode in the message (for example `... for Plan
mode.`)
- add TUI tests for model-change and reasoning-only change notifications
on mode switch

<img width="715" height="184" alt="Screenshot 2026-02-20 at 2 01 40 PM"
src="https://github.com/user-attachments/assets/18d1beb3-ab87-4e1c-9ada-a10218520420"
/>
2026-02-20 15:22:10 -08:00
sayan-oai
65b9fe8f30
clarify model_catalog_json only applied on startup (#12379)
# External (non-OpenAI) Pull Request Requirements

Before opening this Pull Request, please read the dedicated
"Contributing" markdown file or your PR may be closed:
https://github.com/openai/codex/blob/main/docs/contributing.md

If your PR conforms to our contribution guidelines, replace this text
with a detailed and high quality description of your changes.

Include a link to a bug report or enhancement request.
2026-02-20 15:04:36 -08:00
viyatb-oai
64f3827d10
Move sanitizer into codex-secrets (#12306)
## Summary
- move the sanitizer implementation into `codex-secrets`
(`secrets/src/sanitizer.rs`) and re-export `redact_secrets`
- switch `codex-core` to depend on/import `codex-secrets` for sanitizer
usage
- remove the old `utils/sanitizer` crate wiring and refresh lockfiles

## Testing
- `just fmt`
- `cargo test -p codex-secrets`
- `cargo test -p codex-core --no-run`
- `cargo clippy -p codex-secrets -p codex-core --all-targets
--all-features -- -D warnings`
- `just bazel-lock-update`
- `just bazel-lock-check`

## Notes
- not run: `cargo test --all-features` (full workspace suite)
2026-02-20 22:47:54 +00:00
pakrym-oai
1bb7989b20
Add ability to attach extra files to feedback (#12370)
Allow clients to provide extra files.
2026-02-20 22:26:14 +00:00
derekf-oai
9176f09cb8
docs: use --locked when installing cargo-nextest (#12377)
## What

Updates the optional `cargo-nextest` install command in
`docs/install.md`:

- `cargo install cargo-nextest` -> `cargo install --locked
cargo-nextest`

## Why

The current docs command can fail during source install because recent
`cargo-nextest` releases intentionally require `--locked`.

Repro (macOS, but likely not platform-specific):
- `cargo install cargo-nextest`
- Fails with a compile error from `locked-tripwire` indicating:
  - `Nextest does not support being installed without --locked`
  - suggests `cargo install --locked cargo-nextest`

Using the locked command succeeds:
- `cargo install --locked cargo-nextest`

## How

Single-line docs change in `docs/install.md` to match current
`cargo-nextest` install requirements.

## Validation

- Reproduced failure locally using a temporary `CARGO_HOME` directory
(clean Cargo home)
- Example command used: `CARGO_HOME=/tmp/cargo-home-test cargo install
cargo-nextest`
- Confirmed success with `cargo install --locked cargo-nextest`
2026-02-20 14:12:13 -08:00
Matthew Zeng
354e7fedd2
[apps] Enforce simple logo url format. (#12374)
- [x] Enforce simple logo url format when loading apps directory to save
bandwidth.
2026-02-20 22:05:55 +00:00
viyatb-oai
60c2b7beca
core tests: use hermetic mock server in review suite (#12291)
## Summary
- switch the review test SSE mock helper to use the shared hermetic mock
server setup
- ensure review tests always have a default `/v1/models` stub during
Codex session bootstrap
- remove the race that caused intermittent `/v1/models` connection
failures and flaky ETag refresh assertions

## Testing
- `just fmt`
- `cargo test -p codex-core --test all
refresh_models_on_models_etag_mismatch_and_avoid_duplicate_models_fetch`
- `cargo test -p codex-core --test all
review_uses_custom_review_model_from_config`
- repeated both targeted tests 5x in a loop
- `cargo clippy -p codex-core --tests -- -D warnings`
2026-02-20 12:50:12 -08:00
Max Johnson
6b1091fc92
app-server: harden disconnect cleanup paths (#12218)
Hardens codex-rs/app-server connection lifecycle and outbound routing
for websocket clients. Fixes some FUD I was having

- Added per-connection disconnect signaling (CancellationToken) for
websocket transports.
- Split websocket handling into independent inbound/outbound tasks
coordinated by cancellation.
- Changed outbound routing so websocket connections use non-blocking
try_send; slow/full websocket writers are disconnected instead of
stalling broadcast delivery.
- Kept stdio behavior blocking-on-send (no forced disconnect) so local
stdio clients are not dropped when queues are temporarily full.
- Simplified outbound router flow by removing deferred
pending_closed_connections handling.
- Added guards to drop incoming response/notification/error messages
from unknown connections.
- Fixed listener teardown race in thread listener tasks using a
listener_generation check so stale tasks do not clear newer listeners.

Fixes
https://linear.app/openai/issue/CODEX-4966/multiclient-handle-slow-notification-consumers

  ## Tests

  Added/updated transport tests covering:

  - broadcast does not block on a slow/full websocket connection
  - stdio connection waits instead of disconnecting on full queue

I (maxj) have tested manually and will retest before landing
2026-02-20 20:35:16 +00:00
colby-oai
d3cf8bd0fa
fix(core): require approval for destructive MCP tool calls (#12353)
Summary
- ensure destructive tool annotations short-circuit to require approval
- simplify approval logic to only require read/write + open-world when
destructive is false
- update the unit test to cover the new destructive behavior

Testing
- Not run (not requested)
2026-02-20 12:12:16 -08:00
Matthew Zeng
aa121a115e
[apps] Implement apps configs. (#12086)
- [x] Implement apps configs.
2026-02-20 12:05:21 -08:00
jif-oai
5034d4bd89
feat: add config allow_login_shell (#12312) 2026-02-20 20:02:24 +00:00
Curtis 'Fjord' Hawthorne
67e802e26b
ci(bazel): install Node from node-version.txt in remote image (#12205)
## Summary
Install Node in the Bazel remote execution image using the version
pinned in `codex-rs/node-version.txt`.

## Why
`js_repl` tests run under Bazel remote execution and require a modern
Node runtime. Runner-level `setup-node` does not guarantee Node is
available (or recent enough) inside the remote worker container.

## What changed
- Updated `.github/workflows/Dockerfile.bazel` to install Node from
official tarballs at image build time.
- Added `xz-utils` for extracting `.tar.xz` archives.
- Copied `codex-rs/node-version.txt` into the image build context and
used it as the single source of truth for Node version.
- Added architecture mapping for multi-arch builds:
  - `amd64 -> x64`
  - `arm64 -> arm64`
- Verified install during image build with:
  - `node --version`
  - `npm --version`

## Impact
- Bazel remote workers should now have the required Node version
available for `js_repl` tests.
- Keeps Node version synchronized with repo policy via
`codex-rs/node-version.txt`.

## Testing
- Verified Dockerfile changes and build steps locally (build-time
commands are deterministic and fail fast on unsupported arch/version
fetch issues).

## Follow-up
- Rebuild and publish the Bazel runner image for both `linux/amd64` and
`linux/arm64`.
- Update image digests in `rbe.bzl` to roll out this runtime update in
CI.


#### [git stack](https://github.com/magus/git-stack-cli)
-  `1` https://github.com/openai/codex/pull/12300
-  `2` https://github.com/openai/codex/pull/12275
- 👉 `3` https://github.com/openai/codex/pull/12205
-  `4` https://github.com/openai/codex/pull/12185
-  `5` https://github.com/openai/codex/pull/10673
2026-02-20 11:51:17 -08:00
daniel-oai
f08cf8d65f
CODEX-4927: Surface local login entitlement denials in browser (#12289)
## Problem
Users without Codex access can hit a confusing local login loop. In the
denial case, the callback could fall through to generic behavior
(including a plain "Missing authorization code" page) instead of clearly
explaining that access was denied.

<img width="842" height="464" alt="Screenshot 2026-02-19 at 11 43 45 PM"
src="https://github.com/user-attachments/assets/f7a25e1d-e480-4ac2-b0ff-8bfe31003e66"
/>
<img width="842" height="464" alt="Screenshot 2026-02-19 at 11 44 53 PM"
src="https://github.com/user-attachments/assets/8a4fe6e4-b27b-483c-9f0c-60164933221d"
/>


## Scope
This PR improves local login error clarity only. It does not change
entitlement policy, RBAC rules, or who is allowed to use Codex.

## What Changed
- The local OAuth callback handler now parses `error` and
`error_description` on `/auth/callback` and exits the callback loop with
a real failure.
- Callback failures render a branded local Codex error page instead of a
generic/plain page.
- `access_denied` + `missing_codex_entitlement` is now mapped to an
explicit user-facing message telling the user Codex is not enabled for
their workspace and to contact their workspace administrator for access.
- Unknown OAuth callback errors continue to use a generic error page
while preserving the OAuth error code/details for debugging.
- Added the login error page template to Bazel assets so the local
binary can render it in Bazel builds.

## Non-goals
- No TUI onboarding/toast changes in this PR.
- No backend entitlement or policy changes.

## Tests
- Added an end-to-end `codex-login` test for `access_denied` +
`missing_codex_entitlement` and verified the page shows the actionable
admin guidance.
- Added an end-to-end `codex-login` test for a generic `access_denied`
reason to verify we keep a generic fallback page/message.
2026-02-20 11:35:28 -08:00
Curtis 'Fjord' Hawthorne
097620218d
js_repl: remove codex.state helper references (#12275)
## Summary

This PR removes `codex.state` from the `js_repl` helper surface and
removes all corresponding documentation/instruction references.

## Motivation

Top-level bindings in `js_repl` now persist across cells, so the extra
`codex.state` helper is redundant and adds unnecessary API/docs surface.

## Changes

- Removed the long-lived `state` object from the Node kernel helper
wiring.
- Stopped exposing `codex.state` (and `context.state`) during `js_repl`
execution.
- Updated user-facing `js_repl` docs to remove `codex.state`.
- Updated generated instruction text and related test expectations to
list only:
  - `codex.tmpDir`
  - `codex.tool(name, args?)`


#### [git stack](https://github.com/magus/git-stack-cli)
-  `1` https://github.com/openai/codex/pull/12300
- 👉 `2` https://github.com/openai/codex/pull/12275
-  `3` https://github.com/openai/codex/pull/12205
-  `4` https://github.com/openai/codex/pull/12185
-  `5` https://github.com/openai/codex/pull/10673
2026-02-20 11:20:45 -08:00
viyatb-oai
28c0089060
fix(network-proxy): add unix socket allow-all and update seatbelt rules (#11368)
## Summary
Adds support for a Unix socket escape hatch so we can bypass socket
allowlisting when explicitly enabled.

## Description
* added a new flag, `network.dangerously_allow_all_unix_sockets` as an
explicit escape hatch
* In codex-network-proxy, enabling that flag now allows any absolute
Unix socket path from x-unix-socket instead of requiring each path to be
explicitly allowlisted. Relative paths are still rejected.
* updated the macOS seatbelt path in core so it enforces the same Unix
socket behavior:
  * allowlisted sockets generate explicit network* subpath rules
  * allow-all generates a broad network* (subpath "/") rule

---------

Co-authored-by: Codex <199175422+chatgpt-codex-connector[bot]@users.noreply.github.com>
2026-02-20 10:56:57 -08:00
Curtis 'Fjord' Hawthorne
73fd939296
js_repl: block wrapped payload prefixes in grammar (#12300)
## Summary

Tighten the `js_repl` freeform Lark grammar to block the most common
malformed payload wrappers before they reach runtime validation.

## What Changed

- Replaced the overly permissive `js_repl` freeform grammar (`start:
/[\s\S]*/`) with a structured grammar that still supports:
  - plain JS source
  - optional first-line `// codex-js-repl:` pragma followed by JS source
- Added grammar-level filtering for common bad payload shapes by
rejecting inputs whose first significant token starts with:
  - `{` (JSON object wrapper like `{"code":"..."}`)
  - `"` (quoted code string)
  - `` ``` `` (markdown code fences)
- Implemented the grammar without regex lookahead/lookbehind because the
API-side Lark regex engine does not support look-around.
- Added a unit test to validate the grammar shape and guard against
reintroducing unsupported lookaround.

## Why

`js_repl` is a freeform tool, but the model sometimes emits wrapped
payloads (JSON, quoted strings, markdown fences) instead of raw
JavaScript. We already reject those at runtime, but this change moves
the constraint into the tool grammar so the model is less likely to
generate invalid tool-call payloads in the first place.

## Testing

- `cargo test -p codex-core
js_repl_freeform_grammar_blocks_common_non_js_prefixes`
- `cargo test -p codex-core parse_freeform_args_rejects_`

## Notes

- This intentionally over-blocks a few uncommon valid JS starts (for
example top-level `{ ... }` blocks or top-level quoted directives like
`"use strict";`) in exchange for preventing the common wrapped-payload
mistakes.



#### [git stack](https://github.com/magus/git-stack-cli)
- 👉 `1` https://github.com/openai/codex/pull/12300
-  `2` https://github.com/openai/codex/pull/12275
-  `3` https://github.com/openai/codex/pull/12205
-  `4` https://github.com/openai/codex/pull/12185
-  `5` https://github.com/openai/codex/pull/10673
2026-02-20 10:47:07 -08:00
viyatb-oai
e8afaed502
Refactor network approvals to host/protocol/port scope (#12140)
## Summary
Simplify network approvals by removing per-attempt proxy correlation and
moving to session-level approval dedupe keyed by (host, protocol, port).
Instead of encoding attempt IDs into proxy credentials/URLs, we now
treat approvals as a destination policy decision.

- Concurrent calls to the same destination share one approval prompt.
- Different destinations (or same host on different ports) get separate
prompts.
- Allow once approves the current queued request group only.
- Allow for session caches that (host, protocol, port) and auto-allows
future matching requests.
- Never policy continues to deny without prompting.

Example:
- 3 calls: 
  - a.com (line 443)
  - b.com (line 443)
  - a.com (line 443)
=> 2 prompts total (a, b), second a waits on the first decision.
- a.com:80 is treated separately from a.com line 443

## Testing
- `just fmt` (in `codex-rs`)
- `cargo test -p codex-core tools::network_approval::tests`
- `cargo test -p codex-core` (unit tests pass; existing
integration-suite failures remain in this environment)
2026-02-20 10:39:55 -08:00
Max Johnson
41f15bf07b
app-server: add JSON tracing logs (#12287)
- add `LOG_FORMAT=json` support for app-server tracing logs via
`tracing_subscriber`'s built-in JSON formatter
- keep the default human-readable format unchanged and keep `RUST_LOG`
filtering behavior
- document the env var and update lockfile
2026-02-20 10:10:51 -08:00
pakrym-oai
86803ca9bf
Reuse connection between turns (#12294)
Add a pool of one to the model client to reuse connections across turns.
2026-02-20 10:09:46 -08:00
jif-oai
035c4c30bb
fix: nick name at thread/read (#12347) 2026-02-20 17:53:51 +00:00
Yaroslav Volovich
5b71246001
fix: simplify macOS sleep inhibitor FFI (#12340)
Summary
- simplify the macOS sleep inhibitor FFI by replacing `dlopen` / `dlsym`
/ `transmute` with normal IOKit extern calls and `SAFETY` comments
- switch to cfg-selected platform implementations
(`imp::SleepInhibitor`) instead of `Box<dyn ...>`
- check in minimal IOKit bindings generated with `bindgen` and include
them from the macOS backend
- enable direct IOKit linkage in Bazel macOS builds by registering
`IOKit` in the Bazel `osx.framework(...)` toolchain extension list
- update `Cargo.lock` and `MODULE.bazel.lock` after removing the
build-time `bindgen` dependency path

Testing
- `just fmt`
- `cargo clippy -p codex-utils-sleep-inhibitor --all-targets -- -D
warnings`
- `cargo test -p codex-utils-sleep-inhibitor`
- `bazel test //codex-rs/utils/sleep-inhibitor:all --test_output=errors`
- `just bazel-lock-update`
- `just bazel-lock-check`

Context
- follow-up to #11711 addressing Ryan's review comments
- `bindgen` is used to generate the checked-in bindings file, but not at
build time
2026-02-20 09:52:21 -08:00
jif-oai
fd67aba114
feat: do not enqueue phase 2 if not necessary (#12344) 2026-02-20 17:21:45 +00:00
jif-oai
5a30cd3f92
feat: better agent picker in TUI (#12332)
<img width="486" height="112" alt="Screenshot 2026-02-20 at 15 04 52"
src="https://github.com/user-attachments/assets/0d744f58-d902-4638-aeaf-27e7389ccd73"
/>
2026-02-20 15:40:34 +00:00
jif-oai
4d60c803ba
feat: cleaner TUI for sub-agents (#12327)
<img width="760" height="496" alt="Screenshot 2026-02-20 at 14 31 25"
src="https://github.com/user-attachments/assets/1983b825-bb47-417e-9925-6f727af56765"
/>
2026-02-20 15:26:33 +00:00
colby-oai
2036a5f5e0
Add MCP server context to otel tool_result logs (#12267)
Summary
- capture the origin for each configured MCP server and expose it via
the connection manager
- plumb MCP server name/origin into tool logging and emit
codex.tool_result events with those fields
- add unit coverage for origin parsing and extend OTEL tests to assert
empty MCP fields for non-MCP tools
- currently not logging full urls or url paths to prevent logging
potentially sensitive data

Testing
- Not run (not requested)
2026-02-20 10:26:19 -05:00
jif-oai
ede561b5d1
disable collab for phase 2 (#12326) 2026-02-20 14:51:17 +00:00
jif-oai
595665de35
chore: better agent names (#12328) 2026-02-20 14:51:09 +00:00
jif-oai
0f9eed3a6f
feat: add nick name to sub-agents (#12320)
Adding random nick name to sub-agents. Used for UX

At the same time, also storing and wiring the role of the sub-agent
2026-02-20 14:39:49 +00:00
jif-oai
03ff04cd65
chore: nit explorer (#12315) 2026-02-20 11:24:28 +00:00
jif-oai
a7632f68a6
Set memories phase reasoning effort constants (#12309)
## Summary
- add reasoning effort constants for the memories phase one and phase
two agents
- wire the constants into phase1 request creation and phase2 agent
configuration so the default efforts are always applied

## Testing
- Not run (not requested)
2026-02-20 09:25:35 +00:00
zuxin-oai
e747a8eb74
memories: add rollout_summary_file header to raw memories and tune prompts (#12221)
## Summary
- Add `rollout_summary_file: <generated>.md` to each thread header in
`raw_memories.md` so Phase 2 can reliably reference the canonical
rollout summary filename.
- Update the memory prompts/templates (`stage_one_system`,
`consolidation`, `read_path`) for the new task-oriented raw-memory /
MEMORY.md schema and stronger consolidation guidance.

## Details
- `codex-rs/core/src/memories/storage.rs`
- Writes the generated `rollout_summary_file` path into the per-thread
metadata header when rebuilding `raw_memories.md`.
- `codex-rs/core/src/memories/tests.rs`
- Verifies the canonical `rollout_summary_file` header is present and
ordered after `updated_at`/`cwd` in `raw_memories.md`.
- Verifies task-structured raw-memory content is preserved while the
canonical header is added.
- `codex-rs/core/templates/memories/*.md`
- Updates the stage-1 raw-memory format to task-grouped sections
(`task`, `task_group`, `task_outcome`).
- Updates Phase 2 consolidation guidance around recency (`updated_at`),
task-oriented `MEMORY.md` blocks, and richer evidence-backed
consolidation.
- Tweaks the quick memory pass wording to emphasize topics/workflows in
addition to keywords.

## Testing
- `cargo test -p codex-core memories`
2026-02-20 09:13:35 +00:00
Matthew Zeng
18bd6d2d71
[apps] Store apps tool cache in disk to reduce startup time. (#11822)
We now write MCP tools from installed apps to disk cache so that they
can be picked up instantly at startup. We still do a fresh fetch from
remote MCP server but it's non blocking unless there's a cache miss.

- [x] Store apps tool cache in disk to reduce startup time.
2026-02-19 22:06:51 -08:00
Max Johnson
b06f91c4fe
app-server: improve thread resume rejoin flow (#11776)
thread/resume response includes latest turn with all items, in band so
no events are stale or lost

Testing
- e2e tested using app-server-test-client using flow described in
"Testing Thread Rejoin Behavior" in
codex-rs/app-server-test-client/README.md
- e2e tested in codex desktop by reconnecting to a running turn
2026-02-20 05:29:05 +00:00
Michael Bolin
366ecaf17a
app-server: fix flaky list_apps_returns_connectors_with_accessible_flags test (#12286)
## Why

`app/list` emits `app/list/updated` after whichever async load finishes
first (directory connectors or accessible tools). This test assumed the
directory-backed update always arrived first because it injected a tools
delay, but that assumption is not stable when the process-global Codex
Apps tools cache is already warm. In that case the accessible-tools path
can return immediately and the first notification shape flips, which
makes the assertion flaky.

Relevant code paths:

-
[`codex-rs/app-server/src/codex_message_processor.rs`](13ec97d72e/codex-rs/app-server/src/codex_message_processor.rs (L4949-L5034))
(concurrent loads + per-load `app/list/updated` notifications)
-
[`codex-rs/core/src/mcp_connection_manager.rs`](13ec97d72e/codex-rs/core/src/mcp_connection_manager.rs (L1182-L1197))
(Codex Apps tools cache hit path)

## What Changed

Updated
`suite::v2::app_list::list_apps_returns_connectors_with_accessible_flags`
in `codex-rs/app-server/tests/suite/v2/app_list.rs` to accept either
valid first `app/list/updated` payload:

- the directory-first snapshot
- the accessible-tools-first snapshot

The test still keeps the later assertions strict:

- the second `app/list/updated` notification must be the fully merged
result
- the final `app/list` response must match the same merged result

I also added an inline comment explaining why the first notification is
intentionally order-insensitive.

## Verification

- `cargo test -p codex-app-server`
2026-02-20 02:27:18 +00:00
Michael Bolin
4fa304306b
tests: centralize in-flight turn cleanup helper (#12271)
## Why

Several tests intentionally exercise behavior while a turn is still
active. The cleanup sequence for those tests (`turn/interrupt` + waiting
for `codex/event/turn_aborted`) was duplicated across files, which made
the rationale easy to lose and the pattern easy to apply inconsistently.

This change centralizes that cleanup in one place with a single
explanatory doc comment.

## What Changed

### Added shared helper

In `codex-rs/app-server/tests/common/mcp_process.rs`:

- Added `McpProcess::interrupt_turn_and_wait_for_aborted(...)`.
- Added a doc comment explaining why explicit interrupt + terminal wait
is required for tests that intentionally leave a turn in-flight.

### Migrated call sites

Replaced duplicated interrupt/aborted blocks with the helper in:

- `codex-rs/app-server/tests/suite/v2/thread_resume.rs`
  - `thread_resume_rejects_history_when_thread_is_running`
  - `thread_resume_rejects_mismatched_path_when_thread_is_running`
- `codex-rs/app-server/tests/suite/v2/turn_start_zsh_fork.rs`
  - `turn_start_shell_zsh_fork_executes_command_v2`
-
`turn_start_shell_zsh_fork_subcommand_decline_marks_parent_declined_v2`
- `codex-rs/app-server/tests/suite/v2/turn_steer.rs`
  - `turn_steer_returns_active_turn_id`

### Existing cleanup retained

In `codex-rs/app-server/tests/suite/v2/turn_start.rs`:

- `turn_start_accepts_local_image_input` continues to explicitly wait
for `turn/completed` so the turn lifecycle is fully drained before test
exit.

## Verification

- `cargo test -p codex-app-server`
2026-02-20 01:47:34 +00:00
xl-openai
e4456840f5
skill-creator: lazy-load PyYAML in frontmatter parsing (#12080)
init-skill should work even without PyYAML
2026-02-19 15:09:12 -08:00
mjr-openai
3293538e12
Update pnpm versions to fix cve-2026-24842 (#12009)
Update pnpm versions to resolve CVE-2026-24842
2026-02-19 14:27:55 -08:00