Commit graph

3066 commits

Author SHA1 Message Date
Max Johnson
fb0aaf94de
codex-rs: fix thread resume rejoin semantics (#11756)
## Summary
- always rejoin an in-memory running thread on `thread/resume`, even
when overrides are present
- reject `thread/resume` when `history` is provided for a running thread
- reject `thread/resume` when `path` mismatches the running thread
rollout path
- warn (but do not fail) on override mismatches for running threads
- add more `thread_resume` integration tests and fixes; including
restart-based resume-with-overrides coverage

## Validation
- `just fmt`
- `cargo test -p codex-app-server --test all thread_resume`
- manual test with app-server-test-client
https://github.com/openai/codex/pull/11755
- manual test both stdio and websocket in app
2026-02-13 23:09:58 +00:00
Jeremy Rose
e4f8263798
[app-server] add fuzzyFileSearch/sessionCompleted (#11773)
this is to allow the client to know when to stop showing a spinner.
2026-02-13 15:08:14 -08:00
pash-openai
a5e8e69d18
turn metadata followups (#11782)
some trivial simplifications from #11677
2026-02-13 14:59:16 -08:00
Charley Cunningham
26a7cd21e2
tui: preserve remote image attachments across resume/backtrack (#10590)
## Summary
This PR makes app-server-provided image URLs first-class attachments in
TUI, so they survive resume/backtrack/history recall and are resubmitted
correctly.

<img width="715" height="491" alt="Screenshot 2026-02-12 at 8 27 08 PM"
src="https://github.com/user-attachments/assets/226cbd35-8f0c-4e51-a13e-459ef5dd1927"
/>

Can delete the attached image upon backtracking:
<img width="716" height="301" alt="Screenshot 2026-02-12 at 8 27 31 PM"
src="https://github.com/user-attachments/assets/4558d230-f1bd-4eed-a093-8e1ab9c6db27"
/>

In both history and composer, remote images are rendered as normal
`[Image #N]` placeholders, with numbering unified with local images.

## What changed
- Plumb remote image URLs through TUI message state:
  - `UserHistoryCell`
  - `BacktrackSelection`
  - `ChatComposerHistory::HistoryEntry`
  - `ChatWidget::UserMessage`
- Show remote images as placeholder rows inside the composer box (above
textarea), and in history cells.
- Support keyboard selection/deletion for remote image rows in composer
(`Up`/`Down`, `Delete`/`Backspace`).
- Preserve remote-image-only turns in local composer history (Up/Down
recall), including restore after backtrack.
- Ensure submit/queue/backtrack resubmit include remote images in model
input (`UserInput::Image`), and keep request shape stable for
remote-image-only turns.
- Keep image numbering contiguous across remote + local images:
  - remote images occupy `[Image #1]..[Image #M]`
  - local images start at `[Image #M+1]`
  - deletion renumbers consistently.
- In protocol conversion, increment shared image index for remote images
too, so mixed remote/local image tags stay in a single sequence.
- Simplify restore logic to trust in-memory attachment order (no
placeholder-number parsing path).
- Backtrack/replay rollback handling now queues trims through
`AppEvent::ApplyThreadRollback` and syncs transcript overlay/deferred
lines after trims, so overlay/transcript state stays consistent.
- Trim trailing blank rendered lines from user history rendering to
avoid oversized blank padding.

## Docs + tests
- Updated: `docs/tui-chat-composer.md` (remote image flow,
selection/deletion, numbering offsets)
- Added/updated tests across `tui/src/chatwidget/tests.rs`,
`tui/src/app.rs`, `tui/src/app_backtrack.rs`, `tui/src/history_cell.rs`,
and `tui/src/bottom_pane/chat_composer.rs`
- Added snapshot coverage for remote image composer states, including
deleting the first of two remote images.

## Validation
- `just fmt`
- `cargo test -p codex-tui`

## Codex author
`codex fork 019c2636-1571-74a1-8471-15a3b1c3f49d`
2026-02-13 14:54:06 -08:00
Max Johnson
395729910c
rmcp-client: fix auth crash (#11692)
Don't load auth tokens if bearer token is present. This fixes a crash I
was getting on Linux:

```
2026-02-12T23:26:24.999408Z DEBUG session_init: codex_core::codex: Configuring session: model=gpt-5.3-codex-spark; provider=ModelProviderInfo { name: "OpenAI", base_url: None, env_key: None, env_key_instructions: No
ne, experimental_bearer_token: None, wire_api: Responses, query_params: None, http_headers: Some({"version": "0.0.0"}), env_http_headers: Some({"OpenAI-Project": "OPENAI_PROJECT", "OpenAI-Organization": "OPENAI_ORGA
NIZATION"}), request_max_retries: None, stream_max_retries: None, stream_idle_timeout_ms: None, requires_openai_auth: true, supports_websockets: true }
2026-02-12T23:26:24.999799Z TRACE session_init: codex_keyring_store: keyring.load start, service=Codex MCP Credentials, account=codex_apps|20398391ad12d90b

thread 'tokio-runtime-worker' (96190) has overflowed its stack
fatal runtime error: stack overflow, aborting
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 1.35s
```
2026-02-13 14:32:01 -08:00
pash-openai
6c0a924203
turn metadata: per-turn non-blocking (#11677) 2026-02-13 12:48:29 -08:00
Alex Kwiatkowski
a4bb59884b
fix(nix): use correct version from Cargo.toml in flake build (#11770)
## Summary

- When building via `nix build`, the binary reports `codex-cli 0.0.0`
because the workspace `Cargo.toml` uses `0.0.0` as a placeholder on
`main`. This causes the update checker to always prompt users to upgrade
even when running the latest code.
- Reads the version from `codex-rs/Cargo.toml` at flake evaluation time
using `builtins.fromTOML` and patches it into the workspace `Cargo.toml`
before cargo builds via `postPatch`.
- On release commits (e.g. tag `rust-v0.101.0`), the real version is
used as-is. On `main` branch builds, falls back to
`0.0.0-dev+<shortRev>` (or `0.0.0-dev+dirty`), which the update
checker's `parse_version` ignores — suppressing the spurious upgrade
prompt.

| Scenario | Cargo.toml version | Nix `version` | Binary reports |
Upgrade nag? |
|---|---|---|---|---|
| Release commit (e.g. `rust-v0.101.0`) | `0.101.0` | `0.101.0` |
`codex-cli 0.101.0` | Only if newer exists |
| Main branch (committed) | `0.0.0` | `0.0.0-dev+b934ffc` | `codex-cli
0.0.0-dev+b934ffc` | No |
| Main branch (uncommitted) | `0.0.0` | `0.0.0-dev+dirty` | `codex-cli
0.0.0-dev+dirty` | No |

## Test plan

- [ ] `nix build` from `main` branch and verify `codex --version`
reports `0.0.0-dev+<shortRev>` instead of `0.0.0`
- [ ] Verify the update checker does not show a spurious upgrade prompt
for dev builds
- [ ] Confirm that on a release commit where `Cargo.toml` has a real
version, the binary reports that version correctly
2026-02-13 12:19:25 -08:00
alexsong-oai
e71760fc64
support app usage analytics (#11687)
Emit app mentioned and app used events. Dedup by (turn_id, connector_id)

Example event params:
{
    "event_type": "codex_app_used",
    "connector_id": "asdk_app_xxx",
    "thread_id": "019c5527-36d4-xxx",
    "turn_id": "019c552c-cd17-xxx",
    "app_name": "Slack (OpenAI Internal)",
    "product_client_id": "codex_cli_rs",
    "invoke_type": "explicit",
    "model_slug": "gpt-5.3-codex"
}
2026-02-13 12:00:16 -08:00
Curtis 'Fjord' Hawthorne
a02342c9e1
Add js_repl kernel crash diagnostics (#11666)
## Summary

This PR improves `js_repl` crash diagnostics so kernel failures are
debuggable without weakening timeout/reset guarantees.

## What Changed

- Added bounded kernel stderr capture and truncation logic (line + byte
caps).
- Added structured kernel snapshots (`pid`, exit status, stderr tail)
for failure paths.
- Enriched model-visible kernel-failure errors with a structured
diagnostics payload:
  - `js_repl diagnostics: {...}`
  - Included only for likely kernel-failure write/EOF cases.
- Improved logging around kernel write failures, unexpected exits, and
kill/wait paths.
- Added/updated unit tests for:
  - UTF-8-safe truncation
  - stderr tail bounds
  - structured diagnostics shape/truncation
  - conditional diagnostics emission
  - timeout kill behavior
  - forced kernel-failure diagnostics

## Why

Before this, failures like broken pipe / unexpected kernel exit often
surfaced as generic errors with little context. This change preserves
existing behavior but adds actionable diagnostics while keeping output
bounded.

## Scope

- Code changes are limited to:
-
`/Users/fjord/code/codex-jsrepl-seq/codex-rs/core/src/tools/js_repl/mod.rs`

## Validation

- `cargo clippy -p codex-core --all-targets -- -D warnings`
- Targeted `codex-core` js_repl unit tests (including new
diagnostics/timeout coverage)
- Tried starting a long running js_repl command (sleep for 10 minutes),
verified error output was as expected after killing the node process.

#### [git stack](https://github.com/magus/git-stack-cli)
- 👉 `1` https://github.com/openai/codex/pull/11666
-  `2` https://github.com/openai/codex/pull/10673
-  `3` https://github.com/openai/codex/pull/10670
2026-02-13 11:57:11 -08:00
Matthew Zeng
8468871e2b
[apps] Improve app listing filtering. (#11697)
- [x] If an installed app is not on the app listing, remove it from the
final list.
2026-02-13 11:54:16 -08:00
jif-oai
c54a4ec078
chore: mini (#11772)
https://github.com/openai/codex/issues/11764
2026-02-13 19:30:49 +00:00
zuxin-oai
b934ffcaaa
Update read_path prompt (#11763)
## Summary

- Created branch zuxin/read-path-update from main.
- Copied codex-rs/core/templates/memories/read_path.md from the current
branch.
- Committed the content change.

## Testing
Not run (content copy + commit only).
2026-02-13 18:34:54 +00:00
Eric Traut
b98c810328
Report syntax errors in rules file (#11686)
Currently, if there are syntax errors detected in the starlark rules
file, the entire policy is silently ignored by the CLI. The app server
correctly emits a message that can be displayed in a GUI.

This PR changes the CLI (both the TUI and non-interactive exec) to fail
when the rules file can't be parsed. It then prints out an error message
and exits with a non-zero exit code. This is consistent with the
handling of errors in the config file.

This addresses #11603
2026-02-13 10:33:40 -08:00
Yaroslav Volovich
32da5eb358
feat(tui): prevent macOS idle sleep while turns run (#11711)
## Summary
- add a shared `codex-core` sleep inhibitor that uses native macOS IOKit
assertions (`IOPMAssertionCreateWithName` / `IOPMAssertionRelease`)
instead of spawning `caffeinate`
- wire sleep inhibition to turn lifecycle in `tui` (`TurnStarted`
enables; `TurnComplete` and abort/error finalization disable)
- gate this behavior behind a `/experimental` feature toggle
(`[features].prevent_idle_sleep`) instead of a dedicated `[tui]` config
flag
- expose the toggle in `/experimental` on macOS; keep it under
development on other platforms
- keep behavior no-op on non-macOS targets

<img width="1326" height="577" alt="image"
src="https://github.com/user-attachments/assets/73fac06b-97ae-46a2-800a-30f9516cf8a3"
/>

## Testing
- `cargo check -p codex-core -p codex-tui`
- `cargo test -p codex-core sleep_inhibitor::tests -- --nocapture`
- `cargo test -p codex-core
tui_config_missing_notifications_field_defaults_to_enabled --
--nocapture`
- `cargo test -p codex-core prevent_idle_sleep_is_ -- --nocapture`

## Semantics and API references
- This PR targets `caffeinate -i` semantics: prevent *idle system sleep*
while allowing display idle sleep.
- `caffeinate -i` mapping in Apple open source (`assertionMap`):
  - `kIdleAssertionFlag -> kIOPMAssertionTypePreventUserIdleSystemSleep`
- Source:
https://github.com/apple-oss-distributions/PowerManagement/blob/PowerManagement-1846.60.12/caffeinate/caffeinate.c#L52-L54
- Apple IOKit docs for assertion types and API:
-
https://developer.apple.com/documentation/iokit/iopmlib_h/iopmassertiontypes
-
https://developer.apple.com/documentation/iokit/1557092-iopmassertioncreatewithname
  - https://developer.apple.com/library/archive/qa/qa1340/_index.html

## Codex Electron vs this PR (full stack path)
- Codex Electron app requests sleep blocking with
`powerSaveBlocker.start("prevent-app-suspension")`:
-
https://github.com/openai/codex/blob/main/codex/codex-vscode/electron/src/electron-message-handler.ts
- Electron maps that string to Chromium wake lock type
`kPreventAppSuspension`:
-
https://github.com/electron/electron/blob/main/shell/browser/api/electron_api_power_save_blocker.cc
- Chromium macOS backend maps wake lock types to IOKit assertion
constants and calls IOKit:
  - `kPreventAppSuspension -> kIOPMAssertionTypeNoIdleSleep`
- `kPreventDisplaySleep / kPreventDisplaySleepAllowDimming ->
kIOPMAssertionTypeNoDisplaySleep`
-
https://github.com/chromium/chromium/blob/main/services/device/wake_lock/power_save_blocker/power_save_blocker_mac.cc

## Why this PR uses a different macOS constant name
- This PR uses `"PreventUserIdleSystemSleep"` directly, via
`IOPMAssertionCreateWithName`, in
`codex-rs/core/src/sleep_inhibitor.rs`.
- Apple’s IOKit header documents `kIOPMAssertionTypeNoIdleSleep` as
deprecated and recommends `kIOPMAssertPreventUserIdleSystemSleep` /
`kIOPMAssertionTypePreventUserIdleSystemSleep`:
-
https://github.com/apple-oss-distributions/IOKitUser/blob/IOKitUser-100222.60.2/pwr_mgt.subproj/IOPMLib.h#L1000-L1030
- So Chromium and this PR are using different constant names, but
semantically equivalent idle-system-sleep prevention behavior.

## Future platform support
The architecture is intentionally set up for multi-platform extensions:
- UI code (`tui`) only calls `SleepInhibitor::set_turn_running(...)` on
turn lifecycle boundaries.
- Platform-specific behavior is isolated in
`codex-rs/core/src/sleep_inhibitor.rs` behind `cfg(...)` blocks.
- Feature exposure is centralized in `core/src/features.rs` and surfaced
via `/experimental`.
- Adding new OS backends should not require additional TUI wiring; only
the backend internals and feature stage metadata need to change.

Potential follow-up implementations:
- Windows:
- Add a backend using Win32 power APIs
(`SetThreadExecutionState(ES_CONTINUOUS | ES_SYSTEM_REQUIRED)` as
baseline).
- Optionally move to `PowerCreateRequest` / `PowerSetRequest` /
`PowerClearRequest` for richer assertion semantics.
- Linux:
- Add a backend using logind inhibitors over D-Bus
(`org.freedesktop.login1.Manager.Inhibit` with `what="sleep"`).
  - Keep a no-op fallback where logind/D-Bus is unavailable.

This PR keeps the cross-platform API surface minimal so future PRs can
add Windows/Linux support incrementally with low churn.

---------

Co-authored-by: jif-oai <jif@openai.com>
2026-02-13 10:31:39 -08:00
jif-oai
851fcc377b
feat: switch on dying sub-agents (#11477)
[codex-generated]

## Updated PR Description (Ready To Paste)

## Problem

When a sub-agent thread emits `ShutdownComplete`, the TUI switches back
to the primary thread.
That was also happening for user-requested exits (for example `Ctrl+C`),
which could prevent a
clean app exit and unexpectedly resurrect the main thread.

## Mental model

The app has one primary thread and one active thread. A non-primary
active thread shutting down
usually means "agent died, fail back to primary," but during
`ExitMode::ShutdownFirst` shutdown
means "the user is exiting," not "recover this session."

## Non-goals

No change to thread lifecycle, thread-manager ownership, or shutdown
protocol wire format.
No behavioral changes to non-shutdown events.

## Tradeoffs

This adds a small local marker (`pending_shutdown_exit_thread_id`)
instead of inferring intent
from event timing. It is deterministic and simple, but relies on
correctly setting and clearing
that marker around exit.

## Architecture

`App` tracks which thread is intentionally being shut down for exit.
`active_non_primary_shutdown_target` centralizes failover eligibility
for `ShutdownComplete` and
skips failover when shutdown matches the pending-exit thread.
`handle_active_thread_event` handles non-primary failover before generic
forwarding and clears the
pending-exit marker only when the matching active thread completes
shutdown.

## Observability

User-facing info/error messages continue to indicate whether failover to
the main thread succeeded.
The shutdown-intent path is now explicitly documented inline for easier
debugging.

## Tests

Added targeted tests for `active_non_primary_shutdown_target` covering
non-shutdown events,
primary-thread shutdown, non-primary shutdown failover, pending exit on
active thread (no failover),
and pending exit for another thread (still failover).

Validated with:
- `cargo test -p codex-tui` (pass)

---------

Co-authored-by: Josh McKinney <joshka@openai.com>
2026-02-13 18:29:03 +00:00
iceweasel-oai
99466f1f90
sandbox NUX metrics update (#11667)
just updating metrics to match the NUX tweaks we made this week.
2026-02-13 10:01:47 -08:00
Michael Bolin
2383978a2c
fix: reduce flakiness of compact_resume_after_second_compaction_preserves_history (#11663)
## Why
`compact_resume_after_second_compaction_preserves_history` has been
intermittently flaky in Windows CI.

The test had two one-shot request matchers in the second compact/resume
phase that could overlap, and it waited for the first `Warning` event
after compaction. In practice, that made the test sensitive to
platform/config-specific prompt shape and unrelated warning timing.

## What Changed
- Hardened the second compaction matcher in
`codex-rs/core/tests/suite/compact_resume_fork.rs` so it accepts
expected compact-request variants while explicitly excluding the
`AFTER_SECOND_RESUME` payload.
- Updated `compact_conversation()` to wait for the specific compaction
warning (`COMPACT_WARNING_MESSAGE`) rather than any `Warning` event.
- Added an inline comment explaining why the matcher is intentionally
broad but disjoint from the follow-up resume matcher.

## Test Plan
- `cargo test -p codex-core --test all
suite::compact_resume_fork::compact_resume_after_second_compaction_preserves_history
-- --exact`
- Repeated the same test in a loop (40 runs) to check for local
nondeterminism.
2026-02-13 09:51:22 -08:00
Max Johnson
f687b074ca
app-server-test-client websocket client and thread tools (#11755)
- add websocket endpoint mode with default ws://127.0.0.1:4222 while
keeping stdio codex-bin path compatibility
- add thread-resume (follow stream) and thread-list commands for manual
thread lifecycle testing
- quickstart docs
2026-02-13 17:34:35 +00:00
Anton Panasenko
38c442ca7f
core: limit search_tool_bm25 to Apps and clarify discovery guidance (#11669)
## Summary
- Limit `search_tool_bm25` indexing to `codex_apps` tools only, so
non-Apps MCP servers are no longer discoverable through this search
path.
- Move search-tool discovery guidance into the `search_tool_bm25` tool
description (via template include) instead of injecting it as a separate
developer message.
- Update Apps discovery guidance wording to clarify when to use
`search_tool_bm25` for Apps-backed systems (for example Slack, Google
Drive, Jira, Notion) and when to call tools directly.
- Remove dead `core` helper code (`filter_codex_apps_mcp_tools` and
`codex_apps_connector_id`) that is no longer used after the
tool-selection refactor.
- Update `core` search-tool tests to assert codex-apps-only behavior and
to validate guidance from the tool description.

## Validation
-  `just fmt`
-  `cargo test -p codex-core search_tool`
- ⚠️ `cargo test -p codex-core` was attempted, but the run repeatedly
stalled on
`tools::js_repl::tests::js_repl_can_attach_image_via_view_image_tool`.

## Tickets
- None
2026-02-13 09:32:46 -08:00
jif-oai
c0749c349f
Fix memories output schema requirements (#11748)
Summary
- make the phase1 memories schema require `rollout_slug` while still
allowing it to be `null`
- update the corresponding test to check the required fields and
nullable type list

Testing
- Not run (not requested)
2026-02-13 16:17:21 +00:00
jif-oai
561fc14045
chore: move explorer to spark (#11745) 2026-02-13 16:13:24 +00:00
jif-oai
db66d827be
feat: add slug in name (#11739) 2026-02-13 15:24:03 +00:00
jif-oai
bc80a4a8ed
feat: increase windows workers stack (#11736)
Switched arg0 runtime initialization from tokio::runtime::Runtime::new()
to an explicit multi-thread builder that sets the thread stack size to
16MiB.

This is only for Windows for now but we might need to do this for others
in the future. This is required because Codex becomes quite large and
Windows tends to consume stack a little bit faster (this is a known
thing even though everyone seems to have different theory on it)
2026-02-13 15:16:57 +00:00
jif-oai
e00080cea3
feat: memories config (#11731) 2026-02-13 14:18:15 +00:00
jif-oai
36541876f4
chore: streamline phase 2 (#11712) 2026-02-13 13:21:11 +00:00
jif-oai
feae389942
Lower missing rollout log level (#11722)
Fix this: https://github.com/openai/codex/issues/11634
2026-02-13 12:59:17 +00:00
jif-oai
e5e40e2d4b
feat: add token usage on memories (#11618)
Add aggregated token usage metrics on phase 1 of memories
2026-02-13 09:31:20 +00:00
viyatb-oai
2bced810da
feat(network-proxy): structured policy signaling and attempt correlation to core (#11662)
## Summary
When network requests were blocked, downstream code often had to infer
ask vs deny from free-form response text. That was brittle and led to
incorrect approval behavior.
This PR fixes the proxy side so blocked decisions are structured and
request metadata survives reliably.

## Description
- Blocked proxy responses now carry consistent structured policy
decision data.
- Request attempt metadata is preserved across proxy env paths
(including ALL_PROXY flows).
- Header stripping was tightened so we still remove unsafe forwarding
headers, but keep metadata needed for policy handling.
- Block messages were clarified (for example, allowlist miss vs explicit
deny).
- Added unified violation log entries so policy failures can be
inspected in one place.
- Added/updated tests for these behaviors.

---------

Co-authored-by: Codex <199175422+chatgpt-codex-connector[bot]@users.noreply.github.com>
2026-02-13 09:01:11 +00:00
Dylan Hurd
e6e4c5fa3a
chore(core) Restrict model-suggested rules (#11671)
## Summary
If the model suggests a bad rule, don't show it to the user. This does
not impact the parsing of existing rules, just the ones we show.

## Testing
- [x] Added unit tests
- [x] Ran locally
2026-02-12 23:57:53 -08:00
Josh McKinney
1e75173ebd
Point Codex App tooltip links to app landing page (#11515)
### Motivation
- Ensure the in-TUI Codex App call-to-action opens the app landing page
variant `https://chatgpt.com/codex?app-landing-page=true` so users reach
the intended landing experience.

### Description
- Update tooltip constants in `codex-rs/tui/src/tooltips.rs` to replace
`https://chatgpt.com/codex` with
`https://chatgpt.com/codex?app-landing-page=true` for the PAID and OTHER
tooltip variants.

### Testing
- Ran `just fmt` in `codex-rs` and `cargo test -p codex-tui`, and the
test suite completed successfully.

------
[Codex
Task](https://chatgpt.com/codex/tasks/task_i_698d20cf6f088329bb82b07d3ce76e61)
2026-02-12 23:35:57 -08:00
sayan-oai
abeafbdca1
fix: dont show NUX for upgrade-target models that are hidden (#11679)
dont show NUX for models marked with `visibility:hide`.

Tested locally
2026-02-12 20:29:22 -08:00
Matthew Zeng
f93037f55d
[apps] Fix app loading logic. (#11518)
When `app/list` is called with `force_refetch=True`, we should seed the
results with what is already cached instead of starting from an empty
list. Otherwise when we send app/list/updated events, the client will
first see an empty list of accessible apps and then get the updated one.
2026-02-13 03:55:10 +00:00
Dylan Hurd
35692e99c1
chore(approvals) More approvals scenarios (#11660)
## Summary
Add some additional tests to approvals flow

## Testing
- [x] these are tests
2026-02-12 19:54:54 -08:00
acrognale-oai
ebe359b876
Add cwd as an optional field to thread/list (#11651)
Add's the ability to filter app-server thread/list by cwd
2026-02-13 02:05:04 +00:00
Eric Traut
537102e657
Added a test to verify that feature flags that are enabled by default are stable (#11275)
We've had a few cases recently where someone enabled a feature flag for
a feature that's still under development or experimental. This test
should prevent this.
2026-02-12 17:53:15 -08:00
Jeremy Rose
9cf7a07281
feat(shell-tool-mcp): add patched zsh build pipeline (#11668)
## Summary
- add `shell-tool-mcp/patches/zsh-exec-wrapper.patch` against upstream
zsh `77045ef899e53b9598bebc5a41db93a548a40ca6`
- add `zsh-linux` and `zsh-darwin` jobs to
`.github/workflows/shell-tool-mcp.yml`
- stage zsh binaries under `artifacts/vendor/<target>/zsh/<variant>/zsh`
- include zsh artifact jobs in `package.needs`
- mark staged zsh binaries executable during packaging

## Notes
- zsh source is cloned from `https://git.code.sf.net/p/zsh/code`
- workflow pins zsh commit `77045ef899e53b9598bebc5a41db93a548a40ca6`
- zsh build runs `./Util/preconfig` before `./configure`

## Validation
- parsed workflow YAML locally (`yaml-ok`)
- validated zsh patch applies cleanly with `git apply --check` on a
fresh zsh clone
2026-02-13 01:34:48 +00:00
Josh McKinney
fc073c9c5b
Remove git commands from dangerous command checks (#11510)
### Motivation

- Git subcommand matching was being classified as "dangerous" and caused
benign developer workflows (for example `git push --force-with-lease`)
to be blocked by the preflight policy.
- The change aligns behavior with the intent to reserve the dangerous
checklist for truly destructive shell ops (e.g. `rm -rf`) and avoid
surprising developer-facing blocks.

### Description

- Remove git-specific subcommand checks from
`is_dangerous_to_call_with_exec` in
`codex-rs/shell-command/src/command_safety/is_dangerous_command.rs`,
leaving only explicit `rm` and `sudo` passthrough checks.
- Deleted the git-specific helper logic that classified `reset`,
`branch`-delete, `push` (force/delete/refspec) and `clean --force` as
dangerous.
- Updated unit tests in the same file to assert that various `git
reset`/`git branch`/`git push`/`git clean` variants are no longer
classified as dangerous.
- Kept `find_git_subcommand` (used by safe-command classification)
intact so safe/unsafe parsing elsewhere remains functional.

### Testing

- Ran formatter with `just fmt` successfully.  
- Ran unit tests with `cargo test -p codex-shell-command` and all tests
passed (`144 passed; 0 failed`).

------
[Codex
Task](https://chatgpt.com/codex/tasks/task_i_698d19dedb4883299c3ceb5bbc6a0dcf)
2026-02-13 01:33:02 +00:00
Charley Cunningham
f24669d444
Persist complete TurnContextItem state via canonical conversion (#11656)
## Summary

This PR delivers the first small, shippable step toward model-visible
state diffing by making
`TurnContextItem` more complete and standardizing how it is built.

Specifically, it:
- Adds persisted network context to `TurnContextItem`.
- Introduces a single canonical `TurnContext -> TurnContextItem`
conversion path.
- Routes existing rollout write sites through that canonical conversion
helper.

No context injection/diff behavior changes are included in this PR.

## Why this change

The design goal is to make `TurnContextItem` the canonical source of
truth for context-diff
decisions.
Before this PR:
- `TurnContextItem` did not include all TurnContext-derived environment
inputs needed for v1
completeness.
- Construction was duplicated at multiple write sites.

This PR addresses both with a minimal, reviewable change.

## Changes

### 1) Extend `TurnContextItem` with network state
- Added `TurnContextNetworkItem { allowed_domains, denied_domains }`.
- Added `network: Option<TurnContextNetworkItem>` to `TurnContextItem`.
- Kept backward compatibility by making the new field optional and
skipped when absent.

Files:
- `codex-rs/protocol/src/protocol.rs`

### 2) Canonical conversion helper
- Added `TurnContext::to_turn_context_item(collaboration_mode)` in core.
- Added internal helper to derive network fields from
`config_layer_stack.requirements().network`.

Files:
- `codex-rs/core/src/codex.rs`

### 3) Use canonical conversion at rollout write sites
- Replaced ad hoc `TurnContextItem { ... }` construction with
`to_turn_context_item(...)` in:
  - sampling request path
  - compaction path

Files:
- `codex-rs/core/src/codex.rs`
- `codex-rs/core/src/compact.rs`

### 4) Update fixtures/tests for new optional field
- Updated existing `TurnContextItem` literals in tests to include
`network: None`.
- Added protocol tests for:
  - deserializing old payloads with no `network`
  - serializing when `network` is present

Files:
- `codex-rs/core/tests/suite/resume_warning.rs`
- No replay/diff logic changes.
- Persisted rollout `TurnContextItem` now carries additional network
context when available.
- Older rollout lines without `network` remain readable.
2026-02-12 17:22:44 -08:00
canvrno-oai
46b2da35d5
Add new apps_mcp_gateway (#11630)
Adds a new apps_mcp_gateway flag to route Apps MCP calls through
https://api.openai.com/v1/connectors/mcp/ when enabled, while keeping
legacy MCP routing as default.
2026-02-12 16:54:11 -08:00
Matthew Zeng
c37560069a
[apps] Add is_enabled to app info. (#11417)
- [x] Add is_enabled to app info and the response of `app/list`.
- [x] Update TUI to have Enable/Disable button on the app detail page.
2026-02-13 00:30:52 +00:00
Owen Lin
8d97b5c246
fix(app-server): surface more helpful errors for json-rpc (#11638)
Propagate client JSON-RPC errors for app-server request callbacks.
Previously a number of possible errors were collapsed to `channel
closed`. Now we should be able to see the underlying client error.

### Summary
This change stops masking client JSON-RPC error responses as generic
callback cancellation in app-server server->client request flows.

Previously, when the client responded with a JSON-RPC error, we removed
the callback entry but did not send anything to the waiting oneshot
receiver. Waiters then observed channel closure (for example, auth
refresh request canceled: channel closed), which hid the actual client
error.

Now, client JSON-RPC errors are forwarded through the callback channel
and handled explicitly by request consumers.

### User-visible behavior
- External auth refresh now surfaces real client JSON-RPC errors when
provided.
- True transport/callback-drop cases still report
canceled/channel-closed semantics.

### Example: client JSON-RPC error is now propagated (not masked as
"canceled")

When app-server asks the client to refresh ChatGPT auth tokens, it sends
a server->client JSON-RPC request like:

```json
{
  "id": 42,
  "method": "account/chatgptAuthTokens/refresh",
  "params": {
    "reason": "unauthorized",
    "previousAccountId": "org-abc"
  }
}
```

If the client cannot refresh and responds with a JSON-RPC error:
```
{
  "id": 42,
  "error": {
    "code": -32000,
    "message": "refresh failed",
    "data": null
  }
}
```

app-server now forwards that error through the callback path and
surfaces:
`auth refresh request failed: code=-32000 message=refresh failed`

Previously, this same case could be reported as:
`auth refresh request canceled: channel closed`
2026-02-13 00:14:55 +00:00
Michael Bolin
2825ac85a8
app-server: stabilize detached review start on Windows (#11646)
## Why

`review_start_with_detached_delivery_returns_new_thread_id` has been
failing on Windows CI. The failure mode is a process crash
(`tokio-runtime-worker` stack overflow) during detached review setup,
which causes EOF in the test harness.

This test is intended to validate detached review thread identity, not
shell snapshot behavior. We also still want detached review to avoid
unnecessary rollout-path rediscovery when the parent thread is already
loaded.

## What Changed

- Updated detached review startup in
`codex-rs/app-server/src/codex_message_processor.rs`:
  - `start_detached_review` now receives the loaded parent thread.
  - It prefers `parent_thread.rollout_path()`.
- It falls back to `find_thread_path_by_id_str(...)` only if the
in-memory path is unavailable.
- Hardened the review test fixture in
`codex-rs/app-server/tests/suite/v2/review.rs` by setting
`shell_snapshot = false` in test config, so this test no longer depends
on unrelated Windows PowerShell snapshot initialization.

## Verification

- `cargo test -p codex-app-server`
- Verified
`suite::v2::review::review_start_with_detached_delivery_returns_new_thread_id`
passes locally.

## Notes

- Related context: rollout-path lookup behavior changed in #10532.
2026-02-12 16:12:44 -08:00
Michael Bolin
aef4af1079
app-server tests: disable shell_snapshot for review suite (#11657)
## Why


`suite::v2::review::review_start_with_detached_delivery_returns_new_thread_id`
was failing on Windows CI due to an unrelated process crash during shell
snapshot initialization (`tokio-runtime-worker` stack overflow).

This review test suite validates review API behavior and should not
depend on shell snapshot behavior. Keeping shell snapshot enabled in
this fixture made the test flaky for reasons outside the scenario under
test.

## What Changed

- Updated the review suite test config in
`codex-rs/app-server/tests/suite/v2/review.rs` to set:
  - `shell_snapshot = false`

This keeps the review tests focused on review behavior by disabling
shell snapshot initialization in this fixture.

## Verification

- `cargo test -p codex-app-server`
- Confirmed the previously failing Windows CI job for this test now
passes on this PR.
2026-02-12 23:56:43 +00:00
Curtis 'Fjord' Hawthorne
0dcfc59171
Add js_repl_tools_only model and routing restrictions (#10671)
# External (non-OpenAI) Pull Request Requirements

Before opening this Pull Request, please read the dedicated
"Contributing" markdown file or your PR may be closed:
https://github.com/openai/codex/blob/main/docs/contributing.md

If your PR conforms to our contribution guidelines, replace this text
with a detailed and high quality description of your changes.

Include a link to a bug report or enhancement request.


#### [git stack](https://github.com/magus/git-stack-cli)
-  `1` https://github.com/openai/codex/pull/10674
-  `2` https://github.com/openai/codex/pull/10672
- 👉 `3` https://github.com/openai/codex/pull/10671
-  `4` https://github.com/openai/codex/pull/10673
-  `5` https://github.com/openai/codex/pull/10670
2026-02-12 15:41:05 -08:00
Wendy Jiao
a7ce2a1c31
Remove absolute path in rollout_summary (#11622) 2026-02-12 23:32:41 +00:00
Celia Chen
dfd1e199a0
[feat] add seatbelt permission files (#11639)
Add seatbelt permission extension abstraction as permission files for
seatbelt profiles. This should complement our current sandbox policy
2026-02-12 23:30:22 +00:00
Owen Lin
76256a8cec
fix: skip review_start_with_detached_delivery_returns_new_thread_id o… (#11645)
…n windows
2026-02-12 15:12:57 -08:00
Michael Bolin
a4cc1a4a85
feat: introduce Permissions (#11633)
## Why
We currently carry multiple permission-related concepts directly on
`Config` for shell/unified-exec behavior (`approval_policy`,
`sandbox_policy`, `network`, `shell_environment_policy`,
`windows_sandbox_mode`).

Consolidating these into one in-memory struct makes permission handling
easier to reason about and sets up the next step: supporting named
permission profiles (`[permissions.PROFILE_NAME]`) without changing
behavior now.

This change is mostly mechanical: it updates existing callsites to go
through `config.permissions`, but it does not yet refactor those
callsites to take a single `Permissions` value in places where multiple
permission fields are still threaded separately.

This PR intentionally **does not** change the on-disk `config.toml`
format yet and keeps compatibility with legacy config keys.

## What Changed
- Introduced `Permissions` in `core/src/config/mod.rs`.
- Added `Config::permissions` and moved effective runtime permission
fields under it:
  - `approval_policy`
  - `sandbox_policy`
  - `network`
  - `shell_environment_policy`
  - `windows_sandbox_mode`
- Updated config loading/building so these effective values are still
derived from the same existing config inputs and constraints.
- Updated Windows sandbox helpers/resolution to read/write via
`permissions`.
- Threaded the new field through all permission consumers across core
runtime, app-server, CLI/exec, TUI, and sandbox summary code.
- Updated affected tests to reference `config.permissions.*`.
- Renamed the struct/field from
`EffectivePermissions`/`effective_permissions` to
`Permissions`/`permissions` and aligned variable naming accordingly.

## Verification
- `just fix -p codex-core -p codex-tui -p codex-cli -p codex-app-server
-p codex-exec -p codex-utils-sandbox-summary`
- `cargo build -p codex-core -p codex-tui -p codex-cli -p
codex-app-server -p codex-exec -p codex-utils-sandbox-summary`
2026-02-12 14:42:54 -08:00
xl-openai
d7cb70ed26
Better error message for model limit hit. (#11636)
<img width="553" height="147" alt="image"
src="https://github.com/user-attachments/assets/f04cdebd-608a-4055-a413-fae92aaf04e5"
/>
2026-02-12 14:10:30 -08:00
Dylan Hurd
4668feb43a
chore(core) Deprecate approval_policy: on-failure (#11631)
## Summary
In an effort to start simplifying our sandbox setup, we're announcing
this approval_policy as deprecated. In general, it performs worse than
`on-request`, and we're focusing on making fewer sandbox configurations
perform much better.

## Testing
- [x] Tested locally
- [x] Existing tests pass
2026-02-12 13:23:30 -08:00