Commit graph

4710 commits

Author SHA1 Message Date
Eric Traut
281b0eae8b
Don't allow model_supports_reasoning_summaries to disable reasoning (#11833)
The `model_supports_reasoning_summaries` config option was originally
added so users could enable reasoning for custom models (models that
codex doesn't know about). This is how it was documented in the source,
but its implementation didn't match. It was implemented such that it can
also be used to disable reasoning for models that otherwise support
reasoning. This leads to bad behavior for some reasoning models like
`gpt-5.3-codex`. Diagnosing this is difficult, and it has led to many
support issues.

This PR changes the handling of `model_supports_reasoning_summaries` so
it matches its original documented behavior. If it is set to false, it
is a no-op. That is, it never disables reasoning for models that are
known to support reasoning. It can still be used for its intended
purpose -- to enable reasoning for unknown models.
2026-02-17 07:19:28 -08:00
jif-oai
4ab44e2c5c
feat: add --compact mode to just log (#11994)
Summary:
- add a `--compact` flag to the logs client to suppress thread/target
info
- format rows and timestamps differently when compact mode is enabled so
only hour time, level, and message remain
2026-02-17 14:21:26 +00:00
jif-oai
31d4bfdde0
feat: add --search to just log (#11995)
Summary
- extend the log client to accept an optional `--search` substring
filter when querying codex-state logs
- propagate the filter through `LogQuery` and apply it in
`push_log_filters` via `INSTR(message, ...)`
- add an integration test that exercises the new search filtering
behavior

Testing
- Not run (not requested)
2026-02-17 14:19:52 +00:00
jif-oai
56cd85cd4b
nit: wording multi-agent (#11986) 2026-02-17 11:45:59 +00:00
jif-oai
5ae84197b2
Exit early when session initialization fails (#11908)
Summary
- wait for the initial session startup loop to finish and handle exit
before waiting for the first message in fresh sessions
- propagate AppRunControl::Exit to return immediately when
initialization fails
2026-02-17 11:22:30 +00:00
Dylan Hurd
fcf16e97a6
fix(ci) Fix shell-tool-mcp.yml (#11969)
## Summary
We're seeing failures for shell-tool-mcp.yml during git checkouts. This
is a quick attempt to unblock releases - we should revisit this build
pipeline since we've hit a number of errors.
2026-02-17 11:13:18 +00:00
jif-oai
77f74a5c17
fix: race in js repl (#11922)
js_repl_reset previously raced with in-flight/new js_repl executions
because reset() could clear exec_tool_calls without synchronizing with
execute(). In that window, a running exec could lose its per-exec
tool-call context, and subsequent kernel RunTool messages would fail
with js_repl exec context not found. The fix serializes reset and
execute on the same exec_lock, so reset cannot run concurrently with
exec setup/teardown. We also keep the timeout path safe by performing
reset steps inline while execute() already holds the lock, avoiding
re-entrant lock acquisition. A regression test now verifies that reset
waits for the exec lock and does not clear tool-call state early.
2026-02-17 11:06:14 +00:00
jif-oai
b994b52994
Hide /debug slash commands from popup menu (#11974)
Summary
- filter command popup builtins to remove any `/debug*` entries so they
stay usable but are not listed
- added regression tests to ensure the popup hides debug commands while
dispatch still resolves them
2026-02-17 10:30:17 +00:00
jif-oai
846464e869
fix: js_repl reset hang by clearing exec tool calls without waiting (#11932)
Remove the waiting loop in `reset` so it no longer blocks on potentially
hanging exec tool calls + add `clear_all_exec_tool_calls_map` to drain
the map and notify waiters so `reset` completes immediately
2026-02-17 08:40:54 +00:00
Dylan Hurd
0fbe10a807
fix(core) exec_policy parsing fixes (#11951)
## Summary
Fixes a few things in our exec_policy handling of prefix_rules:
1. Correctly match redirects specifically for exec_policy parsing. i.e.
if you have `prefix_rule(["echo"], decision="allow")` then `echo hello >
output.txt` should match - this should fix #10321
2. If there already exists any rule that would match our prefix rule
(not just a prompt), then drop it, since it won't do anything.


## Testing
- [x] Updated unit tests, added approvals ScenarioSpecs
2026-02-16 23:11:59 -08:00
Fouad Matin
02e9006547
add(core): safety check downgrade warning (#11964)
Add per-turn notice when a request is downgraded to a fallback model due
to cyber safety checks.

**Changes**

- codex-api: Emit a ServerModel event based on the openai-model response
header and/or response payload (SSE + WebSocket), including when the
model changes mid-stream.
- core: When the server-reported model differs from the requested model,
emit a single per-turn warning explaining the reroute to gpt-5.2 and
directing users to Trusted
    Access verification and the cyber safety explainer.
- app-server (v2): Surface these cyber model-routing warnings as
synthetic userMessage items with text prefixed by Warning: (and document
this behavior).
2026-02-16 22:13:36 -08:00
Eric Traut
08f689843f
Fixed screen reader regression in CLI (#11860)
The `tui.animations` switch should gate all animations in the TUI, but a
recent change introduced a regression that didn't include the gate. This
makes it difficult to use the TUI with a screen reader.

This fix addresses #11856
2026-02-16 18:17:52 -08:00
Fouad Matin
b37555dd75
add(feedback): over-refusal / safety check (#11948)
Add new feedback option for "Over-refusal / safety check"
2026-02-16 16:24:47 -08:00
Dylan Hurd
19afbc35c1
chore(core) rm Feature::RequestRule (#11866)
## Summary
This feature is now reasonably stable, let's remove it so we can
simplify our upcoming iterations here.

## Testing 
- [x] Existing tests pass
2026-02-16 22:30:23 +00:00
Matthew Zeng
5b421bba34
[apps] Fix app mention syntax. (#11894)
- [x] Fix app mention syntax.
2026-02-16 22:01:49 +00:00
jif-oai
beb5cb4f48
Rename collab modules to multi agents (#11939)
Summary
- rename the `collab` handlers and UI files to `multi_agents` to match
the new naming
- update module references and specs so the handlers and TUI widgets
consistently use the renamed files
- keep the existing functionality while aligning file and module names
with the multi-agent terminology
2026-02-16 19:05:13 +00:00
jif-oai
af434b4f71
feat: drop MCP managing tools if no MCP servers (#11900)
Drop MCP tools if no MCP servers to save context

For this https://github.com/openai/codex/issues/11049
2026-02-16 18:40:45 +00:00
Vaibhav Srivastav
cef7fbc494
docs: mention Codex app in README intro (#11926)
Add mention of the app in the README.
2026-02-16 17:35:05 +01:00
jif-oai
e47045c806
feat: add customizable roles for multi-agents (#11917)
The idea is to have 2 family of agents.

1. Built-in that we packaged directly with Codex
2. User defined that are defined using the `agents_config.toml` file. It
can reference config files that will override the agent config. This
looks like this:
```
version = 1

[agents.explorer]
description = """Use `explorer` for all codebase questions.
Explorers are fast and authoritative.
Always prefer them over manual search or file reading.
Rules:
- Ask explorers first and precisely.
- Do not re-read or re-search code they cover.
- Trust explorer results without verification.
- Run explorers in parallel when useful.
- Reuse existing explorers for related questions."""
config_file = "explorer.toml"
```
2026-02-16 16:29:32 +00:00
jif-oai
50aea4b0dc
nit: memory storage (#11924) 2026-02-16 16:18:53 +00:00
jif-oai
e41536944e
chore: rename collab feature flag key to multi_agent (#11918)
Summary
- rename the collab feature key to multi_agent while keeping the Feature
enum unchanged
- add legacy alias support so both "multi_agent" and "collab" map to the
same feature
- cover the alias behavior with a new unit test
2026-02-16 15:28:31 +00:00
gt-oai
b3095679ed
Allow hooks to error (#11615)
Allow hooks to return errors. 

We should do this before introducing more hook types, or we'll have to
migrate them all.
2026-02-16 14:11:05 +00:00
jif-oai
825a4af42f
feat: use shell policy in shell snapshot (#11759)
Honor `shell_environment_policy.set` even after a shell snapshot
2026-02-16 09:11:00 +00:00
Anton Panasenko
1d95656149
bazel: fix snapshot parity for tests/*.rs rust_test targets (#11893)
## Summary
- make `rust_test` targets generated from `tests/*.rs` use Cargo-style
crate names (file stem) so snapshot names match Cargo (`all__...`
instead of Bazel-derived names)
- split lib vs `tests/*.rs` test env wiring in `codex_rust_crate` to
keep existing lib snapshot behavior while applying Bazel
runfiles-compatible workspace root for `tests/*.rs`
- compute the `tests/*.rs` snapshot workspace root from package depth so
`insta` resolves committed snapshots under Bazel `--noenable_runfiles`

## Validation
- `bazelisk test //codex-rs/core:core-all-test
--test_arg=suite::compact:: --cache_test_results=no`
- `bazelisk test //codex-rs/core:core-all-test
--test_arg=suite::compact_remote:: --cache_test_results=no`
2026-02-16 07:11:59 +00:00
sayan-oai
bdea9974d9
fix: only emit unknown model warning on user turns (#11884)
###### Context
unknown model warning added in #11690 has
[issues](https://github.com/openai/codex/actions/runs/22047424710/job/63700733887)
on ubuntu runners because we potentially emit it on all new turns,
including ones with intentionally fake models (i.e., `mock-model` in a
test).

###### Fix
change the warning to only emit on user turns/review turns.

###### Tests
CI now passes on ubuntu, still passes locally
2026-02-15 21:18:35 -08:00
Anton Panasenko
02abd9a8ea
feat: persist and restore codex app's tools after search (#11780)
### What changed
1. Removed per-turn MCP selection reset in `core/src/tasks/mod.rs`.
2. Added `SessionState::set_mcp_tool_selection(Vec<String>)` in
`core/src/state/session.rs` for authoritative restore behavior (deduped,
order-preserving, empty clears).
3. Added rollout parsing in `core/src/codex.rs` to recover
`active_selected_tools` from prior `search_tool_bm25` outputs:
   - tracks matching `call_id`s
   - parses function output text JSON
   - extracts `active_selected_tools`
   - latest valid payload wins
   - malformed/non-matching payloads are ignored
4. Applied restore logic to resumed and forked startup paths in
`core/src/codex.rs`.
5. Updated instruction text to session/thread scope in
`core/templates/search_tool/tool_description.md`.
6. Expanded tests in `core/tests/suite/search_tool.rs`, plus unit
coverage in:
   - `core/src/codex.rs`
   - `core/src/state/session.rs`

### Behavior after change
1. Search activates matched tools.
2. Additional searches union into active selection.
3. Selection survives new turns in the same thread.
4. Resume/fork restores selection from rollout history.
5. Separate threads do not inherit selection unless forked.
2026-02-15 19:18:41 -08:00
sayan-oai
060a320e7d
fix: show user warning when using default fallback metadata (#11690)
### What
It's currently unclear when the harness falls back to the default,
generic `ModelInfo`. This happens when the `remote_models` feature is
disabled or the model is truly unknown, and can lead to bad performance
and issues in the harness.

Add a user-facing warning when this happens so they are aware when their
setup is broken.

### Tests
Added tests, tested locally.
2026-02-15 18:46:05 -08:00
Charley Cunningham
85034b189e
core: snapshot tests for compaction requests, post-compaction layout, some additional compaction tests (#11487)
This PR keeps compaction context-layout test coverage separate from
runtime compaction behavior changes, so runtime logic review can stay
focused.

## Included
- Adds reusable context snapshot helpers in
`core/tests/common/context_snapshot.rs` for rendering model-visible
request/history shapes.
- Standardizes helper naming for readability:
  - `format_request_input_snapshot`
  - `format_response_items_snapshot`
  - `format_labeled_requests_snapshot`
  - `format_labeled_items_snapshot`
- Expands snapshot coverage for both local and remote compaction flows:
  - pre-turn auto-compaction
  - pre-turn failure/context-window-exceeded paths
  - mid-turn continuation compaction
  - manual `/compact` with and without prior user turns
- Captures both sides where relevant:
  - compaction request shape
  - post-compaction history layout shape
- Adds/uses shared request-inspection helpers so assertions target
structured request content instead of ad-hoc JSON string parsing.
- Aligns snapshots/assertions to current behavior and leaves explicit
`TODO(ccunningham)` notes where behavior is known and intentionally
deferred.

## Not Included
- No runtime compaction logic changes.
- No model-visible context/state behavior changes.
2026-02-14 19:57:10 -08:00
Charley Cunningham
fce4ad9cf4
Add process_uuid to sqlite logs (#11534)
## Summary
This PR is the first slice of the per-session `/feedback` logging work:
it adds a process-unique identifier to SQLite log rows.

It does **not** change `/feedback` sourcing behavior yet.

## Changes
- Add migration `0009_logs_process_id.sql` to extend `logs` with:
  - `process_uuid TEXT`
  - `idx_logs_process_uuid` index
- Extend state log models:
  - `LogEntry.process_uuid: Option<String>`
  - `LogRow.process_uuid: Option<String>`
- Stamp each log row with a stable per-process UUID in the sqlite log
layer:
  - generated once per process as `pid:<pid>:<uuid>`
- Update sqlite log insert/query paths to persist and read
`process_uuid`:
  - `INSERT INTO logs (..., process_uuid, ...)`
  - `SELECT ..., process_uuid, ... FROM logs`

## Why
App-server runs many sessions in one process. This change provides a
process-scoping primitive we need for follow-up `/feedback` work, so
threadless/process-level logs can be associated with the emitting
process without mixing across processes.

## Non-goals in this PR
- No `/feedback` transport/source changes
- No attachment size changes
- No sqlite retention/trim policy changes

## Testing
- `just fmt`
- CI will run the full checks
2026-02-14 17:27:22 -08:00
viyatb-oai
db6aa80195
fix(core): add linux bubblewrap sandbox tag (#11767)
## Summary
- add a distinct `linux_bubblewrap` sandbox tag when the Linux
bubblewrap pipeline feature is enabled
- thread the bubblewrap feature flag into sandbox tag generation for:
  - turn metadata header emission
  - tool telemetry metric tags and after-tool-use hooks
- add focused unit tests for `sandbox_tag` precedence and Linux
bubblewrap behavior

## Validation
- `just fmt`
- `cargo clippy -p codex-core --all-targets`
- `cargo test -p codex-core sandbox_tags::tests`
- started `cargo test -p codex-core` and stopped it per request

Co-authored-by: Codex <199175422+chatgpt-codex-connector[bot]@users.noreply.github.com>
2026-02-14 19:00:01 +00:00
Dylan Hurd
ebceb71db6
feat(tui) Permissions update history item (#11550)
## Summary
We should document in the tui when you switch permissions!

## Testing
- [x] Added unit tests
- [x] Tested locally
2026-02-13 23:44:27 -08:00
viyatb-oai
3164670101
feat(tui): render structured network approval prompts in approval overlay (#11674)
### Description
#### Summary
Adds the TUI UX layer for structured network approvals

#### What changed
- Updated approval overlay to display network-specific approval context
(host/protocol).
- Added/updated TUI wiring so approval prompts show correct network
messaging.
- Added tests covering the new approval overlay behavior.

#### Why
Core orchestration can now request structured network approvals; this
ensures users see clear, contextual prompts in the TUI.

#### Notes
- UX behavior activates only when network approval context is present.

---------

Co-authored-by: Codex <199175422+chatgpt-codex-connector[bot]@users.noreply.github.com>
2026-02-13 22:38:36 -08:00
viyatb-oai
b527ee2890
feat(core): add structured network approval plumbing and policy decision model (#11672)
### Description
#### Summary
Introduces the core plumbing required for structured network approvals

#### What changed
- Added structured network policy decision modeling in core.
- Added approval payload/context types needed for network approval
semantics.
- Wired shell/unified-exec runtime plumbing to consume structured
decisions.
- Updated related core error/event surfaces for structured handling.
- Updated protocol plumbing used by core approval flow.
- Included small CLI debug sandbox compatibility updates needed by this
layer.

#### Why
establishes the minimal backend foundation for network approvals without
yet changing high-level orchestration or TUI behavior.

#### Notes
- Behavior remains constrained by existing requirements/config gating.
- Follow-up PRs in the stack handle orchestration, UX, and app-server
integration.

---------

Co-authored-by: Codex <199175422+chatgpt-codex-connector[bot]@users.noreply.github.com>
2026-02-14 04:18:12 +00:00
Eric Traut
854e91e422
Fixed help text for mcp and mcp-server CLI commands (#11813)
Also removed the "[experimental]" tag since these have been stable for
many months

This addresses #11812
2026-02-13 20:16:22 -08:00
Charley Cunningham
67e577da53
Handle model-switch base instructions after compaction (#11659)
Strip trailing <model_switch> during model-switch compaction request,
and append <model_switch> after model switch compaction
2026-02-13 19:02:53 -08:00
alexsong-oai
8156c57234
add perf metrics for connectors load (#11803) 2026-02-13 18:15:07 -08:00
Josh McKinney
de93cef5b7
bazel: enforce MODULE.bazel.lock sync with Cargo.lock (#11790)
## Why this change

When Cargo dependencies change, it is easy to end up with an unexpected
local diff in
`MODULE.bazel.lock` after running Bazel. That creates noisy working
copies and pushes lockfile fixes
later in the cycle. This change addresses that pain point directly.

## What this change enforces

The expected invariant is: after dependency updates, `MODULE.bazel.lock`
is already in sync with
Cargo resolution. In practice, running `bazel mod deps` should not
mutate the lockfile in a clean
state. If it does, the dependency update is incomplete.

## How this is enforced

This change adds a single lockfile check script that snapshots
`MODULE.bazel.lock`, runs
`bazel mod deps`, and fails if the file changes. The same check is wired
into local workflow
commands (`just bazel-lock-update` and `just bazel-lock-check`) and into
Bazel CI (Linux x86_64 job)
so drift is caught early and consistently. The developer documentation
is updated in
`codex-rs/docs/bazel.md` and `AGENTS.md` to make the expected flow
explicit.

`MODULE.bazel.lock` is also refreshed in this PR to match the current
Cargo dependency resolution.

## Expected developer workflow

After changing `Cargo.toml` or `Cargo.lock`, run `just
bazel-lock-update`, then run
`just bazel-lock-check`, and include any resulting `MODULE.bazel.lock`
update in the same change.

## Testing

Ran `just bazel-lock-check` locally.
2026-02-14 02:11:19 +00:00
Celia Chen
5b6911cb1b
feat(skills): add permission profiles from openai.yaml metadata (#11658)
## Summary

This PR adds support for skill-level permissions in .codex/openai.yaml
and wires that through the skill loading pipeline.

  ## What’s included

1. Added a new permissions section for skills (network, filesystem, and
macOS-related access).
2. Implemented permission parsing/normalization and translation into
runtime permission profiles.
3. Threaded the new permission profile through SkillMetadata and loader
flow.

  ## Follow-up

A follow-up PR will connect these permission profiles to actual sandbox
enforcement and add user approval prompts for executing binaries/scripts
from skill directories.


 ## Example 
`openai.yaml` snippet:
```
  permissions:
    network: true
    fs_read:
      - "./data"
      - "./data"
    fs_write:
      - "./output"
    macos_preferences: "readwrite"
    macos_automation:
      - "com.apple.Notes"
    macos_accessibility: true
    macos_calendar: true
```

compiled skill permission profile metadata (macOS): 
```
SkillPermissionProfile {
      sandbox_policy: SandboxPolicy::WorkspaceWrite {
          writable_roots: vec![
              AbsolutePathBuf::try_from("/ABS/PATH/TO/SKILL/output").unwrap(),
          ],
          read_only_access: ReadOnlyAccess::Restricted {
              include_platform_defaults: true,
              readable_roots: vec![
                  AbsolutePathBuf::try_from("/ABS/PATH/TO/SKILL/data").unwrap(),
              ],
          },
          network_access: true,
          exclude_tmpdir_env_var: false,
          exclude_slash_tmp: false,
      },
      // Truncated for readability; actual generated profile is longer.
      macos_seatbelt_permission_file: r#"
  (allow user-preference-write)
  (allow appleevent-send
      (appleevent-destination "com.apple.Notes"))
  (allow mach-lookup (global-name "com.apple.axserver"))
  (allow mach-lookup (global-name "com.apple.CalendarAgent"))
  ...
  "#.to_string(),
```
2026-02-14 01:43:44 +00:00
Curtis 'Fjord' Hawthorne
0d76d029b7
Fix js_repl in-flight tool-call waiter race (#11800)
## Summary

This PR fixes a race in `js_repl` tool-call draining that could leave an
exec waiting indefinitely for in-flight tool calls to finish.

The fix is in:

-
`/Users/fjord/code/codex-jsrepl-seq/codex-rs/core/src/tools/js_repl/mod.rs`

## Problem

`js_repl` tracks in-flight tool calls per exec and waits for them to
drain on completion/timeout/cancel paths.
The previous wait logic used a check-then-wait pattern with `Notify`
that could miss a wakeup:

1. Observe `in_flight > 0`
2. Drop lock
3. Register wait (`notified().await`)

If `notify_waiters()` happened between (2) and (3), the waiter could
sleep until another notification that never comes.

## What changed

- Updated all exec-tool-call wait loops to create an owned notification
future while holding the lock:
- use `Arc<Notify>::notified_owned()` instead of cloning notify and
awaiting later.
- Applied this consistently to:
  - `wait_for_exec_tool_calls`
  - `wait_for_all_exec_tool_calls`
  - `wait_for_exec_tool_calls_map`

This preserves existing behavior while eliminating the lost-wakeup
window.

## Test coverage

Added a regression test:

- `wait_for_exec_tool_calls_map_drains_inflight_calls_without_hanging`

The test repeatedly races waiter/finisher tasks and asserts bounded
completion to catch hangs.

## Impact

- No API changes.
- No user-facing behavior changes intended.
- Improves reliability of exec lifecycle boundaries when tool calls are
still in flight.


#### [git stack](https://github.com/magus/git-stack-cli)
-  `1` https://github.com/openai/codex/pull/11796
- 👉 `2` https://github.com/openai/codex/pull/11800
-  `3` https://github.com/openai/codex/pull/10673
-  `4` https://github.com/openai/codex/pull/10670
2026-02-14 01:24:52 +00:00
Curtis 'Fjord' Hawthorne
6cbb489e6e
Fix js_repl view_image test runtime panic (#11796)
## Summary
Fixes a flaky/panicking `js_repl` image-path test by running it on a
multi-thread Tokio runtime and tightening assertions to focus on real
behavior.

## Problem
`js_repl_can_attach_image_via_view_image_tool` in  

`/Users/fjord/code/codex-jsrepl-seq/codex-rs/core/src/tools/js_repl/mod.rs`
can panic under single-thread test runtime with:

`can call blocking only when running on the multi-threaded runtime`

It also asserted a brittle user-facing text string.

## Changes
1. Updated the test runtime to:
   `#[tokio::test(flavor = "multi_thread", worker_threads = 2)]`
2. Removed the brittle `"attached local image path"` string assertion.
3. Kept the concrete side-effect assertions:
   - tool call succeeds
- image is actually injected into pending input (`InputImage` with
`data:image/png;base64,...`)

## Why this is safe
This is test-only behavior. No production runtime code paths are
changed.

## Validation
- Ran:
`cargo test -p codex-core
tools::js_repl::tests::js_repl_can_attach_image_via_view_image_tool --
--nocapture`
- Result: pass


#### [git stack](https://github.com/magus/git-stack-cli)
- 👉 `1` https://github.com/openai/codex/pull/11796
-  `2` https://github.com/openai/codex/pull/11800
-  `3` https://github.com/openai/codex/pull/10673
-  `4` https://github.com/openai/codex/pull/10670
2026-02-14 01:11:13 +00:00
Josh McKinney
067f8b1be0
fix(protocol): make local image test Bazel-friendly (#11799)
Fixes Bazel build failure in //codex-rs/protocol:protocol-unit-tests.

The test used include_bytes! to read a PNG from codex-core assets; Cargo
can read it,
but Bazel sandboxing can't, so the crate fails to compile.

This change inlines a tiny valid PNG in the test to keep it hermetic.

Related regression: #10590 (cc: @charley-oai)
2026-02-14 00:53:15 +00:00
sayan-oai
6b466df146
fix: send unfiltered models over model/list (#11793)
### What
to unblock filtering models in VSCE, change `model/list` app-server
endpoint to send all models + visibility field `showInPicker` so
filtering can be done in VSCE if desired.

### Tests
Updated tests.
2026-02-13 16:26:32 -08:00
Max Johnson
fb0aaf94de
codex-rs: fix thread resume rejoin semantics (#11756)
## Summary
- always rejoin an in-memory running thread on `thread/resume`, even
when overrides are present
- reject `thread/resume` when `history` is provided for a running thread
- reject `thread/resume` when `path` mismatches the running thread
rollout path
- warn (but do not fail) on override mismatches for running threads
- add more `thread_resume` integration tests and fixes; including
restart-based resume-with-overrides coverage

## Validation
- `just fmt`
- `cargo test -p codex-app-server --test all thread_resume`
- manual test with app-server-test-client
https://github.com/openai/codex/pull/11755
- manual test both stdio and websocket in app
2026-02-13 23:09:58 +00:00
Jeremy Rose
e4f8263798
[app-server] add fuzzyFileSearch/sessionCompleted (#11773)
this is to allow the client to know when to stop showing a spinner.
2026-02-13 15:08:14 -08:00
pash-openai
a5e8e69d18
turn metadata followups (#11782)
some trivial simplifications from #11677
2026-02-13 14:59:16 -08:00
Charley Cunningham
26a7cd21e2
tui: preserve remote image attachments across resume/backtrack (#10590)
## Summary
This PR makes app-server-provided image URLs first-class attachments in
TUI, so they survive resume/backtrack/history recall and are resubmitted
correctly.

<img width="715" height="491" alt="Screenshot 2026-02-12 at 8 27 08 PM"
src="https://github.com/user-attachments/assets/226cbd35-8f0c-4e51-a13e-459ef5dd1927"
/>

Can delete the attached image upon backtracking:
<img width="716" height="301" alt="Screenshot 2026-02-12 at 8 27 31 PM"
src="https://github.com/user-attachments/assets/4558d230-f1bd-4eed-a093-8e1ab9c6db27"
/>

In both history and composer, remote images are rendered as normal
`[Image #N]` placeholders, with numbering unified with local images.

## What changed
- Plumb remote image URLs through TUI message state:
  - `UserHistoryCell`
  - `BacktrackSelection`
  - `ChatComposerHistory::HistoryEntry`
  - `ChatWidget::UserMessage`
- Show remote images as placeholder rows inside the composer box (above
textarea), and in history cells.
- Support keyboard selection/deletion for remote image rows in composer
(`Up`/`Down`, `Delete`/`Backspace`).
- Preserve remote-image-only turns in local composer history (Up/Down
recall), including restore after backtrack.
- Ensure submit/queue/backtrack resubmit include remote images in model
input (`UserInput::Image`), and keep request shape stable for
remote-image-only turns.
- Keep image numbering contiguous across remote + local images:
  - remote images occupy `[Image #1]..[Image #M]`
  - local images start at `[Image #M+1]`
  - deletion renumbers consistently.
- In protocol conversion, increment shared image index for remote images
too, so mixed remote/local image tags stay in a single sequence.
- Simplify restore logic to trust in-memory attachment order (no
placeholder-number parsing path).
- Backtrack/replay rollback handling now queues trims through
`AppEvent::ApplyThreadRollback` and syncs transcript overlay/deferred
lines after trims, so overlay/transcript state stays consistent.
- Trim trailing blank rendered lines from user history rendering to
avoid oversized blank padding.

## Docs + tests
- Updated: `docs/tui-chat-composer.md` (remote image flow,
selection/deletion, numbering offsets)
- Added/updated tests across `tui/src/chatwidget/tests.rs`,
`tui/src/app.rs`, `tui/src/app_backtrack.rs`, `tui/src/history_cell.rs`,
and `tui/src/bottom_pane/chat_composer.rs`
- Added snapshot coverage for remote image composer states, including
deleting the first of two remote images.

## Validation
- `just fmt`
- `cargo test -p codex-tui`

## Codex author
`codex fork 019c2636-1571-74a1-8471-15a3b1c3f49d`
2026-02-13 14:54:06 -08:00
Max Johnson
395729910c
rmcp-client: fix auth crash (#11692)
Don't load auth tokens if bearer token is present. This fixes a crash I
was getting on Linux:

```
2026-02-12T23:26:24.999408Z DEBUG session_init: codex_core::codex: Configuring session: model=gpt-5.3-codex-spark; provider=ModelProviderInfo { name: "OpenAI", base_url: None, env_key: None, env_key_instructions: No
ne, experimental_bearer_token: None, wire_api: Responses, query_params: None, http_headers: Some({"version": "0.0.0"}), env_http_headers: Some({"OpenAI-Project": "OPENAI_PROJECT", "OpenAI-Organization": "OPENAI_ORGA
NIZATION"}), request_max_retries: None, stream_max_retries: None, stream_idle_timeout_ms: None, requires_openai_auth: true, supports_websockets: true }
2026-02-12T23:26:24.999799Z TRACE session_init: codex_keyring_store: keyring.load start, service=Codex MCP Credentials, account=codex_apps|20398391ad12d90b

thread 'tokio-runtime-worker' (96190) has overflowed its stack
fatal runtime error: stack overflow, aborting
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 1.35s
```
2026-02-13 14:32:01 -08:00
pash-openai
6c0a924203
turn metadata: per-turn non-blocking (#11677) 2026-02-13 12:48:29 -08:00
Alex Kwiatkowski
a4bb59884b
fix(nix): use correct version from Cargo.toml in flake build (#11770)
## Summary

- When building via `nix build`, the binary reports `codex-cli 0.0.0`
because the workspace `Cargo.toml` uses `0.0.0` as a placeholder on
`main`. This causes the update checker to always prompt users to upgrade
even when running the latest code.
- Reads the version from `codex-rs/Cargo.toml` at flake evaluation time
using `builtins.fromTOML` and patches it into the workspace `Cargo.toml`
before cargo builds via `postPatch`.
- On release commits (e.g. tag `rust-v0.101.0`), the real version is
used as-is. On `main` branch builds, falls back to
`0.0.0-dev+<shortRev>` (or `0.0.0-dev+dirty`), which the update
checker's `parse_version` ignores — suppressing the spurious upgrade
prompt.

| Scenario | Cargo.toml version | Nix `version` | Binary reports |
Upgrade nag? |
|---|---|---|---|---|
| Release commit (e.g. `rust-v0.101.0`) | `0.101.0` | `0.101.0` |
`codex-cli 0.101.0` | Only if newer exists |
| Main branch (committed) | `0.0.0` | `0.0.0-dev+b934ffc` | `codex-cli
0.0.0-dev+b934ffc` | No |
| Main branch (uncommitted) | `0.0.0` | `0.0.0-dev+dirty` | `codex-cli
0.0.0-dev+dirty` | No |

## Test plan

- [ ] `nix build` from `main` branch and verify `codex --version`
reports `0.0.0-dev+<shortRev>` instead of `0.0.0`
- [ ] Verify the update checker does not show a spurious upgrade prompt
for dev builds
- [ ] Confirm that on a release commit where `Cargo.toml` has a real
version, the binary reports that version correctly
2026-02-13 12:19:25 -08:00
Eric Traut
ffef5ce5de
Improve GitHub issue deduplication reliability by introducing a stage… (#11769)
…d two-pass Codex search strategy with deterministic fallback behavior,
and remove an obsolete prompt file that was no longer used.

### Changes
- Updated `workflows/issue-deduplicator.yml`:
- Added richer issue input fields (`state`, `updatedAt`, `labels`) for
model context.
  - Added two candidate pools:
    - `codex-existing-issues-all.json` (`--state all`)
    - `codex-existing-issues-open.json` (`--state open`)
- Added body truncation during JSON preparation to reduce prompt noise.
  - Added **Pass 1** Codex run over all issues.
  - Added normalization/validation step for Pass 1 output:
    - tolerant JSON parsing
    - self-issue filtering
    - deduplication
    - cap to 5 results
- Added **Pass 2 fallback** Codex run over open issues only, triggered
only when Pass 1 has no usable matches.
- Added normalization/validation step for Pass 2 output (same
filtering/dedup/cap behavior).
  - Added final deterministic selector:
    - prefer pass 2 if it finds matches
    - otherwise use pass 1
    - otherwise return no matches
  - Added observability logs:
    - pool sizes
    - per-pass parse/match status
    - final pass selected and final duplicate count
  - Kept public issue-comment format unchanged.
- Added comment documenting that prompt text now lives inline in
workflow.

- Deleted obsolete file:
  - `/prompts/issue-deduplicator.txt`

### Behavior Impact
- Better duplicate recall when broad search fails by retrying against
active issues only.
- More deterministic/noise-resistant output handling.
- No change to workflow trigger conditions, permissions, or issue
comment structure.
2026-02-13 12:01:07 -08:00