Commit graph

3066 commits

Author SHA1 Message Date
Ahmed Ibrahim
1478a88eb0
Add collaboration developer instructions (#9424)
- Add additional instructions when they are available
- Make sure to update them on change either UserInput or UserTurn
2026-01-18 01:31:14 +00:00
Dylan Hurd
80d7a5d7fe
chore(instructions) Remove unread SessionMeta.instructions field (#9423)
### Description
- Remove the now-unused `instructions` field from the session metadata
to simplify SessionMeta and stop propagating transient instruction text
through the rollout recorder API. This was only saving
user_instructions, and was never being read.
- Stop passing user instructions into the rollout writer at session
creation so the rollout header only contains canonical session metadata.

### Testing

- Ran `just fmt` which completed successfully.
- Ran `just fix -p codex-protocol`, `just fix -p codex-core`, `just fix
-p codex-app-server`, `just fix -p codex-tui`, and `just fix -p
codex-tui2` which completed (Clippy fixes applied) as part of
verification.
- Ran `cargo test -p codex-protocol` which passed (28 tests).
- Ran `cargo test -p codex-core` which showed failures in a small set of
tests (not caused by the protocol type change directly):
`default_client::tests::test_create_client_sets_default_headers`,
several `models_manager::manager::tests::refresh_available_models_*`,
and `shell_snapshot::tests::linux_sh_snapshot_includes_sections` (these
tests failed in this CI run).
- Ran `cargo test -p codex-app-server` which reported several failing
integration tests (including
`suite::codex_message_processor_flow::test_codex_jsonrpc_conversation_flow`,
`suite::output_schema::send_user_turn_*`, and
`suite::user_agent::get_user_agent_returns_current_codex_user_agent`).
- `cargo test -p codex-tui` and `cargo test -p codex-tui2` were
attempted but aborted due to disk space exhaustion (`No space left on
device`).

------
[Codex
Task](https://chatgpt.com/codex/tasks/task_i_696bd8ce632483228d298cf07c7eb41c)
2026-01-17 16:02:28 -08:00
Dylan Hurd
bffe9b33e9
chore(core) Create instructions module (#9422)
## Summary
We have a variety of things we refer to as instructions in the code
base: our current canonical terms are:
- base instructions (raw string)
- developer instructions (has a type in protocol)
- user instructions

We also have `instructions` floating around in various places. We should
standardize on the above, and start using types to prevent them from
ending up in the wrong place. There will be additional PRs, but I'm
going to keep these small so we can easily follow them!

## Testing
- [x] Tests pass, this is purely a file move
2026-01-17 16:01:26 -08:00
Ahmed Ibrahim
8f0e0300d2
Expose collaboration presets (#9421)
Expose collaboration presets for clients

---------

Co-authored-by: Josh McKinney <joshka@openai.com>
2026-01-17 12:32:50 -08:00
Alex Hornby
b877a2041e
fix unified_exec::tests::unified_exec_timeouts to use a more unique match value (#9414)
Fix unified_exec_timeouts to use a unique variable value rather than
"codex" which was causing false positives when running tests locally
(presumably from my bash prompts). Discovered while running tests to
validate another change.

Fixes https://github.com/openai/codex/issues/9413

Test Plan:

Ran test locally on my fedora 43 x86_64 machine with:
```
cd codex/cargo-rs
cargo nextest run --all-features --no-fail-fast unified_exec::tests::unified_exec_timeouts
```

Before, unified_exec_timeouts fails:
```
    Finished `test` profile [unoptimized + debuginfo] target(s) in 0.38s
────────────
 Nextest run ID fa2b4949-a66c-408c-8002-32c52c70ec4f with nextest profile: default
    Starting 1 test across 107 binaries (3211 tests skipped)
        FAIL [   5.667s] codex-core unified_exec::tests::unified_exec_timeouts
  stdout ───

    running 1 test
    test unified_exec::tests::unified_exec_timeouts ... FAILED

    failures:

    failures:
        unified_exec::tests::unified_exec_timeouts

    test result: FAILED. 0 passed; 1 failed; 0 ignored; 0 measured; 774 filtered out; finished in 5.66s

  stderr ───

    thread 'unified_exec::tests::unified_exec_timeouts' (459601) panicked at core/src/unified_exec/mod.rs:381:9:
    timeout too short should yield incomplete output
    note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

────────────
     Summary [   5.677s] 1 test run: 0 passed, 1 failed, 3211 skipped
        FAIL [   5.667s] codex-core unified_exec::tests::unified_exec_timeouts
error: test run failed

```

After, works:
```
    Finished `test` profile [unoptimized + debuginfo] target(s) in 0.34s
────────────
 Nextest run ID f49e9004-e30b-4049-b0ff-283b543a1cd7 with nextest profile: default
    Starting 1 test across 107 binaries (3211 tests skipped)
        SLOW [> 15.000s] codex-core unified_exec::tests::unified_exec_timeouts
        PASS [  17.666s] codex-core unified_exec::tests::unified_exec_timeouts
────────────
     Summary [  17.676s] 1 test run: 1 passed (1 slow), 3211 skipped
```
2026-01-17 09:05:53 -08:00
Ahmed Ibrahim
764f3c7d03
fix(tui) Defer backtrack trim until rollback confirms (#9401)
Document the backtrack/rollback state machine and invariants between the
transcript overlay, in-flight “live tail”, and core thread state (tui + tui2).

Also adjust behavior for correctness:
- Track a single pending rollback and block additional rollbacks until core responds.
- Defer trimming transcript cells until ThreadRolledBack for the active session.
- Clear the guard on ThreadRollbackFailed so the user can retry.
- After a confirmed trim, schedule a one-shot scrollback refresh on the next draw.
- Clear stale pending rollback state when switching sessions.

---------

Co-authored-by: Josh McKinney <joshka@openai.com>
2026-01-17 06:29:41 +00:00
Fouad Matin
93a5e0fe1c
fix(codex-api): treat invalid_prompt as non-retryable (#9400)
**Goal**: Prevent response.failed events with `invalid_prompt` from
being treated as retryable errors so the UI shows the actual error
message instead of continually retrying.

**Before**: Codex would continue to retry despite the prompt being
marked as disallowed
**After**: Codex will stop retrying once prompt is marked disallowed
2026-01-16 22:22:08 -08:00
Ahmed Ibrahim
146d54cede
Add collaboration_mode override to turns (#9408) 2026-01-16 21:51:25 -08:00
xl-openai
ad8bf59cbf
Support enable/disable skill via config/api. (#9328)
In config.toml:
```
[[skills.config]]
path = "/Users/xl/.codex/skills/my_skill/SKILL.md"
enabled = false
```

API:
skills/list, skills/config/write
2026-01-16 20:22:05 -08:00
Ahmed Ibrahim
246f506551
Introduce collaboration modes (#9340)
- Merge `model` and `reasoning_effort` under collaboration modes.
- Add additional instructions for custom collaboration mode
- Default to Custom to not change behavior
2026-01-17 00:28:22 +00:00
Anton Panasenko
c26fe64539
feat: show forked from session id in /status (#9330)
Summary:
- Add forked_from to SessionMeta/SessionConfiguredEvent and persist it
for forked sessions.
- Surface forked_from in /status for tui + tui2 and add snapshots.
2026-01-16 13:41:46 -08:00
Owen Lin
f1653dd4d3
feat(app-server, core): return threads by created_at or updated_at (#9247)
Add support for returning threads by either `created_at` OR `updated_at`
descending. Previously core always returned threads ordered by
`created_at`.

This PR:
- updates core to be able to list threads by `updated_at` OR
`created_at` descending based on what the caller wants
- also update `thread/list` in app-server to expose this (default to
`created_at` if not specified)

All existing codepaths (app-server, TUI) still default to `created_at`,
so no behavior change is expected with this PR.

**Implementation**
To sort by `updated_at` is a bit nontrivial (whereas `created_at` is
easy due to the way we structure the folders and filenames on disk,
which are all based on `created_at`).

The most naive way to do this without introducing a cache file or sqlite
DB (which we have to implement/maintain) is to scan files in reverse
`created_at` order on disk, and look at the file's mtime (last modified
timestamp according to the filesystem) until we reach `MAX_SCAN_FILES`
(currently set to 10,000). Then, we can return the most recent N
threads.

Based on some quick and dirty benchmarking on my machine with ~1000
rollout files, calling `thread/list` with limit 50, the `updated_at`
path is slower as expected due to all the I/O:
- updated-at: average 103.10 ms
- created-at: average 41.10 ms

Those absolute numbers aren't a big deal IMO, but we can certainly
optimize this in a followup if needed by introducing more state stored
on disk.

**Caveat**
There's also a limitation in that any files older than `MAX_SCAN_FILES`
will be excluded, which means if a user continues a REALLY old thread,
it's possible to not be included. In practice that should not be too big
of an issue.

If a user makes...
- 1000 rollouts/day → threads older than 10 days won't show up
- 100 rollouts/day → ~100 days

If this becomes a problem for some reason, even more motivation to
implement an updated_at cache.
2026-01-16 20:58:55 +00:00
Anton Panasenko
e893e83eb9
feat: /fork the current session instead of opening session picker (#9385)
Implemented /fork to fork the current session directly (no picker),
handling it via a new ForkCurrentSession app event in both tui and tui2.
Updated slash command descriptions/tooltips and adjusted the fork tests
accordingly. Removed the unused in-session fork picker event.
2026-01-16 11:28:52 -08:00
viyatb-oai
f89a40a849
chore: upgrade to Rust 1.92.0 (#8860)
**Summary**
- Upgrade Rust toolchain used by CI to 1.92.0.
- Address new clippy `derivable_impls` warnings by deriving `Default`
for enums across protocol, core, backend openapi models, and
windows-sandbox setup.
- Tidy up related test/config behavior (originator header handling, env
override cleanup) and remove a now-unused assignment in TUI/TUI2 render
layout.

**Testing**
- `just fmt`
- `just fix -p codex-tui`
- `just fix -p codex-tui2`
- `just fix -p codex-windows-sandbox`
- `cargo test -p codex-tui`
- `cargo test -p codex-tui2`
- `cargo test -p codex-windows-sandbox`
- `cargo test -p codex-core --test all`
- `cargo test -p codex-app-server --test all`
- `cargo test -p codex-mcp-server --test all`
- `cargo test --all-features`
2026-01-16 11:12:52 -08:00
jif-oai
e650d4b02c
feat: tool call duration metric (#9364) 2026-01-16 18:33:14 +01:00
Ahmed Ibrahim
ebdd8795e9
Turn-state sticky routing per turn (#9332)
- capture the header from SSE/WS handshakes, store it per
ModelClientSession using `Oncelock`, echo it on turn-scoped requests,
and add SSE+WS integration tests for within-turn persistence +
cross-turn reset.

- keep `x-codex-turn-state` sticky within a user turn to maintain
routing continuity for retries/tool follow-ups.
2026-01-16 09:30:11 -08:00
Jeremy Rose
4125c825f9
add codex cloud list (#9324)
for listing cloud tasks.
2026-01-16 08:56:38 -08:00
Eric Traut
9147df0e60
Made codex exec resume --last consistent with codex resume --last (#9352)
PR #9245 made `codex resume --last` honor cwd, but I forgot to make the
same change for `codex exec resume --last`. This PR fixes the
inconsistency.

This addresses #8700
2026-01-16 08:53:47 -08:00
Matthew Zeng
131590066e
[device-auth] Add device code auth as a standalone option when headless environment is detected. (#9333) 2026-01-16 08:35:03 -08:00
jif-oai
2691e1ce21
fix: flaky tests (#9373) 2026-01-16 17:24:41 +01:00
jif-oai
1668ca726f
chore: close pipe on non-pty processes (#9369)
Closing the STDIN of piped process when starting them to avoid commands
like `rg` to wait for content on STDIN and hangs for ever
2026-01-16 15:54:32 +01:00
jif-oai
7905e99d03
prompt collab (#9367) 2026-01-16 15:12:41 +01:00
jif-oai
7fc49697dd
feat: CODEX_CI (#9366) 2026-01-16 13:52:16 +00:00
jif-oai
c576756c81
feat: collab wait multiple IDs (#9294) 2026-01-16 12:05:04 +01:00
jif-oai
c1ac5223e1
feat: run user commands under user snapshot (#9357)
The initial goal is for user snapshots to have access to aliases etc
2026-01-16 11:49:28 +01:00
jif-oai
f5b3e738fb
feat: propagate approval request of unsubscribed threads (#9232)
A thread can now be spawned by another thread. In order to process the
approval requests of such sub-threads, we need to detect those event and
show them in the TUI.

This is a temporary solution while the UX is being figured out. This PR
should be reverted once done
2026-01-16 11:23:01 +01:00
Ahmed Ibrahim
0cce6ebd83
rename model turn to sampling request (#9336)
We have two type of turns now: model and user turns. It's always
confusing to refer to either. Model turn is basically a sampling
request.
2026-01-16 10:06:24 +01:00
Eric Traut
1fc72c647f
Fix token estimate during compaction (#9337)
This addresses #9287
2026-01-15 19:48:11 -08:00
Michael Bolin
99f47d6e9a
fix(mcp): include threadId in both content and structuredContent in CallToolResult (#9338) 2026-01-15 18:33:11 -08:00
Thanh Nguyen
a6324ab34b
fix(tui): only show 'Worked for' separator when actual work was performed (#8958)
Fixes #7919.

This PR addresses a TUI display bug where the "Worked for" separator
would appear prematurely during the planning stage.

**Changes:**
- Added `had_work_activity` flag to `ChatWidget` to track if actual work
(exec commands, MCP tool calls, patches) was performed in the current
turn.
- Updated `handle_streaming_delta` to only display the
`FinalMessageSeparator` if both `needs_final_message_separator` AND
`had_work_activity` are true.
- Updated `handle_exec_end_now`, `handle_patch_apply_end_now`, and
`handle_mcp_end_now` to set `had_work_activity = true`.

**Verification:**
- Ran `cargo test -p codex-tui` to ensure no regressions.
- Manual verification confirms the separator now only appears after
actual work is completed.

---------

Co-authored-by: Josh McKinney <joshka@openai.com>
2026-01-16 01:41:43 +00:00
Dylan Hurd
3cabb24210
chore(windows) Enable Powershell UTF8 feature (#9195)
## Summary
We've received a lot of positive feedback about this feature, so we're
going to enable it by default.
2026-01-16 01:29:12 +00:00
charley-oai
1fa8350ae7
Add text element metadata to protocol, app server, and core (#9331)
The second part of breaking up PR
https://github.com/openai/codex/pull/9116

Summary:

- Add `TextElement` / `ByteRange` to protocol user inputs and user
message events with defaults.
- Thread `text_elements` through app-server v1/v2 request handling and
history rebuild.
- Preserve UI metadata only in user input/events (not `ContentItem`)
while keeping local image attachments in user events for rehydration.

Details:

- Protocol: `UserInput::Text` carries `text_elements`;
`UserMessageEvent` carries `text_elements` + `local_images`.
Serialization includes empty vectors for backward compatibility.
- app-server-protocol: v1 defines `V1TextElement` / `V1ByteRange` in
camelCase with conversions; v2 uses its own camelCase wrapper.
- app-server: v1/v2 input mapping includes `text_elements`; thread
history rebuilds include them.
- Core: user event emission preserves UI metadata while model history
stays clean; history replay round-trips the metadata.
2026-01-15 17:26:41 -08:00
Yuvraj Angad Singh
004a74940a
fix: send non-null content on elicitation Accept (#9196)
## Summary

- When a user accepts an MCP elicitation request, send `content:
Some(json!({}))` instead of `None`
- MCP servers that use elicitation expect content to be present when
action is Accept
- This matches the expected behavior shown in tests at
`exec-server/tests/common/lib.rs:171`

## Root Cause

In `codex-rs/core/src/codex.rs`, the `resolve_elicitation` function
always sent `content: None`:

```rust
let response = ElicitationResponse {
    action,
    content: None,  // Always None, even for Accept
};
```

## Fix

Send an empty object when accepting:

```rust
let content = match action {
    ElicitationAction::Accept => Some(serde_json::json!({})),
    ElicitationAction::Decline | ElicitationAction::Cancel => None,
};
```

## Test plan

- [x] Code compiles with `cargo check -p codex-core`
- [x] Formatted with `just fmt`
- [ ] Integration test `accept_elicitation_for_prompt_rule` (requires
MCP server binary)

Fixes #9053
2026-01-15 14:20:57 -08:00
Ahmed Ibrahim
749b58366c
Revert empty paste image handling (#9318)
Revert #9049 behavior so empty paste events no longer trigger a
clipboard image read.
2026-01-15 14:16:09 -08:00
pap-openai
d886a8646c
remove needs_follow_up error log (#9272) 2026-01-15 21:20:54 +00:00
sayan-oai
169201b1b5
[search] allow explicitly disabling web search (#9249)
moving `web_search` rollout serverside, so need a way to explicitly
disable search + signal eligibility from the client.

- Add `x‑oai‑web‑search‑eligible` header that signifies whether the
request can have web search.
- Only attach the `web_search` tool when the resolved `WebSearchMode` is
`Live` or `Cached`.
2026-01-15 11:28:57 -08:00
xl-openai
42fa4c237f
Support SKILL.toml file. (#9125)
We’re introducing a new SKILL.toml to hold skill metadata so Codex can
deliver a richer Skills experience.

Initial focus is the interface block:
```
[interface]
display_name = "Optional user-facing name"
short_description = "Optional user-facing description"
icon_small = "./assets/small-400px.png"
icon_large = "./assets/large-logo.svg"
brand_color = "#3B82F6"
default_prompt = "Optional surrounding prompt to use the skill with"
```

All fields are exposed via the app server API.
display_name and short_description are consumed by the TUI.
2026-01-15 11:20:04 -08:00
Eric Traut
5f10548772
Revert recent styling change for input prompt placeholder text (#9307)
A recent change in commit ccba737d26 modified the styling of the
placeholder text (e.g. "Implement {feature}") in the input box of the
CLI, changing it from non-italic to italic. I think this was likely
unintentional. It results in a bad display appearance on some terminal
emulators, and several users have complained about it.

This change switches back to non-italic styling, restoring the older
behavior.

It addresses #9262
2026-01-15 10:58:12 -08:00
jif-oai
da44569fef
nit: clean unified exec background processes (#9304)
To fix the occurences where the End event is received after the listener
stopped listenning
2026-01-15 18:34:33 +00:00
jif-oai
393a5a0311
chore: better orchestrator prompt (#9301) 2026-01-15 18:11:43 +00:00
viyatb-oai
55bda1a0f2
revert: remove pre-Landlock bind mounts apply (#9300)
**Description**

This removes the pre‑Landlock read‑only bind‑mount step from the Linux
sandbox so filesystem restrictions rely solely on Landlock again.
`mounts.rs` is kept in place but left unused. The linux‑sandbox README
is updated to match the new behavior and manual test expectations.
2026-01-15 09:47:57 -08:00
李琼羽
b4d240c3ae
fix(exec): improve stdin prompt decoding (#9151)
Fixes #8733.

- Read prompt from stdin as raw bytes and decode more helpfully.
- Strip UTF-8 BOM; decode UTF-16LE/UTF-16BE when a BOM is present.
- For other non-UTF8 input, fail with an actionable message (offset +
iconv hint).

Tests: `cargo test -p codex-exec`.
2026-01-15 09:29:05 -08:00
gt-oai
f6df1596eb
Propagate MCP disabled reason (#9207)
Indicate why MCP servers are disabled when they are disabled by
requirements:

```
➜  codex git:(main) ✗ just codex mcp list
cargo run --bin codex -- "$@"
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.27s
     Running `target/debug/codex mcp list`
Name         Command          Args  Env  Cwd  Status                                                                  Auth
docs         docs-mcp         -     -    -    disabled: requirements (MDM com.openai.codex:requirements_toml_base64)  Unsupported
hello_world  hello-world-mcp  -     -    -    disabled: requirements (MDM com.openai.codex:requirements_toml_base64)  Unsupported

➜  codex git:(main) ✗ just c
cargo run --bin codex -- "$@"
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.90s
     Running `target/debug/codex`
╭─────────────────────────────────────────────╮
│ >_ OpenAI Codex (v0.0.0)                    │
│                                             │
│ model:     gpt-5.2 xhigh   /model to change │
│ directory: ~/code/codex/codex-rs            │
╰─────────────────────────────────────────────╯

/mcp

🔌  MCP Tools

  • No MCP tools available.

  • docs (disabled)
    • Reason: requirements (MDM com.openai.codex:requirements_toml_base64)

  • hello_world (disabled)
    • Reason: requirements (MDM com.openai.codex:requirements_toml_base64)
```
2026-01-15 17:24:00 +00:00
Eric Traut
ae96a15312
Changed codex resume --last to honor the current cwd (#9245)
This PR changes `codex resume --last` to work consistently with `codex
resume`. Namely, it filters based on the cwd when selecting the last
session. It also supports the `--all` modifier as an override.

This addresses #8700
2026-01-15 17:05:08 +00:00
jif-oai
3fc487e0e0
feat: basic tui for event emission (#9209) 2026-01-15 15:53:02 +00:00
jif-oai
faeb08c1e1
feat: add interrupt capabilities to send_input (#9276) 2026-01-15 14:59:07 +00:00
jif-oai
05b960671d
feat: add agent roles to collab tools (#9275)
Add `agent_type` parameter to the collab tool `spawn_agent` that
contains a preset to apply on the config when spawning this agent
2026-01-15 13:33:52 +00:00
jif-oai
bad4c12b9d
feat: collab tools app-server event mapping (#9213) 2026-01-15 09:03:26 +00:00
viyatb-oai
2259031d64
fix: fallback to Landlock-only when user namespaces unavailable and set PR_SET_NO_NEW_PRIVS early (#9250)
fixes https://github.com/openai/codex/issues/9236

### Motivation
- Prevent sandbox setup from failing when unprivileged user namespaces
are denied so Landlock-only protections can still be applied.
- Ensure `PR_SET_NO_NEW_PRIVS` is set before installing seccomp and
Landlock restrictions to avoid kernel `EPERM`/`LandlockRestrict`
ordering issues.

### Description
- Add `is_permission_denied` helper that detects `EPERM` /
`PermissionDenied` from `CodexErr` to drive fallback logic.
- In `apply_read_only_mounts` skip read-only bind-mount setup and return
`Ok(())` when `unshare_user_and_mount_namespaces()` fails with
permission-denied so Landlock rules can still be installed.
- Add `set_no_new_privs()` and call it from
`apply_sandbox_policy_to_current_thread` before installing seccomp
filters and Landlock rules when disk or network access is restricted.
2026-01-14 22:24:34 -08:00
Ahmed Ibrahim
a09711332a
Add migration_markdown in model_info (#9219)
Next step would be to clean Model Upgrade in model presets

---------

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: aibrahim-oai <219906144+aibrahim-oai@users.noreply.github.com>
2026-01-15 01:55:22 +00:00