Commit graph

2793 commits

Author SHA1 Message Date
jif-oai
66554abfb9
sec: fix version of time to prevent vulnerability (#10876)
RUSTSEC-2026-0009
2026-02-06 12:10:07 +01:00
Eric Traut
dd80e332c4
Removed the "remote_compaction" feature flag (#10840)
This feature is always on now
2026-02-05 23:54:57 -08:00
Eric Traut
f61226d32a
Personality setting is no longer available in experimental menu (#10852)
This PR removes the inaccurate "Disable in /experimental." statement now
that the "personality" feature flag is no longer experimental.

This addresses #10850
2026-02-05 22:19:09 -08:00
Eric Traut
e5c1a2d6fb
Log an event (info only) when we receive a file watcher event (#10843) 2026-02-05 20:24:16 -08:00
Ahmed Ibrahim
048e0f3888
Gate app tooltips to macOS (#10784)
- Gate app promo tips to macOS and use non-app copy elsewhere.
2026-02-05 19:18:08 -08:00
Anton Panasenko
4ee039744e
feat: expose detailed metrics to runtime metrics (#10699) 2026-02-05 18:22:30 -08:00
gt-oai
d74fa8edd1
Print warning when config does not meet requirements (#10792)
<img width="1019" height="284" alt="Screenshot 2026-02-05 at 23 34 08"
src="https://github.com/user-attachments/assets/19ec3ce1-3c3b-40f5-b251-a31d964bf3bb"
/>

Currently, if a config value is set that fails the requirements, we exit
Codex.

Now, instead of this, we print a warning and default to a
requirements-permitting value.
2026-02-06 01:12:44 +00:00
Owen Lin
0d8b2b74c4
feat(app-server): turn/steer API (#10821)
This PR adds a dedicated `turn/steer` API for appending user input to an
in-flight turn.

## Motivation
Currently, steering in the app is implemented by just calling
`turn/start` while a turn is running. This has some really weird quirks:
- Client gets back a new `turn.id`, even though streamed
events/approvals remained tied to the original active turn ID.
- All the various turn-level override params on `turn/start` do not
apply to the "steer", and would only apply to the next real turn.
- There can also be a race condition where the client thinks the turn is
active but the server has already completed it, so there might be bugs
if the client has baked in some client-specific behavior thinking it's a
steer when in fact the server kicked off a new turn. This is
particularly possible when running a client against a remote app-server.

Having a dedicated `turn/steer` API eliminates all those quirks.

`turn/steer` behavior:
- Requires an active turn on threadId. Returns a JSON-RPC error if there
is no active turn.
- If expectedTurnId is provided, it must match the active turn (more
useful when connecting to a remote app-server).
- Does not emit `turn/started`.
- Does not accept turn overrides (`cwd`, `model`, `sandbox`, etc.) or
`outputSchema` to accurately reflect that these are not applied when
steering.
2026-02-06 00:35:04 +00:00
Matthew Zeng
729b016515
Add stage field for experimental flags. (#10793)
- [x] Add stage field for experimental flags.
2026-02-05 23:31:04 +00:00
Noah Jorgensen
dcea972db8
updates: use brew api for version check (#10809)
## Problem

`codex` currently prompts you to update via `brew upgrade --cask codex`
but the brew api does not return the new version

> <img width="1500" height="822" alt="Screenshot 2026-02-05 at 12 36
09 PM"
src="https://github.com/user-attachments/assets/9e12929d-95e8-43f4-8fba-ab93f5f76e73"
/>

## Solution

`codex-rs/tui/src/updates.rs` was using the [latest cask in
github](https://github.com/Homebrew/homebrew-cask/blob/HEAD/Casks/c/codex.rb)
but this does not agree with the brew api, which leads to the issue
above. Instead we use the [brew api json
endpoint](https://github.com/Homebrew/homebrew-cask/blob/HEAD/Casks/c/codex.rb)
to ensure our version check agrees with the upgrade command.
2026-02-05 15:12:27 -08:00
pakrym-oai
dbe47ea01a
Send beta header with websocket connects (#10727) 2026-02-05 15:05:02 -08:00
sayan-oai
378f1cabe8
go back to auto-enabling web_search for azure (#10820)
###### What
Remove special-casing that prevented auto-enabling `web_search` for
Azure model provider users. Addresses #10071, #10257.

###### Why
Azure fixed their responsesapi implementation; `web_search` is now
supported on models it wasn't before (like `gpt-5.1-codex-max`).

This request now works:
```
curl "$AZURE_API_ENDPOINT" -H "Content-Type: application/json" -H "Authorization: Bearer $AZURE_API_KEY" -d '{
  "model": "gpt-5.1-codex-max",
  "tools": [
    { "type": "web_search" }
  ],
  "tool_choice": "auto",
  "input": "Find the sunrise time in Paris today and cite the source."
}'
```

###### Tests
Tested with above curl, removed Azure-specific tests.
2026-02-05 14:57:07 -08:00
xl-openai
43a7290f11
Sync app-server requirements API with refreshed cloud loader (#10815)
configRequirements/read now returns updated cloud requirements after
login.
2026-02-05 14:43:31 -08:00
Max Johnson
8473096efb
Add app-server transport layer with websocket support (#10693)
- Adds --listen <URL> to codex app-server with two listen modes:
      - stdio:// (default, existing behavior)
      - ws://IP:PORT (new websocket transport)
  - Refactors message routing to be connection-aware:
- Tracks per-connection session state (initialize/experimental
capability)
      - Routes responses/errors to the originating connection
- Broadcasts server notifications/requests to initialized connections
- Updates initialization semantics to be per connection (not
process-global), and updates app-server docs accordingly.
- Adds websocket accept/read/write handling (JSON-RPC per text frame,
ping/pong handling, connection lifecycle events).

Testing

- Unit tests for transport URL parsing and targeted response/error
routing.
  - New websocket integration test validating:
      - per-connection initialization requirements
      - no cross-connection response leakage
      - same request IDs on different connections route independently.
2026-02-05 20:56:34 +00:00
jif-oai
428a9f6035
feat: wait for backfill to be ready (#10790) 2026-02-05 20:45:16 +00:00
pap-openai
529b539564
Add analytics for /rename and /fork (#10655) 2026-02-05 20:18:29 +00:00
Matthew Zeng
7e81f63698
[app-server] Add a method to list experimental features. (#10721)
- [x] Add a method to list experimental features.
2026-02-05 20:04:01 +00:00
jif-oai
ddd09a9368
fix: announcement in prio (#10783) 2026-02-05 19:57:57 +00:00
sayan-oai
5fdf6f5efa
chore: rm web-search-eligible header (#10660)
default-enablement of web_search is now client-side, no need to send
eligibility headers to backend.

Tested locally, headers no longer sent.

will wait for corresponding backend change to deploy before merging
2026-02-05 11:48:34 -08:00
iceweasel-oai
901d5b8fd6
add sandbox policy and sandbox name to codex.tool.call metrics (#10711)
This will give visibility into the comparative success rate of the
Windows sandbox implementations compared to other platforms.
2026-02-05 11:42:12 -08:00
Owen Lin
3582b74d01
fix(auth): isolate chatgptAuthTokens concept to auth manager and app-server (#10423)
So that the rest of the codebase (like TUI) don't need to be concerned
whether ChatGPT auth was handled by Codex itself or passed in via
app-server's external auth mode.
2026-02-05 10:46:06 -08:00
Owen Lin
5c0fd62ff1
fix(tui): fix resume_picker_orders_by_updated_at test (#10769)
I think this was due to https://github.com/openai/codex/issues/10752
landing and not rebased on top of
9ee746afd6
2026-02-05 18:03:10 +00:00
Felipe Coury
22545bf206
feat(tui): add sortable resume picker with created/updated timestamp toggle (#10752)
## Summary

- Add sorting support to the resume session picker with Tab key toggle
- Sessions can now be sorted by either creation time or last updated
time
- Display the current sort mode in the picker header
- Default to sorting by creation time (most recent first)

## Changes

- Add `sort_key` field to `PickerState` to track current sort order
- Pass sort key to `RolloutRecorder::list_threads()` for proper backend
sorting
- Add Tab key handler to toggle between `CreatedAt` and `UpdatedAt`
sorting
- Show current sort mode ("Created at" / "Updated at") in header
- Add "Tab to toggle sort" keyboard hint
- Intelligently hide secondary date column when terminal is narrow
- Reload session list when sort order changes

## Test plan

- [x] Unit tests for sort key toggle functionality
- [x] Snapshot tests updated for new header format
- [x] Test that Tab key triggers reload with new sort key
- [x] Test column visibility adapts to narrow terminals
2026-02-05 09:08:31 -08:00
Felipe Coury
b0e5a6305b
feat(tui): add /statusline command for interactive status line configuration (#10546)
## Summary
- Adds a new `/statusline` command to configure TUI footer status line
- Introduces reusable `MultiSelectPicker` component with keyboard
navigation, optional ordering and toggle support
- Implement status line setup modal that persist configuration to
config.toml

  ## Status Line Items
  The following items can be displayed in the status line:
  - **Model**: Current model name (with optional reasoning level)
  - **Context**: Remaining/used context window percentage
  - **Rate Limits**: 5-day and weekly usage limits
  - **Git**: Current branch (with optimized lookups)
  - **Tokens**: Used tokens, input/output token counts
  - **Session**: Session ID (full or shortened prefix)
  - **Paths**: Current directory, project root
  - **Version**: Codex version

  ## Features
  - Live preview while configuring status line items
  - Fuzzy search filtering in the picker
  - Intelligent truncation when items don't fit
  - Items gracefully omit when data is unavailable
  - Configuration persists to `config.toml`
  - Validates and warns about invalid status line items

  ## Test plan
  - [x] Run `/statusline` and verify picker UI appears
  - [x] Toggle items on/off and verify live preview updates
  - [x] Confirm selection persists after restart
  - [x] Verify truncation behavior with many items selected
  - [x] Test git branch detection in and out of git repos

---------

Co-authored-by: Josh McKinney <joshka@openai.com>
2026-02-05 08:50:21 -08:00
gt-oai
3b54fd7336
Add hooks implementation and wire up to notify (#9691)
This introduces a `Hooks` service. It registers hooks from config and
dispatches hook events at runtime.

N.B. The hook config is not wired up to this yet. But for legacy
reasons, we wire up `notify` from config and power it using hooks now.
Nothing about the `notify` interface has changed.

I'd start by reviewing `hooks/types.rs`

Some things to note:
  - hook names subject to change
  - no hook result yet
  - stopping semantics yet to be introduced
  - additional hooks yet to be introduced
2026-02-05 16:49:35 +00:00
jif-oai
9ee746afd6
Leverage state DB metadata for thread summaries (#10621)
Summary:
- read conversation summaries and cwd info from the state DB when
possible so we no longer rely on rollout files for metadata and avoid
extra I/O
- persist CLI version in thread metadata, surface it through summary
builders, and add the necessary DB migration hooks
- simplify thread listing by using enriched state DB data directly
rather than reading rollout heads

Testing:
- Not run (not requested)
2026-02-05 16:39:11 +00:00
jif-oai
68e82e5dc9
nit: add DB version is discrepancy recording (#10762) 2026-02-05 16:24:18 +00:00
jif-oai
901215e310
feat: repair DB in case of missing lines (#10751) 2026-02-05 16:21:49 +00:00
jif-oai
41f3b1ba0b
feat: add memory tool (#10637)
Add a tool for memory to retrieve a full memory based on the memory ID
2026-02-05 16:16:31 +00:00
jif-oai
fe1cbd0f38
chore: handle shutdown correctly in tui (#10756) 2026-02-05 16:07:50 +00:00
jif-oai
d337b51741
feat: wire ephemeral in codex exec (#10758) 2026-02-05 15:49:57 +00:00
jif-oai
4033f905c6
feat: resumable backfill (#10745)
## Summary

This PR makes SQLite rollout backfill resumable and repeatable instead
of one-shot-on-db-create.

## What changed

- Added a persisted backfill state table:
  - state/migrations/0008_backfill_state.sql
- Tracks status (pending|running|complete), last_watermark, and
last_success_at.
- Added backfill state model/types in codex-state:
  - BackfillState, BackfillStatus (state/src/model/backfill_state.rs)
- Added runtime APIs to manage backfill lifecycle/progress:
  - get_backfill_state
  - mark_backfill_running
  - checkpoint_backfill
  - mark_backfill_complete
- Updated core startup behavior:
- Backfill now runs whenever state is not Complete (not only when DB
file is newly created).
- Reworked backfill execution:
- Collect rollout files, derive deterministic watermark per path, sort,
resume from last_watermark.
- Process in batches (BACKFILL_BATCH_SIZE = 200), checkpoint after each
batch.
  - Mark complete with last_success_at at the end.

## Why

Previous behavior could leave users permanently partially backfilled if
the process exited during initial async backfill. This change allows
safe continuation across restarts and avoids restarting from scratch.
2026-02-05 14:34:34 +00:00
iceweasel-oai
f2ffc4e5d0
Include real OS info in metrics. (#10425)
calculated a hashed user ID from either auth user id or API key
Also correctly populates OS.

These will make our metrics more useful and powerful for analysis.
2026-02-05 06:30:31 -08:00
jif-oai
040ecee715
Update explorer role default model (#10748)
Summary
- switch the explorer role in core agent configuration to use
`gpt-5.1-codex-mini` as the default model override
- leave other role defaults untouched

Testing
- Not run (not requested)
2026-02-05 13:51:53 +00:00
pap-openai
b2424cb635
adding fork information (UI) when forking (#10246)
- shows `/fork` command that ran in prev session
- shows `session forked from name (uuid) || uuid (if name is not set)` as an event in new session
2026-02-05 13:24:55 +00:00
jif-oai
aa46b5cf99
nit: backfill stronger (#10738) 2026-02-05 12:30:16 +00:00
jif-oai
97582ac52d
Allow user shell commands to run alongside active turns (#10513)
Summary
- refactor user shell command execution into a shared helper and add
modes for standalone vs active-turn execution
- run user shell commands asynchronously when a turn is already active
so they don’t replace or abort the current turn
- extend the tests to cover the new behavior and add the generated Codex
environment manifest

Testing
- Not run (not requested)
2026-02-05 11:11:00 +00:00
jif-oai
c67120f4a0
fix: flaky landlock (#10689)
https://openai.slack.com/archives/C095U48JNL9/p1770243347893959
2026-02-05 10:30:18 +00:00
Ashutosh Kumar Singh
7b28b350e1
fix(tui): flush input buffer on init to prevent early exit on Windows (#10729)
Fixes #10661.

### Problem
On Windows, the sign-in menu can exit immediately if the OS-level input
buffer contains trailing characters (like the Enter key from running the
command).

### Solution
**Flush Input Buffer on Init**: Use FlushConsoleInputBuffer on Windows
(and cflush on Unix) in ui::init() to discard any input captured before
the TUI was ready.

Verified by @CodebyAmbrose in #10661.
2026-02-05 00:59:32 -08:00
Dylan Hurd
fe8b474acd
fix(core,app-server) resume with different model (#10719)
## Summary
When resuming with a different model, we should also append a developer
message with the model instructions

## Testing
- [x] Added unit tests
2026-02-05 00:40:05 -08:00
xl-openai
1e1146cd29
Reload cloud requirements after user login (#10725)
Reload cloud requirements after user login so it could take effect
immediately.
2026-02-05 00:27:16 -08:00
Charley Cunningham
dc7007beaa
Fix remote compaction estimator/payload instruction small mismatch (#10692)
## Summary
This PR fixes a deterministic mismatch in remote compaction where
pre-trim estimation and the `/v1/responses/compact` payload could use
different base instructions.

Before this change:
- pre-trim estimation used model-derived instructions
(`model_info.get_model_instructions(...)`)
- compact payload used session base instructions
(`sess.get_base_instructions()`)

After this change:
- remote pre-trim estimation and compact payload both use the same
`BaseInstructions` instance from session state.

## Changes
- Added a shared estimator entry point in `ContextManager`:
- `estimate_token_count_with_base_instructions(&self, base_instructions:
&BaseInstructions) -> Option<i64>`
- Kept `estimate_token_count(&TurnContext)` as a thin wrapper that
resolves model/personality instructions and delegates to the new helper.
- Updated remote compaction flow to fetch base instructions once and
reuse it for both:
  - trim preflight estimation
  - compact request payload construction
- Added regression coverage for parity and behavior:
  - unit test verifying explicit-base estimator behavior
- integration test proving remote compaction uses session override
instructions and trims accordingly

## Why this matters
This removes a deterministic divergence source where pre-trim could
think the request fits while the actual compact request exceeded context
because its instructions were longer/different.

## Scope
In scope:
- estimator/payload base-instructions parity in remote compaction

Out of scope:
- retry-on-`context_length_exceeded`
- compaction threshold/headroom policy changes
- broader trimming policy changes

## Codex author:
`codex fork 019c2b24-c2df-7b31-a482-fb8cf7a28559`
2026-02-04 23:24:06 -08:00
Ahmed Ibrahim
cd5f49a619
Make steer stable by default (#10690)
Promotes the Steer feature from Experimental to Stable and enables it by
default.

## What is Steer mode?

Steer mode changes how message submission works in the TUI:

- **With Steer enabled (new default)**: 
  - `Enter` submits messages immediately, even when a task is running
- `Tab` queues messages when a task is running (allows building up a
queue)
  
- **With Steer disabled (old behavior)**:
  - `Enter` queues messages when a task is running
  - This preserves the previous "queue while a task is running" behavior

## How Steer vs Queue work

The key difference is in the submission behavior:

1. **Steer mode** (`steer_enabled = true`):
- Enter → `InputResult::Submitted` → sends immediately via
`submit_user_message()`
- Tab → `InputResult::Queued` → queues via `queue_user_message()` if a
task is running
- This gives users direct control: Enter for immediate submission, Tab
for queuing

2. **Queue mode** (`steer_enabled = false`, previous default):
- Enter → `InputResult::Queued` → always queues when a task is running
   - Tab → `InputResult::Queued` → queues when a task is running
- This preserves the original behavior where Enter respects the running
task queue

## Implementation details

The behavior is controlled in
`ChatComposer::handle_key_event_without_popup()`:
- When `steer_enabled` is true, Enter calls `handle_submission(false)`
(submit immediately)
- When `steer_enabled` is false, Enter calls `handle_submission(true)`
(queue)

See `codex-rs/tui/src/bottom_pane/chat_composer.rs` for the
implementation.

## Documentation

For more details on the chat composer behavior, see:
- [TUI Chat Composer documentation](docs/tui-chat-composer.md)
- Feature flag definition: `codex-rs/core/src/features.rs`
2026-02-04 23:12:59 -08:00
Charley Cunningham
41b4962b0a
Sync collaboration mode naming across Default prompt, tools, and TUI (#10666)
## Summary
- add shared `ModeKind` helpers for display names, TUI visibility, and
`request_user_input` availability
- derive TUI mode filtering/labels from shared `ModeKind` metadata
instead of local hardcoded matches
- derive `request_user_input` availability text and unavailable error
mode names from shared mode metadata
- replace hardcoded known mode names in the Default collaboration-mode
template with `{{KNOWN_MODE_NAMES}}` and fill it from
`TUI_VISIBLE_COLLABORATION_MODES`
- add regression tests for mode metadata sync and placeholder
replacement

## Notes
- `cargo test -p codex-core` integration target (`tests/all`) still
shows pre-existing env-specific failures in this environment due missing
`test_stdio_server` binary resolution; core unit tests are green.

## Codex author
`codex resume 019c26ff-dfe7-7173-bc04-c9e1fff1e447`
2026-02-04 23:03:28 -08:00
Dylan Hurd
e482978261
fix(core) switching model appends model instructions (#10651)
## Summary
When switching models, we should append the instructions of the new
model to the conversation as a developer message.

## Test
- [x] Adds a unit test
2026-02-05 05:50:38 +00:00
Dylan Hurd
a05aadfa1b
chore(config) Default Personality Pragmatic (#10705)
## Summary
Switch back to Pragmatic personality

## Testing
- [x] Updated unit tests
2026-02-04 21:22:47 -08:00
sayan-oai
4ed8d74aab
fix: ensure status indicator present earlier in exec path (#10700)
ensure status indicator present in all classifications of exec tool.
fixes indicator disappearing after preambles, will look into using
`phase` to avoid this class of error in a few hours.

commands parsed as unknown faced this issue

tested locally, added test for specific failure flow
2026-02-05 03:56:50 +00:00
Josh McKinney
d876f3b94f
fix(tui): restore working shimmer after preamble output (#10701)
## Problem
When a turn streamed a preamble line before any tool activity,
`ChatWidget` hid the status row while committing streamed lines and did
not restore it until a later event (commonly `ExecCommandBegin`). During
that idle gap, the UI looked finished even though the turn was still
active.

## Mental model
The bottom status row and transcript stream are separate progress
affordances:
- transcript stream shows committed output
- status row (spinner/shimmer + header) shows liveness of an active turn

While stream output is actively committing, hiding the status row is
acceptable to avoid redundant visual noise. Once stream controllers go
idle, an active turn must restore the status row immediately so liveness
remains visible across preamble-to-tool gaps.

## Non-goals
- No changes to streaming chunking policy or pacing.
- No changes to final completion behavior (status still hides when task
actually ends).
- No refactor of status lifecycle ownership between `ChatWidget` and
`BottomPane`.

## Tradeoffs
- We keep the existing behavior of hiding the status row during active
stream commits.
- We add explicit restoration on the idle boundary when the task is
still running.
- This introduces one extra status update on idle transitions, which is
small overhead but makes liveness semantics consistent.

## Architecture
`run_commit_tick_with_scope` in `chatwidget.rs` now documents and
enforces a two-phase contract:
1. For each committed streamed cell, hide status and append transcript
output.
2. If controllers are present and all idle, restore status iff task is
still running, preserving the current header.

This keeps status ownership in `ChatWidget` while relying on
`BottomPane` helpers:
- `hide_status_indicator()` during active stream commits
- `ensure_status_indicator()` +
`set_status_header(current_status_header)` at stream-idle boundary

Documentation pass additions:
- Clarified the function-level contract and lifecycle intent in
`run_commit_tick_with_scope`.
- Added an explicit regression snapshot test comment describing the
failing sequence.

## Observability
Signal that the fix is present:
- In the preamble-idle state, rendered output still includes `• Working
(… esc to interrupt)`.
- New snapshot:
`codex_tui__chatwidget__tests__preamble_keeps_working_status.snap`.

Debug path for future regressions:
- Start at `run_commit_tick_with_scope` for hide/restore transitions.
- Verify `bottom_pane.is_task_running()` at idle transition.
- Confirm `current_status_header` continuity when status is recreated.
- Use the new snapshot and targeted test sequence to reproduce
deterministic preamble-idle behavior.

## Tests
- Updated regression assertion:
- `streaming_final_answer_keeps_task_running_state` now expects status
widget to remain present while turn is running.
- Renamed/updated behavioral regression:
  - `preamble_keeps_status_indicator_visible_until_exec_begin`.
- Added snapshot regression coverage:
  - `preamble_keeps_working_status_snapshot`.
- Snapshot file:
`tui/src/chatwidget/snapshots/codex_tui__chatwidget__tests__preamble_keeps_working_status.snap`.

Commands run:
- `just fmt`
- `cargo test -p codex-tui
preamble_keeps_status_indicator_visible_until_exec_begin`
- `cargo test -p codex-tui preamble_keeps_working_status_snapshot`

## Risks / Inconsistencies
- Status visibility policy is still split across multiple event paths
(`commit tick`, `turn complete`, `exec begin`), so future regressions
can reintroduce ordering gaps.
- Restoration depends on `is_task_running()` correctness; if task
lifecycle flags drift, status behavior will drift too.
- Snapshot proves rendered state, not animation cadence; cadence still
relies on frame scheduling behavior elsewhere.
2026-02-04 19:28:13 -08:00
Dylan Hurd
73f32840c6
chore(core) personality migration tests (#10650)
## Summary
Adds additional tests for personality edge cases

## Testing
- [x] These are tests
2026-02-04 19:03:14 -08:00
gt-oai
1f47e08d66
Cloud Requirements: increase timeout and retries (#10631)
Add retries and an increased-length timeout for loading Cloud
Requirements.

Co-authored-by: alexsong-oai <alexsong@openai.com>
2026-02-05 01:52:12 +00:00