This PR removes the inaccurate "Disable in /experimental." statement now
that the "personality" feature flag is no longer experimental.
This addresses #10850
<img width="1019" height="284" alt="Screenshot 2026-02-05 at 23 34 08"
src="https://github.com/user-attachments/assets/19ec3ce1-3c3b-40f5-b251-a31d964bf3bb"
/>
Currently, if a config value is set that fails the requirements, we exit
Codex.
Now, instead of this, we print a warning and default to a
requirements-permitting value.
This PR adds a dedicated `turn/steer` API for appending user input to an
in-flight turn.
## Motivation
Currently, steering in the app is implemented by just calling
`turn/start` while a turn is running. This has some really weird quirks:
- Client gets back a new `turn.id`, even though streamed
events/approvals remained tied to the original active turn ID.
- All the various turn-level override params on `turn/start` do not
apply to the "steer", and would only apply to the next real turn.
- There can also be a race condition where the client thinks the turn is
active but the server has already completed it, so there might be bugs
if the client has baked in some client-specific behavior thinking it's a
steer when in fact the server kicked off a new turn. This is
particularly possible when running a client against a remote app-server.
Having a dedicated `turn/steer` API eliminates all those quirks.
`turn/steer` behavior:
- Requires an active turn on threadId. Returns a JSON-RPC error if there
is no active turn.
- If expectedTurnId is provided, it must match the active turn (more
useful when connecting to a remote app-server).
- Does not emit `turn/started`.
- Does not accept turn overrides (`cwd`, `model`, `sandbox`, etc.) or
`outputSchema` to accurately reflect that these are not applied when
steering.
###### What
Remove special-casing that prevented auto-enabling `web_search` for
Azure model provider users. Addresses #10071, #10257.
###### Why
Azure fixed their responsesapi implementation; `web_search` is now
supported on models it wasn't before (like `gpt-5.1-codex-max`).
This request now works:
```
curl "$AZURE_API_ENDPOINT" -H "Content-Type: application/json" -H "Authorization: Bearer $AZURE_API_KEY" -d '{
"model": "gpt-5.1-codex-max",
"tools": [
{ "type": "web_search" }
],
"tool_choice": "auto",
"input": "Find the sunrise time in Paris today and cite the source."
}'
```
###### Tests
Tested with above curl, removed Azure-specific tests.
- Adds --listen <URL> to codex app-server with two listen modes:
- stdio:// (default, existing behavior)
- ws://IP:PORT (new websocket transport)
- Refactors message routing to be connection-aware:
- Tracks per-connection session state (initialize/experimental
capability)
- Routes responses/errors to the originating connection
- Broadcasts server notifications/requests to initialized connections
- Updates initialization semantics to be per connection (not
process-global), and updates app-server docs accordingly.
- Adds websocket accept/read/write handling (JSON-RPC per text frame,
ping/pong handling, connection lifecycle events).
Testing
- Unit tests for transport URL parsing and targeted response/error
routing.
- New websocket integration test validating:
- per-connection initialization requirements
- no cross-connection response leakage
- same request IDs on different connections route independently.
default-enablement of web_search is now client-side, no need to send
eligibility headers to backend.
Tested locally, headers no longer sent.
will wait for corresponding backend change to deploy before merging
So that the rest of the codebase (like TUI) don't need to be concerned
whether ChatGPT auth was handled by Codex itself or passed in via
app-server's external auth mode.
## Summary
- Add sorting support to the resume session picker with Tab key toggle
- Sessions can now be sorted by either creation time or last updated
time
- Display the current sort mode in the picker header
- Default to sorting by creation time (most recent first)
## Changes
- Add `sort_key` field to `PickerState` to track current sort order
- Pass sort key to `RolloutRecorder::list_threads()` for proper backend
sorting
- Add Tab key handler to toggle between `CreatedAt` and `UpdatedAt`
sorting
- Show current sort mode ("Created at" / "Updated at") in header
- Add "Tab to toggle sort" keyboard hint
- Intelligently hide secondary date column when terminal is narrow
- Reload session list when sort order changes
## Test plan
- [x] Unit tests for sort key toggle functionality
- [x] Snapshot tests updated for new header format
- [x] Test that Tab key triggers reload with new sort key
- [x] Test column visibility adapts to narrow terminals
## Summary
- Adds a new `/statusline` command to configure TUI footer status line
- Introduces reusable `MultiSelectPicker` component with keyboard
navigation, optional ordering and toggle support
- Implement status line setup modal that persist configuration to
config.toml
## Status Line Items
The following items can be displayed in the status line:
- **Model**: Current model name (with optional reasoning level)
- **Context**: Remaining/used context window percentage
- **Rate Limits**: 5-day and weekly usage limits
- **Git**: Current branch (with optimized lookups)
- **Tokens**: Used tokens, input/output token counts
- **Session**: Session ID (full or shortened prefix)
- **Paths**: Current directory, project root
- **Version**: Codex version
## Features
- Live preview while configuring status line items
- Fuzzy search filtering in the picker
- Intelligent truncation when items don't fit
- Items gracefully omit when data is unavailable
- Configuration persists to `config.toml`
- Validates and warns about invalid status line items
## Test plan
- [x] Run `/statusline` and verify picker UI appears
- [x] Toggle items on/off and verify live preview updates
- [x] Confirm selection persists after restart
- [x] Verify truncation behavior with many items selected
- [x] Test git branch detection in and out of git repos
---------
Co-authored-by: Josh McKinney <joshka@openai.com>
This introduces a `Hooks` service. It registers hooks from config and
dispatches hook events at runtime.
N.B. The hook config is not wired up to this yet. But for legacy
reasons, we wire up `notify` from config and power it using hooks now.
Nothing about the `notify` interface has changed.
I'd start by reviewing `hooks/types.rs`
Some things to note:
- hook names subject to change
- no hook result yet
- stopping semantics yet to be introduced
- additional hooks yet to be introduced
Summary:
- read conversation summaries and cwd info from the state DB when
possible so we no longer rely on rollout files for metadata and avoid
extra I/O
- persist CLI version in thread metadata, surface it through summary
builders, and add the necessary DB migration hooks
- simplify thread listing by using enriched state DB data directly
rather than reading rollout heads
Testing:
- Not run (not requested)
## Summary
This PR makes SQLite rollout backfill resumable and repeatable instead
of one-shot-on-db-create.
## What changed
- Added a persisted backfill state table:
- state/migrations/0008_backfill_state.sql
- Tracks status (pending|running|complete), last_watermark, and
last_success_at.
- Added backfill state model/types in codex-state:
- BackfillState, BackfillStatus (state/src/model/backfill_state.rs)
- Added runtime APIs to manage backfill lifecycle/progress:
- get_backfill_state
- mark_backfill_running
- checkpoint_backfill
- mark_backfill_complete
- Updated core startup behavior:
- Backfill now runs whenever state is not Complete (not only when DB
file is newly created).
- Reworked backfill execution:
- Collect rollout files, derive deterministic watermark per path, sort,
resume from last_watermark.
- Process in batches (BACKFILL_BATCH_SIZE = 200), checkpoint after each
batch.
- Mark complete with last_success_at at the end.
## Why
Previous behavior could leave users permanently partially backfilled if
the process exited during initial async backfill. This change allows
safe continuation across restarts and avoids restarting from scratch.
calculated a hashed user ID from either auth user id or API key
Also correctly populates OS.
These will make our metrics more useful and powerful for analysis.
Summary
- switch the explorer role in core agent configuration to use
`gpt-5.1-codex-mini` as the default model override
- leave other role defaults untouched
Testing
- Not run (not requested)
Summary
- refactor user shell command execution into a shared helper and add
modes for standalone vs active-turn execution
- run user shell commands asynchronously when a turn is already active
so they don’t replace or abort the current turn
- extend the tests to cover the new behavior and add the generated Codex
environment manifest
Testing
- Not run (not requested)
Fixes#10661.
### Problem
On Windows, the sign-in menu can exit immediately if the OS-level input
buffer contains trailing characters (like the Enter key from running the
command).
### Solution
**Flush Input Buffer on Init**: Use FlushConsoleInputBuffer on Windows
(and cflush on Unix) in ui::init() to discard any input captured before
the TUI was ready.
Verified by @CodebyAmbrose in #10661.
## Summary
When resuming with a different model, we should also append a developer
message with the model instructions
## Testing
- [x] Added unit tests
## Summary
This PR fixes a deterministic mismatch in remote compaction where
pre-trim estimation and the `/v1/responses/compact` payload could use
different base instructions.
Before this change:
- pre-trim estimation used model-derived instructions
(`model_info.get_model_instructions(...)`)
- compact payload used session base instructions
(`sess.get_base_instructions()`)
After this change:
- remote pre-trim estimation and compact payload both use the same
`BaseInstructions` instance from session state.
## Changes
- Added a shared estimator entry point in `ContextManager`:
- `estimate_token_count_with_base_instructions(&self, base_instructions:
&BaseInstructions) -> Option<i64>`
- Kept `estimate_token_count(&TurnContext)` as a thin wrapper that
resolves model/personality instructions and delegates to the new helper.
- Updated remote compaction flow to fetch base instructions once and
reuse it for both:
- trim preflight estimation
- compact request payload construction
- Added regression coverage for parity and behavior:
- unit test verifying explicit-base estimator behavior
- integration test proving remote compaction uses session override
instructions and trims accordingly
## Why this matters
This removes a deterministic divergence source where pre-trim could
think the request fits while the actual compact request exceeded context
because its instructions were longer/different.
## Scope
In scope:
- estimator/payload base-instructions parity in remote compaction
Out of scope:
- retry-on-`context_length_exceeded`
- compaction threshold/headroom policy changes
- broader trimming policy changes
## Codex author:
`codex fork 019c2b24-c2df-7b31-a482-fb8cf7a28559`
Promotes the Steer feature from Experimental to Stable and enables it by
default.
## What is Steer mode?
Steer mode changes how message submission works in the TUI:
- **With Steer enabled (new default)**:
- `Enter` submits messages immediately, even when a task is running
- `Tab` queues messages when a task is running (allows building up a
queue)
- **With Steer disabled (old behavior)**:
- `Enter` queues messages when a task is running
- This preserves the previous "queue while a task is running" behavior
## How Steer vs Queue work
The key difference is in the submission behavior:
1. **Steer mode** (`steer_enabled = true`):
- Enter → `InputResult::Submitted` → sends immediately via
`submit_user_message()`
- Tab → `InputResult::Queued` → queues via `queue_user_message()` if a
task is running
- This gives users direct control: Enter for immediate submission, Tab
for queuing
2. **Queue mode** (`steer_enabled = false`, previous default):
- Enter → `InputResult::Queued` → always queues when a task is running
- Tab → `InputResult::Queued` → queues when a task is running
- This preserves the original behavior where Enter respects the running
task queue
## Implementation details
The behavior is controlled in
`ChatComposer::handle_key_event_without_popup()`:
- When `steer_enabled` is true, Enter calls `handle_submission(false)`
(submit immediately)
- When `steer_enabled` is false, Enter calls `handle_submission(true)`
(queue)
See `codex-rs/tui/src/bottom_pane/chat_composer.rs` for the
implementation.
## Documentation
For more details on the chat composer behavior, see:
- [TUI Chat Composer documentation](docs/tui-chat-composer.md)
- Feature flag definition: `codex-rs/core/src/features.rs`
## Summary
- add shared `ModeKind` helpers for display names, TUI visibility, and
`request_user_input` availability
- derive TUI mode filtering/labels from shared `ModeKind` metadata
instead of local hardcoded matches
- derive `request_user_input` availability text and unavailable error
mode names from shared mode metadata
- replace hardcoded known mode names in the Default collaboration-mode
template with `{{KNOWN_MODE_NAMES}}` and fill it from
`TUI_VISIBLE_COLLABORATION_MODES`
- add regression tests for mode metadata sync and placeholder
replacement
## Notes
- `cargo test -p codex-core` integration target (`tests/all`) still
shows pre-existing env-specific failures in this environment due missing
`test_stdio_server` binary resolution; core unit tests are green.
## Codex author
`codex resume 019c26ff-dfe7-7173-bc04-c9e1fff1e447`
## Summary
When switching models, we should append the instructions of the new
model to the conversation as a developer message.
## Test
- [x] Adds a unit test
ensure status indicator present in all classifications of exec tool.
fixes indicator disappearing after preambles, will look into using
`phase` to avoid this class of error in a few hours.
commands parsed as unknown faced this issue
tested locally, added test for specific failure flow
## Problem
When a turn streamed a preamble line before any tool activity,
`ChatWidget` hid the status row while committing streamed lines and did
not restore it until a later event (commonly `ExecCommandBegin`). During
that idle gap, the UI looked finished even though the turn was still
active.
## Mental model
The bottom status row and transcript stream are separate progress
affordances:
- transcript stream shows committed output
- status row (spinner/shimmer + header) shows liveness of an active turn
While stream output is actively committing, hiding the status row is
acceptable to avoid redundant visual noise. Once stream controllers go
idle, an active turn must restore the status row immediately so liveness
remains visible across preamble-to-tool gaps.
## Non-goals
- No changes to streaming chunking policy or pacing.
- No changes to final completion behavior (status still hides when task
actually ends).
- No refactor of status lifecycle ownership between `ChatWidget` and
`BottomPane`.
## Tradeoffs
- We keep the existing behavior of hiding the status row during active
stream commits.
- We add explicit restoration on the idle boundary when the task is
still running.
- This introduces one extra status update on idle transitions, which is
small overhead but makes liveness semantics consistent.
## Architecture
`run_commit_tick_with_scope` in `chatwidget.rs` now documents and
enforces a two-phase contract:
1. For each committed streamed cell, hide status and append transcript
output.
2. If controllers are present and all idle, restore status iff task is
still running, preserving the current header.
This keeps status ownership in `ChatWidget` while relying on
`BottomPane` helpers:
- `hide_status_indicator()` during active stream commits
- `ensure_status_indicator()` +
`set_status_header(current_status_header)` at stream-idle boundary
Documentation pass additions:
- Clarified the function-level contract and lifecycle intent in
`run_commit_tick_with_scope`.
- Added an explicit regression snapshot test comment describing the
failing sequence.
## Observability
Signal that the fix is present:
- In the preamble-idle state, rendered output still includes `• Working
(… esc to interrupt)`.
- New snapshot:
`codex_tui__chatwidget__tests__preamble_keeps_working_status.snap`.
Debug path for future regressions:
- Start at `run_commit_tick_with_scope` for hide/restore transitions.
- Verify `bottom_pane.is_task_running()` at idle transition.
- Confirm `current_status_header` continuity when status is recreated.
- Use the new snapshot and targeted test sequence to reproduce
deterministic preamble-idle behavior.
## Tests
- Updated regression assertion:
- `streaming_final_answer_keeps_task_running_state` now expects status
widget to remain present while turn is running.
- Renamed/updated behavioral regression:
- `preamble_keeps_status_indicator_visible_until_exec_begin`.
- Added snapshot regression coverage:
- `preamble_keeps_working_status_snapshot`.
- Snapshot file:
`tui/src/chatwidget/snapshots/codex_tui__chatwidget__tests__preamble_keeps_working_status.snap`.
Commands run:
- `just fmt`
- `cargo test -p codex-tui
preamble_keeps_status_indicator_visible_until_exec_begin`
- `cargo test -p codex-tui preamble_keeps_working_status_snapshot`
## Risks / Inconsistencies
- Status visibility policy is still split across multiple event paths
(`commit tick`, `turn complete`, `exec begin`), so future regressions
can reintroduce ordering gaps.
- Restoration depends on `is_task_running()` correctness; if task
lifecycle flags drift, status behavior will drift too.
- Snapshot proves rendered state, not animation cadence; cadence still
relies on frame scheduling behavior elsewhere.