Commit graph

287 commits

Author SHA1 Message Date
Michael Bolin
bfff0c729f
config: enforce enterprise feature requirements (#13388)
## Why

Enterprises can already constrain approvals, sandboxing, and web search
through `requirements.toml` and MDM, but feature flags were still only
configurable as managed defaults. That meant an enterprise could suggest
feature values, but it could not actually pin them.

This change closes that gap and makes enterprise feature requirements
behave like the other constrained settings. The effective feature set
now stays consistent with enterprise requirements during config load,
when config writes are validated, and when runtime code mutates feature
flags later in the session.

It also tightens the runtime API for managed features. `ManagedFeatures`
now follows the same constraint-oriented shape as `Constrained<T>`
instead of exposing panic-prone mutation helpers, and production code
can no longer construct it through an unconstrained `From<Features>`
path.

The PR also hardens the `compact_resume_fork` integration coverage on
Windows. After the feature-management changes,
`compact_resume_after_second_compaction_preserves_history` was
overflowing the libtest/Tokio thread stacks on Windows, so the test now
uses an explicit larger-stack harness as a pragmatic mitigation. That
may not be the ideal root-cause fix, and it merits a parallel
investigation into whether part of the async future chain should be
boxed to reduce stack pressure instead.

## What Changed

Enterprises can now pin feature values in `requirements.toml` with the
requirements-side `features` table:

```toml
[features]
personality = true
unified_exec = false
```

Only canonical feature keys are allowed in the requirements `features`
table; omitted keys remain unconstrained.

- Added a requirements-side pinned feature map to
`ConfigRequirementsToml`, threaded it through source-preserving
requirements merge and normalization in `codex-config`, and made the
TOML surface use `[features]` (while still accepting legacy
`[feature_requirements]` for compatibility).
- Exposed `featureRequirements` from `configRequirements/read`,
regenerated the JSON/TypeScript schema artifacts, and updated the
app-server README.
- Wrapped the effective feature set in `ManagedFeatures`, backed by
`ConstrainedWithSource<Features>`, and changed its API to mirror
`Constrained<T>`: `can_set(...)`, `set(...) -> ConstraintResult<()>`,
and result-returning `enable` / `disable` / `set_enabled` helpers.
- Removed the legacy-usage and bulk-map passthroughs from
`ManagedFeatures`; callers that need those behaviors now mutate a plain
`Features` value and reapply it through `set(...)`, so the constrained
wrapper remains the enforcement boundary.
- Removed the production loophole for constructing unconstrained
`ManagedFeatures`. Non-test code now creates it through the configured
feature-loading path, and `impl From<Features> for ManagedFeatures` is
restricted to `#[cfg(test)]`.
- Rejected legacy feature aliases in enterprise feature requirements,
and return a load error when a pinned combination cannot survive
dependency normalization.
- Validated config writes against enterprise feature requirements before
persisting changes, including explicit conflicting writes and
profile-specific feature states that normalize into invalid
combinations.
- Updated runtime and TUI feature-toggle paths to use the constrained
setter API and to persist or apply the effective post-constraint value
rather than the requested value.
- Updated the `core_test_support` Bazel target to include the bundled
core model-catalog fixtures in its runtime data, so helper code that
resolves `core/models.json` through runfiles works in remote Bazel test
environments.
- Renamed the core config test coverage to emphasize that effective
feature values are normalized at runtime, while conflicting persisted
config writes are rejected.
- Ran `compact_resume_after_second_compaction_preserves_history` inside
an explicit 8 MiB test thread and Tokio runtime worker stack, following
the existing larger-stack integration-test pattern, to keep the Windows
`compact_resume_fork` test slice from aborting while a parallel
investigation continues into whether some of the underlying async
futures should be boxed.

## Verification

- `cargo test -p codex-config`
- `cargo test -p codex-core feature_requirements_ -- --nocapture`
- `cargo test -p codex-core
load_requirements_toml_produces_expected_constraints -- --nocapture`
- `cargo test -p codex-core
compact_resume_after_second_compaction_preserves_history -- --nocapture`
- `cargo test -p codex-core compact_resume_fork -- --nocapture`
- Re-ran the built `codex-core` `tests/all` binary with
`RUST_MIN_STACK=262144` for
`compact_resume_after_second_compaction_preserves_history` to confirm
the explicit-stack harness fixes the deterministic low-stack repro.
- `cargo test -p codex-core`
- This still fails locally in unrelated integration areas that expect
the `codex` / `test_stdio_server` binaries or hit existing `search_tool`
wiremock mismatches.

## Docs

`developers.openai.com/codex` should document the requirements-side
`[features]` table for enterprise and MDM-managed configuration,
including that it only accepts canonical feature keys and that
conflicting config writes are rejected.
2026-03-04 04:40:22 +00:00
pash-openai
07e532dcb9
app-server service tier plumbing (plus some cleanup) (#13334)
followup to https://github.com/openai/codex/pull/13212 to expose fast
tier controls to app server
(majority of this PR is generated schema jsons - actual code is +69 /
-35 and +24 tests )

- add service tier fields to the app-server protocol surfaces used by
thread lifecycle, turn start, config, and session configured events
- thread service tier through the app-server message processor and core
thread config snapshots
- allow runtime config overrides to carry service tier for app-server
callers

cleanup:
- Removing useless "legacy" code supporting "standard" - we moved to
None | "fast", so "standard" is not needed.
2026-03-03 02:35:09 -08:00
pash-openai
2f5b01abd6
add fast mode toggle (#13212)
- add a local Fast mode setting in codex-core (similar to how model id
is currently stored on disk locally)
- send `service_tier=priority` on requests when Fast is enabled
- add `/fast` in the TUI and persist it locally
- feature flag
2026-03-02 20:29:33 -08:00
jif-oai
f8838fd6f3
feat: enable ma through /agent (#13246)
<img width="639" height="139" alt="Screenshot 2026-03-02 at 16 06 41"
src="https://github.com/user-attachments/assets/c006fcec-c1e7-41ce-bb84-c121d5ffb501"
/>

Then
<img width="372" height="37" alt="Screenshot 2026-03-02 at 16 06 49"
src="https://github.com/user-attachments/assets/aa4ad703-e7e7-4620-9032-f5cd4f48ff79"
/>
2026-03-02 18:37:29 +00:00
jif-oai
9a42a56d8f
chore: /multiagent alias for /agent (#13249)
Add a `/mutli-agents` alias for `/agent` and update the wording
2026-03-02 16:51:54 +00:00
jif-oai
2b38b4e03b
feat: approval for sub-agent in the TUI (#12995)
<img width="766" height="290" alt="Screenshot 2026-02-27 at 10 50 48"
src="https://github.com/user-attachments/assets/3bc96cd9-ed2c-4d67-a317-8f7b60abbbb1"
/>
2026-02-28 14:07:07 +01:00
Ahmed Ibrahim
ec6f6aacbf
Add model availability NUX tooltips (#13021)
- override startup tooltips with model availability NUX and persist
per-model show counts in config
- stop showing each model after four exposures and fall back to normal
tooltips
2026-02-27 17:14:06 -08:00
Ahmed Ibrahim
f90e97e414
Add realtime audio device picker (#12850)
## Summary
- add a dedicated /audio picker for realtime microphone and speaker
selection
- persist realtime audio choices and prompt to restart only local audio
when voice is live
- add snapshot coverage for the new picker surfaces

## Validation
- cargo test -p codex-tui
- cargo insta accept
- just fix -p codex-tui
- just fmt
2026-02-26 17:27:44 -08:00
Charley Cunningham
c1afb8815a
tui: use thread_id for resume/fork cwd resolution (#12727)
## Summary
- make resume/fork targets explicit and typed as `SessionTarget { path,
thread_id }` (non-optional `thread_id`)
- resolve `thread_id` centrally via `resolve_session_thread_id(...)`:
- use CLI input directly when it is a UUID (`--resume <uuid>` / `--fork
<uuid>`)
- otherwise read `thread_id` from rollout `SessionMeta` for path-based
selections (picker, `--resume-last`, name-based resume/fork)
- use `thread_id` to read cwd from SQLite first during resume/fork cwd
resolution
- keep rollout fallback for cwd resolution when SQLite is unavailable or
does not return thread metadata (`TurnContext` tail, then `SessionMeta`)
- keep the resume picker open when a selected row has unreadable session
metadata, and show an inline recoverable error instead of aborting the
TUI

## Why
This removes ad-hoc rollout filename parsing and makes resume/fork
target identity explicit. The resume/fork cwd check can use indexed
SQLite lookup by `thread_id` in the common path, while preserving
rollout-based fallback behavior. It also keeps malformed legacy rows
recoverable in the picker instead of letting a selection failure unwind
the app.

## Notes
- minimal TUI-only change; no schema/protocol changes
- includes TUI test coverage for SQLite cwd precedence when `thread_id`
is available
- includes TUI regression coverage for picker inline error rendering /
non-fatal unreadable session rows

## Codex author
`codex resume 019c9205-7f8b-7173-a2a2-f082d4df3de3`
2026-02-26 12:52:31 -08:00
jif-oai
f0a85ded18
fix: ctrl c sub agent (#12911) 2026-02-26 17:06:20 +00:00
Michael Bolin
14116ade8d
feat: include available decisions in command approval requests (#12758)
Command-approval clients currently infer which choices to show from
side-channel fields like `networkApprovalContext`,
`proposedExecpolicyAmendment`, and `additionalPermissions`. That makes
the request shape harder to evolve, and it forces each client to
replicate the server's heuristics instead of receiving the exact
decision list for the prompt.

This PR introduces a mapping between `CommandExecutionApprovalDecision`
and `codex_protocol::protocol::ReviewDecision`:

```rust
impl From<CoreReviewDecision> for CommandExecutionApprovalDecision {
    fn from(value: CoreReviewDecision) -> Self {
        match value {
            CoreReviewDecision::Approved => Self::Accept,
            CoreReviewDecision::ApprovedExecpolicyAmendment {
                proposed_execpolicy_amendment,
            } => Self::AcceptWithExecpolicyAmendment {
                execpolicy_amendment: proposed_execpolicy_amendment.into(),
            },
            CoreReviewDecision::ApprovedForSession => Self::AcceptForSession,
            CoreReviewDecision::NetworkPolicyAmendment {
                network_policy_amendment,
            } => Self::ApplyNetworkPolicyAmendment {
                network_policy_amendment: network_policy_amendment.into(),
            },
            CoreReviewDecision::Abort => Self::Cancel,
            CoreReviewDecision::Denied => Self::Decline,
        }
    }
}
```

And updates `CommandExecutionRequestApprovalParams` to have a new field:

```rust
available_decisions: Option<Vec<CommandExecutionApprovalDecision>>
```

when, if specified, should make it easier for clients to display an
appropriate list of options in the UI.

This makes it possible for `CoreShellActionProvider::prompt()` in
`unix_escalation.rs` to specify the `Vec<ReviewDecision>` directly,
adding support for `ApprovedForSession` when approving a skill script,
which was previously missing in the TUI.

Note this results in a significant change to `exec_options()` in
`approval_overlay.rs`, as the displayed options are now derived from
`available_decisions: &[ReviewDecision]`.

## What Changed

- Add `available_decisions` to
[`ExecApprovalRequestEvent`](de00e932dd/codex-rs/protocol/src/approvals.rs (L111-L175)),
including helpers to derive the legacy default choices when older
senders omit the field.
- Map `codex_protocol::protocol::ReviewDecision` to app-server
`CommandExecutionApprovalDecision` and expose the ordered list as
experimental `availableDecisions` in
[`CommandExecutionRequestApprovalParams`](de00e932dd/codex-rs/app-server-protocol/src/protocol/v2.rs (L3798-L3807)).
- Thread optional `available_decisions` through the core approval path
so Unix shell escalation can explicitly request `ApprovedForSession` for
session-scoped approvals instead of relying on client heuristics.
[`unix_escalation.rs`](de00e932dd/codex-rs/core/src/tools/runtimes/shell/unix_escalation.rs (L194-L214))
- Update the TUI approval overlay to build its buttons from the ordered
decision list, while preserving the legacy fallback when
`available_decisions` is missing.
- Update the app-server README, test client output, and generated schema
artifacts to document and surface the new field.

## Testing

- Add `approval_overlay.rs` coverage for explicit decision lists,
including the generic `ApprovedForSession` path and network approval
options.
- Update `chatwidget/tests.rs` and app-server protocol tests to populate
the new optional field and keep older event shapes working.

## Developers Docs

- If we document `item/commandExecution/requestApproval` on
[developers.openai.com/codex](https://developers.openai.com/codex), add
experimental `availableDecisions` as the preferred source of approval
choices and note that older servers may omit it.
2026-02-26 01:10:46 +00:00
Charley Cunningham
2f4d6ded1d
Enable request_user_input in Default mode (#12735)
## Summary
- allow `request_user_input` in Default collaboration mode as well as
Plan
- update the Default-mode instructions to prefer assumptions first and
use `request_user_input` only when a question is unavoidable
- update request_user_input and app-server tests to match the new
Default-mode behavior
- refactor collaboration-mode availability plumbing into
`CollaborationModesConfig` for future mode-related flags

## Codex author
`codex resume 019c9124-ed28-7c13-96c6-b916b1c97d49`
2026-02-25 15:20:46 -08:00
jif-oai
bcd6e68054
Display pending child-thread approvals in TUI (#12767)
Summary
- propagate approval policy from parent to spawned agents and drop the
Never override so sub-agents respect the caller’s request
- refresh the pending-approval list whenever events arrive or the active
thread changes and surface the list above the composer for inactive
threads
- add widgets, helpers, and tests covering the new pending-thread
approval UI state

![Uploading Screenshot 2026-02-25 at 11.02.18.png…]()
2026-02-25 11:40:11 +00:00
Won Park
ca556fa313
ctrl-L (clears terminal but does not start a new chat) (#12628)
# ctrl-L

- Clears your terminal window
- Does not start a new chat
2026-02-24 10:03:42 -08:00
jif-oai
0679e70bfc
fix: replay after /agent (#12663)
Filter the events after a`/agent` replay to prevent replaying decision
events
2026-02-24 12:08:38 +00:00
sayan-oai
7e46e5b9c2
chore: rm hardcoded PRESETS list (#12650)
rm `PRESETS` list harcoded in `model_presets` as we now have bundled
`models.json` with equivalent info.

update logic to rely on bundled models instead, update tests.
2026-02-23 22:35:51 -08:00
Jeremy Rose
855e275591
voice transcription (#3381)
Adds voice transcription on press-and-hold of spacebar.


https://github.com/user-attachments/assets/85039314-26f3-46d1-a83b-8c4a4a1ecc21

---------

Co-authored-by: Codex <199175422+chatgpt-codex-connector[bot]@users.noreply.github.com>
Co-authored-by: David Zbarsky <zbarsky@openai.com>
2026-02-23 22:15:18 +00:00
Won Park
a606e85859
tweaked /clear to support clear + new chat, also fix minor bug for macos terminal (#12520)
# /clear feature! 

Use /clear to start a new chat with Codex on a clean terminal!
2026-02-23 09:11:05 -08:00
jif-oai
829d1080f6
feat: keep dead agents in the agent picker (#12570) 2026-02-23 12:58:55 +00:00
Won Park
82d3c9ed76
feat(tui) /clear (#12444)
# /clear feature! 

/clear will clear your terminal while preserving the context/state of
the thread.
2026-02-21 22:06:56 -08:00
Felipe Coury
c4f1af7a86
feat(tui): syntax highlighting via syntect with theme picker (#11447)
## Summary

Adds syntax highlighting to the TUI for fenced code blocks in markdown
responses and file diffs, plus a `/theme` command with live preview and
persistent theme selection. Uses syntect (~250 grammars, 32 bundled
themes, ~1 MB binary cost) — the same engine behind `bat`, `delta`, and
`xi-editor`. Includes guardrails for large inputs, graceful fallback to
plain text, and SSH-aware clipboard integration for the `/copy` command.

<img width="1554" height="1014" alt="image"
src="https://github.com/user-attachments/assets/38737a79-8717-4715-b857-94cf1ba59b85"
/>

<img width="2354" height="1374" alt="image"
src="https://github.com/user-attachments/assets/25d30a00-c487-4af8-9cb6-63b0695a4be7"
/>

## Problem

Code blocks in the TUI (markdown responses and file diffs) render
without syntax highlighting, making it hard to scan code at a glance.
Users also have no way to pick a color theme that matches their terminal
aesthetic.

## Mental model

The highlighting system has three layers:

1. **Syntax engine** (`render::highlight`) -- a thin wrapper around
syntect + two-face. It owns a process-global `SyntaxSet` (~250 grammars)
and a `RwLock<Theme>` that can be swapped at runtime. All public entry
points accept `(code, lang)` and return ratatui `Span`/`Line` vectors or
`None` when the language is unrecognized or the input exceeds safety
guardrails.

2. **Rendering consumers** -- `markdown_render` feeds fenced code blocks
through the engine; `diff_render` highlights Add/Delete content as a
whole file and Update hunks per-hunk (preserving parser state across
hunk lines). Both callers fall back to plain unstyled text when the
engine returns `None`.

3. **Theme lifecycle** -- at startup the config's `tui.theme` is
resolved to a syntect `Theme` via `set_theme_override`. At runtime the
`/theme` picker calls `set_syntax_theme` to swap themes live; on cancel
it restores the snapshot taken at open. On confirm it persists `[tui]
theme = "..."` to config.toml.

## Non-goals

- Inline diff highlighting (word-level change detection within a line).
- Semantic / LSP-backed highlighting.
- Theme authoring tooling; users supply standard `.tmTheme` files.

## Tradeoffs

| Decision | Upside | Downside |
| ------------------------------------------------ |
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|
-----------------------------------------------------------------------------------------------------------------------
|
| syntect over tree-sitter / arborium | ~1 MB binary increase for ~250
grammars + 32 themes; battle-tested crate powering widely-used tools
(`bat`, `delta`, `xi-editor`). tree-sitter would add ~12 MB for 20-30
languages or ~35 MB for full coverage. | Regex-based; less structurally
accurate than tree-sitter for some languages (e.g. language injections
like JS-in-HTML). |
| Global `RwLock<Theme>` | Enables live `/theme` preview without
threading Theme through every call site | Lock contention risk
(mitigated: reads vastly outnumber writes, single UI thread) |
| Skip background / italic / underline from themes | Terminal BG
preserved, avoids ugly rendering on some themes | Themes that rely on
these properties lose fidelity |
| Guardrails: 512 KB / 10k lines | Prevents pathological stalls on huge
diffs or pastes | Very large files render without color |

## Architecture

```
config.toml  ─[tui.theme]─>  set_theme_override()  ─>  THEME (RwLock)
                                                              │
                  ┌───────────────────────────────────────────┘
                  │
  markdown_render ─── highlight_code_to_lines(code, lang) ─> Vec<Line>
  diff_render     ─── highlight_code_to_styled_spans(code, lang) ─> Option<Vec<Vec<Span>>>
                  │
                  │   (None ⇒ plain text fallback)
                  │
  /theme picker   ─── set_syntax_theme(theme)    // live preview swap
                  ─── current_syntax_theme()      // snapshot for cancel
                  ─── resolve_theme_by_name(name) // lookup by kebab-case
```

Key files:

- `tui/src/render/highlight.rs` -- engine, theme management, guardrails
- `tui/src/diff_render.rs` -- syntax-aware diff line wrapping
- `tui/src/theme_picker.rs` -- `/theme` command builder
- `tui/src/bottom_pane/list_selection_view.rs` -- side content panel,
callbacks
- `core/src/config/types.rs` -- `Tui::theme` field
- `core/src/config/edit.rs` -- `syntax_theme_edit()` helper

## Observability

- `tracing::warn` when a configured theme name cannot be resolved.
- `Config::startup_warnings` surfaces the same message as a TUI banner.
- `tracing::error` when persisting theme selection fails.

## Tests

- Unit tests in `highlight.rs`: language coverage, fallback behavior,
CRLF stripping, style conversion, guardrail enforcement, theme name
mapping exhaustiveness.
- Unit tests in `diff_render.rs`: snapshot gallery at multiple terminal
sizes (80x24, 94x35, 120x40), syntax-highlighted wrapping, large-diff
guardrail, rename-to-different-extension highlighting, parser state
preservation across hunk lines.
- Unit tests in `theme_picker.rs`: preview rendering (wide + narrow),
dim overlay on deletions, subtitle truncation, cancel-restore, fallback
for unavailable configured theme.
- Unit tests in `list_selection_view.rs`: side layout geometry, stacked
fallback, buffer clearing, cancel/selection-changed callbacks.
- Integration test in `lib.rs`: theme warning uses the final
(post-resume) config.

## Cargo Deny: Unmaintained Dependency Exceptions

This PR adds two `cargo deny` advisory exceptions for transitive
dependencies pulled in by `syntect v5.3.0`:

| Advisory | Crate | Status |
|----------|-------|--------|
| RUSTSEC-2024-0320 | `yaml-rust` | Unmaintained (maintainer
unreachable) |
| RUSTSEC-2025-0141 | `bincode` | Unmaintained (development ceased;
v1.3.3 considered complete) |

**Why this is safe in our usage:**

- Neither advisory describes a known security vulnerability. Both are
"unmaintained" notices only.
- `bincode` is used by syntect to deserialize pre-compiled syntax sets.
Again, these are **static vendored artifacts** baked into the binary at
build time. No user-supplied bincode data is ever deserialized. - Attack
surface is zero for both crates; exploitation would require a
supply-chain compromise of our own build artifacts.
- These exceptions can be removed when syntect migrates to `yaml-rust2`
and drops `bincode`, or when alternative crates are available upstream.
2026-02-21 20:26:58 -08:00
Michael Bolin
1af2a37ada
chore: remove codex-core public protocol/shell re-exports (#12432)
## Why

`codex-rs/core/src/lib.rs` re-exported a broad set of types and modules
from `codex-protocol` and `codex-shell-command`. That made it easy for
workspace crates to import those APIs through `codex-core`, which in
turn hides dependency edges and makes it harder to reduce compile-time
coupling over time.

This change removes those public re-exports so call sites must import
from the source crates directly. Even when a crate still depends on
`codex-core` today, this makes dependency boundaries explicit and
unblocks future work to drop `codex-core` dependencies where possible.

## What Changed

- Removed public re-exports from `codex-rs/core/src/lib.rs` for:
- `codex_protocol::protocol` and related protocol/model types (including
`InitialHistory`)
  - `codex_protocol::config_types` (`protocol_config_types`)
- `codex_shell_command::{bash, is_dangerous_command, is_safe_command,
parse_command, powershell}`
- Migrated workspace Rust call sites to import directly from:
  - `codex_protocol::protocol`
  - `codex_protocol::config_types`
  - `codex_protocol::models`
  - `codex_shell_command`
- Added explicit `Cargo.toml` dependencies (`codex-protocol` /
`codex-shell-command`) in crates that now import those crates directly.
- Kept `codex-core` internal modules compiling by using `pub(crate)`
aliases in `core/src/lib.rs` (internal-only, not part of the public
API).
- Updated the two utility crates that can already drop a `codex-core`
dependency edge entirely:
  - `codex-utils-approval-presets`
  - `codex-utils-cli`

## Verification

- `cargo test -p codex-utils-approval-presets`
- `cargo test -p codex-utils-cli`
- `cargo check --workspace --all-targets`
- `just clippy`
2026-02-20 23:45:35 -08:00
Charley Cunningham
4c1744afb2
Improve Plan mode reasoning selection flow (#12303)
Addresses https://github.com/openai/codex/issues/11013

## Summary
- add a Plan implementation path in the TUI that lets users choose
reasoning before switching to Default mode and implementing
- add Plan-mode reasoning scope handling (Plan-only override vs
all-modes default), including config/schema/docs plumbing for
`plan_mode_reasoning_effort`
- remove the hardcoded Plan preset medium default and make the reasoning
popup reflect the active Plan override as `(current)`
- split the collaboration-mode switch notification UI hint into #12307
to keep this diff focused

If I have `plan_mode_reasoning_effort = "medium"` set in my
`config.toml`:
<img width="699" height="127" alt="Screenshot 2026-02-20 at 6 59 37 PM"
src="https://github.com/user-attachments/assets/b33abf04-6b7a-49ed-b2e9-d24b99795369"
/>

If I don't have `plan_mode_reasoning_effort` set in my `config.toml`:
<img width="704" height="129" alt="Screenshot 2026-02-20 at 7 01 51 PM"
src="https://github.com/user-attachments/assets/88a086d4-d2f1-49c7-8be4-f6f0c0fa1b8d"
/>

## Codex author
`codex resume 019c78a2-726b-7fe3-adac-3fa4523dcc2a`
2026-02-20 20:08:56 -08:00
jif-oai
5a30cd3f92
feat: better agent picker in TUI (#12332)
<img width="486" height="112" alt="Screenshot 2026-02-20 at 15 04 52"
src="https://github.com/user-attachments/assets/0d744f58-d902-4638-aeaf-27e7389ccd73"
/>
2026-02-20 15:40:34 +00:00
sayan-oai
d54999d006
client side modelinfo overrides (#12101)
TL;DR
Add top-level `model_catalog_json` config support so users can supply a
local model catalog override from a JSON file path (including adding new
models) without backend changes.

### Problem
Codex previously had no clean client-side way to replace/overlay model
catalog data for local testing of model metadata and new model entries.

### Fix
- Add top-level `model_catalog_json` config field (JSON file path).
- Apply catalog entries when resolving `ModelInfo`:
  1. Base resolved model metadata (remote/fallback)
  2. Catalog overlay from `model_catalog_json`
3. Existing global top-level overrides (`model_context_window`,
`model_supports_reasoning_summaries`, etc.)

### Note
Will revisit per-field overrides in a follow-up

### Tests
Added tests
2026-02-19 10:38:57 -08:00
jif-oai
0dcf8d9c8f
Enable default status line indicators in TUI config (#12015)
Default statusline to something
<img width="307" height="83" alt="Screenshot 2026-02-17 at 18 16 12"
src="https://github.com/user-attachments/assets/44e16153-0aa2-4c1a-9b4a-02e2feb8b7f6"
/>
2026-02-18 09:51:15 +00:00
Charley Cunningham
709e2133bb
tui: exit session on Ctrl+C in cwd change prompt (#12040)
## Summary
- change the cwd-change prompt (shown when resuming/forking across
different directories) so `Ctrl+C`/`Ctrl+D` exits the session instead of
implicitly selecting "Use session directory"
- introduce explicit prompt and resolver exit outcomes so this intent is
propagated cleanly through both startup resume/fork and in-app `/resume`
flows
- add a unit test that verifies `Ctrl+C` exits rather than selecting an
option

## Why
Previously, pressing `Ctrl+C` on this prompt silently picked one of the
options, which made it hard to abort. This aligns the prompt with the
expected quit behavior.

## Codex author
`codex resume 019c6d39-bbfb-7dc3-8008-1388a054e86d`
2026-02-17 14:48:12 -08:00
sayan-oai
41800fc876
chore: rm remote models fflag (#11699)
rm `remote_models` feature flag.

We see issues like #11527 when a user has `remote_models` disabled, as
we always use the default fallback `ModelInfo`. This causes issues with
model performance.

Builds on #11690, which helps by warning the user when they are using
the default fallback. This PR will make that happen much less frequently
as an accidental consequence of disabling `remote_models`.
2026-02-17 11:43:16 -08:00
jif-oai
5ae84197b2
Exit early when session initialization fails (#11908)
Summary
- wait for the initial session startup loop to finish and handle exit
before waiting for the first message in fresh sessions
- propagate AppRunControl::Exit to return immediately when
initialization fails
2026-02-17 11:22:30 +00:00
Charley Cunningham
26a7cd21e2
tui: preserve remote image attachments across resume/backtrack (#10590)
## Summary
This PR makes app-server-provided image URLs first-class attachments in
TUI, so they survive resume/backtrack/history recall and are resubmitted
correctly.

<img width="715" height="491" alt="Screenshot 2026-02-12 at 8 27 08 PM"
src="https://github.com/user-attachments/assets/226cbd35-8f0c-4e51-a13e-459ef5dd1927"
/>

Can delete the attached image upon backtracking:
<img width="716" height="301" alt="Screenshot 2026-02-12 at 8 27 31 PM"
src="https://github.com/user-attachments/assets/4558d230-f1bd-4eed-a093-8e1ab9c6db27"
/>

In both history and composer, remote images are rendered as normal
`[Image #N]` placeholders, with numbering unified with local images.

## What changed
- Plumb remote image URLs through TUI message state:
  - `UserHistoryCell`
  - `BacktrackSelection`
  - `ChatComposerHistory::HistoryEntry`
  - `ChatWidget::UserMessage`
- Show remote images as placeholder rows inside the composer box (above
textarea), and in history cells.
- Support keyboard selection/deletion for remote image rows in composer
(`Up`/`Down`, `Delete`/`Backspace`).
- Preserve remote-image-only turns in local composer history (Up/Down
recall), including restore after backtrack.
- Ensure submit/queue/backtrack resubmit include remote images in model
input (`UserInput::Image`), and keep request shape stable for
remote-image-only turns.
- Keep image numbering contiguous across remote + local images:
  - remote images occupy `[Image #1]..[Image #M]`
  - local images start at `[Image #M+1]`
  - deletion renumbers consistently.
- In protocol conversion, increment shared image index for remote images
too, so mixed remote/local image tags stay in a single sequence.
- Simplify restore logic to trust in-memory attachment order (no
placeholder-number parsing path).
- Backtrack/replay rollback handling now queues trims through
`AppEvent::ApplyThreadRollback` and syncs transcript overlay/deferred
lines after trims, so overlay/transcript state stays consistent.
- Trim trailing blank rendered lines from user history rendering to
avoid oversized blank padding.

## Docs + tests
- Updated: `docs/tui-chat-composer.md` (remote image flow,
selection/deletion, numbering offsets)
- Added/updated tests across `tui/src/chatwidget/tests.rs`,
`tui/src/app.rs`, `tui/src/app_backtrack.rs`, `tui/src/history_cell.rs`,
and `tui/src/bottom_pane/chat_composer.rs`
- Added snapshot coverage for remote image composer states, including
deleting the first of two remote images.

## Validation
- `just fmt`
- `cargo test -p codex-tui`

## Codex author
`codex fork 019c2636-1571-74a1-8471-15a3b1c3f49d`
2026-02-13 14:54:06 -08:00
Matthew Zeng
8468871e2b
[apps] Improve app listing filtering. (#11697)
- [x] If an installed app is not on the app listing, remove it from the
final list.
2026-02-13 11:54:16 -08:00
jif-oai
851fcc377b
feat: switch on dying sub-agents (#11477)
[codex-generated]

## Updated PR Description (Ready To Paste)

## Problem

When a sub-agent thread emits `ShutdownComplete`, the TUI switches back
to the primary thread.
That was also happening for user-requested exits (for example `Ctrl+C`),
which could prevent a
clean app exit and unexpectedly resurrect the main thread.

## Mental model

The app has one primary thread and one active thread. A non-primary
active thread shutting down
usually means "agent died, fail back to primary," but during
`ExitMode::ShutdownFirst` shutdown
means "the user is exiting," not "recover this session."

## Non-goals

No change to thread lifecycle, thread-manager ownership, or shutdown
protocol wire format.
No behavioral changes to non-shutdown events.

## Tradeoffs

This adds a small local marker (`pending_shutdown_exit_thread_id`)
instead of inferring intent
from event timing. It is deterministic and simple, but relies on
correctly setting and clearing
that marker around exit.

## Architecture

`App` tracks which thread is intentionally being shut down for exit.
`active_non_primary_shutdown_target` centralizes failover eligibility
for `ShutdownComplete` and
skips failover when shutdown matches the pending-exit thread.
`handle_active_thread_event` handles non-primary failover before generic
forwarding and clears the
pending-exit marker only when the matching active thread completes
shutdown.

## Observability

User-facing info/error messages continue to indicate whether failover to
the main thread succeeded.
The shutdown-intent path is now explicitly documented inline for easier
debugging.

## Tests

Added targeted tests for `active_non_primary_shutdown_target` covering
non-shutdown events,
primary-thread shutdown, non-primary shutdown failover, pending exit on
active thread (no failover),
and pending exit for another thread (still failover).

Validated with:
- `cargo test -p codex-tui` (pass)

---------

Co-authored-by: Josh McKinney <joshka@openai.com>
2026-02-13 18:29:03 +00:00
iceweasel-oai
99466f1f90
sandbox NUX metrics update (#11667)
just updating metrics to match the NUX tweaks we made this week.
2026-02-13 10:01:47 -08:00
sayan-oai
abeafbdca1
fix: dont show NUX for upgrade-target models that are hidden (#11679)
dont show NUX for models marked with `visibility:hide`.

Tested locally
2026-02-12 20:29:22 -08:00
Matthew Zeng
c37560069a
[apps] Add is_enabled to app info. (#11417)
- [x] Add is_enabled to app info and the response of `app/list`.
- [x] Update TUI to have Enable/Disable button on the app detail page.
2026-02-13 00:30:52 +00:00
Michael Bolin
a4cc1a4a85
feat: introduce Permissions (#11633)
## Why
We currently carry multiple permission-related concepts directly on
`Config` for shell/unified-exec behavior (`approval_policy`,
`sandbox_policy`, `network`, `shell_environment_policy`,
`windows_sandbox_mode`).

Consolidating these into one in-memory struct makes permission handling
easier to reason about and sets up the next step: supporting named
permission profiles (`[permissions.PROFILE_NAME]`) without changing
behavior now.

This change is mostly mechanical: it updates existing callsites to go
through `config.permissions`, but it does not yet refactor those
callsites to take a single `Permissions` value in places where multiple
permission fields are still threaded separately.

This PR intentionally **does not** change the on-disk `config.toml`
format yet and keeps compatibility with legacy config keys.

## What Changed
- Introduced `Permissions` in `core/src/config/mod.rs`.
- Added `Config::permissions` and moved effective runtime permission
fields under it:
  - `approval_policy`
  - `sandbox_policy`
  - `network`
  - `shell_environment_policy`
  - `windows_sandbox_mode`
- Updated config loading/building so these effective values are still
derived from the same existing config inputs and constraints.
- Updated Windows sandbox helpers/resolution to read/write via
`permissions`.
- Threaded the new field through all permission consumers across core
runtime, app-server, CLI/exec, TUI, and sandbox summary code.
- Updated affected tests to reference `config.permissions.*`.
- Renamed the struct/field from
`EffectivePermissions`/`effective_permissions` to
`Permissions`/`permissions` and aligned variable naming accordingly.

## Verification
- `just fix -p codex-core -p codex-tui -p codex-cli -p codex-app-server
-p codex-exec -p codex-utils-sandbox-summary`
- `cargo build -p codex-core -p codex-tui -p codex-cli -p
codex-app-server -p codex-exec -p codex-utils-sandbox-summary`
2026-02-12 14:42:54 -08:00
iceweasel-oai
5c3ca73914
add a slash command to grant sandbox read access to inaccessible directories (#11512)
There is an edge case where a directory is not readable by the sandbox.
In practice, we've seen very little of it, but it can happen so this
slash command unlocks users when it does.

Future idea is to make this a tool that the agent knows about so it can
be more integrated.
2026-02-12 12:48:36 -08:00
Owen Lin
efc8d45750
feat(app-server): experimental flag to persist extended history (#11227)
This PR adds an experimental `persist_extended_history` bool flag to
app-server thread APIs so rollout logs can retain a richer set of
EventMsgs for non-lossy Thread > Turn > ThreadItems reconstruction (i.e.
on `thread/resume`).

### Motivation
Today, our rollout recorder only persists a small subset (e.g. user
message, reasoning, assistant message) of `EventMsg` types, dropping a
good number (like command exec, file change, etc.) that are important
for reconstructing full item history for `thread/resume`, `thread/read`,
and `thread/fork`.

Some clients want to be able to resume a thread without lossiness. This
lossiness is primarily a UI thing, since what the model sees are
`ResponseItem` and not `EventMsg`.

### Approach
This change introduces an opt-in `persist_full_history` flag to preserve
those events when you start/resume/fork a thread (defaults to `false`).

This is done by adding an `EventPersistenceMode` to the rollout
recorder:
- `Limited` (existing behavior, default)
- `Extended` (new opt-in behavior)

In `Extended` mode, persist additional `EventMsg` variants needed for
non-lossy app-server `ThreadItem` reconstruction. We now store the
following ThreadItems that we didn't before:
- web search
- command execution
- patch/file changes
- MCP tool calls
- image view calls
- collab tool outcomes
- context compaction
- review mode enter/exit

For **command executions** in particular, we truncate the output using
the existing `truncate_text` from core to store an upper bound of 10,000
bytes, which is also the default value for truncating tool outputs shown
to the model. This keeps the size of the rollout file and command
execution items returned over the wire reasonable.

And we also persist `EventMsg::Error` which we can now map back to the
Turn's status and populates the Turn's error metadata.

#### Updates to EventMsgs
To truly make `thread/resume` non-lossy, we also needed to persist the
`status` on `EventMsg::CommandExecutionEndEvent` and
`EventMsg::PatchApplyEndEvent`. Previously it was not obvious whether a
command failed or was declined (similar for apply_patch). These
EventMsgs were never persisted before so I made it a required field.
2026-02-12 19:34:22 +00:00
Michael Bolin
abbd74e2be
feat: make sandbox read access configurable with ReadOnlyAccess (#11387)
`SandboxPolicy::ReadOnly` previously implied broad read access and could
not express a narrower read surface.
This change introduces an explicit read-access model so we can support
user-configurable read restrictions in follow-up work, while preserving
current behavior today.

It also ensures unsupported backends fail closed for restricted-read
policies instead of silently granting broader access than intended.

## What

- Added `ReadOnlyAccess` in protocol with:
  - `Restricted { include_platform_defaults, readable_roots }`
  - `FullAccess`
- Updated `SandboxPolicy` to carry read-access configuration:
  - `ReadOnly { access: ReadOnlyAccess }`
  - `WorkspaceWrite { ..., read_only_access: ReadOnlyAccess }`
- Preserved existing behavior by defaulting current construction paths
to `ReadOnlyAccess::FullAccess`.
- Threaded the new fields through sandbox policy consumers and call
sites across `core`, `tui`, `linux-sandbox`, `windows-sandbox`, and
related tests.
- Updated Seatbelt policy generation to honor restricted read roots by
emitting scoped read rules when full read access is not granted.
- Added fail-closed behavior on Linux and Windows backends when
restricted read access is requested but not yet implemented there
(`UnsupportedOperation`).
- Regenerated app-server protocol schema and TypeScript artifacts,
including `ReadOnlyAccess`.

## Compatibility / rollout

- Runtime behavior remains unchanged by default (`FullAccess`).
- API/schema changes are in place so future config wiring can enable
restricted read access without another policy-shape migration.
2026-02-11 18:31:14 -08:00
iceweasel-oai
87279de434
Promote Windows Sandbox (#11341)
1. Move Windows Sandbox NUX to right after trust directory screen
2. Don't offer read-only as an option in Sandbox NUX.
Elevated/Legacy/Quit
3. Don't allow new untrusted directories. It's trust or quit
4. move experimental sandbox features to `[windows]
sandbox="elevated|unelevatd"`
5. Copy tweaks = elevated -> default, non-elevated -> non-admin
2026-02-11 11:48:33 -08:00
Michael Bolin
476c1a7160
Remove test-support feature from codex-core and replace it with explicit test toggles (#11405)
## Why

`codex-core` was being built in multiple feature-resolved permutations
because test-only behavior was modeled as crate features. For a large
crate, those permutations increase compile cost and reduce cache reuse.

## Net Change

- Removed the `test-support` crate feature and related feature wiring so
`codex-core` no longer needs separate feature shapes for test consumers.
- Standardized cross-crate test-only access behind
`codex_core::test_support`.
- External test code now imports helpers from
`codex_core::test_support`.
- Underlying implementation hooks are kept internal (`pub(crate)`)
instead of broadly public.

## Outcome

- Fewer `codex-core` build permutations.
- Better incremental cache reuse across test targets.
- No intended production behavior change.
2026-02-10 22:44:02 -08:00
Charley Cunningham
8b46c0ce00
tui: queue non-pending rollback trims in app-event order (#11373)
## Summary

This PR fixes TUI transcript-sync behavior for
`EventMsg::ThreadRolledBack` and makes rollback application order
deterministic.

Previously, rollback handling depended on `pending_rollback`:

- if `pending_rollback` was set (local backtrack), TUI trimmed correctly
- otherwise, replayed/external rollbacks were either ignored or could be
applied at the wrong time relative to queued transcript inserts

This change keeps the local backtrack path intact and routes non-pending
rollbacks through the app event queue so rollback trims are applied in
FIFO order with transcript cell inserts.

## What changed

- Added/used `trim_transcript_cells_drop_last_n_user_turns(...)` for
rollback-by-`num_turns` semantics.
- Renamed rollback app event:
- `AppEvent::ApplyReplayedThreadRollback` ->
`AppEvent::ApplyThreadRollback`
- Replay path (`ChatWidget`) now emits `ApplyThreadRollback`.
- Live non-pending rollback path (`App::handle_backtrack_event`) now
emits `ApplyThreadRollback` instead of trimming immediately.
- App-level event handler applies `ApplyThreadRollback` after queued
`InsertHistoryCell` events and schedules redraw only when a trim
occurred.
- When a trim occurs with an overlay open, TUI now syncs transcript
overlay committed cells, clamps backtrack preview selection, and clears
stale `deferred_history_lines` so closed overlays do not re-append
rolled-back lines.
- Clarified inline comments around the `pending_rollback` branch so
future readers can reason about why there are two paths.

## Why queueing matters

During resume/replay, transcript cells are populated via queued
`InsertHistoryCell` app events. If a rollback is applied immediately
outside that queue, it can run against an incomplete transcript and
under-trim. Queueing non-pending rollbacks ensures consistent ordering
and correct final transcript state.

## Behavior by rollback source

- `pending_rollback = Some(...)` (local backtrack requested by this
TUI):
  - use `finish_pending_backtrack()` and the stored selection boundary
- `pending_rollback = None` (replay/external/non-local rollback):
- enqueue `AppEvent::ApplyThreadRollback { num_turns }` and trim in
app-event order

## Tests

Added/updated tests covering ordering and semantics:

-
`app_backtrack::tests::trim_drop_last_n_user_turns_applies_rollback_semantics`
- `app_backtrack::tests::trim_drop_last_n_user_turns_allows_overflow`
- `app::tests::replayed_initial_messages_apply_rollback_in_queue_order`
-
`app::tests::live_rollback_during_replay_is_applied_in_app_event_order`
-
`app::tests::queued_rollback_syncs_overlay_and_clears_deferred_history`
- `chatwidget::tests::replayed_thread_rollback_emits_ordered_app_event`

Validation run:

- `just fmt`
- `cargo test -p codex-tui`
2026-02-10 18:53:43 -08:00
jif-oai
847a6092e6
fix: reduce usage of open_if_present (#11344) 2026-02-10 19:25:07 +00:00
Matthew Zeng
48e415bdef
[apps] Improve app installation flow. (#11249)
- [x] Add buttons to start the installation flow and verify installation
completes.
- [x] Hard refresh apps list when the /apps view opens.
2026-02-10 17:59:43 +00:00
Michael Bolin
44ebf4588f
feat: retain NetworkProxy, when appropriate (#11207)
As of this PR, `SessionServices` retains a
`Option<StartedNetworkProxy>`, if appropriate.

Now the `network` field on `Config` is `Option<NetworkProxySpec>`
instead of `Option<NetworkProxy>`.

Over in `Session::new()`, we invoke `NetworkProxySpec::start_proxy()` to
create the `StartedNetworkProxy`, which is a new struct that retains the
`NetworkProxy` as well as the `NetworkProxyHandle`. (Note that `Drop` is
implemented for `NetworkProxyHandle` to ensure the proxies are shutdown
when it is dropped.)

The `NetworkProxy` from the `StartedNetworkProxy` is threaded through to
the appropriate places.


---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/11207).
* #11285
* __->__ #11207
2026-02-10 02:09:23 -08:00
jif-oai
13de744296
fix: do not show closed agents in /agent (#11175) 2026-02-09 11:25:31 +00:00
Matthew Zeng
45b7763c3f
[apps] Improve app loading. (#10994)
There are two concepts of apps that we load in the harness:

- Directory apps, which is all the apps that the user can install.
- Accessible apps, which is what the user actually installed and can be
$ inserted and be used by the model. These are extracted from the tools
that are loaded through the gateway MCP.

Previously we wait for both sets of apps before returning the full apps
list. Which causes many issues because accessible apps won't be
available to the UI or the model if directory apps aren't loaded or
failed to load.

In this PR we are separating them so that accessible apps can be loaded
separately and are instantly available to be shown in the UI and to be
provided in model context. We also added an app-server event so that
clients can subscribe to also get accessible apps without being blocked
on the full app list.

- [x] Separate accessible apps and directory apps loading.
- [x] `app/list` request will also emit `app/list/updated` notifications
that app-server clients can subscribe. Which allows clients to get
accessible apps list to render in the $ menu without being blocked by
directory apps.
- [x] Cache both accessible and directory apps with 1 hour TTL to avoid
reloading them when creating new threads.
- [x] TUI improvements to redraw $ menu and /apps menu when app list is
updated.
2026-02-08 15:24:56 -08:00
Eric Traut
b3de6c7f2b
Defer persistence of rollout file (#11028)
- Defer rollout persistence for fresh threads (`InitialHistory::New`):
keep rollout events in memory and only materialize rollout file + state
DB row on first `EventMsg::UserMessage`.
- Keep precomputed rollout path available before materialization.
- Change `thread/start` to build thread response from live config
snapshot and optional precomputed path.
- Improve pre-materialization behavior in app-server/TUI: clearer
invalid-request errors for file-backed ops and a friendlier `/fork` “not
ready yet” UX.
- Update tests to match deferred semantics across
start/read/archive/unarchive/fork/resume/review flows.
- Improved resilience of user_shell test, which should be unrelated to
this change but must be affected by timing changes

For Reviewers:
* The primary change is in recorder.rs
* Most of the other changes were to fix up broken assumptions in
existing tests

Testing:
* Manually tested CLI
* Exercised app server paths by manually running IDE Extension with
rebuilt CLI binary
* Only user-visible change is that `/fork` in TUI generates visible
error if used prior to first turn
2026-02-07 23:05:03 -08:00
alexsong-oai
daeef06bec
add originator to otel (#10826) 2026-02-06 15:13:56 -08:00
pap-openai
529b539564
Add analytics for /rename and /fork (#10655) 2026-02-05 20:18:29 +00:00