core-agent-ide

Author	SHA1	Message	Date
Michael Bolin	bfff0c729f	config: enforce enterprise feature requirements (#13388 ) ## Why Enterprises can already constrain approvals, sandboxing, and web search through `requirements.toml` and MDM, but feature flags were still only configurable as managed defaults. That meant an enterprise could suggest feature values, but it could not actually pin them. This change closes that gap and makes enterprise feature requirements behave like the other constrained settings. The effective feature set now stays consistent with enterprise requirements during config load, when config writes are validated, and when runtime code mutates feature flags later in the session. It also tightens the runtime API for managed features. `ManagedFeatures` now follows the same constraint-oriented shape as `Constrained<T>` instead of exposing panic-prone mutation helpers, and production code can no longer construct it through an unconstrained `From<Features>` path. The PR also hardens the `compact_resume_fork` integration coverage on Windows. After the feature-management changes, `compact_resume_after_second_compaction_preserves_history` was overflowing the libtest/Tokio thread stacks on Windows, so the test now uses an explicit larger-stack harness as a pragmatic mitigation. That may not be the ideal root-cause fix, and it merits a parallel investigation into whether part of the async future chain should be boxed to reduce stack pressure instead. ## What Changed Enterprises can now pin feature values in `requirements.toml` with the requirements-side `features` table: ```toml [features] personality = true unified_exec = false ``` Only canonical feature keys are allowed in the requirements `features` table; omitted keys remain unconstrained. - Added a requirements-side pinned feature map to `ConfigRequirementsToml`, threaded it through source-preserving requirements merge and normalization in `codex-config`, and made the TOML surface use `[features]` (while still accepting legacy `[feature_requirements]` for compatibility). - Exposed `featureRequirements` from `configRequirements/read`, regenerated the JSON/TypeScript schema artifacts, and updated the app-server README. - Wrapped the effective feature set in `ManagedFeatures`, backed by `ConstrainedWithSource<Features>`, and changed its API to mirror `Constrained<T>`: `can_set(...)`, `set(...) -> ConstraintResult<()>`, and result-returning `enable` / `disable` / `set_enabled` helpers. - Removed the legacy-usage and bulk-map passthroughs from `ManagedFeatures`; callers that need those behaviors now mutate a plain `Features` value and reapply it through `set(...)`, so the constrained wrapper remains the enforcement boundary. - Removed the production loophole for constructing unconstrained `ManagedFeatures`. Non-test code now creates it through the configured feature-loading path, and `impl From<Features> for ManagedFeatures` is restricted to `#[cfg(test)]`. - Rejected legacy feature aliases in enterprise feature requirements, and return a load error when a pinned combination cannot survive dependency normalization. - Validated config writes against enterprise feature requirements before persisting changes, including explicit conflicting writes and profile-specific feature states that normalize into invalid combinations. - Updated runtime and TUI feature-toggle paths to use the constrained setter API and to persist or apply the effective post-constraint value rather than the requested value. - Updated the `core_test_support` Bazel target to include the bundled core model-catalog fixtures in its runtime data, so helper code that resolves `core/models.json` through runfiles works in remote Bazel test environments. - Renamed the core config test coverage to emphasize that effective feature values are normalized at runtime, while conflicting persisted config writes are rejected. - Ran `compact_resume_after_second_compaction_preserves_history` inside an explicit 8 MiB test thread and Tokio runtime worker stack, following the existing larger-stack integration-test pattern, to keep the Windows `compact_resume_fork` test slice from aborting while a parallel investigation continues into whether some of the underlying async futures should be boxed. ## Verification - `cargo test -p codex-config` - `cargo test -p codex-core feature_requirements_ -- --nocapture` - `cargo test -p codex-core load_requirements_toml_produces_expected_constraints -- --nocapture` - `cargo test -p codex-core compact_resume_after_second_compaction_preserves_history -- --nocapture` - `cargo test -p codex-core compact_resume_fork -- --nocapture` - Re-ran the built `codex-core` `tests/all` binary with `RUST_MIN_STACK=262144` for `compact_resume_after_second_compaction_preserves_history` to confirm the explicit-stack harness fixes the deterministic low-stack repro. - `cargo test -p codex-core` - This still fails locally in unrelated integration areas that expect the `codex` / `test_stdio_server` binaries or hit existing `search_tool` wiremock mismatches. ## Docs `developers.openai.com/codex` should document the requirements-side `[features]` table for enterprise and MDM-managed configuration, including that it only accepts canonical feature keys and that conflicting config writes are rejected.	2026-03-04 04:40:22 +00:00
Celia Chen	e6773f856c	Feat: Preserve network access on read-only sandbox policies (#13409 ) ## Summary `PermissionProfile.network` could not be preserved when additional or compiled permissions resolved to `SandboxPolicy::ReadOnly`, because `ReadOnly` had no network_access field. This change makes read-only + network enabled representable directly and threads that through the protocol, app-server v2 mirror, and permission- merging logic. ## What changed - Added `network_access: bool` to `SandboxPolicy::ReadOnly` in the core protocol and app-server v2 protocol. - Kept backward compatibility by defaulting the new field to false, so legacy read-only payloads still deserialize unchanged. - Updated `has_full_network_access()` and sandbox summaries to respect read-only network access. - Preserved PermissionProfile.network when: - compiling skill permission profiles into sandbox policies - normalizing additional permissions - merging additional permissions into existing sandbox policies - Updated the approval overlay to show network in the rendered permission rule when requested. - Regenerated app-server schema fixtures for the new v2 wire shape.	2026-03-04 02:41:57 +00:00
rhan-oai	e951ef4374	[feedback] diagnostics (#13292 ) - added header logic to display diagnostics on cli - added logic for collecting env vars <img width="606" height="327" alt="Screenshot 2026-03-03 at 3 49 31 PM" src="https://github.com/user-attachments/assets/05e78c56-8cb3-47fa-abaf-3e57f1fdd8e2" /> <img width="690" height="353" alt="Screenshot 2026-03-02 at 6 47 54 PM" src="https://github.com/user-attachments/assets/e470b559-13f4-44d9-897f-bc398943c6d1" />	2026-03-03 16:34:11 -08:00
Charley Cunningham	299b8ac445	tui: align pending steers with core acceptance (#12868 ) ## Summary - submit `Enter` steers immediately while a turn is already running instead of routing them through `queued_user_messages` - keep those submitted steers visible in the footer as `pending_steers` until core records them as a user message or aborts the turn - reconcile pending steers on `ItemCompleted(UserMessage)`, not `RawResponseItem` - emit user-message item lifecycle for leftover pending input at task finish, then remove the TUI `TurnComplete` fallback - keep `queued_user_messages` for actual queued drafts, rendered below pending steers ## Problem While the assistant was generating, pressing `Enter` could send the input into `queued_user_messages`. That queue only drains after the turn ends, so ordinary steers behaved like queued drafts instead of landing at the next core sampling boundary. The first version of this fix also used `RawResponseItem` to decide when a steer had landed. Review feedback was that this is the wrong abstraction for client behavior. There was also a late edge case in core: if pending steer input was accepted after the final sampling decision but before `TurnComplete`, core would record that user message into history at task finish without emitting `ItemStarted(UserMessage)` / `ItemCompleted(UserMessage)`. TUI had a fallback to paper over that gap locally. ## Approach - `Enter` during an active turn now submits a normal `Op::UserTurn` immediately - TUI keeps a local pending-steer preview instead of rendering that user message into history immediately - when core records the steer as `ItemCompleted(UserMessage)`, TUI matches and removes the corresponding pending preview, then renders the committed user message - core now emits the same user-message lifecycle when `on_task_finished(...)` drains leftover pending user input, before `TurnComplete` - with that lifecycle gap closed in core, TUI no longer needs to flush pending steers into history on `TurnComplete` - if the turn is interrupted, pending steers and queued drafts are both restored into the composer, with pending steers first ## Notes - `Tab` still uses the real queued-message path - `queued_user_messages` and `pending_steers` are separate state with separate semantics - the pending-steer matching key is built directly from `UserInput` - this removes the new TUI dependency on `RawResponseItem` ## Validation - `just fmt` - `cargo test -p codex-core task_finish_emits_turn_item_lifecycle_for_leftover_pending_user_input -- --nocapture` - `cargo test -p codex-tui`	2026-03-03 15:31:52 -08:00
Charley Cunningham	c4bd0aa3b9	app-server: source /feedback logs from sqlite at trace level (#12969 ) ## Summary - write app-server SQLite logs at TRACE level when SQLite is enabled - source app-server `/feedback` log attachments from SQLite for the requested thread when available - flush buffered SQLite log writes before `/feedback` queries them so newly emitted events are not lost behind the async inserter - include same-process threadless SQLite rows in those `/feedback` logs so the attachment matches the process-wide feedback buffer more closely - keep the existing in-memory ring buffer fallback unchanged, including when the SQLite query returns no rows ## Details - add a byte-bounded `query_feedback_logs` helper in `codex-state` so `/feedback` does not fetch all rows before truncating - scope SQLite feedback logs to the requested thread plus threadless rows from the same `process_uuid` - format exported SQLite feedback lines with the log level prefix to better match the in-memory feedback formatter - add an explicit `LogDbLayer::flush()` control path and await it in app-server before querying SQLite for feedback logs - pass optional SQLite log bytes through `codex-feedback` as the `codex-logs.log` attachment override - leave TUI behavior unchanged apart from the updated `upload_feedback` call signature - add regression coverage for: - newest-within-budget ordering - excluding oversized newest rows - including same-process threadless rows - keeping the newest suffix across mixed thread and threadless rows - matching the feedback formatter shape aside from span prefixes - falling back to the in-memory snapshot when SQLite returns no logs - flushing buffered SQLite rows before querying ## Follow-up - SQLite feedback exports still do not reproduce span prefixes like `feedback-thread{thread_id=...}:`; there is a `TODO(ccunningham)` in `codex-rs/state/src/log_db.rs` for that follow-up. ## Testing - `cd codex-rs && cargo test -p codex-state` - `cd codex-rs && cargo test -p codex-app-server` - `cd codex-rs && just fmt`	2026-03-03 11:17:06 -08:00
pash-openai	07e532dcb9	app-server service tier plumbing (plus some cleanup) (#13334 ) followup to https://github.com/openai/codex/pull/13212 to expose fast tier controls to app server (majority of this PR is generated schema jsons - actual code is +69 / -35 and +24 tests ) - add service tier fields to the app-server protocol surfaces used by thread lifecycle, turn start, config, and session configured events - thread service tier through the app-server message processor and core thread config snapshots - allow runtime config overrides to carry service tier for app-server callers cleanup: - Removing useless "legacy" code supporting "standard" - we moved to None \| "fast", so "standard" is not needed.	2026-03-03 02:35:09 -08:00
bwanner-oai	6deb72c04b	Renaming Team to Business plan during TUI onboarding (#13313 ) Team is referred to as "Business"	2026-03-02 23:13:29 -08:00
pash-openai	2f5b01abd6	add fast mode toggle (#13212 ) - add a local Fast mode setting in codex-core (similar to how model id is currently stored on disk locally) - send `service_tier=priority` on requests when Fast is enabled - add `/fast` in the TUI and persist it locally - feature flag	2026-03-02 20:29:33 -08:00
rakan-oai	56cc2c71f4	tui: preserve kill buffer across submit and slash-command clears (#12006 ) ## Problem Before this change, composer paths that cleared the textarea after submit or slash-command dispatch also cleared the textarea kill buffer. That meant a user could `Ctrl+K` part of a draft, trigger a composer action that cleared the visible draft, and then lose the ability to `Ctrl+Y` the killed text back. This was especially awkward for workflows where the user wants to temporarily remove text, run a composer action such as changing reasoning level or dispatching a slash command, and then restore the killed text into the now-empty draft. ## Mental model This change separates visible draft state from editing-history state. The visible draft includes the current textarea contents and text elements that should be cleared when the composer submits or dispatches a command. The kill buffer is different: it represents the most recent killed text and should survive those composer-driven clears so the user can still yank it back afterward. After this change, submit and slash-command dispatch still clear the visible textarea contents, but they no longer erase the most recent kill. ## Non-goals This does not implement a multi-entry kill ring or change the semantics of `Ctrl+K` and `Ctrl+Y` beyond preserving the existing yank target across these clears. It also does not change how submit, slash-command parsing, prompt expansion, or attachment handling work, except that those flows no longer discard the textarea kill buffer as a side effect of clearing the draft. ## Tradeoffs The main tradeoff is that clearing the visible textarea is no longer equivalent to fully resetting all editing state. That is intentional here, because submit and slash-command dispatch are composer actions, not requests to forget the user's most recent kill. The benefit is better editing continuity. The cost is that callers must understand that full-buffer replacement resets visible draft state but not the kill buffer. ## Architecture The behavioral change is in `TextArea`: full-buffer replacement now rebuilds text and elements without clearing `kill_buffer`. `ChatComposer` already clears the textarea after successful submit and slash-command dispatch by calling into those textarea replacement paths. With this change, those existing composer flows inherit the new behavior automatically: the visible draft is cleared, but the last killed text remains available for `Ctrl+Y`. The tests cover both layers: - `TextArea` verifies that the kill buffer survives full-buffer replacement. - `ChatComposer` verifies that it survives submit. - `ChatComposer` also verifies that it survives slash-command dispatch. ## Observability There is no dedicated logging for kill-buffer preservation. The most direct way to reason about the behavior is to inspect textarea-wide replacement paths and confirm whether they treat the kill buffer as visible-buffer state or as editing-history state. If this regresses in the future, the likely failure mode is simple and user-visible: `Ctrl+Y` stops restoring text after submit or slash-command clears even though ordinary kill/yank still works within a single uninterrupted draft. ## Tests Added focused regression coverage for the new contract: - `kill_buffer_persists_across_set_text` - `kill_buffer_persists_after_submit` - `kill_buffer_persists_after_slash_command_dispatch` Local verification: - `just fmt` - `cargo test -p codex-tui` --------- Co-authored-by: Josh McKinney <joshka@openai.com>	2026-03-03 02:06:08 +00:00
Celia Chen	0bb152b01d	chore: remove SkillMetadata.permissions and derive skill sandboxing from permission_profile (#13061 ) ## Summary This change removes the compiled permissions field from skill metadata and keeps permission_profile as the single source of truth. Skill loading no longer compiles skill permissions eagerly. Instead, the zsh-fork skill escalation path compiles `skill.permission_profile` when it needs to determine the sandbox to apply for a skill script. ## Behavior change For skills that declare: ``` permissions: {} ``` we now treat that the same as having no skill permissions override, instead of creating and using a default readonly sandbox. This change makes the behavior more intuitive: - only non-empty skill permission profiles affect sandboxing - omitting permissions and writing permissions: {} now mean the same thing - skill metadata keeps a single permissions representation instead of storing derived state too Overall, this makes skill sandbox behavior easier to understand and more predictable.	2026-03-03 01:29:53 +00:00
Ahmed Ibrahim	b20b6aa46f	Update realtime websocket API (#13265 ) - migrate the realtime websocket transport to the new session and handoff flow - make the realtime model configurable in config.toml and use API-key auth for the websocket --------- Co-authored-by: Codex <noreply@openai.com>	2026-03-02 16:05:40 -08:00
jif-oai	f8838fd6f3	feat: enable ma through `/agent` (#13246 ) <img width="639" height="139" alt="Screenshot 2026-03-02 at 16 06 41" src="https://github.com/user-attachments/assets/c006fcec-c1e7-41ce-bb84-c121d5ffb501" /> Then <img width="372" height="37" alt="Screenshot 2026-03-02 at 16 06 49" src="https://github.com/user-attachments/assets/aa4ad703-e7e7-4620-9032-f5cd4f48ff79" />	2026-03-02 18:37:29 +00:00
Charley Cunningham	7979ce453a	tui: restore draft footer hints (#13202 ) ## Summary - restore `Tab to queue` when a draft is present and the agent is running - keep draft-idle footers passive by showing the normal footer or status line instead of `? for shortcuts` - align footer snapshot coverage with the updated draft footer behavior ## Codex author `codex resume 019c7f1c-43aa-73e0-97c7-40f457396bb0` --------- Co-authored-by: Codex <noreply@openai.com>	2026-03-02 10:26:13 -08:00
jif-oai	9a42a56d8f	chore: `/multiagent` alias for `/agent` (#13249 ) Add a `/mutli-agents` alias for `/agent` and update the wording	2026-03-02 16:51:54 +00:00
jif-oai	2a5bcc053f	fix: esc in `/agent` (#13131 ) Fix https://github.com/openai/codex/issues/13093	2026-03-02 15:49:06 +00:00
jif-oai	b08bdd91e3	fix: `/status` when sub-agent (#13130 ) Fix https://github.com/openai/codex/issues/13066	2026-03-02 11:57:15 +00:00
gabec-openai	9685e7d6d1	Improve subagent contrast in TUI (#13197 ) ## Summary - raise contrast for subagent transcript labels and fallback states - remove low-contrast dim styling from role tags and error details - make the closed-agent picker dot readable in dark theme ## Validation - just fmt - just fix -p codex-tui - cargo test -p codex-tui Co-authored-by: Codex <noreply@openai.com>	2026-03-02 12:16:49 +01:00
Ahmed Ibrahim	0aeb55bf08	Record realtime close marker on replacement (#13058 ) ## Summary - record a realtime close developer message when a new realtime session replaces an active one - assert the replacement marker through the mocked responses request path --------- Co-authored-by: Codex <noreply@openai.com> Co-authored-by: Charles Cunningham <ccunningham@openai.com>	2026-03-01 13:54:12 -08:00
xl-openai	752402c4fe	feat: load from plugins (#12864 ) Support loading plugins. Plugins can now be enabled via [plugins.<name>] in config.toml. They are loaded as first-class entities through PluginsManager, and their default skills/ and .mcp.json contributions are integrated into the existing skills and MCP flows.	2026-03-01 10:50:56 -08:00
jif-oai	74e5150b1e	fix: package `models.json` for Bazel tests (#13129 )	2026-02-28 17:21:02 +01:00
jif-oai	3bfee6fcb5	nit: ignore `resume_startup_does_not_consume_model_availability_nux_c… (#13128 )	2026-02-28 14:50:41 +01:00
jif-oai	2b38b4e03b	feat: approval for sub-agent in the TUI (#12995 ) <img width="766" height="290" alt="Screenshot 2026-02-27 at 10 50 48" src="https://github.com/user-attachments/assets/3bc96cd9-ed2c-4d67-a317-8f7b60abbbb1" />	2026-02-28 14:07:07 +01:00
Eric Traut	e6032eb0b7	Fix CLI feedback link (#13086 ) Addresses #12967 About a month ago, I updated the Github bug report templates to accommodate the (at the time) new Codex app. The `/feedback` code path in the CLI was referencing one of the old templates, and I didn't realize it at the time. This PR updates the link so users don't get an empty bug template when using `/feedback`.	2026-02-27 19:02:40 -07:00
Ahmed Ibrahim	ec6f6aacbf	Add model availability NUX tooltips (#13021 ) - override startup tooltips with model availability NUX and persist per-model show counts in config - stop showing each model after four exposures and fall back to normal tooltips	2026-02-27 17:14:06 -08:00
Eric Traut	61c42396ab	Keep large-paste placeholders intact during file completion (#13070 ) Addresses https://github.com/openai/codex/issues/13040 Fixes a regression in 0.106.0 introduced in https://github.com/openai/codex/pull/9393 Summary - replace only the active completion range so unrelated text elements (e.g., large-paste placeholders) stay atomic and can still expand - add a regression test verifying large paste placeholders persist through completions and submit - could not fetch issue details via GitHub API because network access is disabled in this sandboxed environment	2026-02-27 17:19:11 -07:00
Felipe Coury	c3c75878e8	fix(tui): theme-aware diff backgrounds with fallback behavior (#13037 ) ## Problem The TUI diff renderer uses hardcoded background palettes for insert/delete lines that don't respect the user's chosen syntax theme. When a theme defines `markup.inserted` / `markup.deleted` scope backgrounds (the convention used by GitHub, Solarized, Monokai, and most VS Code themes), those colors are ignored — the diff always renders with the same green/red tints regardless of theme selection. Separately, ANSI-16 terminals (and Windows Terminal sessions misreported as ANSI-16) rendered diff backgrounds as full-saturation blocks that obliterated syntax token colors, making highlighted diffs unreadable. ## Mental model Diff backgrounds are resolved in three layers: 1. Color level detection — `diff_color_level_for_terminal()` maps the raw `supports-color` probe + Windows Terminal heuristics to a `DiffColorLevel` (TrueColor / Ansi256 / Ansi16). Windows Terminal gets promoted from Ansi16 to TrueColor when `WT_SESSION` is present. 2. Background resolution — `resolve_diff_backgrounds()` queries the active syntax theme for `markup.inserted`/`markup.deleted` (falling back to `diff.inserted`/`diff.deleted`), then overlays those on top of the hardcoded palette. For ANSI-256, theme RGB values are quantized to the nearest xterm-256 index. For ANSI-16, backgrounds are `None` (foreground-only). 3. Style composition — The resolved `ResolvedDiffBackgrounds` is threaded through every call to `style_add`, `style_del`, `style_sign_`, and `style_line_bg_for`, which decide how to compose foreground+background for each line kind and theme variant. A new `RichDiffColorLevel` type (a subset of `DiffColorLevel` without Ansi16) encodes the invariant "we have enough depth for tinted backgrounds" at the type level, so background-producing functions have exhaustive matches without unreachable arms. ## Non-goals - No change to gutter (line number column) styling — gutter backgrounds still use the hardcoded palette. - No per-token scope background resolution — this is line-level background only; syntax token colors come from the existing `highlight_code_to_styled_spans` path. - No dark/light theme auto-switching from scope backgrounds — `DiffTheme` is still determined by querying the terminal's background color. ## Tradeoffs - Theme trust vs. visual safety:* When a theme defines scope backgrounds, we trust them unconditionally for rich color levels. A badly authored theme could produce illegible combinations. The fallback for `None` backgrounds (foreground-only) is intentionally conservative. - Quantization quality: ANSI-256 quantization uses perceptual distance across indices 16–255, skipping system colors. The result is approximate — a subtle theme tint may land on a noticeably different xterm index. - Single-query caching: `resolve_diff_backgrounds` is called once per `render_change` invocation (i.e., once per file in a diff). If the theme changes mid-render (live preview), the next file picks up the new backgrounds. ## Architecture Files changed: \| File \| Role \| \|---\|---\| \| `tui/src/render/highlight.rs` \| New: `DiffScopeBackgroundRgbs`, `diff_scope_background_rgbs()`, scope extraction helpers \| \| `tui/src/diff_render.rs` \| New: `RichDiffColorLevel`, `ResolvedDiffBackgrounds`, `resolve_diff_backgrounds`, `quantize_rgb_to_ansi256`, Windows Terminal promotion; modified: all style helpers to accept/thread `ResolvedDiffBackgrounds` \| The scope-extraction code lives in `highlight.rs` because it uses `syntect::highlighting::Highlighter` and the theme singleton. The resolution and quantization logic lives in `diff_render.rs` because it depends on diff-specific types (`DiffTheme`, `DiffColorLevel`, ratatui `Color`). ## Observability No runtime logging was added. The most useful debugging aid is the `diff_color_level_for_terminal` function, which is pure and fully unit-tested — to diagnose a color-depth mismatch, log its four inputs (`StdoutColorLevel`, `TerminalName`, `WT_SESSION` presence, `FORCE_COLOR` presence). Scope resolution can be tested by loading a custom `.tmTheme` with known `markup.inserted` / `markup.deleted` backgrounds and checking the diff output in a truecolor terminal. ## Tests - Windows Terminal promotion:* 7 unit tests cover every branch of `diff_color_level_for_terminal` (ANSI-16 promotion, `WT_SESSION` unconditional promotion, `FORCE_COLOR` suppression, conservative `Unknown` level). - ANSI-16 foreground-only: Tests verify that `style_add`, `style_del`, `style_sign_`, `style_line_bg_for`, and `style_gutter_for` all return `None` backgrounds on ANSI-16. - Scope resolution:* Tests verify `markup.` preference over `diff.`, `None` when no scope matches, bundled theme resolution, and custom `.tmTheme` round-trip. - Quantization: Test verifies ANSI-256 quantization of a known RGB triple. - Insta snapshots: 2 new snapshot tests (`ansi16_insert_delete_no_background`, `theme_scope_background_resolution`) lock visual output.	2026-02-27 16:44:56 -07:00
Jeremy Rose	bc0a5843df	Align TUI voice transcription audio with 4o ASR (#13030 ) ## Summary - switch TUI push-to-talk transcription requests to `gpt-4o-mini-transcribe` - prefer 24 kHz mono `i16` microphone configs and normalize voice input to 24 kHz mono before upload/send - add unit coverage for the new downmix/resample path ## Testing - `just fmt` - `cargo test -p codex-tui`	2026-02-27 18:22:48 +00:00
Felipe Coury	3b5996f988	fix(tui): promote windows terminal diff ansi16 to truecolor (#13016 ) ## Summary - Promote ANSI-16 to truecolor for diff rendering when running inside Windows Terminal - Respect explicit `FORCE_COLOR` override, skipping promotion when set - Extract a pure `diff_color_level_for_terminal` function for testability - Strip background tints from ANSI-16 diff output, rendering add/delete lines with foreground color only - Introduce `RichDiffColorLevel` to type-safely restrict background fills to truecolor and ansi256 ## Problem Windows Terminal fully supports 24-bit (truecolor) rendering but often does not provide the usual TERM metadata (`TERM`, `TERM_PROGRAM`, `COLORTERM`) in `cmd.exe`/PowerShell sessions. In those environments, `supports-color` can report only ANSI-16 support. The diff renderer therefore falls back to a 16-color palette, producing washed-out, hard-to-read diffs. The screenshots below demonstrate that both PowerShell and cmd.exe don't set any `TERM` environment variables. \| PowerShell \| cmd.exe \| \|---\|---\| \| <img width="2032" height="1162" alt="SCR-20260226-nfvy" src="https://github.com/user-attachments/assets/59e968cc-4add-4c7b-a415-07163297e86a" /> \| <img width="2032" height="1162" alt="SCR-20260226-nfyc" src="https://github.com/user-attachments/assets/d06b3e39-bf91-4ce3-9705-82bf9563a01b" /> \| ## Mental model `StdoutColorLevel` (from `supports-color`) is the _detected_ capability. `DiffColorLevel` is the _intended_ capability for diff rendering. A new intermediary — `diff_color_level_for_terminal` — maps one to the other and is the single place where terminal-specific overrides live. Windows Terminal is detected two independent ways: the `TerminalName` parsed by `terminal_info()` and the raw presence of `WT_SESSION`. When `WT_SESSION` is present and `FORCE_COLOR` is not set, we promote unconditionally to truecolor. When `WT_SESSION` is absent but `TerminalName::WindowsTerminal` is detected, we promote only the ANSI-16 level (not `Unknown`). A single override helper — `has_force_color_override()` — checks whether `FORCE_COLOR` is set. When it is, both the `WT_SESSION` fast-path and the `TerminalName`-based promotion are suppressed, preserving explicit user intent. \| PowerShell \| cmd.exe \| WSL \| Bash for Windows \| \|---\|---\|---\|---\| \| ![SCR-20260226-msrh](https://github.com/user-attachments/assets/0f6297a6-4241-4dbf-b7ff-cf02da8941b0) \| ![SCR-20260226-nbao](https://github.com/user-attachments/assets/bb5ff8a9-903c-4677-a2de-1f6e1f34b18e) \| ![SCR-20260226-nbej](https://github.com/user-attachments/assets/26ecec2c-a7e9-410a-8702-f73995b490a6) \| ![SCR-20260226-nbkz](https://github.com/user-attachments/assets/80c4bf9a-3b41-40e1-bc87-f5c565f96075) \| ## Non-goals - This does not change color detection for anything outside the diff renderer (e.g. the chat widget, markdown rendering). - This does not add a user-facing config knob; `FORCE_COLOR` already serves that role. ## Tradeoffs - The `has_wt_session` signal is intentionally kept separate from `TerminalName::WindowsTerminal`. `terminal_info()` is derived with `TERM_PROGRAM` precedence, so it can differ from raw `WT_SESSION`. - Real-world validation in this issue: in both `cmd.exe` and PowerShell, `TERM`/`TERM_PROGRAM`/`COLORTERM` were absent, so TERM-based capability hints were unavailable in those sessions. - Checking `FORCE_COLOR` for presence rather than parsing its value is a simplification. In practice `supports-color` has already parsed it, so our check is a coarse "did the user set _anything_?" gate. The effective color level still comes from `supports-color`. - When `WT_SESSION` is present without `FORCE_COLOR`, we promote to truecolor regardless of `stdout_level` (including `Unknown`). This is aggressive but correct: `WT_SESSION` is a strong signal that we're in Windows Terminal. - ANSI-16 add/delete backgrounds (bright green/red) overpower syntax-highlighted token colors, making diffs harder to read. Foreground-only cues (colored text, gutter signs) preserve readability on low-color terminals. ## Architecture ``` stdout_color_level() ──┐ terminal_info().name ──┤ WT_SESSION presence ──┼──▶ diff_color_level_for_terminal() ──▶ DiffColorLevel FORCE_COLOR presence ──┘ │ ▼ RichDiffColorLevel::from_diff_color_level() │ ┌──────────┴──────────┐ │ Some(TrueColor\|256) │ → bg tints │ None (Ansi16) │ → fg only └─────────────────────┘ ``` `diff_color_level()` is the environment-reading entry point; it gathers the four runtime signals and delegates to the pure, testable `diff_color_level_for_terminal()`. ## Observability No new logs or metrics. Incorrect color selection is immediately visible as broken diff rendering; the test suite covers the decision matrix exhaustively. ## Tests Six new unit tests exercise every branch of `diff_color_level_for_terminal`: \| Test \| Inputs \| Expected \| \|------\|--------\|----------\| \| `windows_terminal_promotes_ansi16_to_truecolor_for_diffs` \| Ansi16 + WindowsTerminal name \| TrueColor \| \| `wt_session_promotes_ansi16_to_truecolor_for_diffs` \| Ansi16 + WT_SESSION only \| TrueColor \| \| `non_windows_terminal_keeps_ansi16_diff_palette` \| Ansi16 + WezTerm \| Ansi16 \| \| `wt_session_promotes_unknown_color_level_to_truecolor` \| Unknown + WT_SESSION \| TrueColor \| \| `explicit_force_override_keeps_ansi16_on_windows_terminal` \| Ansi16 + WindowsTerminal + FORCE_COLOR \| Ansi16 \| \| `explicit_force_override_keeps_ansi256_on_windows_terminal` \| Ansi256 + WT_SESSION + FORCE_COLOR \| Ansi256 \| \| `ansi16_add_style_uses_foreground_only` \| Dark + Ansi16 \| fg=Green, bg=None \| \| (and any other new snapshot/assertion tests from commits d757fee and d7c78b3) \| \| \| ## Test plan - [x] Verify all new unit tests pass (`cargo test -p codex-tui --lib`) - [x] On Windows Terminal: confirm diffs render with truecolor backgrounds - [x] On Windows Terminal with `FORCE_COLOR` set: confirm promotion is disabled and output follows the forced `supports-color` level - [x] On macOS/Linux terminals: confirm no behavior change Fixes https://github.com/openai/codex/issues/12904 Fixes https://github.com/openai/codex/issues/12890 Fixes https://github.com/openai/codex/issues/12912 Fixes https://github.com/openai/codex/issues/12840	2026-02-27 10:45:59 -07:00
Michael Bolin	d09a7535ed	fix: use AbsolutePathBuf for permission profile file roots (#12970 ) ## Why `PermissionProfile` should describe filesystem roots as absolute paths at the type level. Using `PathBuf` in `FileSystemPermissions` made the shared type too permissive and blurred together three different deserialization cases: - skill metadata in `agents/openai.yaml`, where relative paths should resolve against the skill directory - app-server API payloads, where callers should have to send absolute paths - local tool-call payloads for commands like `shell_command` and `exec_command`, where `additional_permissions.file_system` may legitimately be relative to the command `workdir` This change tightens the shared model without regressing the existing local command flow. ## What Changed - changed `protocol::models::FileSystemPermissions` and the app-server `AdditionalFileSystemPermissions` mirror to use `AbsolutePathBuf` - wrapped skill metadata deserialization in `AbsolutePathBufGuard`, so relative permission roots in `agents/openai.yaml` resolve against the containing skill directory - kept app-server/API deserialization strict, so relative `additionalPermissions.fileSystem.*` paths are rejected at the boundary - restored cwd/workdir-relative deserialization for local tool-call payloads by parsing `shell`, `shell_command`, and `exec_command` arguments under an `AbsolutePathBufGuard` rooted at the resolved command working directory - simplified runtime additional-permission normalization so it only canonicalizes and deduplicates absolute roots instead of trying to recover relative ones later - updated the app-server schema fixtures, `app-server/README.md`, and the affected transport/TUI tests to match the final behavior	2026-02-27 17:42:52 +00:00
Michael Bolin	e6cd75a684	notify: include client in legacy hook payload (#12968 ) ## Why The `notify` hook payload did not identify which Codex client started the turn. That meant downstream notification hooks could not distinguish between completions coming from the TUI and completions coming from app-server clients such as VS Code or Xcode. Now that the Codex App provides its own desktop notifications, it would be nice to be able to filter those out. This change adds that context without changing the existing payload shape for callers that do not know the client name, and keeps the new end-to-end test cross-platform. ## What changed - added an optional top-level `client` field to the legacy `notify` JSON payload - threaded that value through `core` and `hooks`; the internal session and turn state now carries it as `app_server_client_name` - set the field to `codex-tui` for TUI turns - captured `initialize.clientInfo.name` in the app server and applied it to subsequent turns before dispatching hooks - replaced the notify integration test hook with a `python3` script so the test does not rely on Unix shell permissions or `bash` - documented the new field in `docs/config.md` ## Testing - `cargo test -p codex-hooks` - `cargo test -p codex-tui` - `cargo test -p codex-app-server suite::v2::initialize::turn_start_notify_payload_includes_initialize_client_name -- --exact --nocapture` - `cargo test -p codex-core` (`src/lib.rs` passed; `core/tests/all.rs` still has unrelated existing failures in this environment) ## Docs The public config reference on `developers.openai.com/codex` should mention that the legacy `notify` payload may include a top-level `client` field. The TUI reports `codex-tui`, and the app server reports `initialize.clientInfo.name` when it is available.	2026-02-26 22:27:34 -08:00
Ahmed Ibrahim	4d180ae428	Add model availability NUX metadata (#12972 ) - replace show_nux with structured availability_nux model metadata - expose availability NUX data through the app-server model API - update shared fixtures and tests for the new field	2026-02-26 22:02:57 -08:00
Eric Traut	cee009d117	Add oauth_resource handling for MCP login flows (#12866 ) Addresses bug https://github.com/openai/codex/issues/12589 Builds on community PR #12763. This adds `oauth_resource` support for MCP `streamable_http` servers and wires it through the relevant config and login paths. It fixes the bug where the configured OAuth resource was not reliably included in the authorization request, causing MCP login to omit the expected `resource` parameter.	2026-02-26 20:10:12 -08:00
Ahmed Ibrahim	f90e97e414	Add realtime audio device picker (#12850 ) ## Summary - add a dedicated /audio picker for realtime microphone and speaker selection - persist realtime audio choices and prompt to restart only local audio when voice is live - add snapshot coverage for the new picker surfaces ## Validation - cargo test -p codex-tui - cargo insta accept - just fix -p codex-tui - just fmt	2026-02-26 17:27:44 -08:00
Celia Chen	90cc4e79a2	feat: add local date/timezone to turn environment context (#12947 ) ## Summary This PR includes the session's local date and timezone in the model-visible environment context and persists that data in `TurnContextItem`. ## What changed - captures the current local date and IANA timezone when building a turn context, with a UTC fallback if the timezone lookup fails - includes current_date and timezone in the serialized <environment_context> payload - stores those fields on TurnContextItem so they survive rollout/history handling, subagent review threads, and resume flows - treats date/timezone changes as environment updates, so prompt caching and context refresh logic do not silently reuse stale time context - updates tests to validate the new environment fields without depending on a single hardcoded environment-context string ## test built a local build and saw it in the rollout file: ``` {"timestamp":"2026-02-26T21:39:50.737Z","type":"response_item","payload":{"type":"message","role":"user","content":[{"type":"input_text","text":"<environment_context>\n <shell>zsh</shell>\n <current_date>2026-02-26</current_date>\n <timezone>America/Los_Angeles</timezone>\n</environment_context>"}]}} ```	2026-02-26 23:17:35 +00:00
Ahmed Ibrahim	a0e86c69fe	Add realtime audio device config (#12849 ) ## Summary - add top-level realtime audio config for microphone and speaker selection - apply configured devices when starting realtime capture and playback - keep missing-device behavior on the system default fallback path ## Validation - just write-config-schema - cargo test -p codex-core realtime_audio - cargo test -p codex-tui - just fix -p codex-core - just fix -p codex-tui - just fmt --------- Co-authored-by: Codex <noreply@openai.com>	2026-02-26 15:08:21 -08:00
pakrym-oai	951a389654	Allow clients not to send summary as an option (#12950 ) Summary is a required parameter on UserTurn. Ideally we'd like the core to decide the appropriate summary level. Make the summary optional and don't send it when not needed.	2026-02-26 14:37:38 -08:00
Charley Cunningham	c1afb8815a	tui: use thread_id for resume/fork cwd resolution (#12727 ) ## Summary - make resume/fork targets explicit and typed as `SessionTarget { path, thread_id }` (non-optional `thread_id`) - resolve `thread_id` centrally via `resolve_session_thread_id(...)`: - use CLI input directly when it is a UUID (`--resume <uuid>` / `--fork <uuid>`) - otherwise read `thread_id` from rollout `SessionMeta` for path-based selections (picker, `--resume-last`, name-based resume/fork) - use `thread_id` to read cwd from SQLite first during resume/fork cwd resolution - keep rollout fallback for cwd resolution when SQLite is unavailable or does not return thread metadata (`TurnContext` tail, then `SessionMeta`) - keep the resume picker open when a selected row has unreadable session metadata, and show an inline recoverable error instead of aborting the TUI ## Why This removes ad-hoc rollout filename parsing and makes resume/fork target identity explicit. The resume/fork cwd check can use indexed SQLite lookup by `thread_id` in the common path, while preserving rollout-based fallback behavior. It also keeps malformed legacy rows recoverable in the picker instead of letting a selection failure unwind the app. ## Notes - minimal TUI-only change; no schema/protocol changes - includes TUI test coverage for SQLite cwd precedence when `thread_id` is available - includes TUI regression coverage for picker inline error rendering / non-fatal unreadable session rows ## Codex author `codex resume 019c9205-7f8b-7173-a2a2-f082d4df3de3`	2026-02-26 12:52:31 -08:00
pakrym-oai	717cbe354f	Remove noisy log (#12929 ) This log message floods logs on windows	2026-02-26 11:34:14 -08:00
pakrym-oai	ba41e84a50	Use model catalog default for reasoning summary fallback (#12873 ) ## Summary - make `Config.model_reasoning_summary` optional so unset means use model default - resolve the optional config value to a concrete summary when building `TurnContext` - add protocol support for `default_reasoning_summary` in model metadata ## Validation - `cargo test -p codex-core --lib client::tests -- --nocapture` --------- Co-authored-by: Codex <noreply@openai.com>	2026-02-26 09:31:13 -08:00
jif-oai	f0a85ded18	fix: ctrl c sub agent (#12911 )	2026-02-26 17:06:20 +00:00
pash-openai	6acede5a28	tui: restore visible line numbers for hidden file links (#12870 ) we recently changed file linking so the model uses markdown links when it wants something to be clickable. This works well across the GUI surfaces because they can render markdown cleanly and use the full absolute path in the anchor target. A previous pass hid the absolute path in the TUI (and only showed the label), but that also meant we could lose useful location info when the model put the line number or range in the anchor target instead of the label. This follow-up keeps the TUI behavior simple while making local file links feel closer to the old TUI file reference style. key changes: - Local markdown file links in the TUI keep the old file-ref feel: code styling, no underline, no visible absolute path. - If the hidden local anchor target includes a location suffix and the label does not already include one, we append that suffix to the visible label. - This works for single lines, line/column references, and ranges. - If the label already includes the location, we leave it alone. - normal web links keep the old TUI markdown-link behavior some examples: - `[foo.rs](/abs/path/foo.rs)` renders as `foo.rs` - `[foo.rs](/abs/path/foo.rs:45)` renders as `foo.rs:45` - `[foo.rs](/abs/path/foo.rs:45:3-48:9)` renders as `foo.rs:45:3-48:9` - `[foo.rs:45](/abs/path/foo.rs:45)` stays `foo.rs:45` - `[docs](https://example.com/docs)` still renders like a normal web link how it looks: <img width="732" height="813" alt="Screenshot 2026-02-26 at 9 27 55 AM" src="https://github.com/user-attachments/assets/d51bf236-653a-4e83-96e4-9427f0804471" />	2026-02-26 10:29:54 +00:00
Eric Traut	28bfbb8f2b	Enforce user input length cap (#12823 ) Currently there is no bound on the length of a user message submitted in the TUI or through the app server interface. That means users can paste many megabytes of text, which can lead to bad performance, hangs, and crashes. In extreme cases, it can lead to a [kernel panic](https://github.com/openai/codex/issues/12323). This PR limits the length of a user input to 2**20 (about 1M) characters. This value was chosen because it fills the entire context window on the latest models, so accepting longer inputs wouldn't make sense anyway. Summary - add a shared `MAX_USER_INPUT_TEXT_CHARS` constant in codex-protocol and surface it in TUI and app server code - block oversized submissions in the TUI submit flow and emit error history cells when validation fails - reject heavy app-server requests with JSON-RPC `-32602` and structured `input_too_large` data, plus document the behavior Testing - ran the IDE extension with this change and verified that when I attempt to paste a user message that's several MB long, it correctly reports an error instead of crashing or making my computer hot.	2026-02-25 22:23:51 -08:00
pash-openai	9a96b6f509	Hide local file link destinations in TUI markdown (#12705 ) ## Summary - hide appended destinations for local path-style markdown links in the TUI renderer - keep web links rendering with their visible destination and style link labels consistently - add markdown renderer tests and a snapshot for the new file-link output ## Testing - just fmt - cargo test -p codex-tui <img width="1120" height="968" alt="image" src="https://github.com/user-attachments/assets/490e8eda-ae47-4231-89fa-b254a1f83eed" />	2026-02-26 05:28:37 +00:00
Michael Bolin	14116ade8d	feat: include available decisions in command approval requests (#12758 ) Command-approval clients currently infer which choices to show from side-channel fields like `networkApprovalContext`, `proposedExecpolicyAmendment`, and `additionalPermissions`. That makes the request shape harder to evolve, and it forces each client to replicate the server's heuristics instead of receiving the exact decision list for the prompt. This PR introduces a mapping between `CommandExecutionApprovalDecision` and `codex_protocol::protocol::ReviewDecision`: ```rust impl From<CoreReviewDecision> for CommandExecutionApprovalDecision { fn from(value: CoreReviewDecision) -> Self { match value { CoreReviewDecision::Approved => Self::Accept, CoreReviewDecision::ApprovedExecpolicyAmendment { proposed_execpolicy_amendment, } => Self::AcceptWithExecpolicyAmendment { execpolicy_amendment: proposed_execpolicy_amendment.into(), }, CoreReviewDecision::ApprovedForSession => Self::AcceptForSession, CoreReviewDecision::NetworkPolicyAmendment { network_policy_amendment, } => Self::ApplyNetworkPolicyAmendment { network_policy_amendment: network_policy_amendment.into(), }, CoreReviewDecision::Abort => Self::Cancel, CoreReviewDecision::Denied => Self::Decline, } } } ``` And updates `CommandExecutionRequestApprovalParams` to have a new field: ```rust available_decisions: Option<Vec<CommandExecutionApprovalDecision>> ``` when, if specified, should make it easier for clients to display an appropriate list of options in the UI. This makes it possible for `CoreShellActionProvider::prompt()` in `unix_escalation.rs` to specify the `Vec<ReviewDecision>` directly, adding support for `ApprovedForSession` when approving a skill script, which was previously missing in the TUI. Note this results in a significant change to `exec_options()` in `approval_overlay.rs`, as the displayed options are now derived from `available_decisions: &[ReviewDecision]`. ## What Changed - Add `available_decisions` to [`ExecApprovalRequestEvent`](`de00e932dd/codex-rs/protocol/src/approvals.rs (L111-L175)`), including helpers to derive the legacy default choices when older senders omit the field. - Map `codex_protocol::protocol::ReviewDecision` to app-server `CommandExecutionApprovalDecision` and expose the ordered list as experimental `availableDecisions` in [`CommandExecutionRequestApprovalParams`](`de00e932dd/codex-rs/app-server-protocol/src/protocol/v2.rs (L3798-L3807)`). - Thread optional `available_decisions` through the core approval path so Unix shell escalation can explicitly request `ApprovedForSession` for session-scoped approvals instead of relying on client heuristics. [`unix_escalation.rs`](`de00e932dd/codex-rs/core/src/tools/runtimes/shell/unix_escalation.rs (L194-L214)`) - Update the TUI approval overlay to build its buttons from the ordered decision list, while preserving the legacy fallback when `available_decisions` is missing. - Update the app-server README, test client output, and generated schema artifacts to document and surface the new field. ## Testing - Add `approval_overlay.rs` coverage for explicit decision lists, including the generic `ApprovedForSession` path and network approval options. - Update `chatwidget/tests.rs` and app-server protocol tests to populate the new optional field and keep older event shapes working. ## Developers Docs - If we document `item/commandExecution/requestApproval` on [developers.openai.com/codex](https://developers.openai.com/codex), add experimental `availableDecisions` as the preferred source of approval choices and note that older servers may omit it.	2026-02-26 01:10:46 +00:00
Celia Chen	4f45668106	Revert "Add skill approval event/response (#12633 )" (#12811 ) This reverts commit https://github.com/openai/codex/pull/12633. We no longer need this PR, because we favor sending normal exec command approval server request with `additional_permissions` of skill permissions instead	2026-02-26 01:02:42 +00:00
Ahmed Ibrahim	e76b1a2853	Remove steer feature flag (#12026 ) All code should go in the direction that steer is enabled --------- Co-authored-by: Codex <noreply@openai.com>	2026-02-25 15:41:42 -08:00
Charley Cunningham	2f4d6ded1d	Enable request_user_input in Default mode (#12735 ) ## Summary - allow `request_user_input` in Default collaboration mode as well as Plan - update the Default-mode instructions to prefer assumptions first and use `request_user_input` only when a question is unavoidable - update request_user_input and app-server tests to match the new Default-mode behavior - refactor collaboration-mode availability plumbing into `CollaborationModesConfig` for future mode-related flags ## Codex author `codex resume 019c9124-ed28-7c13-96c6-b916b1c97d49`	2026-02-25 15:20:46 -08:00
sayan-oai	d45ffd5830	make 5.3-codex visible in cli for api users (#12808 ) 5.3-codex released in api, mark it visible for API users via bundled `models.json`.	2026-02-25 13:01:40 -08:00
Eric Traut	f6fdfbeb98	Update Codex docs success link (#12805 ) Fix a stale documentation link in the sign-in flow	2026-02-25 12:02:41 -08:00
Owen Lin	a0fd94bde6	feat(app-server): add ThreadItem::DynamicToolCall (#12732 ) Previously, clients would call `thread/start` with dynamic_tools set, and when a model invokes a dynamic tool, it would just make the server->client `item/tool/call` request and wait for the client's response to complete the tool call. This works, but it doesn't have an `item/started` or `item/completed` event. Now we are doing this: - [new] emit `item/started` with `DynamicToolCall` populated with the call arguments - send an `item/tool/call` server request - [new] once the client responds, emit `item/completed` with `DynamicToolCall` populated with the response. Also, with `persistExtendedHistory: true`, dynamic tool calls are now reconstructable in `thread/read` and `thread/resume` as `ThreadItem::DynamicToolCall`.	2026-02-25 12:00:10 -08:00

1 2 3 4 5 ...

1255 commits