Commit graph

1331 commits

Author SHA1 Message Date
Charley Cunningham
bc24017d64
Add Smart Approvals guardian review across core, app-server, and TUI (#13860)
## Summary
- add `approvals_reviewer = "user" | "guardian_subagent"` as the runtime
control for who reviews approval requests
- route Smart Approvals guardian review through core for command
execution, file changes, managed-network approvals, MCP approvals, and
delegated/subagent approval flows
- expose guardian review in app-server with temporary unstable
`item/autoApprovalReview/{started,completed}` notifications carrying
`targetItemId`, `review`, and `action`
- update the TUI so Smart Approvals can be enabled from `/experimental`,
aligned with the matching `/approvals` mode, and surfaced clearly while
reviews are pending or resolved

## Runtime model
This PR does not introduce a new `approval_policy`.

Instead:
- `approval_policy` still controls when approval is needed
- `approvals_reviewer` controls who reviewable approval requests are
routed to:
  - `user`
  - `guardian_subagent`

`guardian_subagent` is a carefully prompted reviewer subagent that
gathers relevant context and applies a risk-based decision framework
before approving or denying the request.

The `smart_approvals` feature flag is a rollout/UI gate. Core runtime
behavior keys off `approvals_reviewer`.

When Smart Approvals is enabled from the TUI, it also switches the
current `/approvals` settings to the matching Smart Approvals mode so
users immediately see guardian review in the active thread:
- `approval_policy = on-request`
- `approvals_reviewer = guardian_subagent`
- `sandbox_mode = workspace-write`

Users can still change `/approvals` afterward.

Config-load behavior stays intentionally narrow:
- plain `smart_approvals = true` in `config.toml` remains just the
rollout/UI gate and does not auto-set `approvals_reviewer`
- the deprecated `guardian_approval = true` alias migration does
backfill `approvals_reviewer = "guardian_subagent"` in the same scope
when that reviewer is not already configured there, so old configs
preserve their original guardian-enabled behavior

ARC remains a separate safety check. For MCP tool approvals, ARC
escalations now flow into the configured reviewer instead of always
bypassing guardian and forcing manual review.

## Config stability
The runtime reviewer override is stable, but the config-backed
app-server protocol shape is still settling.

- `thread/start`, `thread/resume`, and `turn/start` keep stable
`approvalsReviewer` overrides
- the config-backed `approvals_reviewer` exposure returned via
`config/read` (including profile-level config) is now marked
`[UNSTABLE]` / experimental in the app-server protocol until we are more
confident in that config surface

## App-server surface
This PR intentionally keeps the guardian app-server shape narrow and
temporary.

It adds generic unstable lifecycle notifications:
- `item/autoApprovalReview/started`
- `item/autoApprovalReview/completed`

with payloads of the form:
- `{ threadId, turnId, targetItemId, review, action? }`

`review` is currently:
- `{ status, riskScore?, riskLevel?, rationale? }`
- where `status` is one of `inProgress`, `approved`, `denied`, or
`aborted`

`action` carries the guardian action summary payload from core when
available. This lets clients render temporary standalone pending-review
UI, including parallel reviews, even when the underlying tool item has
not been emitted yet.

These notifications are explicitly documented as `[UNSTABLE]` and
expected to change soon.

This PR does **not** persist guardian review state onto `thread/read`
tool items. The intended follow-up is to attach guardian review state to
the reviewed tool item lifecycle instead, which would improve
consistency with manual approvals and allow thread history / reconnect
flows to replay guardian review state directly.

## TUI behavior
- `/experimental` exposes the rollout gate as `Smart Approvals`
- enabling it in the TUI enables the feature and switches the current
session to the matching Smart Approvals `/approvals` mode
- disabling it in the TUI clears the persisted `approvals_reviewer`
override when appropriate and returns the session to default manual
review when the effective reviewer changes
- `/approvals` still exposes the reviewer choice directly
- the TUI renders:
- pending guardian review state in the live status footer, including
parallel review aggregation
  - resolved approval/denial state in history

## Scope notes
This PR includes the supporting core/runtime work needed to make Smart
Approvals usable end-to-end:
- shell / unified-exec / apply_patch / managed-network / MCP guardian
review
- delegated/subagent approval routing into guardian review
- guardian review risk metadata and action summaries for app-server/TUI
- config/profile/TUI handling for `smart_approvals`, `guardian_approval`
alias migration, and `approvals_reviewer`
- a small internal cleanup of delegated approval forwarding to dedupe
fallback paths and simplify guardian-vs-parent approval waiting (no
intended behavior change)

Out of scope for this PR:
- redesigning the existing manual approval protocol shapes
- persisting guardian review state onto app-server `ThreadItem`s
- delegated MCP elicitation auto-review (the current delegated MCP
guardian shim only covers the legacy `RequestUserInput` path)

---------

Co-authored-by: Codex <noreply@openai.com>
2026-03-13 15:27:00 -07:00
Ahmed Ibrahim
36dfb84427
Stabilize multi-agent feature flag (#14622)
- make multi_agent stable and enabled by default
- update feature and tool-spec coverage to match the new default

---------

Co-authored-by: Codex <noreply@openai.com>
2026-03-13 14:38:15 -07:00
Won Park
6720caf778
Slash copy osc52 wsl support (#13201)
This PR is a followup to the /copy feature to support WSL and SSH!
2026-03-13 14:00:58 -07:00
canvrno-oai
914f7c7317
Override local apps settings with requirements.toml settings (#14304)
This PR changes app and connector enablement when `requirements.toml` is
present locally or via remote configuration.

For apps.* entries:
- `enabled = false` in `requirements.toml` overrides the user’s local
`config.toml` and forces the app to be disabled.
- `enabled = true` in `requirements.toml` does not re-enable an app the
user has disabled in config.toml.

This behavior applies whether or not the user has an explicit entry for
that app in `config.toml`. It also applies to cloud-managed policies and
configurations when the admin sets the override through
`requirements.toml`.

Scenarios tested and verified:
- Remote managed, user config (present) override
- Admin-defined policies & configurations include a connector override:
  `[apps.<appID>]
enabled = false`
- User's config.toml has the same connector configured with `enabled =
true`
  - TUI/App should show connector as disabled
  - Connector should be unavailable for use in the composer
  
- Remote managed, user config (absent) override
- Admin-defined policies & configurations include a connector override:
  `[apps.<appID>]
enabled = false`
  - User's config.toml has no entry for the the same connector
  - TUI/App should show connector as disabled
  - Connector should be unavailable for use in the composer
  
- Locally managed, user config (present) override
  - Local requirements.toml includes a connector override:
  `[apps.<appID>]
enabled = false`
- User's config.toml has the same connector configured with `enabled =
true`
  - TUI/App should show connector as disabled
  - Connector should be unavailable for use in the composer

- Locally managed, user config (absent) override
  - Local requirements.toml includes a connector override:
  `[apps.<appID>]
enabled = false`
  - User's config.toml has no entry for the the same connector
  - TUI/App should show connector as disabled
  - Connector should be unavailable for use in the composer




<img width="1446" height="753" alt="image"
src="https://github.com/user-attachments/assets/61c714ca-dcca-4952-8ad2-0afc16ff3835"
/>
<img width="595" height="233" alt="image"
src="https://github.com/user-attachments/assets/7c8ab147-8fd7-429a-89fb-591c21c15621"
/>
2026-03-13 12:40:24 -07:00
Ahmed Ibrahim
d58620c852
Use subagents naming in the TUI (#14618)
- rename user-facing TUI multi-agent wording to subagents
- rename the surfaced slash command to `subagents` and update
tests/snapshots

Co-authored-by: Codex <noreply@openai.com>
2026-03-13 19:08:38 +00:00
Eric Traut
9dba7337f2
Start TUI on embedded app server (#14512)
This PR is part of the effort to move the TUI on top of the app server.
In a previous PR, we introduced an in-process app server and moved
`exec` on top of it.

For the TUI, we want to do the migration in stages. The app server
doesn't currently expose all of the functionality required by the TUI,
so we're going to need to support a hybrid approach as we make the
transition.

This PR changes the TUI initialization to instantiate an in-process app
server and access its `AuthManager` and `ThreadManager` rather than
constructing its own copies. It also adds a placeholder TUI event
handler that will eventually translate app server events into TUI
events. App server notifications are accepted but ignored for now. It
also adds proper shutdown of the app server when the TUI terminates.
2026-03-13 12:04:41 -06:00
Celia Chen
0c60eea4a5
feat: support skill-scoped managed network domain overrides in skill config (#14522)
## Summary

This lets skill loading split `permissions.network` into two distinct
pieces:

- `permissions.network.enabled` still feeds the skill
`PermissionProfile` and remains the coarse gate for whether the skill
can use network access at all.
- `permissions.network.allowed_domains` and
`permissions.network.denied_domains` are lifted into a new
`SkillManagedNetworkOverride` so managed-network sessions can start
per-skill scoped proxies with the right domain overrides.

The change also updates `SkillMetadata` construction sites and adds
loader tests covering YAML parsing plus normalization of the network
gate vs. domain override fields.

## Follow-up
A PR that uses the network_override to spin up a skill-specific proxy if
network_override is not none.
2026-03-13 04:45:14 +00:00
Jack Mousseau
7c7e267501
Simplify permissions available in request permissions tool (#14529) 2026-03-12 21:13:17 -07:00
Josh McKinney
6912da84a8
client: extend custom CA handling across HTTPS and websocket clients (#14239)
## Stacked PRs

This work is now effectively split across two steps:

- #14178: add custom CA support for browser and device-code login flows,
docs, and hermetic subprocess tests
- #14239: extend that shared custom CA handling across Codex HTTPS
clients and secure websocket TLS

Note: #14240 was merged into this branch while it was stacked on top of
this PR. This PR now subsumes that websocket follow-up and should be
treated as the combined change.

Builds on top of #14178.

## Problem

Custom CA support landed first in the login path, but the real
requirement is broader. Codex constructs outbound TLS clients in
multiple places, and both HTTPS and secure websocket paths can fail
behind enterprise TLS interception if they do not honor
`CODEX_CA_CERTIFICATE` or `SSL_CERT_FILE` consistently.

This PR broadens the shared custom-CA logic beyond login and applies the
same policy to websocket TLS, so the enterprise-proxy story is no longer
split between “HTTPS works” and “websockets still fail”.

## What This Delivers

Custom CA support is no longer limited to login. Codex outbound HTTPS
clients and secure websocket connections can now honor the same
`CODEX_CA_CERTIFICATE` / `SSL_CERT_FILE` configuration, so enterprise
proxy/intercept setups work more consistently end-to-end.

For users and operators, nothing new needs to be configured beyond the
same CA env vars introduced in #14178. The change is that more of Codex
now respects them, including websocket-backed flows that were previously
still using default trust roots.

I also manually validated the proxy path locally with mitmproxy using:
`CODEX_CA_CERTIFICATE=~/.mitmproxy/mitmproxy-ca-cert.pem
HTTPS_PROXY=http://127.0.0.1:8080 just codex`
with mitmproxy installed via `brew install mitmproxy` and configured as
the macOS system proxy.

## Mental model

`codex-client` is now the owner of shared custom-CA policy for outbound
TLS client construction. Reqwest callers start from the builder
configuration they already need, then pass that builder through
`build_reqwest_client_with_custom_ca(...)`. Websocket callers ask the
same module for a rustls client config when a custom CA bundle is
configured.

The env precedence is the same everywhere:
- `CODEX_CA_CERTIFICATE` wins
- otherwise fall back to `SSL_CERT_FILE`
- otherwise use system roots

The helper is intentionally narrow. It loads every usable certificate
from the configured PEM bundle into the appropriate root store and
returns either a configured transport or a typed error that explains
what went wrong.

## Non-goals

This does not add handshake-level integration tests against a live TLS
endpoint. It does not validate that the configured bundle forms a
meaningful certificate chain. It also does not try to force every
transport in the repo through one abstraction; it extends the shared CA
policy across the reqwest and websocket paths that actually needed it.

## Tradeoffs

The main tradeoff is centralizing CA behavior in `codex-client` while
still leaving adoption up to call sites. That keeps the implementation
additive and reviewable, but it means the rule "outbound Codex TLS that
should honor enterprise roots must use the shared helper" is still
partly enforced socially rather than by types.

For websockets, the shared helper only builds an explicit rustls config
when a custom CA bundle is configured. When no override env var is set,
websocket callers still use their ordinary default connector path.

## Architecture

`codex-client::custom_ca` now owns CA bundle selection, PEM
normalization, mixed-section parsing, certificate extraction, typed
CA-loading errors, and optional rustls client-config construction for
websocket TLS.

The affected consumers now call into that shared helper directly rather
than carrying login-local CA behavior:
- backend-client
- cloud-tasks
- RMCP client paths that use `reqwest`
- TUI voice HTTP paths
- `codex-core` default reqwest client construction
- `codex-api` websocket clients for both responses and realtime
websocket connections

The subprocess CA probe, env-sensitive integration tests, and shared PEM
fixtures also live in `codex-client`, which is now the actual owner of
the behavior they exercise.

## Observability

The shared CA path logs:
- which environment variable selected the bundle
- which path was loaded
- how many certificates were accepted
- when `TRUSTED CERTIFICATE` labels were normalized
- when CRLs were ignored
- where client construction failed

Returned errors remain user-facing and include the relevant env var,
path, and remediation hint. That same error model now applies whether
the failure surfaced while building a reqwest client or websocket TLS
configuration.

## Tests

Pure unit tests in `codex-client` cover env precedence and PEM
normalization behavior. Real client construction remains in subprocess
tests so the suite can control process env and avoid the macOS seatbelt
panic path that motivated the hermetic test split.

The subprocess coverage verifies:
- `CODEX_CA_CERTIFICATE` precedence over `SSL_CERT_FILE`
- fallback to `SSL_CERT_FILE`
- single-cert and multi-cert bundles
- malformed and empty-file errors
- OpenSSL `TRUSTED CERTIFICATE` handling
- CRL tolerance for well-formed CRL sections

The websocket side is covered by the existing `codex-api` / `codex-core`
websocket test suites plus the manual mitmproxy validation above.

---------

Co-authored-by: Ivan Zakharchanka <3axap4eHko@gmail.com>
Co-authored-by: Codex <noreply@openai.com>
2026-03-13 00:59:26 +00:00
Eric Traut
7f2ca502f5
Updated out-of-date tip about availability on free and go plans (#14471)
This addresses #14464
2026-03-12 09:12:38 -06:00
Matthew Zeng
23e55d7668
[elicitation] User-friendly tool call messages. (#14403)
- [x] Add a curated set of tool call messages and human-readable tool
param names.
2026-03-12 00:35:21 -07:00
Matthew Zeng
ba5b94287e
[apps] Add tool_suggest tool. (#14287)
- [x] Add tool_suggest tool.
- [x] Move chatgpt/src/connectors.rs and core/src/connectors.rs into a
dedicated mod so that we have all the logic and global cache in one
place.
- [x] Update TUI app link view to support rendering the installation
view for mcp elicitation.

---------

Co-authored-by: Shaqayeq <shaqayeq@openai.com>
Co-authored-by: Eric Traut <etraut@openai.com>
Co-authored-by: pakrym-oai <pakrym@openai.com>
Co-authored-by: Ahmed Ibrahim <aibrahim@openai.com>
Co-authored-by: guinness-oai <guinness@openai.com>
Co-authored-by: Eugene Brevdo <ebrevdo@users.noreply.github.com>
Co-authored-by: Charlie Guo <cguo@openai.com>
Co-authored-by: Fouad Matin <fouad@openai.com>
Co-authored-by: Fouad Matin <169186268+fouad-openai@users.noreply.github.com>
Co-authored-by: xl-openai <xl@openai.com>
Co-authored-by: alexsong-oai <alexsong@openai.com>
Co-authored-by: Owen Lin <owenlin0@gmail.com>
Co-authored-by: sdcoffey <stevendcoffey@gmail.com>
Co-authored-by: Codex <noreply@openai.com>
Co-authored-by: Won Park <won@openai.com>
Co-authored-by: Dylan Hurd <dylan.hurd@openai.com>
Co-authored-by: celia-oai <celia@openai.com>
Co-authored-by: gabec-openai <gabec@openai.com>
Co-authored-by: joeytrasatti-openai <joey.trasatti@openai.com>
Co-authored-by: Leo Shimonaka <leoshimo@openai.com>
Co-authored-by: Rasmus Rygaard <rasmus@openai.com>
Co-authored-by: maja-openai <163171781+maja-openai@users.noreply.github.com>
Co-authored-by: pash-openai <pash@openai.com>
Co-authored-by: Josh McKinney <joshka@openai.com>
2026-03-11 22:06:59 -07:00
Owen Lin
5bc82c5b93
feat(app-server): propagate traces across tasks and core ops (#14387)
## Summary

This PR keeps app-server RPC request trace context alive for the full
lifetime of the work that request kicks off (e.g. for `thread/start`,
this is `app-server rpc handler -> tokio background task -> core op
submissions`). Previously we lose trace lineage once the request handler
returns or hands work off to background tasks.

This approach is especially relevant for `thread/start` and other RPC
handlers that run in a non-blocking way. In the near future we'll most
likely want to make all app-server handlers run in a non-blocking way by
default, and only queue operations that must operate in order (e.g.
thread RPCs per thread?), so we want to make sure tracing in app-server
just generally works.

Depends on https://github.com/openai/codex/pull/14300

**Before**
<img width="155" height="207" alt="image"
src="https://github.com/user-attachments/assets/c9487459-36f1-436c-beb7-fafeb40737af"
/>


**After**
<img width="299" height="337" alt="image"
src="https://github.com/user-attachments/assets/727392b2-d072-4427-9dc4-0502d8652dea"
/>

## What changed

- Keep request-scoped trace context around until we send the final
response or error, or the connection closes.
- Thread that trace context through detached `thread/start` work so
background startup stays attached to the originating request.
- Pass request trace context through to downstream core operations,
including:
  - thread creation
  - resume/fork flows
  - turn submission
  - review
  - interrupt
  - realtime conversation operations
- Add tracing tests that verify:
  - remote W3C trace context is preserved for `thread/start`
  - remote W3C trace context is preserved for `turn/start`
  - downstream core spans stay under the originating request span
  - request-scoped tracing state is cleaned up correctly
- Clean up shutdown behavior so detached background tasks and spawned
threads are drained before process exit.
2026-03-11 20:18:31 -07:00
Ahmed Ibrahim
bf5e997b31
Include spawn agent model metadata in app-server items (#14410)
- add model and reasoning effort to app-server collab spawn items and
notifications
- regenerate app-server protocol schemas for the new fields

---------

Co-authored-by: Codex <noreply@openai.com>
2026-03-11 19:25:21 -07:00
Josh McKinney
f548309797
Keep agent-switch word-motion keys out of draft editing (#14376)
## Summary
- only trigger multi-agent fast-switch shortcuts when the composer is
empty
- keep the Option+b/f fallback for terminals that encode Option+arrow
that way
- document why the empty-composer gate preserves expected word-wise
editing behavior

## Testing
- just fmt
- cargo test -p codex-tui

Co-authored-by: Codex <noreply@openai.com>
2026-03-11 14:52:40 -07:00
gabec-openai
180a5820fc Add keyboard based fast switching between agents in TUI (#13923) 2026-03-11 12:33:10 -07:00
pash-openai
da74da6684 render local file links from target paths (#13857)
Co-authored-by: Josh McKinney <joshka@openai.com>
2026-03-11 12:33:09 -07:00
Ahmed Ibrahim
285b3a5143 Show spawned agent model and effort in TUI (#14273)
- include the requested sub-agent model and reasoning effort in the
spawn begin event\n- render that metadata next to the spawned agent name
and role in the TUI transcript

---------

Co-authored-by: Codex <noreply@openai.com>
2026-03-11 12:33:09 -07:00
Leo Shimonaka
889b4796fc feat: Add additional macOS Sandbox Permissions for Launch Services, Contacts, Reminders (#14155)
Add additional macOS Sandbox Permissions levers for the following:

- Launch Services
- Contacts
- Reminders
2026-03-11 12:33:09 -07:00
Won Park
722e8f08e1 unifying all image saves to /tmp to bug-proof (#14149)
image-gen feature will have the model saving to /tmp by default + at all
times
2026-03-11 12:33:08 -07:00
Matthew Zeng
f2d66fadd8 add(core): arc_monitor (#13936)
## Summary
- add ARC monitor support for MCP tool calls by serializing MCP approval
requests into the ARC action shape and sending the relevant
conversation/policy context to the `/api/codex/safety/arc` endpoint
- route ARC outcomes back into MCP approval flow so `ask-user` falls
back to a user prompt and `steer-model` blocks the tool call, with
guardian/ARC tests covering the new request shape
- update the TUI approval copy from “Approve Once” to “Allow” / “Allow
for this session” and refresh the related
  snapshots

---------

Co-authored-by: Fouad Matin <fouad@openai.com>
Co-authored-by: Fouad Matin <169186268+fouad-openai@users.noreply.github.com>
2026-03-11 12:33:08 -07:00
Eric Traut
f9cba5cb16 Log ChatGPT user ID for feedback tags (#13901)
There are some bug investigations that currently require us to ask users
for their user ID even though they've already uploaded logs and session
details via `/feedback`. This frustrates users and increases the time
for diagnosis.

This PR includes the ChatGPT user ID in the metadata uploaded for
`/feedback` (both the TUI and app-server).
2026-03-11 12:33:07 -07:00
Ahmed Ibrahim
2e24be2134
Use realtime transcript for handoff context (#14132)
- collect input/output transcript deltas into active handoff transcript
state
- attach and clear that transcript on each handoff, and regenerate
schema/tests
2026-03-09 22:30:03 -07:00
Matthew Zeng
566e4cee4b
[apps] Fix apps enablement condition. (#14011)
- [x] Fix apps enablement condition to check both the feature flag and
that the user is not an API key user.
2026-03-09 22:25:43 -07:00
xl-openai
0c33af7746
feat: support disabling bundled system skills (#13792)
Support disable bundled system skills with a config:

[skills.bundled]
enabled = false
2026-03-09 22:02:53 -07:00
pash-openai
63597d1b2d
tui: only show fast status for gpt-5.4 (#14135) 2026-03-09 21:12:05 -07:00
Andrei Eternal
244b2d53f4
start of hooks engine (#13276)
(Experimental)

This PR adds a first MVP for hooks, with SessionStart and Stop

The core design is:

- hooks live in a dedicated engine under codex-rs/hooks
- each hook type has its own event-specific file
- hook execution is synchronous and blocks normal turn progression while
running
- matching hooks run in parallel, then their results are aggregated into
a normalized HookRunSummary

On the AppServer side, hooks are exposed as operational metadata rather
than transcript-native items:

- new live notifications: hook/started, hook/completed
- persisted/replayed hook results live on Turn.hookRuns
- we intentionally did not add hook-specific ThreadItem variants

Hooks messages are not persisted, they remain ephemeral. The context
changes they add are (they get appended to the user's prompt)
2026-03-10 04:11:31 +00:00
sayan-oai
a3cd9f16f5
sort plugins first in menu (#14163)
we want plugin mentions to show up before others, like apps and skills.

updated tests.
2026-03-10 03:51:16 +00:00
sayan-oai
a5af11211a
make dollar-mention always clarify item category (skill, app, plugin) (#14147)
#### What

###### Context + Problem

With the introduction of plugins, we now have one more type of
`$`-mentionable item in the TUI's popup menu on `$`. Apps, skills, and
plugins can all have the same user-facing name, and we attempt to
distinguish with a category tag suffix, like `[App]`. This has a few
problems:

- We decide to show tags by the text that will be inserted into the
conversation, not the actual user-visible text, so two visibly-identical
entries can have no clarifying category tag suffix
- The category tag is a suffix and commonly gets cut off by long
descriptions
- The skill category tag is currently only displayed on repo skills as
`[Repo]`, which is confusing to most users
- The plugin category tag is currently `[<marketplace-name>]`, which is
also confusing to most users

###### Solution
- **Always** show a **prefix** category tag that is `[Skill]`, `[App]`,
or `[Plugin]`. No conditional rendering or copy.

Before:
<img width="801" height="153" alt="image"
src="https://github.com/user-attachments/assets/448e06e7-2af8-4c14-9804-ed1ca17cf514"
/>

After:
<img width="800" height="118" alt="image"
src="https://github.com/user-attachments/assets/57895b41-06fe-4d92-887b-68704c5a15fd"
/>

I also feel this clarifies the results at-a-glance while you scroll:


https://github.com/user-attachments/assets/cbdd5840-53d9-4656-812c-6e816755e1fd

### Tests
Added + updated tests (including snapshots), tested locally
2026-03-09 19:35:11 -07:00
Dylan Hurd
d241dc598c
feat(core) Persist request_permission data across turns (#14009)
## Summary
request_permissions flows should support persisting results for the
session.

Open Question: Still deciding if we need within-turn approvals - this
adds complexity but I could see it being useful

## Testing
- [x] Updated unit tests

---------

Co-authored-by: Codex <noreply@openai.com>
2026-03-09 14:36:38 -07:00
Won Park
42f20a6845
pass on save info to model + ui tweaks (#14123)
Passing on more information to the model for context purposes, to
streamline image-identification.
2026-03-09 20:10:15 +00:00
Ahmed Ibrahim
e03e9b63ea
Stabilize guardian approval coverage (#14103)
## Summary
- align the guardian permission test with the actual sandbox policy it
widens and use a slightly larger Windows-only timeout budget
- expose the additional-permissions normalization helper to the guardian
test module
- replace the guardian popup snapshot assertion with targeted string
assertions

## Why this fixes the flake
This group was carrying two separate sources of drift. The guardian core
test widened derived sandbox policies without updating the source
sandbox policy, and it used a Windows command/timeout combination that
was too tight on slower runners. Separately, the TUI test was
snapshotting the full popup even though unrelated feature text changes
were the only thing moving. The new assertions keep coverage on the
guardian entry itself while removing unrelated snapshot churn.
2026-03-09 11:23:20 -07:00
xl-openai
c1f3ef16ec
fix(plugin): Also load curated plugins for TUI. (#14050)
Also run maybe_start_curated_repo_sync_for_config at TUI start time.
2026-03-09 11:05:02 -07:00
Charley Cunningham
f23fcd6ced
guardian initial feedback / tweaks (#13897)
## Summary
- remove the remaining model-visible guardian-specific `on-request`
prompt additions so enabling the feature does not change the main
approval-policy instructions
- neutralize user-facing guardian wording to talk about automatic
approval review / approval requests rather than a second reviewer or
only sandbox escalations
- tighten guardian retry-context handling so agent-authored
`justification` stays in the structured action JSON and is not also
injected as raw retry context
- simplify guardian review plumbing in core by deleting dead
prompt-append paths and trimming some request/transcript setup code

## Notable Changes
- delete the dead `permissions/approval_policy/guardian.md` append path
and stop threading `guardian_approval_enabled` through model-facing
developer-instruction builders
- rename the experimental feature copy to `Automatic approval review`
and update the `/experimental` snapshot text accordingly
- make approval-review status strings generic across shell, patch,
network, and MCP review types
- forward real sandbox/network retry reasons for shell and unified-exec
guardian review, but do not pass agent-authored justification as raw
retry context
- simplify `guardian.rs` by removing the one-field request wrapper,
deduping reasoning-effort selection, and cleaning up transcript entry
collection

## Testing
- `just fmt`
- full validation left to CI

---------

Co-authored-by: Codex <noreply@openai.com>
2026-03-09 09:25:24 -07:00
Dylan Hurd
06f82c123c
feat(tui) render request_permissions calls (#14004)
## Summary
Adds support for tui rendering of request_permission calls

<img width="724" height="245" alt="Screenshot 2026-03-08 at 9 04 07 PM"
src="https://github.com/user-attachments/assets/e1997825-a496-4bfb-bbda-43d0006460a5"
/>


## Testing
- [x] Added snapshot test
2026-03-09 04:24:04 +00:00
Jack Mousseau
e6b93841c5
Add request permissions tool (#13092)
Adds a built-in `request_permissions` tool and wires it through the
Codex core, protocol, and app-server layers so a running turn can ask
the client for additional permissions instead of relying on a static
session policy.

The new flow emits a `RequestPermissions` event from core, tracks the
pending request by call ID, forwards it through app-server v2 as an
`item/permissions/requestApproval` request, and resumes the tool call
once the client returns an approved subset of the requested permission
profile.
2026-03-08 20:23:06 -07:00
Charley Cunningham
4ad3b59de3
tui: clarify pending steer follow-ups (#13841)
## Summary
- split the pending input preview into labeled pending-steer and queued
follow-up sections
- explain that pending steers submit after the next tool call and that
Esc can interrupt and send them immediately
- treat Esc as an interrupt-plus-resubmit path when pending steers
exist, with updated TUI snapshots and tests

Queues and steers:
<img width="1038" height="263" alt="Screenshot 2026-03-07 at 10 17
17 PM"
src="https://github.com/user-attachments/assets/4ef433ef-27a3-4b7c-ad69-2046f6eb89e6"
/>

After pressing Esc:
<img width="1046" height="320" alt="Screenshot 2026-03-07 at 10 17
21 PM"
src="https://github.com/user-attachments/assets/0f4d89e0-b6b9-486a-9f04-b6021f169ba7"
/>

## Codex author
`codex resume 019cc6f4-2cca-7803-b717-8264526dbd97`

---------

Co-authored-by: Codex <noreply@openai.com>
2026-03-08 20:13:21 -07:00
Celia Chen
340f9c9ecb
app-server: include experimental skill metadata in exec approval requests (#13929)
## Summary

This change surfaces skill metadata on command approval requests so
app-server clients can tell when an approval came from a skill script
and identify the originating `SKILL.md`.

- add `skill_metadata` to exec approval events in the shared protocol
- thread skill metadata through core shell escalation and delegated
approval handling for skill-triggered approvals
- expose the field in app-server v2 as experimental `skillMetadata`
- regenerate the JSON/TypeScript schemas and cover the new field in
protocol, transport, core, and TUI tests

## Why

Skill-triggered approvals already carry skill context inside core, but
app-server clients could not see which skill caused the prompt. Sending
the skill metadata with the approval request makes it possible for
clients to present better approval UX and connect the prompt back to the
relevant skill definition.


## example event in app-server-v2
verified that we see this event when experimental api is on:
```
< {
<   "id": 11,
<   "method": "item/commandExecution/requestApproval",
<   "params": {
<     "additionalPermissions": {
<       "fileSystem": null,
<       "macos": {
<         "accessibility": false,
<         "automations": {
<           "bundle_ids": [
<             "com.apple.Notes"
<           ]
<         },
<         "calendar": false,
<         "preferences": "read_only"
<       },
<       "network": null
<     },
<     "approvalId": "25d600ee-5a3c-4746-8d17-e2e61fb4c563",
<     "availableDecisions": [
<       "accept",
<       "acceptForSession",
<       "cancel"
<     ],
<     "command": "/Applications/ChatGPT.app/Contents/Resources/CodexAppServer_CodexAppServerBundledSkills.bundle/Contents/Resources/skills/apple-notes/scripts/notes_info",
<     "commandActions": [
<       {
<         "command": "/Applications/ChatGPT.app/Contents/Resources/CodexAppServer_CodexAppServerBundledSkills.bundle/Contents/Resources/skills/apple-notes/scripts/notes_info",
<         "type": "unknown"
<       }
<     ],
<     "cwd": "/Applications/ChatGPT.app/Contents/Resources/CodexAppServer_CodexAppServerBundledSkills.bundle/Contents/Resources/skills/apple-notes",
<     "itemId": "call_jZp3xFpNg4D8iKAD49cvEvZy",
<     "skillMetadata": {
<       "pathToSkillsMd": "/Applications/ChatGPT.app/Contents/Resources/CodexAppServer_CodexAppServerBundledSkills.bundle/Contents/Resources/skills/apple-notes/SKILL.md"
<     },
<     "threadId": "019ccc10-b7d3-7ff2-84fe-3a75e7681e69",
<     "turnId": "019ccc10-b848-76f1-81b3-4a1fa225493f"
<   }
< }`
```

& verified that this is the event when experimental api is off:
```
< {
<   "id": 13,
<   "method": "item/commandExecution/requestApproval",
<   "params": {
<     "approvalId": "5fbbf776-261b-4cf8-899b-c125b547f2c0",
<     "availableDecisions": [
<       "accept",
<       "acceptForSession",
<       "cancel"
<     ],
<     "command": "/Applications/ChatGPT.app/Contents/Resources/CodexAppServer_CodexAppServerBundledSkills.bundle/Contents/Resources/skills/apple-notes/scripts/notes_info",
<     "commandActions": [
<       {
<         "command": "/Applications/ChatGPT.app/Contents/Resources/CodexAppServer_CodexAppServerBundledSkills.bundle/Contents/Resources/skills/apple-notes/scripts/notes_info",
<         "type": "unknown"
<       }
<     ],
<     "cwd": "/Users/celia/code/codex/codex-rs",
<     "itemId": "call_OV2DHzTgYcbYtWaTTBWlocOt",
<     "threadId": "019ccc16-2a2b-7be1-8500-e00d45b892d4",
<     "turnId": "019ccc16-2a8e-7961-98ec-649600e7d06a"
<   }
< }
```
2026-03-08 18:07:46 -07:00
Matthew Zeng
a4a9536fd7
[elicitations] Support always allow option for mcp tool calls. (#13807)
- [x] Support always allow option for mcp tool calls, writes to
config.toml.
- [x] Fix config hot-reload after starting a new thread for TUI.
2026-03-08 01:46:40 +00:00
sayan-oai
590cfa6176
chore: use @plugin instead of $plugin for plaintext mentions (#13921)
change plaintext plugin-mentions from `$plugin` to `@plugin`, ensure TUI
can correctly decode these from history.

tested locally, added/updated tests.
2026-03-08 01:36:39 +00:00
Eric Traut
e8d7ede83c
Fix TUI context window display before first TokenCount (#13896)
The TUI was showing the raw configured `model_context_window` until the
first
`TokenCount` event arrived, even though core had already emitted the
effective
runtime window on `TurnStarted`. This made the footer, status-line
context
window, and `/status` output briefly inconsistent for models/configs
where the
effective window differs from the configured value, such as the
`gpt-5.4`
1,000,000-token override reported in #13623.

Update the TUI to cache `TurnStarted.model_context_window` immediately
so
pre-token-count displays use the runtime effective window, and add
regression
coverage for the startup path.

---------

Co-authored-by: Charles Cunningham <ccunningham@openai.com>
Co-authored-by: Codex <noreply@openai.com>
2026-03-07 17:01:47 -07:00
Dylan Hurd
92f7541624
fix(ci) fix guardian ci (#13911)
## Summary
#13910 was merged with some unused imports, let's fix this

## Testing
- [x] Let's make sure CI is green

---------

Co-authored-by: Charles Cunningham <ccunningham@openai.com>
Co-authored-by: Codex <noreply@openai.com>
2026-03-07 23:34:56 +00:00
Charley Cunningham
e84ee33cc0
Add guardian approval MVP (#13692)
## Summary
- add the guardian reviewer flow for `on-request` approvals in command,
patch, sandbox-retry, and managed-network approval paths
- keep guardian behind `features.guardian_approval` instead of exposing
a public `approval_policy = guardian` mode
- route ordinary `OnRequest` approvals to the guardian subagent when the
feature is enabled, without changing the public approval-mode surface

## Public model
- public approval modes stay unchanged
- guardian is enabled via `features.guardian_approval`
- when that feature is on, `approval_policy = on-request` keeps the same
approval boundaries but sends those approval requests to the guardian
reviewer instead of the user
- `/experimental` only persists the feature flag; it does not rewrite
`approval_policy`
- CLI and app-server no longer expose a separate `guardian` approval
mode in this PR

## Guardian reviewer
- the reviewer runs as a normal subagent and reuses the existing
subagent/thread machinery
- it is locked to a read-only sandbox and `approval_policy = never`
- it does not inherit user/project exec-policy rules
- it prefers `gpt-5.4` when the current provider exposes it, otherwise
falls back to the parent turn's active model
- it fail-closes on timeout, startup failure, malformed output, or any
other review error
- it currently auto-approves only when `risk_score < 80`

## Review context and policy
- guardian mirrors `OnRequest` approval semantics rather than
introducing a separate approval policy
- explicit `require_escalated` requests follow the same approval surface
as `OnRequest`; the difference is only who reviews them
- managed-network allowlist misses that enter the approval flow are also
reviewed by guardian
- the review prompt includes bounded recent transcript history plus
recent tool call/result evidence
- transcript entries and planned-action strings are truncated with
explicit `<guardian_truncated ... />` markers so large payloads stay
bounded
- apply-patch reviews include the full patch content (without
duplicating the structured `changes` payload)
- the guardian request layout is snapshot-tested using the same
model-visible Responses request formatter used elsewhere in core

## Guardian network behavior
- the guardian subagent inherits the parent session's managed-network
allowlist when one exists, so it can use the same approved network
surface while reviewing
- exact session-scoped network approvals are copied into the guardian
session with protocol/port scope preserved
- those copied approvals are now seeded before the guardian's first turn
is submitted, so inherited approvals are available during any immediate
review-time checks

## Out of scope / follow-ups
- the sandbox-permission validation split was pulled into a separate PR
and is not part of this diff
- a future follow-up can enable `serde_json` preserve-order in
`codex-core` and then simplify the guardian action rendering further

---------

Co-authored-by: Codex <noreply@openai.com>
2026-03-07 05:40:10 -08:00
jif-oai
cf143bf71e
feat: simplify DB further (#13771) 2026-03-07 03:48:36 -08:00
Eric Traut
8df4d9b3b2
Add Fast mode status-line indicator (#13670)
Addresses feature request #13660

Adds new option to `/statusline` so the status line can display "fast
on" or "fast off"

Summary
- introduce a `FastMode` status-line item so `/statusline` can render
explicit `Fast on`/`Fast off` text for the service tier
- wire the item into the picker metadata and resolve its string from
`ChatWidget` without adding any unrelated `thread-name` logic or storage
changes
- ensure the refresh paths keep the cached footer in sync when the
service tier (fast mode) changes

Testing
- Manually tested

Here's what it looks like when enabled:

<img width="366" height="75" alt="image"
src="https://github.com/user-attachments/assets/7f992d2b-6dab-49ed-aa43-ad496f56f193"
/>
2026-03-07 00:42:08 -07:00
viyatb-oai
25fa974166
fix: support managed network allowlist controls (#12752)
## Summary
- treat `requirements.toml` `allowed_domains` and `denied_domains` as
managed network baselines for the proxy
- in restricted modes by default, build the effective runtime policy
from the managed baseline plus user-configured allowlist and denylist
entries, so common hosts can be pre-approved without blocking later user
expansion
- add `experimental_network.managed_allowed_domains_only = true` to pin
the effective allowlist to managed entries, ignore user allowlist
additions, and hard-deny non-managed domains without prompting
- apply `managed_allowed_domains_only` anywhere managed network
enforcement is active, including full access, while continuing to
respect denied domains from all sources
- add regression coverage for merged-baseline behavior, managed-only
behavior, and full-access managed-only enforcement

## Behavior
Assuming `requirements.toml` defines both
`experimental_network.allowed_domains` and
`experimental_network.denied_domains`.

### Default mode
- By default, the effective allowlist is
`experimental_network.allowed_domains` plus user or persisted allowlist
additions.
- By default, the effective denylist is
`experimental_network.denied_domains` plus user or persisted denylist
additions.
- Allowlist misses can go through the network approval flow.
- Explicit denylist hits and local or private-network blocks are still
hard-denied.
- When `experimental_network.managed_allowed_domains_only = true`, only
managed `allowed_domains` are respected, user allowlist additions are
ignored, and non-managed domains are hard-denied without prompting.
- Denied domains continue to be respected from all sources.

### Full access
- With managed requirements present, the effective allowlist is pinned
to `experimental_network.allowed_domains`.
- With managed requirements present, the effective denylist is pinned to
`experimental_network.denied_domains`.
- There is no allowlist-miss approval path in full access.
- Explicit denylist hits are hard-denied.
- `experimental_network.managed_allowed_domains_only = true` now also
applies in full access, so managed-only behavior remains in effect
anywhere managed network enforcement is active.
2026-03-06 17:52:54 -08:00
Owen Lin
289ed549cf
chore(otel): rename OtelManager to SessionTelemetry (#13808)
## Summary
This is a purely mechanical refactor of `OtelManager` ->
`SessionTelemetry` to better convey what the struct is doing. No
behavior change.

## Why

`OtelManager` ended up sounding much broader than what this type
actually does. It doesn't manage OTEL globally; it's the session-scoped
telemetry surface for emitting log/trace events and recording metrics
with consistent session metadata (`app_version`, `model`, `slug`,
`originator`, etc.).

`SessionTelemetry` is a more accurate name, and updating the call sites
makes that boundary a lot easier to follow.

## Validation

- `just fmt`
- `cargo test -p codex-otel`
- `cargo test -p codex-core`
2026-03-06 16:23:30 -08:00
Ruslan Nigmatullin
5b04cc657f
utils/pty: add streaming spawn and terminal sizing primitives (#13695)
Enhance pty utils:
* Support closing stdin
* Separate stderr and stdout streams to allow consumers differentiate them
* Provide compatibility helper to merge both streams back into combined one
* Support specifying terminal size for pty, including on-demand resizes while process is already running
* Support terminating the process while still consuming its outputs
2026-03-06 15:13:12 -08:00
sayan-oai
8a54d3caaa
feat: structured plugin parsing (#13711)
#### What

Add structured `@plugin` parsing and TUI support for plugin mentions.

- Core: switch from plain-text `@display_name` parsing to structured
`plugin://...` mentions via `UserInput::Mention` and
`[$...](plugin://...)` links in text, same pattern as apps/skills.
- TUI: add plugin mention popup, autocomplete, and chips when typing
`$`. Load plugin capability summaries and feed them into the composer;
plugin mentions appear alongside skills and apps.
- Generalize mention parsing to a sigil parameter, still defaults to `$`

<img width="797" height="119" alt="image"
src="https://github.com/user-attachments/assets/f0fe2658-d908-4927-9139-73f850805ceb"
/>

Builds on #13510. Currently clients have to build their own `id` via
`plugin@marketplace` and filter plugins to show by `enabled`, but we
will add `id` and `available` as fields returned from `plugin/list`
soon.

####Tests

Added tests, verified locally.
2026-03-06 11:08:36 -08:00
Owen Lin
3449e00bc9
feat(otel, core): record turn TTFT and TTFM metrics in codex-core (#13630)
### Summary
This adds turn-level latency metrics for the first model output and the
first completed agent message.
- `codex.turn.ttft.duration_ms` starts at turn start and records on the
first output signal we see from the model. That includes normal
assistant text, reasoning deltas, and non-text outputs like tool-call
items.
- `codex.turn.ttfm.duration_ms` also starts at turn start, but it
records when the first agent message finishes streaming rather than when
its first delta arrives.

### Implementation notes
The timing is tracked in codex-core, not app-server, so the definition
stays consistent across CLI, TUI, and app-server clients.

I reused the existing turn lifecycle boundary that already drives
`codex.turn.e2e_duration_ms`, stored the turn start timestamp in turn
state, and record each metric once per turn.

I also wired the new metric names into the OTEL runtime metrics summary
so they show up in the same in-memory/debug snapshot path as the
existing timing metrics.
2026-03-06 10:23:48 -08:00