This is a significant change to how layers of configuration are applied.
In particular, the `ConfigLayerStack` now has two important fields:
- `layers: Vec<ConfigLayerEntry>`
- `requirements: ConfigRequirements`
We merge `TomlValue`s across the layers, but they are subject to
`ConfigRequirements` before creating a `Config`.
How I would review this PR:
- start with `codex-rs/app-server-protocol/src/protocol/v2.rs` and note
the new variants added to the `ConfigLayerSource` enum:
`LegacyManagedConfigTomlFromFile` and `LegacyManagedConfigTomlFromMdm`
- note that `ConfigLayerSource` now has a `precedence()` method and
implements `PartialOrd`
- `codex-rs/core/src/config_loader/layer_io.rs` is responsible for
loading "admin" preferences from `/etc/codex/managed_config.toml` and
MDM. Because `/etc/codex/managed_config.toml` is now deprecated in favor
of `/etc/codex/requirements.toml` and `/etc/codex/config.toml`, we now
include some extra information on the `LoadedConfigLayers` returned in
`layer_io.rs`.
- `codex-rs/core/src/config_loader/mod.rs` has major changes to
`load_config_layers_state()`, which is what produces `ConfigLayerStack`.
The docstring has the new specification and describes the various layers
that will be loaded and the precedence order.
- It uses the information from `LoaderOverrides` "twice," both in the
spirit of legacy support:
- We use one instances to derive an instance of `ConfigRequirements`.
Currently, the only field in `managed_config.toml` that contributes to
`ConfigRequirements` is `approval_policy`. This PR introduces
`Constrained::allow_only()` to support this.
- We use a clone of `LoaderOverrides` to derive
`ConfigLayerSource::LegacyManagedConfigTomlFromFile` and
`ConfigLayerSource::LegacyManagedConfigTomlFromMdm` layers, as
appropriate. As before, this ends up being a "best effort" at enterprise
controls, but is enforcement is not guaranteed like it is for
`ConfigRequirements`.
- Now we only create a "user" layer if `$CODEX_HOME/config.toml` exists.
(Previously, a user layer was always created for `ConfigLayerStack`.)
- Similarly, we only add a "session flags" layer if there are CLI
overrides.
- `config_loader/state.rs` contains the updated implementation for
`ConfigLayerStack`. Note the public API is largely the same as before,
but the implementation is quite different. We leverage the fact that
`ConfigLayerSource` is now `PartialOrd` to ensure layers are in the
correct order.
- A `Config` constructed via `ConfigBuilder.build()` will use
`load_config_layers_state()` to create the `ConfigLayerStack` and use
the associated `ConfigRequirements` when constructing the `Config`
object.
- That said, a `Config` constructed via
`Config::load_from_base_config_with_overrides()` does _not_ yet use
`ConfigBuilder`, so it creates a `ConfigRequirements::default()` instead
of loading a proper `ConfigRequirements`. I will fix this in a
subsequent PR.
Then the following files are mostly test changes:
```
codex-rs/app-server/tests/suite/v2/config_rpc.rs
codex-rs/core/src/config/service.rs
codex-rs/core/src/config_loader/tests.rs
```
Again, because we do not always include "user" and "session flags"
layers when the contents are empty, `ConfigLayerStack` sometimes has
fewer layers than before (and the precedence order changed slightly),
which is the main reason integration tests changed.
1. Remove PUBLIC skills and introduce SYSTEM skills embedded in the
binary and installed into $CODEX_HOME/skills/.system at startup.
2. Skills are now always enabled (feature flag removed).
3. Update skills/list to accept forceReload and plumb it through (not
used by clients yet).
This attempts to tighten up the types related to "config layers."
Currently, `ConfigLayerEntry` is defined as follows:
bef36f4ae7/codex-rs/core/src/config_loader/state.rs (L19-L25)
but the `source` field is a bit of a lie, as:
- for `ConfigLayerName::Mdm`, it is
`"com.openai.codex/config_toml_base64"`
- for `ConfigLayerName::SessionFlags`, it is `"--config"`
- for `ConfigLayerName::User`, it is `"config.toml"` (just the file
name, not the path to the `config.toml` on disk that was read)
- for `ConfigLayerName::System`, it seems like it is usually
`/etc/codex/managed_config.toml` in practice, though on Windows, it is
`%CODEX_HOME%/managed_config.toml`:
bef36f4ae7/codex-rs/core/src/config_loader/layer_io.rs (L84-L101)
All that is to say, in three out of the four `ConfigLayerName`, `source`
is a `PathBuf` that is not an absolute path (or even a true path).
This PR tries to uplevel things by eliminating `source` from
`ConfigLayerEntry` and turning `ConfigLayerName` into a disjoint union
named `ConfigLayerSource` that has the appropriate metadata for each
variant, favoring the use of `AbsolutePathBuf` where appropriate:
```rust
pub enum ConfigLayerSource {
/// Managed preferences layer delivered by MDM (macOS only).
#[serde(rename_all = "camelCase")]
#[ts(rename_all = "camelCase")]
Mdm { domain: String, key: String },
/// Managed config layer from a file (usually `managed_config.toml`).
#[serde(rename_all = "camelCase")]
#[ts(rename_all = "camelCase")]
System { file: AbsolutePathBuf },
/// Session-layer overrides supplied via `-c`/`--config`.
SessionFlags,
/// User config layer from a file (usually `config.toml`).
#[serde(rename_all = "camelCase")]
#[ts(rename_all = "camelCase")]
User { file: AbsolutePathBuf },
}
```
1. Adds SkillScope::Public end-to-end (core + protocol) and loads skills
from the public cache directory
2. Improves repo skill discovery by searching upward for the nearest
.codex/skills within a git repo
3. Deduplicates skills by name with deterministic ordering to avoid
duplicates across sources
4. Fixes garbled “Skill errors” overlay rendering by preventing pending
history lines from being injected during the modal
5. Updates the project docs “Skills” intro wording to avoid hardcoded
paths
We want to rely on server-side auto-compaction instead of having the
client trigger context compaction manually. This API was stubbed as a
placeholder and never implemented.
## Problem
When generating JSON schemas on Windows, the `codex app-server
generate-json-schema` command fails with a filename error:
```text
Error: Failed to write JSON schema for Option<()>
Caused by:
0: Failed to write .\Option<()>.json
1: The filename, directory name, or volume label syntax is incorrect. (os error 123)
```
This occurs because Windows doesn't allow certain characters in
filenames, specifically the angle brackets **<>** used in the
**Option<()>** type name.
## Root Cause
The schema generation process attempts to create individual JSON files
for each schema definition, including `Option<()>`. However, the
characters `<` and `>` are invalid in Windows filenames, causing the
file creation to fail.
## Solution
The fix extends the existing `IGNORED_DEFINITIONS` constant (which was
already being used in the **bundle generation**) to also skip
`Option<()>` when generating individual JSON schema files. This
maintains consistency with the existing behavior where `Option<()>` is
excluded from the bundled schema.
---
close#7479
refactor the way we load and manage skills:
1. Move skill discovery/caching into SkillsManager and reuse it across
sessions.
2. Add the skills/list API (Op::ListSkills/SkillsListResponse) to fetch
skills for one or more cwds. Also update app-server for VSCE/App;
3. Trigger skills/list during session startup so UIs preload skills and
handle errors immediately.
Changes the `writable_roots` field of the `WorkspaceWrite` variant of
the `SandboxPolicy` enum from `Vec<PathBuf>` to `Vec<AbsolutePathBuf>`.
This is helpful because now callers can be sure the value is an absolute
path rather than a relative one. (Though when using an absolute path in
a Seatbelt config policy, we still have to _canonicalize_ it first.)
Because `writable_roots` can be read from a config file, it is important
that we are able to resolve relative paths properly using the parent
folder of the config file as the base path.
Our previous design of cancellation endpoint is not idempotent, which
caused a bunch of flaky tests. Make app server just returned a not_found
status instead of throwing an error if the login id is not found. Keep
V1 endpoint behavior the same.
### Summary
* Added `mcpServer/oauthLogin` in app server for supporting in session
MCP server login
* Added `McpServerOauthLoginParams` and `McpServerOauthLoginResponse` to
support above method with response returning the auth URL for consumer
to open browser or display accordingly.
* Added `McpServerOauthLoginCompletedNotification` which the app server
would emit on MCP server login success or failure (i.e. timeout).
* Refactored rmcp-client oath_login to have the ability on starting a
auth server which the codex_message_processor uses for in-session auth.
VSCE renders `codex/event/stream_error` (automatically retried, e.g.
`"Reconnecting... 1/n"`) and `codex/event/error` (terminal errors)
differently, so add `will_retry` on ErrorNotification to indicate this.
When we are writing to config using `config/value/write` or
`config/batchWrite`, it always require a `config/read` before it right
now in order to get the correct file path to write to. make this
optional so we read from the default user config file if this is not
passed in.
- Introduce `openai_models` in `/core`
- Move `PRESETS` under it
- Move `ModelPreset`, `ModelUpgrade`, `ReasoningEffortPreset`,
`ReasoningEffortPreset`, and `ReasoningEffortPreset` to `protocol`
- Introduce `Op::ListModels` and `EventMsg::AvailableModels`
Next steps:
- migrate `app-server` and `tui` to use the introduced Operation
This item is sent as a turn notification instead: `turn/plan/updated`,
similar to Turn diffs (which is `turn/diff/updated`).
We treat these concepts as ephemeral compared to Items which are usually
persisted.
### Summary
Added `mcp/servers/list` which is equivalent to `/mcp` slash command in
CLI for response. This will be used in VSCE MCP settings to show log in
status, available tools etc.
The problem with using `serde(flatten)` on Turn status is that it
conditionally serializes the `error` field, which is not the pattern we
want in API v2 where all fields on an object should always be returned.
```
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, JsonSchema, TS)]
#[serde(rename_all = "camelCase")]
#[ts(export_to = "v2/")]
pub struct Turn {
pub id: String,
/// Only populated on a `thread/resume` response.
/// For all other responses and notifications returning a Turn,
/// the items field will be an empty list.
pub items: Vec<ThreadItem>,
#[serde(flatten)]
pub status: TurnStatus,
}
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, JsonSchema, TS)]
#[serde(tag = "status", rename_all = "camelCase")]
#[ts(tag = "status", export_to = "v2/")]
pub enum TurnStatus {
Completed,
Interrupted,
Failed { error: TurnError },
InProgress,
}
```
serializes to:
```
{
"id": "turn-123",
"items": [],
"status": "completed"
}
{
"id": "turn-123",
"items": [],
"status": "failed",
"error": {
"message": "Tool timeout",
"codexErrorInfo": null
}
}
```
Instead we want:
```
{
"id": "turn-123",
"items": [],
"status": "completed",
"error": null
}
{
"id": "turn-123",
"items": [],
"status": "failed",
"error": {
"message": "Tool timeout",
"codexErrorInfo": null
}
}
```
This is the V2 version of `EventMsg::TurnDiff`.
I decided to expose this as a `turn/*` notification as opposed to an
Item to make it more explicit that the diff is accumulated throughout a
turn (every `apply_patch` call updates the running diff). Also, I don't
think it's worth persisting this diff as an Item because it can always
be recomputed from the actual `FileChange` Items.
Add `thread_id` and `turn_id` to `item/started`, `item/completed`, and
`error` notifications. Otherwise the client will have a hard time
knowing which thread & turn (if multiple threads are running in
parallel) a new item/error is for.
Also add `thread_id` to `turn/started` and `turn/completed`.
Add a `Declined` status for when we request an approval from the user
and the user declines. This allows us to distinguish from commands that
actually ran, but failed.
This behaves similarly to apply_patch / FileChange, which does the same
thing.
This PR adds the API V2 version of the apply_patch approval flow, which
centers around `ThreadItem::FileChange`.
This PR wires the new RPC (`item/fileChange/requestApproval`, V2 only)
and related events (`item/started`, `item/completed` for
`ThreadItem::FileChange`, which are emitted in both V1 and V2) through
the app-server
protocol. The new approval RPC is only sent when the user initiates a
turn with the new `turn/start` API so we don't break backwards
compatibility with VSCE.
Similar to https://github.com/openai/codex/pull/6758, the approach I
took was to make as few changes to the Codex core as possible,
leveraging existing `EventMsg` core events, and translating those in
app-server. I did have to add a few additional fields to
`EventMsg::PatchApplyBegin` and `EventMsg::PatchApplyEnd`, but those
were fairly lightweight.
However, the `EventMsg`s emitted by core are the following:
```
1) Auto-approved (no request for approval)
- EventMsg::PatchApplyBegin
- EventMsg::PatchApplyEnd
2) Approved by user
- EventMsg::ApplyPatchApprovalRequest
- EventMsg::PatchApplyBegin
- EventMsg::PatchApplyEnd
3) Declined by user
- EventMsg::ApplyPatchApprovalRequest
- EventMsg::PatchApplyBegin
- EventMsg::PatchApplyEnd
```
For a request triggering an approval, this would result in:
```
item/fileChange/requestApproval
item/started
item/completed
```
which is different from the `ThreadItem::CommandExecution` flow
introduced in https://github.com/openai/codex/pull/6758, which does the
below and is preferable:
```
item/started
item/commandExecution/requestApproval
item/completed
```
To fix this, we leverage `TurnSummaryStore` on codex_message_processor
to store a little bit of state, allowing us to fire `item/started` and
`item/fileChange/requestApproval` whenever we receive the underlying
`EventMsg::ApplyPatchApprovalRequest`, and no-oping when we receive the
`EventMsg::PatchApplyBegin` later.
This is much less invasive than modifying the order of EventMsg within
core (I tried).
The resulting payloads:
```
{
"method": "item/started",
"params": {
"item": {
"changes": [
{
"diff": "Hello from Codex!\n",
"kind": "add",
"path": "/Users/owen/repos/codex/codex-rs/APPROVAL_DEMO.txt"
}
],
"id": "call_Nxnwj7B3YXigfV6Mwh03d686",
"status": "inProgress",
"type": "fileChange"
}
}
}
```
```
{
"id": 0,
"method": "item/fileChange/requestApproval",
"params": {
"grantRoot": null,
"itemId": "call_Nxnwj7B3YXigfV6Mwh03d686",
"reason": null,
"threadId": "019a9e11-8295-7883-a283-779e06502c6f",
"turnId": "1"
}
}
```
```
{
"id": 0,
"result": {
"decision": "accept"
}
}
```
```
{
"method": "item/completed",
"params": {
"item": {
"changes": [
{
"diff": "Hello from Codex!\n",
"kind": "add",
"path": "/Users/owen/repos/codex/codex-rs/APPROVAL_DEMO.txt"
}
],
"id": "call_Nxnwj7B3YXigfV6Mwh03d686",
"status": "completed",
"type": "fileChange"
}
}
}
```
Expand the rate-limit cache/TUI: store credit snapshots alongside
primary and secondary windows, render “Credits” when the backend reports
they exist (unlimited vs rounded integer balances)
This PR allows clients to render historical messages when resuming a
thread via `thread/resume` by reading from the list of `EventMsg`
payloads loaded from the rollout, and then transforming them into Turns
and ThreadItems to be returned on the `Thread` object.
This is implemented by leveraging `SessionConfiguredNotification` which
returns this list of `EventMsg` objects when resuming a conversation,
and then applying a stateful `ThreadHistoryBuilder` that parses from
this EventMsg log and transforms it into Turns and ThreadItems.
Note that we only persist a subset of `EventMsg`s in a rollout as
defined in `policy.rs`, so we lose fidelity whenever we resume a thread
compared to when we streamed the thread's turns originally. However,
this behavior is at parity with the legacy API.
## Summary
On app-server startup, detect whether the experimental sandbox is
enabled, and send a notification .
**Note**
New conversations will not respect the feature because we [ignore cli
overrides in
NewConversation](a75321a64c/codex-rs/app-server/src/codex_message_processor.rs (L1237-L1252)).
However, this should be okay, since we don't actually use config for
this, we use a [global
variable](87cce88f48/codex-rs/core/src/safety.rs (L105-L110)).
We should carefully unwind this setup at some point.
## Testing
- [ ] In progress: testing locally
---------
Co-authored-by: jif-oai <jif@openai.com>
# External (non-OpenAI) Pull Request Requirements
Before opening this Pull Request, please read the dedicated
"Contributing" markdown file or your PR may be closed:
https://github.com/openai/codex/blob/main/docs/contributing.md
If your PR conforms to our contribution guidelines, replace this text
with a detailed and high quality description of your changes.
Include a link to a bug report or enhancement request.
This adds the following fields to `ThreadStartResponse` and
`ThreadResumeResponse`:
```rust
pub model: String,
pub model_provider: String,
pub cwd: PathBuf,
pub approval_policy: AskForApproval,
pub sandbox: SandboxPolicy,
pub reasoning_effort: Option<ReasoningEffort>,
```
This is important because these fields are optional in
`ThreadStartParams` and `ThreadResumeParams`, so the caller needs to be
able to determine what values were ultimately used to start/resume the
conversation. (Though note that any of these could be changed later
between turns in the conversation.)
Though to get this information reliably, it must be read from the
internal `SessionConfiguredEvent` that is created in response to the
start of a conversation. Because `SessionConfiguredEvent` (as defined in
`codex-rs/protocol/src/protocol.rs`) did not have all of these fields, a
number of them had to be added as part of this PR.
Because `SessionConfiguredEvent` is referenced in many tests, test
instances of `SessionConfiguredEvent` had to be updated, as well, which
is why this PR touches so many files.