Commit graph

104 commits

Author SHA1 Message Date
Owen Lin
d6c30ed25e
[app-server] feat: v2 apply_patch approval flow (#6760)
This PR adds the API V2 version of the apply_patch approval flow, which
centers around `ThreadItem::FileChange`.

This PR wires the new RPC (`item/fileChange/requestApproval`, V2 only)
and related events (`item/started`, `item/completed` for
`ThreadItem::FileChange`, which are emitted in both V1 and V2) through
the app-server
protocol. The new approval RPC is only sent when the user initiates a
turn with the new `turn/start` API so we don't break backwards
compatibility with VSCE.

Similar to https://github.com/openai/codex/pull/6758, the approach I
took was to make as few changes to the Codex core as possible,
leveraging existing `EventMsg` core events, and translating those in
app-server. I did have to add a few additional fields to
`EventMsg::PatchApplyBegin` and `EventMsg::PatchApplyEnd`, but those
were fairly lightweight.

However, the `EventMsg`s emitted by core are the following:
```
1) Auto-approved (no request for approval)

- EventMsg::PatchApplyBegin
- EventMsg::PatchApplyEnd

2) Approved by user
- EventMsg::ApplyPatchApprovalRequest
- EventMsg::PatchApplyBegin
- EventMsg::PatchApplyEnd

3) Declined by user
- EventMsg::ApplyPatchApprovalRequest
- EventMsg::PatchApplyBegin
- EventMsg::PatchApplyEnd
```

For a request triggering an approval, this would result in:
```
item/fileChange/requestApproval
item/started
item/completed
```

which is different from the `ThreadItem::CommandExecution` flow
introduced in https://github.com/openai/codex/pull/6758, which does the
below and is preferable:
```
item/started
item/commandExecution/requestApproval
item/completed
```

To fix this, we leverage `TurnSummaryStore` on codex_message_processor
to store a little bit of state, allowing us to fire `item/started` and
`item/fileChange/requestApproval` whenever we receive the underlying
`EventMsg::ApplyPatchApprovalRequest`, and no-oping when we receive the
`EventMsg::PatchApplyBegin` later.

This is much less invasive than modifying the order of EventMsg within
core (I tried).

The resulting payloads:
```
{
  "method": "item/started",
  "params": {
    "item": {
      "changes": [
        {
          "diff": "Hello from Codex!\n",
          "kind": "add",
          "path": "/Users/owen/repos/codex/codex-rs/APPROVAL_DEMO.txt"
        }
      ],
      "id": "call_Nxnwj7B3YXigfV6Mwh03d686",
      "status": "inProgress",
      "type": "fileChange"
    }
  }
}
```

```
{
  "id": 0,
  "method": "item/fileChange/requestApproval",
  "params": {
    "grantRoot": null,
    "itemId": "call_Nxnwj7B3YXigfV6Mwh03d686",
    "reason": null,
    "threadId": "019a9e11-8295-7883-a283-779e06502c6f",
    "turnId": "1"
  }
}
```

```
{
  "id": 0,
  "result": {
    "decision": "accept"
  }
}
```

```
{
  "method": "item/completed",
  "params": {
    "item": {
      "changes": [
        {
          "diff": "Hello from Codex!\n",
          "kind": "add",
          "path": "/Users/owen/repos/codex/codex-rs/APPROVAL_DEMO.txt"
        }
      ],
      "id": "call_Nxnwj7B3YXigfV6Mwh03d686",
      "status": "completed",
      "type": "fileChange"
    }
  }
}
```
2025-11-19 20:13:31 -08:00
Celia Chen
72a1453ac5
Revert "[core] add optional status_code to error events (#6865)" (#6955)
This reverts commit c2ec477d93.

# External (non-OpenAI) Pull Request Requirements

Before opening this Pull Request, please read the dedicated
"Contributing" markdown file or your PR may be closed:
https://github.com/openai/codex/blob/main/docs/contributing.md

If your PR conforms to our contribution guidelines, replace this text
with a detailed and high quality description of your changes.

Include a link to a bug report or enhancement request.
2025-11-20 01:26:14 +00:00
Celia Chen
c2ec477d93
[core] add optional status_code to error events (#6865)
We want to better uncover error status code for clients. Add an optional
status_code to error events (thread error, error, stream error) so app
server could uncover the status code from the client side later.

in event log:
```
< {
<   "method": "codex/event/stream_error",
<   "params": {
<     "conversationId": "019a9a32-f576-7292-9711-8e57e8063536",
<     "id": "0",
<     "msg": {
<       "message": "Reconnecting... 5/5",
<       "status_code": 401,
<       "type": "stream_error"
<     }
<   }
< }
< {
<   "method": "codex/event/error",
<   "params": {
<     "conversationId": "019a9a32-f576-7292-9711-8e57e8063536",
<     "id": "0",
<     "msg": {
<       "message": "exceeded retry limit, last status: 401 Unauthorized, request id: 9a0cb03a485067f7-SJC",
<       "status_code": 401,
<       "type": "error"
<     }
<   }
< }
```
2025-11-19 19:51:21 +00:00
Michael Bolin
a75321a64c
fix: add more fields to ThreadStartResponse and ThreadResumeResponse (#6847)
This adds the following fields to `ThreadStartResponse` and
`ThreadResumeResponse`:

```rust
    pub model: String,
    pub model_provider: String,
    pub cwd: PathBuf,
    pub approval_policy: AskForApproval,
    pub sandbox: SandboxPolicy,
    pub reasoning_effort: Option<ReasoningEffort>,
```

This is important because these fields are optional in
`ThreadStartParams` and `ThreadResumeParams`, so the caller needs to be
able to determine what values were ultimately used to start/resume the
conversation. (Though note that any of these could be changed later
between turns in the conversation.)

Though to get this information reliably, it must be read from the
internal `SessionConfiguredEvent` that is created in response to the
start of a conversation. Because `SessionConfiguredEvent` (as defined in
`codex-rs/protocol/src/protocol.rs`) did not have all of these fields, a
number of them had to be added as part of this PR.

Because `SessionConfiguredEvent` is referenced in many tests, test
instances of `SessionConfiguredEvent` had to be updated, as well, which
is why this PR touches so many files.
2025-11-18 21:18:43 -08:00
Ahmed Ibrahim
ddcc60a085
Update defaults to gpt-5.1 (#6652)
## Summary
- update documentation, example configs, and automation defaults to
reference gpt-5.1 / gpt-5.1-codex
- bump the CLI and core configuration defaults, model presets, and error
messaging to the new models while keeping the model-family/tool coverage
for legacy slugs
- refresh tests, fixtures, and TUI snapshots so they expect the upgraded
defaults

## Testing
- `cargo test -p codex-core
config::tests::test_precedence_fixture_with_gpt5_profile`


------
[Codex
Task](https://chatgpt.com/codex/tasks/task_i_6916c5b3c2b08321ace04ee38604fc6b)
2025-11-17 17:40:11 -08:00
Owen Lin
cecbd5b021
[app-server] feat: add v2 command execution approval flow (#6758)
This PR adds the API V2 version of the command‑execution approval flow
for the shell tool.

This PR wires the new RPC (`item/commandExecution/requestApproval`, V2
only) and related events (`item/started`, `item/completed`, and
`item/commandExecution/delta`, which are emitted in both V1 and V2)
through the app-server
protocol. The new approval RPC is only sent when the user initiates a
turn with the new `turn/start` API so we don't break backwards
compatibility with VSCE.

The approach I took was to make as few changes to the Codex core as
possible, leveraging existing `EventMsg` core events, and translating
those in app-server. I did have to add additional fields to
`EventMsg::ExecCommandEndEvent` to capture the command's input so that
app-server can statelessly transform these events to a
`ThreadItem::CommandExecution` item for the `item/completed` event.

Once we stabilize the API and it's complete enough for our partners, we
can work on migrating the core to be aware of command execution items as
a first-class concept.

**Note**: We'll need followup work to make sure these APIs work for the
unified exec tool, but will wait til that's stable and landed before
doing a pass on app-server.

Example payloads below:
```
{
  "method": "item/started",
  "params": {
    "item": {
      "aggregatedOutput": null,
      "command": "/bin/zsh -lc 'touch /tmp/should-trigger-approval'",
      "cwd": "/Users/owen/repos/codex/codex-rs",
      "durationMs": null,
      "exitCode": null,
      "id": "call_lNWWsbXl1e47qNaYjFRs0dyU",
      "parsedCmd": [
        {
          "cmd": "touch /tmp/should-trigger-approval",
          "type": "unknown"
        }
      ],
      "status": "inProgress",
      "type": "commandExecution"
    }
  }
}
```

```
{
  "id": 0,
  "method": "item/commandExecution/requestApproval",
  "params": {
    "itemId": "call_lNWWsbXl1e47qNaYjFRs0dyU",
    "parsedCmd": [
      {
        "cmd": "touch /tmp/should-trigger-approval",
        "type": "unknown"
      }
    ],
    "reason": "Need to create file in /tmp which is outside workspace sandbox",
    "risk": null,
    "threadId": "019a93e8-0a52-7fe3-9808-b6bc40c0989a",
    "turnId": "1"
  }
}
```

```
{
  "id": 0,
  "result": {
    "acceptSettings": {
      "forSession": false
    },
    "decision": "accept"
  }
}
```

```
{
  "params": {
    "item": {
      "aggregatedOutput": null,
      "command": "/bin/zsh -lc 'touch /tmp/should-trigger-approval'",
      "cwd": "/Users/owen/repos/codex/codex-rs",
      "durationMs": 224,
      "exitCode": 0,
      "id": "call_lNWWsbXl1e47qNaYjFRs0dyU",
      "parsedCmd": [
        {
          "cmd": "touch /tmp/should-trigger-approval",
          "type": "unknown"
        }
      ],
      "status": "completed",
      "type": "commandExecution"
    }
  }
}
```
2025-11-18 00:23:54 +00:00
jif-oai
63c8c01f40
feat: better UI for unified_exec (#6515)
<img width="376" height="132" alt="Screenshot 2025-11-12 at 17 36 22"
src="https://github.com/user-attachments/assets/ce693f0d-5ca0-462e-b170-c20811dcc8d5"
/>
2025-11-14 16:31:12 +01:00
pakrym-oai
6c384eb9c6
tests: replace mount_sse_once_match with mount_sse_once for SSE mocking (#6640) 2025-11-13 18:04:05 -08:00
Dan Hernandez
b4a53aef47
feat: Add support for --add-dir to exec and TypeScript SDK (#6565)
## Summary

Adds support for specifying additional directories in the TypeScript SDK
through a new `additionalDirectories` option in `ThreadOptions`.

## Changes

- Added `additionalDirectories` parameter to `ThreadOptions` interface
- Updated `CodexExec` to accept and pass through additional directories
via the `--config` flag for `sandbox_workspace_write.writable_roots`
- Added comprehensive test coverage for the new functionality

## Test plan

- Added test case that verifies `additionalDirectories` is correctly
passed as repeated flags
- Existing tests continue to pass

---------

Co-authored-by: Claude <noreply@anthropic.com>
2025-11-13 13:47:10 -08:00
Andrew Nikolin
131c384361
Fix warning message phrasing (#6446)
Small fix for sentence phrasing in the warning message

Co-authored-by: AndrewNikolin <877163+AndrewNikolin@users.noreply.github.com>
2025-11-09 22:12:28 -08:00
Ahmed Ibrahim
c8ebb2a0dc
Add warning on compact (#6052)
This PR introduces the ability for `core` to send `warnings` as it can
send `errors. It also sends a warning on compaction.

<img width="811" height="187" alt="image"
src="https://github.com/user-attachments/assets/0947a42d-b720-420d-b7fd-115f8a65a46a"
/>
2025-10-31 13:27:33 -07:00
pakrym-oai
815ae4164a
[exec] Add MCP tool arguments and results (#5899)
Extends mcp_tool_call item to include arguments and results.
2025-10-29 14:23:57 -07:00
Abhishek Bhardwaj
89591e4246
feature: Add "!cmd" user shell execution (#2471)
feature: Add "!cmd" user shell execution

This change lets users run local shell commands directly from the TUI by
prefixing their input with ! (e.g. !ls). Output is truncated to keep the
exec cell usable, and Ctrl-C cleanly
  interrupts long-running commands (e.g. !sleep 10000).

**Summary of changes**

- Route Op::RunUserShellCommand through a dedicated UserShellCommandTask
(core/src/tasks/user_shell.rs), keeping the task logic out of codex.rs.
- Reuse the existing tool router: the task constructs a ToolCall for the
local_shell tool and relies on ShellHandler, so no manual MCP tool
lookup is required.
- Emit exec lifecycle events (ExecCommandBegin/ExecCommandEnd) so the
TUI can show command metadata, live output, and exit status.

**End-to-end flow**

  **TUI handling**

1. ChatWidget::submit_user_message (TUI) intercepts messages starting
with !.
2. Non-empty commands dispatch Op::RunUserShellCommand { command };
empty commands surface a help hint.
3. No UserInput items are created, so nothing is enqueued for the model.

  **Core submission loop**
4. The submission loop routes the op to handlers::run_user_shell_command
(core/src/codex.rs).
5. A fresh TurnContext is created and Session::spawn_user_shell_command
enqueues UserShellCommandTask.

  **Task execution**
6. UserShellCommandTask::run emits TaskStartedEvent, formats the
command, and prepares a ToolCall targeting local_shell.
  7. ToolCallRuntime::handle_tool_call dispatches to ShellHandler.

  **Shell tool runtime**
8. ShellHandler::run_exec_like launches the process via the unified exec
runtime, honoring sandbox and shell policies, and emits
ExecCommandBegin/End.
9. Stdout/stderr are captured for the UI, but the task does not turn the
resulting ToolOutput into a model response.

  **Completion**
10. After ExecCommandEnd, the task finishes without an assistant
message; the session marks it complete and the exec cell displays the
final output.

  **Conversation context**

- The command and its output never enter the conversation history or the
model prompt; the flow is local-only.
  - Only exec/task events are emitted for UI rendering.

**Demo video**


https://github.com/user-attachments/assets/fcd114b0-4304-4448-a367-a04c43e0b996
2025-10-29 00:31:20 -07:00
Josh McKinney
66a4b89822
feat(tui): clarify Windows auto mode requirements (#5568)
## Summary
- Coerce Windows `workspace-write` configs back to read-only, surface
the forced downgrade in the approvals popup,
  and funnel users toward WSL or Full Access.
- Add WSL installation instructions to the Auto preset on Windows while
keeping the preset available for other
  platforms.
- Skip the trust-on-first-run prompt on native Windows so new folders
remain read-only without additional
  confirmation.
- Expose a structured sandbox policy resolution from config to flag
Windows downgrades and adjust tests (core,
exec, TUI) to reflect the new behavior; provide a Windows-only approvals
snapshot.

  ## Testing
  - cargo fmt
- cargo test -p codex-core
config::tests::add_dir_override_extends_workspace_writable_roots
- cargo test -p codex-exec
suite::resume::exec_resume_preserves_cli_configuration_overrides
- cargo test -p codex-tui
chatwidget::tests::approvals_selection_popup_snapshot
- cargo test -p codex-tui
approvals_popup_includes_wsl_note_for_auto_mode
  - cargo test -p codex-tui windows_skips_trust_prompt
  - just fix -p codex-core
  - just fix -p codex-tui
2025-10-28 01:19:32 +00:00
jif-oai
f522aafb7f
chore: drop approve all (#5503)
Not needed anymore
2025-10-22 16:55:06 +01:00
jif-oai
774892c6d7
feat: add auto-approval for codex exec (#5043) 2025-10-15 19:03:54 +01:00
pakrym-oai
60f9e85c16
Set codex SDK TypeScript originator (#4894)
## Summary
- ensure the TypeScript SDK sets CODEX_INTERNAL_ORIGINATOR_OVERRIDE to
codex_sdk_ts when spawning the Codex CLI
- extend the responses proxy test helper to capture request headers for
assertions
- add coverage that verifies Codex threads launched from the TypeScript
SDK send the codex_sdk_ts originator header

## Testing
- Not Run (not requested)


------
https://chatgpt.com/codex/tasks/task_i_68e561b125248320a487f129093d16e7
2025-10-07 14:06:41 -07:00
pakrym-oai
35a770e871
Simplify request body assertions (#4845)
We'll have a lot more test like these
2025-10-07 09:56:39 +01:00
pakrym-oai
191d620707
Use response helpers when mounting SSE test responses (#4783)
## Summary
- replace manual wiremock SSE mounts in the compact suite with the
shared response helpers
- simplify the exec auth_env integration test by using the
mount_sse_once_match helper
- rely on mount_sse_sequence plus server request collection to replace
the bespoke SeqResponder utility in tests

## Testing
- just fmt

------
https://chatgpt.com/codex/tasks/task_i_68e2e238f2a88320a337f0b9e4098093
2025-10-05 21:58:16 +00:00
pakrym-oai
aecbe0f333
Add helper for response created SSE events in tests (#4758)
## Summary
- add a reusable `ev_response_created` helper that builds
`response.created` SSE events for integration tests
- update the exec and core integration suites to use the new helper
instead of repeating manual JSON literals
- keep the streaming fixtures consistent by relying on the shared helper
in every touched test

## Testing
- `just fmt`


------
https://chatgpt.com/codex/tasks/task_i_68e1fe885bb883208aafffb94218da61
2025-10-05 21:11:43 +00:00
Michael Bolin
042d4d55d9
feat: codex exec writes only the final message to stdout (#4644)
This updates `codex exec` so that, by default, most of the agent's
activity is written to stderr so that only the final agent message is
written to stdout. This makes it easier to pipe `codex exec` into
another tool without extra filtering.

I introduced `#![deny(clippy::print_stdout)]` to help enforce this
change and renamed the `ts_println!()` macro to `ts_msg()` because (1)
it no longer calls `println!()` and (2), `ts_eprintln!()` seemed too
long of a name.

While here, this also adds `-o` as an alias for `--output-last-message`.

Fixes https://github.com/openai/codex/issues/1670
2025-10-03 16:22:12 +00:00
jif-oai
33d3ecbccc
chore: refactor tool handling (#4510)
# Tool System Refactor

- Centralizes tool definitions and execution in `core/src/tools/*`:
specs (`spec.rs`), handlers (`handlers/*`), router (`router.rs`),
registry/dispatch (`registry.rs`), and shared context (`context.rs`).
One registry now builds the model-visible tool list and binds handlers.
- Router converts model responses to tool calls; Registry dispatches
with consistent telemetry via `codex-rs/otel` and unified error
handling. Function, Local Shell, MCP, and experimental `unified_exec`
all flow through this path; legacy shell aliases still work.
- Rationale: reduce per‑tool boilerplate, keep spec/handler in sync, and
make adding tools predictable and testable.

Example: `read_file`
- Spec: `core/src/tools/spec.rs` (see `create_read_file_tool`,
registered by `build_specs`).
- Handler: `core/src/tools/handlers/read_file.rs` (absolute `file_path`,
1‑indexed `offset`, `limit`, `L#: ` prefixes, safe truncation).
- E2E test: `core/tests/suite/read_file.rs` validates the tool returns
the requested lines.

## Next steps:
- Decompose `handle_container_exec_with_params` 
- Add parallel tool calls
2025-10-03 13:21:06 +01:00
pakrym-oai
c405d8c06c
Rename assistant message to agent message and fix item type field naming (#4610)
Naming cleanup
2025-10-02 15:07:14 -07:00
pakrym-oai
2f6fb37d72
Support CODEX_API_KEY for codex exec (#4615)
Allows to set API key per invocation of `codex exec`
2025-10-02 09:59:45 -07:00
Michael Bolin
5881c0d6d4
fix: remove mcp-types from app server protocol (#4537)
We continue the separation between `codex app-server` and `codex
mcp-server`.

In particular, we introduce a new crate, `codex-app-server-protocol`,
and migrate `codex-rs/protocol/src/mcp_protocol.rs` into it, renaming it
`codex-rs/app-server-protocol/src/protocol.rs`.

Because `ConversationId` was defined in `mcp_protocol.rs`, we move it
into its own file, `codex-rs/protocol/src/conversation_id.rs`, and
because it is referenced in a ton of places, we have to touch a lot of
files as part of this PR.

We also decide to get away from proper JSON-RPC 2.0 semantics, so we
also introduce `codex-rs/app-server-protocol/src/jsonrpc_lite.rs`, which
is basically the same `JSONRPCMessage` type defined in `mcp-types`
except with all of the `"jsonrpc": "2.0"` removed.

Getting rid of `"jsonrpc": "2.0"` makes our serialization logic
considerably simpler, as we can lean heavier on serde to serialize
directly into the wire format that we use now.
2025-10-01 02:16:26 +00:00
pakrym-oai
7fc3edf8a7
Remove legacy codex exec --json format (#4525)
`codex exec --json` now maps to the behavior of `codex exec
--experimental-json` with new event and item shapes.

Thread events:
- thread.started
- turn.started
- turn.completed
- turn.failed
- item.started
- item.updated
- item.completed

Item types: 
- assistant_message
- reasoning
- command_execution
- file_change
- mcp_tool_call
- web_search
- todo_list
- error

Sample output:

<details>
`codex exec "list my assigned github issues"  --json | jq`

```
{
  "type": "thread.started",
  "thread_id": "01999ce5-f229-7661-8570-53312bd47ea3"
}
{
  "type": "turn.started"
}
{
  "type": "item.completed",
  "item": {
    "id": "item_0",
    "item_type": "reasoning",
    "text": "**Planning to list assigned GitHub issues**"
  }
}
{
  "type": "item.started",
  "item": {
    "id": "item_1",
    "item_type": "mcp_tool_call",
    "server": "github",
    "tool": "search_issues",
    "status": "in_progress"
  }
}
{
  "type": "item.completed",
  "item": {
    "id": "item_1",
    "item_type": "mcp_tool_call",
    "server": "github",
    "tool": "search_issues",
    "status": "completed"
  }
}
{
  "type": "item.completed",
  "item": {
    "id": "item_2",
    "item_type": "reasoning",
    "text": "**Organizing final message structure**"
  }
}
{
  "type": "item.completed",
  "item": {
    "id": "item_3",
    "item_type": "assistant_message",
    "text": "**Assigned Issues**\n- openai/codex#3267 – “stream error: stream disconnected before completion…” (bug) – last update 2025-09-08\n- openai/codex#3257 – “You've hit your usage limit. Try again in 4 days 20 hours 9 minutes.” – last update 2025-09-23\n- openai/codex#3054 – “reqwest SSL panic (library has no ciphers)” (bug) – last update 2025-09-03\n- openai/codex#3051 – “thread 'main' panicked at linux-sandbox/src/linux_run_main.rs:53:5:” (bug) – last update 2025-09-10\n- openai/codex#3004 – “Auto-compact when approaching context limit” (enhancement) – last update 2025-09-26\n- openai/codex#2916 – “Feature request: Add OpenAI service tier support for cost optimization” – last update 2025-09-12\n- openai/codex#1581 – “stream error: stream disconnected before completion: stream closed before response.complete; retrying...” (bug) – last update 2025-09-17"
  }
}
{
  "type": "turn.completed",
  "usage": {
    "input_tokens": 34785,
    "cached_input_tokens": 12544,
    "output_tokens": 560
  }
}
```

</details>
2025-09-30 17:21:37 -07:00
pakrym-oai
a534356fe1
Wire up web search item (#4511)
Add handling for web search events.
2025-09-30 12:01:17 -07:00
pakrym-oai
ea82f86662
Rename conversation to thread in codex exec (#4482) 2025-09-29 20:18:30 -07:00
pakrym-oai
a8edc57740
Add MCP tool call item to codex exec (#4481)
No arguments/results for now.
```
{
  "type": "item.started",
  "item": {
    "id": "item_1",
    "item_type": "mcp_tool_call",
    "server": "github",
    "tool": "search_issues",
    "status": "in_progress"
  }
}
{
  "type": "item.completed",
  "item": {
    "id": "item_1",
    "item_type": "mcp_tool_call",
    "server": "github",
    "tool": "search_issues",
    "status": "completed"
  }
}
```
2025-09-29 19:45:11 -07:00
pakrym-oai
4a80059b1b
Add turn.failed and rename session created to thread started (#4478)
Don't produce completed when turn failed.
2025-09-29 18:38:04 -07:00
vishnu-oai
04c1782e52
OpenTelemetry events (#2103)
### Title

## otel

Codex can emit [OpenTelemetry](https://opentelemetry.io/) **log events**
that
describe each run: outbound API requests, streamed responses, user
input,
tool-approval decisions, and the result of every tool invocation. Export
is
**disabled by default** so local runs remain self-contained. Opt in by
adding an
`[otel]` table and choosing an exporter.

```toml
[otel]
environment = "staging"   # defaults to "dev"
exporter = "none"          # defaults to "none"; set to otlp-http or otlp-grpc to send events
log_user_prompt = false    # defaults to false; redact prompt text unless explicitly enabled
```

Codex tags every exported event with `service.name = "codex-cli"`, the
CLI
version, and an `env` attribute so downstream collectors can distinguish
dev/staging/prod traffic. Only telemetry produced inside the
`codex_otel`
crate—the events listed below—is forwarded to the exporter.

### Event catalog

Every event shares a common set of metadata fields: `event.timestamp`,
`conversation.id`, `app.version`, `auth_mode` (when available),
`user.account_id` (when available), `terminal.type`, `model`, and
`slug`.

With OTEL enabled Codex emits the following event types (in addition to
the
metadata above):

- `codex.api_request`
  - `cf_ray` (optional)
  - `attempt`
  - `duration_ms`
  - `http.response.status_code` (optional)
  - `error.message` (failures)
- `codex.sse_event`
  - `event.kind`
  - `duration_ms`
  - `error.message` (failures)
  - `input_token_count` (completion only)
  - `output_token_count` (completion only)
  - `cached_token_count` (completion only, optional)
  - `reasoning_token_count` (completion only, optional)
  - `tool_token_count` (completion only)
- `codex.user_prompt`
  - `prompt_length`
  - `prompt` (redacted unless `log_user_prompt = true`)
- `codex.tool_decision`
  - `tool_name`
  - `call_id`
- `decision` (`approved`, `approved_for_session`, `denied`, or `abort`)
  - `source` (`config` or `user`)
- `codex.tool_result`
  - `tool_name`
  - `call_id`
  - `arguments`
  - `duration_ms` (execution time for the tool)
  - `success` (`"true"` or `"false"`)
  - `output`

### Choosing an exporter

Set `otel.exporter` to control where events go:

- `none` – leaves instrumentation active but skips exporting. This is
the
  default.
- `otlp-http` – posts OTLP log records to an OTLP/HTTP collector.
Specify the
  endpoint, protocol, and headers your collector expects:

  ```toml
  [otel]
  exporter = { otlp-http = {
    endpoint = "https://otel.example.com/v1/logs",
    protocol = "binary",
    headers = { "x-otlp-api-key" = "${OTLP_TOKEN}" }
  }}
  ```

- `otlp-grpc` – streams OTLP log records over gRPC. Provide the endpoint
and any
  metadata headers:

  ```toml
  [otel]
  exporter = { otlp-grpc = {
    endpoint = "https://otel.example.com:4317",
    headers = { "x-otlp-meta" = "abc123" }
  }}
  ```

If the exporter is `none` nothing is written anywhere; otherwise you
must run or point to your
own collector. All exporters run on a background batch worker that is
flushed on
shutdown.

If you build Codex from source the OTEL crate is still behind an `otel`
feature
flag; the official prebuilt binaries ship with the feature enabled. When
the
feature is disabled the telemetry hooks become no-ops so the CLI
continues to
function without the extra dependencies.

---------

Co-authored-by: Anton Panasenko <apanasenko@openai.com>
2025-09-29 11:30:55 -07:00
pakrym-oai
cc1b21e47f
Add turn started/completed events and correct exit code on error (#4309)
Adds new event for session completed that includes usage. Also ensures
we return 1 on failures.
```
{
  "type": "session.created",
  "session_id": "019987a7-93e7-7b20-9e05-e90060e411ea"
}
{
  "type": "turn.started"
}
...
{
  "type": "turn.completed",
  "usage": {
    "input_tokens": 78913,
    "cached_input_tokens": 65280,
    "output_tokens": 1099
  }
}
```
2025-09-26 16:21:50 -07:00
pakrym-oai
ea095e30c1
Add todo-list tool support (#4255)
Adds a 1-per-turn todo-list item and item.updated event

```jsonl
{"type":"item.started","item":{"id":"item_6","item_type":"todo_list","items":[{"text":"Record initial two-step plan  now","completed":false},{"text":"Update progress to next step","completed":false}]}}
{"type":"item.updated","item":{"id":"item_6","item_type":"todo_list","items":[{"text":"Record initial two-step plan  now","completed":true},{"text":"Update progress to next step","completed":false}]}}
{"type":"item.completed","item":{"id":"item_6","item_type":"todo_list","items":[{"text":"Record initial two-step plan  now","completed":true},{"text":"Update progress to next step","completed":false}]}}
```
2025-09-26 09:35:47 -07:00
pakrym-oai
8e3a048fec
Add codex exec testing helpers (#4254)
Add a shortcut to create working directories and run codex exec with
fake server.
2025-09-25 17:12:45 -07:00
pakrym-oai
67aab04c66
[codex exec] Add item.started and support it for command execution (#4250)
Adds a new `item.started` event to `codex exec` and implements it for
command_execution item type.

```jsonl
{"type":"session.created","session_id":"019982d1-75f0-7920-b051-e0d3731a5ed8"}
{"type":"item.completed","item":{"id":"item_0","item_type":"reasoning","text":"**Executing commands securely**\n\nI'm thinking about how the default harness typically uses \"bash -lc,\" while historically \"bash\" is what we've been using. The command should be executed as a string in our CLI, so using \"bash -lc 'echo hello'\" is optimal but calling \"echo hello\" directly feels safer. The sandbox makes sure environment variables like CODEX_SANDBOX_NETWORK_DISABLED=1 are set, so I won't ask for approval. I just need to run \"echo hello\" and correctly present the output."}}
{"type":"item.completed","item":{"id":"item_1","item_type":"reasoning","text":"**Preparing for tool calls**\n\nI realize that I need to include a preamble before making any tool calls. So, I'll first state the preamble in the commentary channel, then proceed with the tool call. After that, I need to present the final message along with the output. It's possible that the CLI will show the output inline, but I must ensure that I present the result clearly regardless. Let's move forward and get this organized!"}}
{"type":"item.completed","item":{"id":"item_2","item_type":"assistant_message","text":"Running `echo` to confirm shell access and print output."}}
{"type":"item.started","item":{"id":"item_3","item_type":"command_execution","command":"bash -lc echo hello","aggregated_output":"","exit_code":null,"status":"in_progress"}}
{"type":"item.completed","item":{"id":"item_3","item_type":"command_execution","command":"bash -lc echo hello","aggregated_output":"hello\n","exit_code":0,"status":"completed"}}
{"type":"item.completed","item":{"id":"item_4","item_type":"assistant_message","text":"hello"}}
```
2025-09-25 22:25:02 +00:00
Jeremy Rose
4a5f05c136
make tests pass cleanly in sandbox (#4067)
This changes the reqwest client used in tests to be sandbox-friendly,
and skips a bunch of other tests that don't work inside the
sandbox/without network.
2025-09-25 13:11:14 -07:00
pakrym-oai
344d4a1d68
Add explicit codex exec events (#4177)
This pull request add a new experimental format of JSON output.

You can try it using `codex exec --experimental-json`.

Design takes a lot of inspiration from Responses API items and stream
format.

# Session and items
Each invocation of `codex exec` starts or resumes a session. 

Session contains multiple high-level item types:
1. Assistant message 
2. Assistant thinking 
3. Command execution 
4. File changes
5. To-do lists
6. etc.

# Events 
Session and items are going through their life cycles which is
represented by events.

Session is `session.created` or `session.resumed`
Items are `item.added`, `item.updated`, `item.completed`,
`item.require_approval` (or other item types like `item.output_delta`
when we need streaming).

So a typical session can look like:

<details>

```
{
  "type": "session.created",
  "session_id": "01997dac-9581-7de3-b6a0-1df8256f2752"
}
{
  "type": "item.completed",
  "item": {
    "id": "itm_0",
    "item_type": "assistant_message",
    "text": "I’ll locate the top-level README and remove its first line. Then I’ll show a quick summary of what changed."
  }
}
{
  "type": "item.completed",
  "item": {
    "id": "itm_1",
    "item_type": "command_execution",
    "command": "bash -lc ls -la | sed -n '1,200p'",
    "aggregated_output": "pyenv: cannot rehash: /Users/pakrym/.pyenv/shims isn't writable\ntotal 192\ndrwxr-xr-x@  33 pakrym  staff   1056 Sep 24 14:36 .\ndrwxr-xr-x   41 pakrym  staff   1312 Sep 24 09:17 ..\n-rw-r--r--@   1 pakrym  staff      6 Jul  9 16:16 .codespellignore\n-rw-r--r--@   1 pakrym  staff    258 Aug 13 09:40 .codespellrc\ndrwxr-xr-x@   5 pakrym  staff    160 Jul 23 08:26 .devcontainer\n-rw-r--r--@   1 pakrym  staff   6148 Jul 22 10:03 .DS_Store\ndrwxr-xr-x@  15 pakrym  staff    480 Sep 24 14:38 .git\ndrwxr-xr-x@  12 pakrym  staff    384 Sep  2 16:00 .github\n-rw-r--r--@   1 pakrym  staff    778 Jul  9 16:16 .gitignore\ndrwxr-xr-x@   3 pakrym  staff     96 Aug 11 09:37 .husky\n-rw-r--r--@   1 pakrym  staff    104 Jul  9 16:16 .npmrc\n-rw-r--r--@   1 pakrym  staff     96 Sep  2 08:52 .prettierignore\n-rw-r--r--@   1 pakrym  staff    170 Jul  9 16:16 .prettierrc.toml\ndrwxr-xr-x@   5 pakrym  staff    160 Sep 14 17:43 .vscode\ndrwxr-xr-x@   2 pakrym  staff     64 Sep 11 11:37 2025-09-11\n-rw-r--r--@   1 pakrym  staff   5505 Sep 18 09:28 AGENTS.md\n-rw-r--r--@   1 pakrym  staff     92 Sep  2 08:52 CHANGELOG.md\n-rw-r--r--@   1 pakrym  staff   1145 Jul  9 16:16 cliff.toml\ndrwxr-xr-x@  11 pakrym  staff    352 Sep 24 13:03 codex-cli\ndrwxr-xr-x@  38 pakrym  staff   1216 Sep 24 14:38 codex-rs\ndrwxr-xr-x@  18 pakrym  staff    576 Sep 23 11:01 docs\n-rw-r--r--@   1 pakrym  staff   2038 Jul  9 16:16 flake.lock\n-rw-r--r--@   1 pakrym  staff   1434 Jul  9 16:16 flake.nix\n-rw-r--r--@   1 pakrym  staff  10926 Jul  9 16:16 LICENSE\ndrwxr-xr-x@ 465 pakrym  staff  14880 Jul 15 07:36 node_modules\n-rw-r--r--@   1 pakrym  staff    242 Aug  5 08:25 NOTICE\n-rw-r--r--@   1 pakrym  staff    578 Aug 14 12:31 package.json\n-rw-r--r--@   1 pakrym  staff    498 Aug 11 09:37 pnpm-lock.yaml\n-rw-r--r--@   1 pakrym  staff     58 Aug 11 09:37 pnpm-workspace.yaml\n-rw-r--r--@   1 pakrym  staff   2402 Jul  9 16:16 PNPM.md\n-rw-r--r--@   1 pakrym  staff   4393 Sep 12 14:36 README.md\ndrwxr-xr-x@   4 pakrym  staff    128 Sep 18 09:28 scripts\ndrwxr-xr-x@   2 pakrym  staff     64 Sep 11 11:34 tmp\n",
    "exit_code": 0,
    "status": "completed"
  }
}
{
  "type": "item.completed",
  "item": {
    "id": "itm_2",
    "item_type": "reasoning",
    "text": "**Reviewing README.md file**\n\nI've located the README.md file at the root, and it’s 4393 bytes. Now, I need to remove the first line, but first, I should check its content to make sure I’m patching it correctly. I’ll use sed to display the first 20 lines. By reviewing those lines, I can determine exactly what needs to be removed before I proceed with the editing. Let's do this carefully!"
  }
}
{
  "type": "item.completed",
  "item": {
    "id": "itm_3",
    "item_type": "command_execution",
    "command": "bash -lc sed -n '1,40p' README.md",
    "aggregated_output": "<h1 align=\"center\">OpenAI Codex CLI</h1>\n\n<p align=\"center\"><code>npm i -g @openai/codex</code><br />or <code>brew install codex</code></p>\n\n<p align=\"center\"><strong>Codex CLI</strong> is a coding agent from OpenAI that runs locally on your computer.\n</br>\n</br>If you want Codex in your code editor (VS Code, Cursor, Windsurf), <a href=\"https://developers.openai.com/codex/ide\">install in your IDE</a>\n</br>If you are looking for the <em>cloud-based agent</em> from OpenAI, <strong>Codex Web</strong>, go to <a href=\"https://chatgpt.com/codex\">chatgpt.com/codex</a></p>\n\n<p align=\"center\">\n  <img src=\"./.github/codex-cli-splash.png\" alt=\"Codex CLI splash\" width=\"80%\" />\n  </p>\n\n---\n\n## Quickstart\n\n### Installing and running Codex CLI\n\nInstall globally with your preferred package manager. If you use npm:\n\n```shell\nnpm install -g @openai/codex\n```\n\nAlternatively, if you use Homebrew:\n\n```shell\nbrew install codex\n```\n\nThen simply run `codex` to get started:\n\n```shell\ncodex\n```\n\n<details>\n<summary>You can also go to the <a href=\"https://github.com/openai/codex/releases/latest\">latest GitHub Release</a> and download the appropriate binary for your platform.</summary>\n\n",
    "exit_code": 0,
    "status": "completed"
  }
}
{
  "type": "item.completed",
  "item": {
    "id": "itm_4",
    "item_type": "reasoning",
    "text": "**Updating README file**\n\nThe user wants to update the README by removing the first line, which is an HTML h1 tag. I'll proceed to delete that line while keeping the rest intact. To do this, I need to use the apply_patch tool and create a patch in the diff format. I’ll take care to include context lines around the change. After removing the header, I’m considering whether to delete the following blank line as well to keep it tidy."
  }
}
{
  "type": "item.completed",
  "item": {
    "id": "itm_5",
    "item_type": "reasoning",
    "text": "**Crafting the patch**\n\nThe user has asked to remove just the first line, without mentioning the blank lines. Technically, leaving a blank line at the start is acceptable, but it's not ideal. Ideally, I'd prefer to remove the first line while maintaining the subsequent blank line. However, if I can also remove that blank line, it could tidy things up even more. Still, since the user's request is strictly to remove only the first line, I'll stick to that and craft the patch accordingly."
  }
}
{
  "type": "item.completed",
  "item": {
    "id": "itm_6",
    "item_type": "assistant_message",
    "text": "Applying the change by removing the first line from `README.md`."
  }
}
{
  "type": "item.completed",
  "item": {
    "id": "itm_7",
    "item_type": "file_change",
    "changes": [
      {
        "path": "/Users/pakrym/code/codex/README.md",
        "kind": "update"
      }
    ],
    "status": "completed"
  }
}
{
  "type": "item.completed",
  "item": {
    "id": "itm_8",
    "item_type": "reasoning",
    "text": "**Validating README Update**\n\nI updated the README.md and now I need to validate that the first line has been successfully removed. To do this, I'll check the first 10 lines of the document to ensure everything looks correct. It’s always good to double-check these kinds of changes, so I’m going to take that extra step to make sure the update was executed properly. Let's see what those initial lines say!"
  }
}
{
  "type": "item.completed",
  "item": {
    "id": "itm_9",
    "item_type": "command_execution",
    "command": "bash -lc sed -n '1,20p' README.md",
    "aggregated_output": "\n<p align=\"center\"><code>npm i -g @openai/codex</code><br />or <code>brew install codex</code></p>\n\n<p align=\"center\"><strong>Codex CLI</strong> is a coding agent from OpenAI that runs locally on your computer.\n</br>\n</br>If you want Codex in your code editor (VS Code, Cursor, Windsurf), <a href=\"https://developers.openai.com/codex/ide\">install in your IDE</a>\n</br>If you are looking for the <em>cloud-based agent</em> from OpenAI, <strong>Codex Web</strong>, go to <a href=\"https://chatgpt.com/codex\">chatgpt.com/codex</a></p>\n\n<p align=\"center\">\n  <img src=\"./.github/codex-cli-splash.png\" alt=\"Codex CLI splash\" width=\"80%\" />\n  </p>\n\n---\n\n## Quickstart\n\n### Installing and running Codex CLI\n\nInstall globally with your preferred package manager. If you use npm:\n\n",
    "exit_code": 0,
    "status": "completed"
  }
}
{
  "type": "item.completed",
  "item": {
    "id": "itm_10",
    "item_type": "assistant_message",
    "text": "- Updated `README.md:1` by removing the opening H1 line.\n- The README now starts with the install instructions paragraph.\n\nWant me to also remove the leading blank line at the top?"
  }
}
```

</details>

The idea is to give users fully formatted items they can use directly in
their rendering/application logic and avoid having them building up
items manually based on events (unless they want to for streaming).

This PR implements only the `item.completed` payload for some event
types, more event types and item types to come.

---------

Co-authored-by: Michael Bolin <mbolin@openai.com>
2025-09-25 17:47:09 +00:00
pakrym-oai
fdb8dadcae
Add exec output-schema parameter (#4079)
Adds structured output to `exec` via the `--structured-output`
parameter.
2025-09-23 13:59:16 -07:00
jif-oai
be366a31ab
chore: clippy on redundant closure (#4058)
Add redundant closure clippy rules and let Codex fix it by minimising
FQP
2025-09-22 19:30:16 +00:00
pakrym-oai
14a115d488
Add non_sandbox_test helper (#3880)
Makes tests shorter
2025-09-22 14:50:41 +00:00
pakrym-oai
9b18875a42
Use helpers instead of fixtures (#3888)
Move to using test helper method everywhere.
2025-09-19 06:46:25 -07:00
Michael Bolin
8595237505
fix: ensure cwd for conversation and sandbox are separate concerns (#3874)
Previous to this PR, both of these functions take a single `cwd`:


71038381aa/codex-rs/core/src/seatbelt.rs (L19-L25)


71038381aa/codex-rs/core/src/landlock.rs (L16-L23)

whereas `cwd` and `sandbox_cwd` should be set independently (fixed in
this PR).

Added `sandbox_distinguishes_command_and_policy_cwds()` to
`codex-rs/exec/tests/suite/sandbox.rs` to verify this.
2025-09-18 14:37:06 -07:00
Ahmed Ibrahim
a30e5e40ee
enable-resume (#3537)
Adding the ability to resume conversations.
we have one verb `resume`. 

Behavior:

`tui`:
`codex resume`: opens session picker
`codex resume --last`: continue last message
`codex resume <session id>`: continue conversation with `session id`

`exec`:
`codex resume --last`: continue last conversation
`codex resume <session id>`: continue conversation with `session id`

Implementation:
- I added a function to find the path in `~/.codex/sessions/` with a
`UUID`. This is helpful in resuming with session id.
- Added the above mentioned flags
- Added lots of testing
2025-09-14 19:33:19 -04:00
Michael Bolin
bec51f6c05
chore: enable clippy::redundant_clone (#3489)
Created this PR by:

- adding `redundant_clone` to `[workspace.lints.clippy]` in
`cargo-rs/Cargol.toml`
- running `cargo clippy --tests --fix`
- running `just fmt`

Though I had to clean up one instance of the following that resulted:

```rust
let codex = codex;
```
2025-09-11 11:59:37 -07:00
Michael Bolin
74d2741729
chore: require uninlined_format_args from clippy (#2845)
- added `uninlined_format_args` to `[workspace.lints.clippy]` in the
`Cargo.toml` for the workspace
- ran `cargo clippy --tests --fix`
- ran `just fmt`
2025-08-28 11:25:23 -07:00
Dylan
7f7d1e30f3
[exec] Clean up apply-patch tests (#2648)
## Summary
These tests were getting a bit unwieldy, and they're starting to become
load-bearing. Let's clean them up, and get them working solidly so we
can easily expand this harness with new tests.

## Test Plan
- [x] Tests continue to pass
2025-08-25 15:08:01 -07:00
Jeremy Rose
32bbbbad61
test: faster test execution in codex-core (#2633)
this dramatically improves time to run `cargo test -p codex-core` (~25x
speedup).

before:
```
cargo test -p codex-core  35.96s user 68.63s system 19% cpu 8:49.80 total
```

after:
```
cargo test -p codex-core  5.51s user 8.16s system 63% cpu 21.407 total
```

both tests measured "hot", i.e. on a 2nd run with no filesystem changes,
to exclude compile times.

approach inspired by [Delete Cargo Integration
Tests](https://matklad.github.io/2021/02/27/delete-cargo-integration-tests.html),
we move all test cases in tests/ into a single suite in order to have a
single binary, as there is significant overhead for each test binary
executed, and because test execution is only parallelized with a single
binary.
2025-08-24 11:10:53 -07:00
Dylan
236c4f76a6
[apply_patch] freeform apply_patch tool (#2576)
## Summary
GPT-5 introduced the concept of [custom
tools](https://platform.openai.com/docs/guides/function-calling#custom-tools),
which allow the model to send a raw string result back, simplifying
json-escape issues. We are migrating gpt-5 to use this by default.

However, gpt-oss models do not support custom tools, only normal
functions. So we keep both tool definitions, and provide whichever one
the model family supports.

## Testing
- [x] Tested locally with various models
- [x] Unit tests pass
2025-08-22 13:42:34 -07:00
Dylan
6df8e35314
[tools] Add apply_patch tool (#2303)
## Summary
We've been seeing a number of issues and reports with our synthetic
`apply_patch` tool, e.g. #802. Let's make this a real tool - in my
anecdotal testing, it's critical for GPT-OSS models, but I'd like to
make it the standard across GPT-5 and codex models as well.

## Testing
- [x] Tested locally
- [x] Integration test
2025-08-15 11:55:53 -04:00
Parker Thompson
a075424437
Added allow-expect-in-tests / allow-unwrap-in-tests (#2328)
This PR:
* Added the clippy.toml to configure allowable expect / unwrap usage in
tests
* Removed as many expect/allow lines as possible from tests
* moved a bunch of allows to expects where possible

Note: in integration tests, non `#[test]` helper functions are not
covered by this so we had to leave a few lingering `expect(expect_used`
checks around
2025-08-14 17:59:01 -07:00