Commit graph

3718 commits

Author SHA1 Message Date
Dylan Hurd
cc8c293378
fix(exec-policy) No empty command lists (#11397)
## Summary
This should rarely, if ever, happen in practice. But regardless, we
should never provide an empty list of `commands` to ExecPolicy. This PR
is almost entirely adding test around these cases.

## Testing
- [x] Adds a bunch of unit tests for this
2026-02-10 19:22:23 -08:00
Michael Bolin
b68a84ee8e
Remove deterministic_process_ids feature to avoid duplicate codex-core builds (#11393)
## Why

`codex-core` enabled `deterministic_process_ids` through a self
dev-dependency.
That forced a second feature-resolved build of the same crate, which
increased
compile time and test latency.

## What Changed

- Removed the `deterministic_process_ids` feature from
`codex-rs/core/Cargo.toml`.
- Removed the self dev-dependency on `codex-core` that enabled that
feature.
- Removed the Bazel `deterministic_process_ids` crate feature for
`codex-core`.
- Added a test-only `AtomicBool` override in unified exec process-id
allocation.
- Added a test-support setter for that override and re-exported it from
`codex-core`.
- Enabled deterministic process IDs in integration tests via
`core_test_support` ctor.

## Behavior

- Production behavior remains random process IDs.
- Unit tests remain deterministic via `cfg(test)`.
- Integration tests remain deterministic via explicit test-support
initialization.

## Validation

- `just fmt`
- `cargo test -p codex-core unified_exec::`
- `cargo test -p codex-core --test all unified_exec -- --test-threads=1`
- `cargo tree -p codex-core -e features` (verified the removed feature
path)
2026-02-10 19:07:01 -08:00
Charley Cunningham
8b46c0ce00
tui: queue non-pending rollback trims in app-event order (#11373)
## Summary

This PR fixes TUI transcript-sync behavior for
`EventMsg::ThreadRolledBack` and makes rollback application order
deterministic.

Previously, rollback handling depended on `pending_rollback`:

- if `pending_rollback` was set (local backtrack), TUI trimmed correctly
- otherwise, replayed/external rollbacks were either ignored or could be
applied at the wrong time relative to queued transcript inserts

This change keeps the local backtrack path intact and routes non-pending
rollbacks through the app event queue so rollback trims are applied in
FIFO order with transcript cell inserts.

## What changed

- Added/used `trim_transcript_cells_drop_last_n_user_turns(...)` for
rollback-by-`num_turns` semantics.
- Renamed rollback app event:
- `AppEvent::ApplyReplayedThreadRollback` ->
`AppEvent::ApplyThreadRollback`
- Replay path (`ChatWidget`) now emits `ApplyThreadRollback`.
- Live non-pending rollback path (`App::handle_backtrack_event`) now
emits `ApplyThreadRollback` instead of trimming immediately.
- App-level event handler applies `ApplyThreadRollback` after queued
`InsertHistoryCell` events and schedules redraw only when a trim
occurred.
- When a trim occurs with an overlay open, TUI now syncs transcript
overlay committed cells, clamps backtrack preview selection, and clears
stale `deferred_history_lines` so closed overlays do not re-append
rolled-back lines.
- Clarified inline comments around the `pending_rollback` branch so
future readers can reason about why there are two paths.

## Why queueing matters

During resume/replay, transcript cells are populated via queued
`InsertHistoryCell` app events. If a rollback is applied immediately
outside that queue, it can run against an incomplete transcript and
under-trim. Queueing non-pending rollbacks ensures consistent ordering
and correct final transcript state.

## Behavior by rollback source

- `pending_rollback = Some(...)` (local backtrack requested by this
TUI):
  - use `finish_pending_backtrack()` and the stored selection boundary
- `pending_rollback = None` (replay/external/non-local rollback):
- enqueue `AppEvent::ApplyThreadRollback { num_turns }` and trim in
app-event order

## Tests

Added/updated tests covering ordering and semantics:

-
`app_backtrack::tests::trim_drop_last_n_user_turns_applies_rollback_semantics`
- `app_backtrack::tests::trim_drop_last_n_user_turns_allows_overflow`
- `app::tests::replayed_initial_messages_apply_rollback_in_queue_order`
-
`app::tests::live_rollback_during_replay_is_applied_in_app_event_order`
-
`app::tests::queued_rollback_syncs_overlay_and_clears_deferred_history`
- `chatwidget::tests::replayed_thread_rollback_emits_ordered_app_event`

Validation run:

- `just fmt`
- `cargo test -p codex-tui`
2026-02-10 18:53:43 -08:00
pakrym-oai
c68999ee6d
Prefer websocket transport when model opts in (#11386)
Summary
- add a `prefer_websockets` field to `ModelInfo`, defaulting to `false`
in all fixtures and constructors
- wire the new flag into websocket selection so models that opt in
always use websocket transport even when the feature gate is off

Testing
- Not run (not requested)
2026-02-10 18:50:48 -08:00
pakrym-oai
bfd4e2112c
Disable very flaky tests (#11394)
Collected from last 20 builds of main in
https://github.com/openai/codex/commits/main/.
2026-02-10 18:50:11 -08:00
github-actions[bot]
f101300dba
Update models.json (#11376)
Automated update of models.json.

---------

Co-authored-by: aibrahim-oai <219906144+aibrahim-oai@users.noreply.github.com>
Co-authored-by: sayan-oai <sayan@openai.com>
2026-02-10 17:25:35 -08:00
Michael Bolin
d44f4205fb
chore: rename codex-command to codex-shell-command (#11378)
This addresses some post-merge feedback on
https://github.com/openai/codex/pull/11361:

- crate rename
- reuse `detect_shell_type()` utility
2026-02-10 17:03:46 -08:00
jif-oai
87bbfc50a1
feat: prevent double backfill (#11377)
## Summary

Add a DB-backed lease to prevent duplicate `.sqlite` backfill workers
from running concurrently.

### What changed
- Added StateRuntime::try_claim_backfill(lease_seconds) that atomically
claims backfill only when:
  - backfill is not complete, and
  - no fresh running worker currently owns it.
- Updated backfill_sessions to use the claim API and exit early when
another worker already holds the lease.
- Added runtime tests covering:
  - singleton claim behavior,
  - stale lease takeover,
  - claim blocked after complete.
- Set backfill lease to 900s in production and 1s in tests.

### Why

This avoids duplicate backfill work and reduces backfill status churn
under concurrent startup, while preserving
current best-effort fallback behavior.
2026-02-11 00:24:20 +00:00
jif-oai
674799d356
feat: mem v2 - PR6 (consolidation) (#11374) 2026-02-11 00:02:57 +00:00
jif-oai
2c9be54c9a
feat: mem v2 - PR5 (#11372) 2026-02-10 23:22:55 +00:00
Josh McKinney
34fb4b6e63
ci: fall back to local Bazel on forks without BuildBuddy key (#11359)
## Summary
- detect whether BUILDBUDDY_API_KEY is present in Bazel CI
- keep existing remote BuildBuddy path when key is available
- add a local fallback path for fork PRs without secrets by clearing
remote cache/executor/BES endpoints
- document each fallback flag inline with links to Bazel docs

## Testing
- ruby -e 'require "yaml";
YAML.load_file(".github/workflows/bazel.yml"); puts "ok"'
- verified Bazel docs/flag references used in workflow comments
2026-02-10 23:19:55 +00:00
viyatb-oai
1d47927aa0
Enable SOCKS defaults for common local network proxy use cases (#11362)
## Summary
- enable local-use defaults in network proxy settings: SOCKS5 on, SOCKS5
UDP on, upstream proxying on, and local binding on
- add a regression test that asserts the full
`NetworkProxySettings::default()` baseline
- Fixed managed listener reservation behavior.
Before: we always reserved a loopback SOCKS listener, even when
enable_socks5 = false.
Now: SOCKS listener is only reserved when SOCKS is enabled.
- Fixed /debug-config env output for SOCKS-disabled sessions.
ALL_PROXY now shows the HTTP proxy URL when SOCKS is disabled (instead
of incorrectly showing socks5h://...).


## Validation
- just fmt
- cargo test -p codex-network-proxy
- cargo clippy -p codex-network-proxy --all-targets
2026-02-10 15:13:52 -08:00
jif-oai
623d3f4071
feat: mem v2 - PR4 (#11369)
# Memories migration plan (simplified global workflow)

## Target behavior

- One shared memory root only: `~/.codex/memories/`.
- No per-cwd memory buckets, no cwd hash handling.
- Phase 1 candidate rules:
- Not currently being processed unless the job lease is stale.
- Rollout updated within the max-age window (currently 30 days).
- Rollout idle for at least 12 hours (new constant).
- Global cap: at most 64 stage-1 jobs in `running` state at any time
(new invariant).
- Stage-1 model output shape (new):
- `rollout_slug` (accepted but ignored for now).
- `rollout_summary`.
- `raw_memory`.
- Phase-1 artifacts written under the shared root:
- `rollout_summaries/<thread_id>.md` for each rollout summary.
- `raw_memories.md` containing appended/merged raw memory paragraphs.
- Phase 2 runs one consolidation agent for the shared `memories/`
directory.
- Phase-2 lock is DB-backed with 1 hour lease and heartbeat/expiry.

## Current code map

- Core startup pipeline: `core/src/memories/startup/mod.rs`.
- Stage-1 request+parse: `core/src/memories/startup/extract.rs`,
`core/src/memories/stage_one.rs`, templates in
`core/templates/memories/`.
- File materialization: `core/src/memories/storage.rs`,
`core/src/memories/layout.rs`.
- Scope routing (cwd/user): `core/src/memories/scope.rs`,
`core/src/memories/startup/mod.rs`.
- DB job lifecycle and scope queueing: `state/src/runtime/memory.rs`.

## PR plan

## PR 1: Correct phase-1 selection invariants (no behavior-breaking
layout changes yet)

- Add `PHASE_ONE_MIN_ROLLOUT_IDLE_HOURS: i64 = 12` in
`core/src/memories/mod.rs`.
- Thread this into `state::claim_stage1_jobs_for_startup(...)`.
- Enforce idle-time filter in DB selection logic (not only in-memory
filtering after `scan_limit`) so eligible threads are not starved by
very recent threads.
- Enforce global running cap of 64 at claim time in DB logic:
- Count fresh `memory_stage1` running jobs.
- Only allow new claims while count < cap.
- Keep stale-lease takeover behavior intact.
- Add/adjust tests in `state/src/runtime.rs`:
- Idle filter inclusion/exclusion around 12h boundary.
- Global running-cap guarantee.
- Existing stale/fresh ownership behavior still passes.

Acceptance criteria:
- Startup never creates more than 64 fresh `memory_stage1` running jobs.
- Threads updated <12h ago are skipped.
- Threads older than 30d are skipped.

## PR 2: Stage-1 output contract + storage artifacts
(forward-compatible)

- Update parser/types to accept the new structured output while keeping
backward compatibility:
- Add `rollout_slug` (optional for now).
- Add `rollout_summary`.
- Keep alias support for legacy `summary` and `rawMemory` until prompt
swap completes.
- Update stage-1 schema generator in `core/src/memories/stage_one.rs` to
include the new keys.
- Update prompt templates:
- `core/templates/memories/stage_one_system.md`.
- `core/templates/memories/stage_one_input.md`.
- Replace storage model in `core/src/memories/storage.rs`:
- Introduce `rollout_summaries/` directory writer (`<thread_id>.md`
files).
- Introduce `raw_memories.md` aggregator writer from DB rows.
- Keep deterministic rebuild behavior from DB outputs so files can
always be regenerated.
- Update consolidation prompt template to reference `rollout_summaries/`
+ `raw_memories.md` inputs.

Acceptance criteria:
- Stage-1 accepts both old and new output keys during migration.
- Phase-1 artifacts are generated in new format from DB state.
- No dependence on per-thread files in `raw_memories/`.

## PR 3: Remove per-cwd memories and move to one global memory root

- Simplify layout in `core/src/memories/layout.rs`:
- Single root: `codex_home/memories`.
- Remove cwd-hash bucket helpers and normalization logic used only for
memory pathing.
- Remove scope branching from startup phase-2 dispatch path:
- No cwd/user mapping in `core/src/memories/startup/mod.rs`.
- One target root for consolidation.
- In `state/src/runtime/memory.rs`, stop enqueueing/handling cwd
consolidation scope.
- Keep one logical consolidation scope/job key (global/user) to avoid a
risky schema rewrite in same PR.
- Add one-time migration helper (core side) to preserve current shared
memory output:
- If `~/.codex/memories/user/memory` exists and new root is empty,
move/copy contents into `~/.codex/memories`.
- Leave old hashed cwd buckets untouched for now (safe/no-destructive
migration).

Acceptance criteria:
- New runs only read/write `~/.codex/memories`.
- No new cwd-scoped consolidation jobs are enqueued.
- Existing user-shared memory content is preserved.

## PR 4: Phase-2 global lock simplification and cleanup

- Replace multi-scope dispatch with a single global consolidation claim
path:
- Either reuse jobs table with one fixed key, or add a tiny dedicated
lock helper; keep 1h lease.
- Ensure at most one consolidation agent can run at once.
- Keep heartbeat + stale lock recovery semantics in
`core/src/memories/startup/watch.rs`.
- Remove dead scope code and legacy constants no longer used.
- Update tests:
- One-agent-at-a-time behavior.
- Lock expiry allows takeover after stale lease.

Acceptance criteria:
- Exactly one phase-2 consolidation agent can be active cluster-wide
(per local DB).
- Stale lock recovers automatically.

## PR 5: Final cleanup and docs

- Remove legacy artifacts and references:
- `raw_memories/` and `memory_summary.md` assumptions from
prompts/comments/tests.
- Scope constants for cwd memory pathing in core/state if fully unused.
- Update docs under `docs/` for memory workflow and directory layout.
- Add a brief operator note for rollout: compatibility window for old
stage-1 JSON keys and when to remove aliases.

Acceptance criteria:
- Code and docs reflect only the simplified global workflow.
- No stale references to per-cwd memory buckets.

## Notes on sequencing

- PR 1 is safest first because it improves correctness without changing
external artifact layout.
- PR 2 keeps parser compatibility so prompt deployment can happen
independently.
- PR 3 and PR 4 split filesystem/scope simplification from locking
simplification to reduce blast radius.
- PR 5 is intentionally cleanup-only.
2026-02-10 23:10:35 +00:00
Michael Bolin
d8f9bb65e2
# Split command parsing/safety out of codex-core into new codex-command (#11361)
`codex-core` had accumulated command parsing and command safety logic
(`bash`, `powershell`, `parse_command`, and `command_safety`) that is
logically cohesive but orthogonal to most core session/runtime logic.
Keeping this code in `codex-core` made the crate increasingly monolithic
and raised iteration cost for unrelated core changes.

This change extracts that surface into a dedicated crate,
`codex-command`, while preserving existing `codex_core::...` call sites
via re-exports.

## Why this refactor

During analysis, command parsing/safety stood out as a good first split
because it has:

- a clear domain boundary (shell parsing + safety classification)
- relatively self-contained dependencies (notably `tree-sitter` /
`tree-sitter-bash`)
- a meaningful standalone test surface (`134` tests moved with the
crate)
- many downstream uses that benefit from independent compilation and
caching

The practical problem was build latency from a large `codex-core`
compile/test graph. Clean-build timings before and after this split
showed measurable wins:

- `cargo check -p codex-core`: `57.08s` -> `53.54s` (~`6.2%` faster)
- `cargo test -p codex-core --no-run`: `2m39.9s` -> `2m20s` (~`12.4%`
faster)
- `codex-core lib` compile unit: `57.18s` -> `49.67s` (~`13.1%` faster)
- `codex-core lib(test)` compile unit: `60.87s` -> `53.21s` (~`12.6%`
faster)

This gives a concrete reduction in core build overhead without changing
behavior.

## What changed

### New crate

- Added `codex-rs/command` as workspace crate `codex-command`.
- Added:
  - `command/src/lib.rs`
  - `command/src/bash.rs`
  - `command/src/powershell.rs`
  - `command/src/parse_command.rs`
  - `command/src/command_safety/*`
  - `command/src/shell_detect.rs`
  - `command/BUILD.bazel`

### Code moved out of `codex-core`

- Moved modules from `core/src` into `command/src`:
  - `bash.rs`
  - `powershell.rs`
  - `parse_command.rs`
  - `command_safety/*`

### Dependency graph updates

- Added workspace member/dependency entries for `codex-command` in
`codex-rs/Cargo.toml`.
- Added `codex-command` dependency to `codex-rs/core/Cargo.toml`.
- Removed `tree-sitter` and `tree-sitter-bash` from `codex-core` direct
deps (now owned by `codex-command`).

### API compatibility for callers

To avoid immediate downstream churn, `codex-core` now re-exports the
moved modules/functions:

- `codex_command::bash`
- `codex_command::powershell`
- `codex_command::parse_command`
- `codex_command::is_safe_command`
- `codex_command::is_dangerous_command`

This keeps existing `codex_core::...` paths working while enabling
gradual migration to direct `codex-command` usage.

### Internal decoupling detail

- Added `command::shell_detect` so moved `bash`/`powershell` logic no
longer depends on core shell internals.
- Adjusted PowerShell helper visibility in `codex-command` for existing
core test usage (`UTF8` prefix helper + executable discovery functions).

## Validation

- `just fmt`
- `just fix -p codex-command -p codex-core`
- `cargo test -p codex-command` (`134` passed)
- `cargo test -p codex-core --no-run`
- `cargo test -p codex-core shell_command_handler`

## Notes / follow-up

This commit intentionally prioritizes boundary extraction and
compatibility. A follow-up can migrate downstream crates to depend
directly on `codex-command` (instead of through `codex-core` re-exports)
to realize additional incremental build wins.
2026-02-10 14:43:16 -08:00
github-actions[bot]
3626399811
Update models.json (#11274)
Automated update of models.json.

---------

Co-authored-by: aibrahim-oai <219906144+aibrahim-oai@users.noreply.github.com>
Co-authored-by: Ahmed Ibrahim <aibrahim@openai.com>
Co-authored-by: Sayan Sisodiya <sayan@openai.com>
2026-02-10 14:28:18 -08:00
jif-oai
3419660767
feat: mem v2 - PR3 (#11366)
# Memories migration plan (simplified global workflow)

## Target behavior

- One shared memory root only: `~/.codex/memories/`.
- No per-cwd memory buckets, no cwd hash handling.
- Phase 1 candidate rules:
- Not currently being processed unless the job lease is stale.
- Rollout updated within the max-age window (currently 30 days).
- Rollout idle for at least 12 hours (new constant).
- Global cap: at most 64 stage-1 jobs in `running` state at any time
(new invariant).
- Stage-1 model output shape (new):
- `rollout_slug` (accepted but ignored for now).
- `rollout_summary`.
- `raw_memory`.
- Phase-1 artifacts written under the shared root:
- `rollout_summaries/<thread_id>.md` for each rollout summary.
- `raw_memories.md` containing appended/merged raw memory paragraphs.
- Phase 2 runs one consolidation agent for the shared `memories/`
directory.
- Phase-2 lock is DB-backed with 1 hour lease and heartbeat/expiry.

## Current code map

- Core startup pipeline: `core/src/memories/startup/mod.rs`.
- Stage-1 request+parse: `core/src/memories/startup/extract.rs`,
`core/src/memories/stage_one.rs`, templates in
`core/templates/memories/`.
- File materialization: `core/src/memories/storage.rs`,
`core/src/memories/layout.rs`.
- Scope routing (cwd/user): `core/src/memories/scope.rs`,
`core/src/memories/startup/mod.rs`.
- DB job lifecycle and scope queueing: `state/src/runtime/memory.rs`.

## PR plan

## PR 1: Correct phase-1 selection invariants (no behavior-breaking
layout changes yet)

- Add `PHASE_ONE_MIN_ROLLOUT_IDLE_HOURS: i64 = 12` in
`core/src/memories/mod.rs`.
- Thread this into `state::claim_stage1_jobs_for_startup(...)`.
- Enforce idle-time filter in DB selection logic (not only in-memory
filtering after `scan_limit`) so eligible threads are not starved by
very recent threads.
- Enforce global running cap of 64 at claim time in DB logic:
- Count fresh `memory_stage1` running jobs.
- Only allow new claims while count < cap.
- Keep stale-lease takeover behavior intact.
- Add/adjust tests in `state/src/runtime.rs`:
- Idle filter inclusion/exclusion around 12h boundary.
- Global running-cap guarantee.
- Existing stale/fresh ownership behavior still passes.

Acceptance criteria:
- Startup never creates more than 64 fresh `memory_stage1` running jobs.
- Threads updated <12h ago are skipped.
- Threads older than 30d are skipped.

## PR 2: Stage-1 output contract + storage artifacts
(forward-compatible)

- Update parser/types to accept the new structured output while keeping
backward compatibility:
- Add `rollout_slug` (optional for now).
- Add `rollout_summary`.
- Keep alias support for legacy `summary` and `rawMemory` until prompt
swap completes.
- Update stage-1 schema generator in `core/src/memories/stage_one.rs` to
include the new keys.
- Update prompt templates:
- `core/templates/memories/stage_one_system.md`.
- `core/templates/memories/stage_one_input.md`.
- Replace storage model in `core/src/memories/storage.rs`:
- Introduce `rollout_summaries/` directory writer (`<thread_id>.md`
files).
- Introduce `raw_memories.md` aggregator writer from DB rows.
- Keep deterministic rebuild behavior from DB outputs so files can
always be regenerated.
- Update consolidation prompt template to reference `rollout_summaries/`
+ `raw_memories.md` inputs.

Acceptance criteria:
- Stage-1 accepts both old and new output keys during migration.
- Phase-1 artifacts are generated in new format from DB state.
- No dependence on per-thread files in `raw_memories/`.

## PR 3: Remove per-cwd memories and move to one global memory root

- Simplify layout in `core/src/memories/layout.rs`:
- Single root: `codex_home/memories`.
- Remove cwd-hash bucket helpers and normalization logic used only for
memory pathing.
- Remove scope branching from startup phase-2 dispatch path:
- No cwd/user mapping in `core/src/memories/startup/mod.rs`.
- One target root for consolidation.
- In `state/src/runtime/memory.rs`, stop enqueueing/handling cwd
consolidation scope.
- Keep one logical consolidation scope/job key (global/user) to avoid a
risky schema rewrite in same PR.
- Add one-time migration helper (core side) to preserve current shared
memory output:
- If `~/.codex/memories/user/memory` exists and new root is empty,
move/copy contents into `~/.codex/memories`.
- Leave old hashed cwd buckets untouched for now (safe/no-destructive
migration).

Acceptance criteria:
- New runs only read/write `~/.codex/memories`.
- No new cwd-scoped consolidation jobs are enqueued.
- Existing user-shared memory content is preserved.

## PR 4: Phase-2 global lock simplification and cleanup

- Replace multi-scope dispatch with a single global consolidation claim
path:
- Either reuse jobs table with one fixed key, or add a tiny dedicated
lock helper; keep 1h lease.
- Ensure at most one consolidation agent can run at once.
- Keep heartbeat + stale lock recovery semantics in
`core/src/memories/startup/watch.rs`.
- Remove dead scope code and legacy constants no longer used.
- Update tests:
- One-agent-at-a-time behavior.
- Lock expiry allows takeover after stale lease.

Acceptance criteria:
- Exactly one phase-2 consolidation agent can be active cluster-wide
(per local DB).
- Stale lock recovers automatically.

## PR 5: Final cleanup and docs

- Remove legacy artifacts and references:
- `raw_memories/` and `memory_summary.md` assumptions from
prompts/comments/tests.
- Scope constants for cwd memory pathing in core/state if fully unused.
- Update docs under `docs/` for memory workflow and directory layout.
- Add a brief operator note for rollout: compatibility window for old
stage-1 JSON keys and when to remove aliases.

Acceptance criteria:
- Code and docs reflect only the simplified global workflow.
- No stale references to per-cwd memory buckets.

## Notes on sequencing

- PR 1 is safest first because it improves correctness without changing
external artifact layout.
- PR 2 keeps parser compatibility so prompt deployment can happen
independently.
- PR 3 and PR 4 split filesystem/scope simplification from locking
simplification to reduce blast radius.
- PR 5 is intentionally cleanup-only.
2026-02-10 22:12:50 +00:00
jif-oai
0229dc5ccf
feat: mem v2 - PR2 (#11365)
# Memories migration plan (simplified global workflow)

## Target behavior

- One shared memory root only: `~/.codex/memories/`.
- No per-cwd memory buckets, no cwd hash handling.
- Phase 1 candidate rules:
- Not currently being processed unless the job lease is stale.
- Rollout updated within the max-age window (currently 30 days).
- Rollout idle for at least 12 hours (new constant).
- Global cap: at most 64 stage-1 jobs in `running` state at any time
(new invariant).
- Stage-1 model output shape (new):
- `rollout_slug` (accepted but ignored for now).
- `rollout_summary`.
- `raw_memory`.
- Phase-1 artifacts written under the shared root:
- `rollout_summaries/<thread_id>.md` for each rollout summary.
- `raw_memories.md` containing appended/merged raw memory paragraphs.
- Phase 2 runs one consolidation agent for the shared `memories/`
directory.
- Phase-2 lock is DB-backed with 1 hour lease and heartbeat/expiry.

## Current code map

- Core startup pipeline: `core/src/memories/startup/mod.rs`.
- Stage-1 request+parse: `core/src/memories/startup/extract.rs`,
`core/src/memories/stage_one.rs`, templates in
`core/templates/memories/`.
- File materialization: `core/src/memories/storage.rs`,
`core/src/memories/layout.rs`.
- Scope routing (cwd/user): `core/src/memories/scope.rs`,
`core/src/memories/startup/mod.rs`.
- DB job lifecycle and scope queueing: `state/src/runtime/memory.rs`.

## PR plan

## PR 1: Correct phase-1 selection invariants (no behavior-breaking
layout changes yet)

- Add `PHASE_ONE_MIN_ROLLOUT_IDLE_HOURS: i64 = 12` in
`core/src/memories/mod.rs`.
- Thread this into `state::claim_stage1_jobs_for_startup(...)`.
- Enforce idle-time filter in DB selection logic (not only in-memory
filtering after `scan_limit`) so eligible threads are not starved by
very recent threads.
- Enforce global running cap of 64 at claim time in DB logic:
- Count fresh `memory_stage1` running jobs.
- Only allow new claims while count < cap.
- Keep stale-lease takeover behavior intact.
- Add/adjust tests in `state/src/runtime.rs`:
- Idle filter inclusion/exclusion around 12h boundary.
- Global running-cap guarantee.
- Existing stale/fresh ownership behavior still passes.

Acceptance criteria:
- Startup never creates more than 64 fresh `memory_stage1` running jobs.
- Threads updated <12h ago are skipped.
- Threads older than 30d are skipped.

## PR 2: Stage-1 output contract + storage artifacts
(forward-compatible)

- Update parser/types to accept the new structured output while keeping
backward compatibility:
- Add `rollout_slug` (optional for now).
- Add `rollout_summary`.
- Keep alias support for legacy `summary` and `rawMemory` until prompt
swap completes.
- Update stage-1 schema generator in `core/src/memories/stage_one.rs` to
include the new keys.
- Update prompt templates:
- `core/templates/memories/stage_one_system.md`.
- `core/templates/memories/stage_one_input.md`.
- Replace storage model in `core/src/memories/storage.rs`:
- Introduce `rollout_summaries/` directory writer (`<thread_id>.md`
files).
- Introduce `raw_memories.md` aggregator writer from DB rows.
- Keep deterministic rebuild behavior from DB outputs so files can
always be regenerated.
- Update consolidation prompt template to reference `rollout_summaries/`
+ `raw_memories.md` inputs.

Acceptance criteria:
- Stage-1 accepts both old and new output keys during migration.
- Phase-1 artifacts are generated in new format from DB state.
- No dependence on per-thread files in `raw_memories/`.

## PR 3: Remove per-cwd memories and move to one global memory root

- Simplify layout in `core/src/memories/layout.rs`:
- Single root: `codex_home/memories`.
- Remove cwd-hash bucket helpers and normalization logic used only for
memory pathing.
- Remove scope branching from startup phase-2 dispatch path:
- No cwd/user mapping in `core/src/memories/startup/mod.rs`.
- One target root for consolidation.
- In `state/src/runtime/memory.rs`, stop enqueueing/handling cwd
consolidation scope.
- Keep one logical consolidation scope/job key (global/user) to avoid a
risky schema rewrite in same PR.
- Add one-time migration helper (core side) to preserve current shared
memory output:
- If `~/.codex/memories/user/memory` exists and new root is empty,
move/copy contents into `~/.codex/memories`.
- Leave old hashed cwd buckets untouched for now (safe/no-destructive
migration).

Acceptance criteria:
- New runs only read/write `~/.codex/memories`.
- No new cwd-scoped consolidation jobs are enqueued.
- Existing user-shared memory content is preserved.

## PR 4: Phase-2 global lock simplification and cleanup

- Replace multi-scope dispatch with a single global consolidation claim
path:
- Either reuse jobs table with one fixed key, or add a tiny dedicated
lock helper; keep 1h lease.
- Ensure at most one consolidation agent can run at once.
- Keep heartbeat + stale lock recovery semantics in
`core/src/memories/startup/watch.rs`.
- Remove dead scope code and legacy constants no longer used.
- Update tests:
- One-agent-at-a-time behavior.
- Lock expiry allows takeover after stale lease.

Acceptance criteria:
- Exactly one phase-2 consolidation agent can be active cluster-wide
(per local DB).
- Stale lock recovers automatically.

## PR 5: Final cleanup and docs

- Remove legacy artifacts and references:
- `raw_memories/` and `memory_summary.md` assumptions from
prompts/comments/tests.
- Scope constants for cwd memory pathing in core/state if fully unused.
- Update docs under `docs/` for memory workflow and directory layout.
- Add a brief operator note for rollout: compatibility window for old
stage-1 JSON keys and when to remove aliases.

Acceptance criteria:
- Code and docs reflect only the simplified global workflow.
- No stale references to per-cwd memory buckets.

## Notes on sequencing

- PR 1 is safest first because it improves correctness without changing
external artifact layout.
- PR 2 keeps parser compatibility so prompt deployment can happen
independently.
- PR 3 and PR 4 split filesystem/scope simplification from locking
simplification to reduce blast radius.
- PR 5 is intentionally cleanup-only.
2026-02-10 21:50:53 +00:00
jif-oai
07da740c8a
feat: mem v2 - PR1 (#11364)
# Memories migration plan (simplified global workflow)

## Target behavior

- One shared memory root only: `~/.codex/memories/`.
- No per-cwd memory buckets, no cwd hash handling.
- Phase 1 candidate rules:
- Not currently being processed unless the job lease is stale.
- Rollout updated within the max-age window (currently 30 days).
- Rollout idle for at least 12 hours (new constant).
- Global cap: at most 64 stage-1 jobs in `running` state at any time
(new invariant).
- Stage-1 model output shape (new):
- `rollout_slug` (accepted but ignored for now).
- `rollout_summary`.
- `raw_memory`.
- Phase-1 artifacts written under the shared root:
- `rollout_summaries/<thread_id>.md` for each rollout summary.
- `raw_memories.md` containing appended/merged raw memory paragraphs.
- Phase 2 runs one consolidation agent for the shared `memories/`
directory.
- Phase-2 lock is DB-backed with 1 hour lease and heartbeat/expiry.

## Current code map

- Core startup pipeline: `core/src/memories/startup/mod.rs`.
- Stage-1 request+parse: `core/src/memories/startup/extract.rs`,
`core/src/memories/stage_one.rs`, templates in
`core/templates/memories/`.
- File materialization: `core/src/memories/storage.rs`,
`core/src/memories/layout.rs`.
- Scope routing (cwd/user): `core/src/memories/scope.rs`,
`core/src/memories/startup/mod.rs`.
- DB job lifecycle and scope queueing: `state/src/runtime/memory.rs`.

## PR plan

## PR 1: Correct phase-1 selection invariants (no behavior-breaking
layout changes yet)

- Add `PHASE_ONE_MIN_ROLLOUT_IDLE_HOURS: i64 = 12` in
`core/src/memories/mod.rs`.
- Thread this into `state::claim_stage1_jobs_for_startup(...)`.
- Enforce idle-time filter in DB selection logic (not only in-memory
filtering after `scan_limit`) so eligible threads are not starved by
very recent threads.
- Enforce global running cap of 64 at claim time in DB logic:
- Count fresh `memory_stage1` running jobs.
- Only allow new claims while count < cap.
- Keep stale-lease takeover behavior intact.
- Add/adjust tests in `state/src/runtime.rs`:
- Idle filter inclusion/exclusion around 12h boundary.
- Global running-cap guarantee.
- Existing stale/fresh ownership behavior still passes.

Acceptance criteria:
- Startup never creates more than 64 fresh `memory_stage1` running jobs.
- Threads updated <12h ago are skipped.
- Threads older than 30d are skipped.

## PR 2: Stage-1 output contract + storage artifacts
(forward-compatible)

- Update parser/types to accept the new structured output while keeping
backward compatibility:
- Add `rollout_slug` (optional for now).
- Add `rollout_summary`.
- Keep alias support for legacy `summary` and `rawMemory` until prompt
swap completes.
- Update stage-1 schema generator in `core/src/memories/stage_one.rs` to
include the new keys.
- Update prompt templates:
- `core/templates/memories/stage_one_system.md`.
- `core/templates/memories/stage_one_input.md`.
- Replace storage model in `core/src/memories/storage.rs`:
- Introduce `rollout_summaries/` directory writer (`<thread_id>.md`
files).
- Introduce `raw_memories.md` aggregator writer from DB rows.
- Keep deterministic rebuild behavior from DB outputs so files can
always be regenerated.
- Update consolidation prompt template to reference `rollout_summaries/`
+ `raw_memories.md` inputs.

Acceptance criteria:
- Stage-1 accepts both old and new output keys during migration.
- Phase-1 artifacts are generated in new format from DB state.
- No dependence on per-thread files in `raw_memories/`.

## PR 3: Remove per-cwd memories and move to one global memory root

- Simplify layout in `core/src/memories/layout.rs`:
- Single root: `codex_home/memories`.
- Remove cwd-hash bucket helpers and normalization logic used only for
memory pathing.
- Remove scope branching from startup phase-2 dispatch path:
- No cwd/user mapping in `core/src/memories/startup/mod.rs`.
- One target root for consolidation.
- In `state/src/runtime/memory.rs`, stop enqueueing/handling cwd
consolidation scope.
- Keep one logical consolidation scope/job key (global/user) to avoid a
risky schema rewrite in same PR.
- Add one-time migration helper (core side) to preserve current shared
memory output:
- If `~/.codex/memories/user/memory` exists and new root is empty,
move/copy contents into `~/.codex/memories`.
- Leave old hashed cwd buckets untouched for now (safe/no-destructive
migration).

Acceptance criteria:
- New runs only read/write `~/.codex/memories`.
- No new cwd-scoped consolidation jobs are enqueued.
- Existing user-shared memory content is preserved.

## PR 4: Phase-2 global lock simplification and cleanup

- Replace multi-scope dispatch with a single global consolidation claim
path:
- Either reuse jobs table with one fixed key, or add a tiny dedicated
lock helper; keep 1h lease.
- Ensure at most one consolidation agent can run at once.
- Keep heartbeat + stale lock recovery semantics in
`core/src/memories/startup/watch.rs`.
- Remove dead scope code and legacy constants no longer used.
- Update tests:
- One-agent-at-a-time behavior.
- Lock expiry allows takeover after stale lease.

Acceptance criteria:
- Exactly one phase-2 consolidation agent can be active cluster-wide
(per local DB).
- Stale lock recovers automatically.

## PR 5: Final cleanup and docs

- Remove legacy artifacts and references:
- `raw_memories/` and `memory_summary.md` assumptions from
prompts/comments/tests.
- Scope constants for cwd memory pathing in core/state if fully unused.
- Update docs under `docs/` for memory workflow and directory layout.
- Add a brief operator note for rollout: compatibility window for old
stage-1 JSON keys and when to remove aliases.

Acceptance criteria:
- Code and docs reflect only the simplified global workflow.
- No stale references to per-cwd memory buckets.

## Notes on sequencing

- PR 1 is safest first because it improves correctness without changing
external artifact layout.
- PR 2 keeps parser compatibility so prompt deployment can happen
independently.
- PR 3 and PR 4 split filesystem/scope simplification from locking
simplification to reduce blast radius.
- PR 5 is intentionally cleanup-only.
2026-02-10 21:29:06 +00:00
jif-oai
a6e9469fa4
chore: unify memory job flow (#11334) 2026-02-10 20:26:39 +00:00
Michael Bolin
58a59a2dae
Use thin LTO for alpha Rust release builds (#11348)
We are looking to speed up build times for alpha releases, but we do not
want to completely compromise on runtime performance by shipping debug
builds. This PR changes our CI so that alpha releases build with
`lto="thin"` instead of `lto="fat"`.

Specifically, this change keeps `[profile.release] lto = "fat"` as the
default in `Cargo.toml`, but overrides LTO in CI using
`CARGO_PROFILE_RELEASE_LTO`:
- `rust-release.yml`: use `thin` for `-alpha` tags, otherwise `fat`
- `shell-tool-mcp.yml`: use `thin` for `-alpha` versions, otherwise
`fat`

Tradeoffs:
- Alpha binaries may be somewhat larger and/or slightly slower than
fat-LTO builds
- LTO policy now lives in workflow logic for two pipelines, so
consistency must be maintained across both files

Note `CARGO_PROFILE_<name>_LTO` is documented on
https://doc.rust-lang.org/cargo/reference/environment-variables.html#configuration-environment-variables.
2026-02-10 11:59:03 -08:00
Ahmed Ibrahim
5e01450963
Strip unsupported images from prompt history to guard against model switch (#11349)
- Make `ContextManager::for_prompt` modality-aware and strip input_image
content when the active model is text-only.
- Added a test for multi-model -> text-only model switch
2026-02-10 11:58:00 -08:00
iceweasel-oai
82f93a13b2
include sandbox (seatbelt, elevated, etc.) as in turn metadata header (#10946)
This will help us understand retention/usage for folks who use the
Windows (or any other) sandboxes
2026-02-10 19:50:07 +00:00
viyatb-oai
62d0f302fd
fix(core): canonicalize wrapper approvals and support heredoc prefix … (#10941)
## Summary
- Reduced repeated approvals for equivalent wrapper commands and fixed
execpolicy matching for heredoc-style shell invocations, with minimal
behavior change and fail-closed defaults.

## Fixes
1. Canonicalized approval matching for wrappers so equivalent commands
map to the same approval intent.
2. Added heredoc-aware prefix extraction for execpolicy so commands like
`python3 <<'PY' ... PY` match rules such as `prefix_rule(["python3"],
...)`.
3. Kept fallback behavior conservative: if parsing is ambiguous,
existing prompt behavior is preserved.

## Edge Cases Covered
- Wrapper path/name differences: `/bin/bash` vs `bash`, `/bin/zsh` vs
`zsh`.
- Shell modes: `-c` and `-lc`.
- Heredoc forms: quoted delimiter (`<<'PY'`) and unquoted delimiter (`<<
PY`).
- Multi-command heredoc scripts are rejected by the fallback
- Non-heredoc redirections (`>`, etc.) are not treated as heredoc prefix
matches.
- Complex scripts still fall back to prior behavior rather than
expanding permissions.

---------

Co-authored-by: Dylan Hurd <dylan.hurd@openai.com>
2026-02-10 11:46:40 -08:00
pakrym-oai
e4b5384539
Extract tool building (#11337)
Make it clear what input go into building tools and allow for easy reuse
for pre-warm request
2026-02-10 11:45:23 -08:00
Ahmed Ibrahim
9c4656000f
Sanitize MCP image output for text-only models (#11346)
- Replace image blocks in MCP tool results with a text placeholder when
the active model does not accept image input.
- Add an e2e rmcp test to verify sanitized tool output is what gets sent
back to the model.
2026-02-10 11:25:32 -08:00
Ahmed Ibrahim
6e96e4837e
Always expose view_image and return unsupported image-input error (#11336)
- Keep `view_image` in the advertised tool list for all models.
- Return a clear error when the current model does not support image
inputs, and cover it with a unit test.
2026-02-10 11:25:12 -08:00
jif-oai
847a6092e6
fix: reduce usage of open_if_present (#11344) 2026-02-10 19:25:07 +00:00
pakrym-oai
0639c33892
Compare full request for websockets incrementality (#11343)
Tools can dynamically change mid-turn now. We need to be more thorough
about reusing incremental connections.
2026-02-10 19:14:36 +00:00
Michael Bolin
548afa5749
core: remove stale apply_patch SandboxPolicy TODO in seatbelt (#11345)
The `TODO` in `core/src/seatbelt.rs` claimed that `apply_patch` still needed to honor `SandboxPolicy`. That was true when the comment was added, but it is no longer true.

Analysis:
- The TODO was introduced in #1762, when seatbelt code was split out of `exec.rs`.
- `apply_patch` sandboxing was later implemented in #1705.
- Today, `apply_patch` calls are routed through the tool orchestrator and delegated to `ApplyPatchRuntime`, which executes via `execute_env()` using the active sandbox attempt policy.
- On macOS, the sandbox transform path for that execution still builds seatbelt args with `create_seatbelt_command_args(command, policy, sandbox_policy_cwd)`, so the same `SandboxPolicy` gates `apply_patch` writes and network behavior.

Because this behavior is already enforced, the TODO is stale and removing it avoids implying missing sandbox coverage where none exists.

No functional behavior change; comment-only cleanup.
2026-02-10 19:10:02 +00:00
Dylan Hurd
f3bbcc987d
test(core): stabilize ARM bazel remote-model and parallelism tests (#11330)
## Summary
- keep wiremock MockServer handles alive through async assertions in
remote model suite tests
- assert /models request count in remote_models_hide_picker_only_models
- use a slightly higher parallel timing threshold on aarch64 while
keeping existing x86 threshold

## Validation
- just fmt
- targeted tests:
- cargo test -p codex-core --test all
suite::remote_models::remote_models_merge_replaces_overlapping_model --
--exact
- cargo test -p codex-core --test all
suite::remote_models::remote_models_hide_picker_only_models -- --exact
- cargo test -p codex-core --test all
suite::tool_parallelism::shell_tools_run_in_parallel -- --exact
- soak loop: 40 iterations of all three targeted tests

## Notes
- cargo test -p codex-core has one unrelated local-env failure in
shell_snapshot::tests::try_new_creates_and_deletes_snapshot_file from
exported certificate env content in this workspace.
- local bazel test //codex-rs/core:core-all-test failed to build due
missing rust-objcopy in this host toolchain.
2026-02-10 10:57:50 -08:00
Michael Bolin
d9c014efce
# Use @openai/codex dist-tags for platform binaries instead of separate package names (#11339)
https://github.com/openai/codex/pull/11318 introduced logic to publish
platform artifacts as separate npm packages (for example,
`@openai/codex-darwin-arm64`, `@openai/codex-linux-x64`, etc.). That
requires provisioning and maintaining multiple package entries in npm,
which we want to avoid.

We still need to keep the package-size mitigation (platform-specific
payloads), but we want that layout to live under a single npm package
namespace (`@openai/codex`) using dist-tags.

We also need to preserve pre-release workflows where users install
`@openai/codex@alpha` and get platform-appropriate binaries.

Additionally, we want GitHub Release assets to group Codex npm tarballs
together, so platform tarballs should follow the same `codex-npm-*`
filename prefix as the main Codex tarball.

## Release Strategy (New Scheme)

We publish **one npm package name for Codex binaries** (`@openai/codex`)
and use **dist-tags** to select platform-specific payloads. This avoids
creating separate platform package names while keeping the package size
split by platform.

### What gets published

#### Mainline release (`x.y.z`)

- `@openai/codex@latest` (meta package)
- `@openai/codex@darwin-arm64`
- `@openai/codex@darwin-x64`
- `@openai/codex@linux-arm64`
- `@openai/codex@linux-x64`
- `@openai/codex@win32-arm64`
- `@openai/codex@win32-x64`
- `@openai/codex-responses-api-proxy@latest`
- `@openai/codex-sdk@latest`

#### Alpha release (`x.y.z-alpha.N`)

- `@openai/codex@alpha` (meta package)
- `@openai/codex@alpha-darwin-arm64`
- `@openai/codex@alpha-darwin-x64`
- `@openai/codex@alpha-linux-arm64`
- `@openai/codex@alpha-linux-x64`
- `@openai/codex@alpha-win32-arm64`
- `@openai/codex@alpha-win32-x64`
- `@openai/codex-responses-api-proxy@alpha`
- `@openai/codex-sdk@alpha`

As an example, the `package.json` for `@openai/codex@alpha` (using
`0.99.0-alpha.17` as the `version`) would be:

```
{
  "name": "@openai/codex",
  "version": "0.99.0-alpha.17",
  "license": "Apache-2.0",
  "bin": {
    "codex": "bin/codex.js"
  },
  "type": "module",
  "engines": {
    "node": ">=16"
  },
  "files": [
    "bin"
  ],
  "repository": {
    "type": "git",
    "url": "git+https://github.com/openai/codex.git",
    "directory": "codex-cli"
  },
  "packageManager": "pnpm@10.28.2+sha512.41872f037ad22f7348e3b1debbaf7e867cfd448f2726d9cf74c08f19507c31d2c8e7a11525b983febc2df640b5438dee6023ebb1f84ed43cc2d654d2bc326264",
  "optionalDependencies": {
    "@openai/codex-linux-x64": "npm:@openai/codex@0.99.0-alpha.17-linux-x64",
    "@openai/codex-linux-arm64": "npm:@openai/codex@0.99.0-alpha.17-linux-arm64",
    "@openai/codex-darwin-x64": "npm:@openai/codex@0.99.0-alpha.17-darwin-x64",
    "@openai/codex-darwin-arm64": "npm:@openai/codex@0.99.0-alpha.17-darwin-arm64",
    "@openai/codex-win32-x64": "npm:@openai/codex@0.99.0-alpha.17-win32-x64",
    "@openai/codex-win32-arm64": "npm:@openai/codex@0.99.0-alpha.17-win32-arm64"
  }
}
```

Note that the keys in `optionalDependencies` have "clean" names, but the
values have the tag embedded.

### Important note

**Note:** Because we never created the new platform package names on npm
(for example,
`@openai/codex-darwin-arm64`) since #11318 landed, there are no extra
npm packages to clean up.

## What changed

### 1. Stage platform tarballs as `@openai/codex` with platform-specific
versions

File: `codex-cli/scripts/build_npm_package.py`

- Added `CODEX_NPM_NAME = "@openai/codex"` and platform metadata
`npm_tag` values:
- `darwin-arm64`, `darwin-x64`, `linux-arm64`, `linux-x64`,
`win32-arm64`, `win32-x64`
- For platform package staging (`codex-<platform>` inputs), switched
generated `package.json` from:
  - `name = @openai/codex-<platform>`
  to:
  - `name = @openai/codex`
- Added `compute_platform_package_version(version, platform_tag)` so
platform tarballs have unique
versions (`<release-version>-<platform-tag>`), which is required because
npm forbids re-publishing
  the same `name@version`.

### 2. Point meta package optional dependencies at dist-tags on
`@openai/codex`

File: `codex-cli/scripts/build_npm_package.py`

- Updated `optionalDependencies` generation for the main `codex` package
to use npm alias syntax:
- key remains alias package name (for example,
`@openai/codex-darwin-arm64`) so runtime lookup behavior is unchanged
  - value now resolves to `@openai/codex` by dist-tag
- Stable releases emit tags like `npm:@openai/codex@darwin-arm64`.
- Alpha releases (`x.y.z-alpha.N`) emit tags like
`npm:@openai/codex@alpha-darwin-arm64`.

### 3. Publish with per-tarball dist-tags in release CI

File: `.github/workflows/rust-release.yml`

- Reworked npm publish logic to derive the publish tag per tarball
filename:
  - platform tarballs publish with `<platform>` tags for stable releases
- platform tarballs publish with `alpha-<platform>` tags for alpha
releases
- top-level tarballs (`codex`, `codex-responses-api-proxy`, `codex-sdk`)
continue using
the existing channel tag policy (`latest` implicit for stable, `alpha`
for alpha)
- Added fail-fast behavior for unexpected tarball names to avoid silent
mispublishes.

### 4. Normalize Codex platform tarball filenames for GitHub Release
grouping

Files: `scripts/stage_npm_packages.py`,
`.github/workflows/rust-release.yml`

- Renamed staged platform tarball filenames from:
  - `codex-linux-<arch>-npm-<version>.tgz`
  - `codex-darwin-<arch>-npm-<version>.tgz`
  - `codex-win32-<arch>-npm-<version>.tgz`
- To:
  - `codex-npm-linux-<arch>-<version>.tgz`
  - `codex-npm-darwin-<arch>-<version>.tgz`
  - `codex-npm-win32-<arch>-<version>.tgz`

This keeps all Codex npm artifacts grouped under a common `codex-npm-`
prefix in GitHub Releases.

### 5. Documentation update

File: `codex-cli/scripts/README.md`

- Updated staging docs to clarify that platform-native variants are
published as dist-tagged
  `@openai/codex` artifacts rather than separate npm package names.

## Resulting behavior

- Mainline release:
  - `@openai/codex@latest` resolves the meta package
- meta package optional dependencies resolve
`@openai/codex@<platform-tag>`
- Alpha release:
  - users can continue installing `@openai/codex@alpha`
- alpha meta package optional dependencies resolve
`@openai/codex@alpha-<platform-tag>`
- Release assets:
- Codex npm tarballs share `codex-npm-` prefix for cleaner grouping in
GitHub Releases

This preserves platform-specific payload distribution while avoiding
separate npm package names and
improves release-asset discoverability.

## Validation notes

- Verified staged `package.json` output for stable and alpha meta
packages includes expected alias targets.
- Verified staged platform package manifests are `name=@openai/codex`
with unique platform-suffixed versions.
- Verified publish tag derivation maps renamed platform tarballs to
expected stable and alpha dist-tags.
2026-02-10 10:33:47 -08:00
guinness-oai
099ed802b2
Treat first rollout session_meta as canonical thread identity (#11241)
During thread/fork, the new rollout includes the fork’s own session_meta
plus copied history that can contain older session_meta entries from the
source thread. thread/list was overwriting metadata on later
session_meta lines, so a fork could be reported with the source thread’s
thread_id. This fix only uses the first session_meta, so the fork keeps
its own ID.
2026-02-10 10:32:11 -08:00
jif-oai
a364dd8b56
feat: opt-out of events in the app-server (#11319)
Add `optOutNotificationMethods` in the app-server to opt-out events
based on exact method matching
2026-02-10 18:04:52 +00:00
Matthew Zeng
48e415bdef
[apps] Improve app installation flow. (#11249)
- [x] Add buttons to start the installation flow and verify installation
completes.
- [x] Hard refresh apps list when the /apps view opens.
2026-02-10 17:59:43 +00:00
Shijie Rao
c4b771a16f
Fix: update parallel tool call exec approval to approve on request id (#11162)
### Summary

In parallel tool call, exec command approvals were not approved at
request level but at a turn level. i.e. when a single request is
approved, the system currently treats all requests in turn as approved.

### Before

https://github.com/user-attachments/assets/d50ed129-b3d2-4b2f-97fa-8601eb11f6a8

### After

https://github.com/user-attachments/assets/36528a43-a4aa-4775-9e12-f13287ef19fc
2026-02-10 09:38:00 -08:00
Max Johnson
47356ff83c
Revert "Add app-server transport layer with websocket support (#10693)" (#11323)
Suspected cause of deadlocking bug
2026-02-10 17:37:49 +00:00
Fouad Matin
693bac1851
fix(protocol): approval policy never prompt (#11288)
This removes overly directed language about how the model should behave
when it's in `approval_policy=never` mode.

---------

Co-authored-by: Dylan Hurd <dylan.hurd@openai.com>
2026-02-10 09:27:46 -08:00
Josh McKinney
e704f488bd
tui: keep history recall cursor at line end (#11295)
## Summary
- keep cursor at end-of-line after Up/Down history recall
- allow continued history navigation when recalled text cursor is at
start or end boundary
- add regression tests and document the history cursor contract in
composer docs

## Testing
- just fmt
- cargo test -p codex-tui --lib
history_navigation_leaves_cursor_at_end_of_line
- cargo test -p codex-tui --lib
should_handle_navigation_when_cursor_is_at_line_boundaries
- cargo test -p codex-tui *(fails in existing integration test
`suite::no_panic_on_startup::malformed_rules_should_not_panic` because
`target/debug/codex` is not present in this environment)*
2026-02-10 17:21:46 +00:00
pakrym-oai
3322b99900
Remove ApiPrompt (#11265)
Keep things simple and build a full Responses API request request right
in the model client
2026-02-10 16:12:31 +00:00
jif-oai
59c625458b
Fix pending input test waiting logic (#11322)
## Summary
- remove redundant user message wait that could time out and cause
flakiness
- rely on the existing turn-complete wait to ensure the follow-up
request is observed

## Testing
- Not run (not requested)
2026-02-10 15:40:53 +00:00
jif-oai
c19969c676
chore: split NPM packages (#11318) 2026-02-10 14:49:53 +00:00
jif-oai
e57892b211
feat: phase 2 consolidation (#11306)
Consolidation phase of memories

Cleaning and better handling of concurrency
2026-02-10 14:31:16 +00:00
jif-oai
d735df1f50
Extract hooks into dedicated crate (#11311)
Summary
- move `core/src/hooks` implementation into a new `codex-hooks` crate
with its own manifest
- update `codex-rs` workspace and `codex-core` crate to depend on the
extracted `hooks` crate and wire up the shared APIs
- ensure references, modules, and lockfile reflect the new crate layout

Testing
- Not run (not requested)
2026-02-10 13:42:17 +00:00
jif-oai
1d5eba0090
feat: align memory phase 1 and make it stronger (#11300)
## Align with the new phase-1 design

Basically we know run phase 1 in parallel by considering:
* Max 64 rollouts
* Max 1 month old
* Consider the most recent first

This PR also adds stronger parallelization capabilities by detecting
stale jobs, retry policies, ownership of computation to prevent double
computations etc etc
2026-02-10 13:42:09 +00:00
jif-oai
223fadc760
Fix spawn_agent input type (#11304) 2026-02-10 12:16:39 +00:00
jif-oai
87ccc5bbae
feat: add connector capabilities to sub-agents (#11191) 2026-02-10 11:53:01 +00:00
jif-oai
6049ff02a0
memories: add extraction and prompt module foundation (#11200)
## Summary
- add the new `core/src/memories` module (phase-one parsing, rollout
filtering, storage, selection, prompts)
- add Askama-backed memory templates for stage-one input/system and
consolidation prompts
- add module tests for parsing, filtering, path bucketing, and summary
maintenance

## Testing
- just fmt
- cargo test -p codex-core --lib memories::
2026-02-10 10:10:24 +00:00
Michael Bolin
44ebf4588f
feat: retain NetworkProxy, when appropriate (#11207)
As of this PR, `SessionServices` retains a
`Option<StartedNetworkProxy>`, if appropriate.

Now the `network` field on `Config` is `Option<NetworkProxySpec>`
instead of `Option<NetworkProxy>`.

Over in `Session::new()`, we invoke `NetworkProxySpec::start_proxy()` to
create the `StartedNetworkProxy`, which is a new struct that retains the
`NetworkProxy` as well as the `NetworkProxyHandle`. (Note that `Drop` is
implemented for `NetworkProxyHandle` to ensure the proxies are shutdown
when it is dropped.)

The `NetworkProxy` from the `StartedNetworkProxy` is threaded through to
the appropriate places.


---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/11207).
* #11285
* __->__ #11207
2026-02-10 02:09:23 -08:00
Michael Bolin
8e240a13be
chore: put crypto provider logic in a shared crate (#11294)
Ensures a process-wide rustls crypto provider is installed.

Both the `codex-network-proxy` and `codex-api` crates need this.
2026-02-10 01:04:31 -08:00
alexsong-oai
9fded117ac
feat: support configurable metric_exporter (#10940) 2026-02-10 08:14:28 +00:00