core-agent-ide

Author	SHA1	Message	Date
Eric Traut	81c534102e	Increased file watcher debounce duration from 1s to 10s (#11494 ) Users were reporting that when they were actively editing a skill file, they would see frequent errors (one per second) across all of their active session until they fixed all frontmatter parse errors. This change will reduce the chatter at the expense of a slightly longer delay before skills are updated in the UI. This addresses #11385	2026-02-11 14:08:03 -08:00
jif-oai	de6f2ef746	nit: memory truncation (#11479 ) Use existing truncation for memories	2026-02-11 21:11:57 +00:00
pakrym-oai	d73de9c8ba	Pump pings (#11413 ) Keep processing ping even when the agent isn't actively running. Otherwise the connection will drop.	2026-02-11 12:43:57 -08:00
Max Johnson	b5339a591d	refactor: codex app-server ThreadState (#11419 ) this is a no-op functionality wise. consolidates thread-specific message processor / event handling state in ThreadState	2026-02-11 12:20:54 -08:00
Curtis 'Fjord' Hawthorne	42e22f3bde	Add feature-gated freeform js_repl core runtime (#10674 ) ## Summary This PR adds an experimental, feature-gated `js_repl` core runtime so models can execute JavaScript in a persistent REPL context across tool calls. The implementation integrates with existing feature gating, tool registration, prompt composition, config/schema docs, and tests. ## What changed - Added new experimental feature flag: `features.js_repl`. - Added freeform `js_repl` tool and companion `js_repl_reset` tool. - Gated tool availability behind `Feature::JsRepl`. - Added conditional prompt-section injection for JS REPL instructions via marker-based prompt processing. - Implemented JS REPL handlers, including freeform parsing and pragma support (timeout/reset controls). - Added runtime resolution order for Node: 1. `CODEX_JS_REPL_NODE_PATH` 2. `js_repl_node_path` in config 3. `PATH` - Added JS runtime assets/version files and updated docs/schema. ## Why This enables richer agent workflows that require incremental JavaScript execution with preserved state, while keeping rollout safe behind an explicit feature flag. ## Testing Coverage includes: - Feature-flag gating behavior for tool exposure. - Freeform parser/pragma handling edge cases. - Runtime behavior (state persistence across calls and top-level `await` support). ## Usage ```toml [features] js_repl = true ``` Optional runtime override: - `CODEX_JS_REPL_NODE_PATH`, or - `js_repl_node_path` in config. #### [git stack](https://github.com/magus/git-stack-cli) - 👉 `1` https://github.com/openai/codex/pull/10674 - ⏳ `2` https://github.com/openai/codex/pull/10672 - ⏳ `3` https://github.com/openai/codex/pull/10671 - ⏳ `4` https://github.com/openai/codex/pull/10673 - ⏳ `5` https://github.com/openai/codex/pull/10670	2026-02-11 12:05:02 -08:00
iceweasel-oai	87279de434	Promote Windows Sandbox (#11341 ) 1. Move Windows Sandbox NUX to right after trust directory screen 2. Don't offer read-only as an option in Sandbox NUX. Elevated/Legacy/Quit 3. Don't allow new untrusted directories. It's trust or quit 4. move experimental sandbox features to `[windows] sandbox="elevated\|unelevatd"` 5. Copy tweaks = elevated -> default, non-elevated -> non-admin	2026-02-11 11:48:33 -08:00
Owen Lin	24e6adbda5	fix: Constrained import (#11485 ) main seems broken	2026-02-11 11:44:20 -08:00
jif-oai	53c1818d29	chore: update mem prompt (#11480 )	2026-02-11 19:29:39 +00:00
pakrym-oai	2c3ce2048d	Linkify feedback link (#11414 ) Make it clickable	2026-02-11 11:21:03 -08:00
jif-oai	2fac9cc8cd	chore: sub-agent never ask for approval (#11464 )	2026-02-11 19:19:37 +00:00
Yuvraj Angad Singh	b4ffb2eb58	fix(tui): increase paste burst char interval on Windows to 30ms (#9348 ) ## Summary - Increases `PASTE_BURST_CHAR_INTERVAL` from 8ms to 30ms on Windows to fix multi-line paste issues in VS Code integrated terminal - Follows existing pattern of platform-specific timing (like `PASTE_BURST_ACTIVE_IDLE_TIMEOUT`) ## Problem When pasting multi-line text in Codex CLI on Windows (especially VS Code integrated terminal), only the first portion is captured before auto-submit. The rest arrives as a separate message. Root cause: VS Code's terminal emulation adds latency (~10-15ms per character) between key events. The 8ms `PASTE_BURST_CHAR_INTERVAL` threshold is too tight - characters arrive slower than expected, so burst detection fails and Enter submits instead of inserting a newline. ## Solution Use Windows-specific timing (30ms) for `PASTE_BURST_CHAR_INTERVAL`, following the same pattern already used for `PASTE_BURST_ACTIVE_IDLE_TIMEOUT` (60ms on Windows vs 8ms on Unix). 30ms is still fast enough to distinguish paste from typing (humans type ~200ms between keystrokes). ## Test plan - [x] All existing paste_burst tests pass - [ ] Test multi-line paste in VS Code integrated PowerShell on Windows - [ ] Test multi-line paste in standalone Windows PowerShell - [ ] Verify no regression on macOS/Linux Fixes #2137 Co-authored-by: Josh McKinney <joshka@openai.com>	2026-02-11 10:31:30 -08:00
jif-oai	1170ffeeae	chore: clean rollout extraction in memories (#11471 )	2026-02-11 18:25:45 +00:00
jif-oai	d4b2c230f1	feat: memory read path (#11459 )	2026-02-11 18:22:45 +00:00
Michael Bolin	3a9324707d	feat: panic if Constrained<WebSearchMode> does not support Disabled (#11470 ) If this happens, this is a logical error on our part and we should fix it.	2026-02-11 10:18:58 -08:00
Max Johnson	7053aa5457	Reapply "Add app-server transport layer with websocket support" (#11370 ) Reapply "Add app-server transport layer with websocket support" with additional fixes from https://github.com/openai/codex/pull/11313/changes to avoid deadlocking. This reverts commit `47356ff83c`. ## Summary To avoid deadlocking when queues are full, we maintain separate tokio tasks dedicated to incoming vs outgoing event handling - split the app-server main loop into two tasks in `run_main_with_transport` - inbound handling (`transport_event_rx`) - outbound handling (`outgoing_rx` + `thread_created_rx`) - separate incoming and outgoing websocket tasks ## Validation Integration tests, testing thoroughly e2e in codex app w/ >10 concurrent requests <img width="1365" height="979" alt="Screenshot 2026-02-10 at 2 54 22 PM" src="https://github.com/user-attachments/assets/47ca2c13-f322-4e5c-bedd-25859cbdc45f" /> --------- Co-authored-by: jif-oai <jif@openai.com>	2026-02-11 18:13:39 +00:00
Michael Bolin	577a416f9a	Extract `codex-config` from `codex-core` (#11389 ) `codex-core` had accumulated config loading, requirements parsing, constraint logic, and config-layer state handling in a single crate. This change extracts that subsystem into `codex-config` to reduce `codex-core` rebuild/test surface area and isolate future config work. ## What Changed ### Added `codex-config` - Added new workspace crate `codex-rs/config` (`codex-config`). - Added workspace/build wiring in: - `codex-rs/Cargo.toml` - `codex-rs/config/Cargo.toml` - `codex-rs/config/BUILD.bazel` - Updated lockfiles (`codex-rs/Cargo.lock`, `MODULE.bazel.lock`). - Added `codex-core` -> `codex-config` dependency in `codex-rs/core/Cargo.toml`. ### Moved config internals from `core` into `config` Moved modules to `codex-rs/config/src/`: - `core/src/config/constraint.rs` -> `config/src/constraint.rs` - `core/src/config_loader/cloud_requirements.rs` -> `config/src/cloud_requirements.rs` - `core/src/config_loader/config_requirements.rs` -> `config/src/config_requirements.rs` - `core/src/config_loader/fingerprint.rs` -> `config/src/fingerprint.rs` - `core/src/config_loader/merge.rs` -> `config/src/merge.rs` - `core/src/config_loader/overrides.rs` -> `config/src/overrides.rs` - `core/src/config_loader/requirements_exec_policy.rs` -> `config/src/requirements_exec_policy.rs` - `core/src/config_loader/state.rs` -> `config/src/state.rs` `codex-config` now re-exports this surface from `config/src/lib.rs` at the crate top level. ### Updated `core` to consume/re-export `codex-config` - `core/src/config_loader/mod.rs` now imports/re-exports config-loader types/functions from top-level `codex_config::*`. - Local moved modules were removed from `core/src/config_loader/`. - `core/src/config/mod.rs` now re-exports constraint types from `codex_config`.	2026-02-11 10:02:49 -08:00
viyatb-oai	7e0178597e	feat(core): promote Linux bubblewrap sandbox to Experimental (#11381 ) ## Summary - Promote `use_linux_sandbox_bwrap` to `Stage::Experimental` on Linux so users see it in `/experimental` and get a startup nudge.	2026-02-11 09:49:24 -08:00
jif-oai	9efb7f4a15	clean: memory rollout recorder (#11462 )	2026-02-11 15:46:10 +00:00
pakrym-oai	eac5473114	Do not attempt to append after response.completed (#11402 ) Completed responses are fully done, and new response must be created.	2026-02-11 07:45:17 -08:00
sayan-oai	83a54766b7	chore: rename disable_websockets -> websockets_disabled (#11420 ) `disable_websockets()` is confusing because its a getter. rename for clarity	2026-02-11 07:44:05 -08:00
jif-oai	b58afbfd0a	feat: set policy for phase 2 memory (#11449 ) Set the policy of the memory phase 2 worker such that it never ask for approval	2026-02-11 15:39:22 +00:00
jif-oai	bd3bf6eda1	fix: optional schema of memories (#11454 )	2026-02-11 15:05:36 +00:00
jif-oai	156f47edd0	feat: close mem agent after consolidation (#11455 ) Close the phase-2 agent of memory when it's done Fire and forget (i.e. best effort)	2026-02-11 14:34:11 +00:00
jif-oai	f19452e475	nit: increase max raw memories (#11452 )	2026-02-11 14:17:34 +00:00
gt-oai	886d9377d3	Cache cloud requirements (#11305 ) We're loading these from the web on every startup. This puts them in a local file with a 1hr TTL. We sign the downloaded requirements with a key compiled into the Codex CLI to prevent unsophisticated tampering (determined circumvention is outside of our threat model: after all, one could just compile Codex without any of these checks). If any of the following are true, we ignore the local cache and re-fetch from Cloud: * The signature is invalid for the payload (== requirements, sign time, ttl, user identity) * The identity does not match the auth'd user's identity * The TTL has expired * We cannot parse requirements.toml from the payload	2026-02-11 14:06:41 +00:00
jif-oai	f5d4a21098	feat: new memory prompts (#11439 ) * Update prompt * Wire CWD in the prompt * Handle the no-output case	2026-02-11 13:57:52 +00:00
Michael Bolin	8b7f8af343	feat: split codex-common into smaller utils crates (#11422 ) We are removing feature-gated shared crates from the `codex-rs` workspace. `codex-common` grouped several unrelated utilities behind `[features]`, which made dependency boundaries harder to reason about and worked against the ongoing effort to eliminate feature flags from workspace crates. Splitting these utilities into dedicated crates under `utils/` aligns this area with existing workspace structure and keeps each dependency explicit at the crate boundary. ## What changed - Removed `codex-rs/common` (`codex-common`) from workspace members and workspace dependencies. - Added six new utility crates under `codex-rs/utils/`: - `codex-utils-cli` - `codex-utils-elapsed` - `codex-utils-sandbox-summary` - `codex-utils-approval-presets` - `codex-utils-oss` - `codex-utils-fuzzy-match` - Migrated the corresponding modules out of `codex-common` into these crates (with tests), and added matching `BUILD.bazel` targets. - Updated direct consumers to use the new crates instead of `codex-common`: - `codex-rs/cli` - `codex-rs/tui` - `codex-rs/exec` - `codex-rs/app-server` - `codex-rs/mcp-server` - `codex-rs/chatgpt` - `codex-rs/cloud-tasks` - Updated workspace lockfile entries to reflect the new dependency graph and removal of `codex-common`.	2026-02-11 12:59:24 +00:00
jif-oai	3d0ead8db8	feat: improve thread listing (#11429 ) Improve listing by doing: 1. List using the rollout file system 2. Upsert the result in the DB (if present) 3. Return the result of a DB listing 4. Fallback on the result of 1 + some metrics on top of this	2026-02-11 11:22:05 +00:00
jif-oai	2c5eeb6b1f	fix: flaky test (#11428 ) stage1_concurrent_claims_respect_running_cap was flaky due to SQLite lock contention, not cap logic correctness. The claim flow used deferred transactions (BEGIN) with read-then-write behavior, which can fail under concurrency with SQLITE_BUSY_SNAPSHOT/database is locked when upgrading a read transaction to a write transaction. We fixed this by using BEGIN IMMEDIATE for stage1 and phase2 claim paths, so lock acquisition happens up front and contenders serialize cleanly instead of failing during upgrade. After the change, codex-state tests pass and stress reruns of the flaky path no longer reproduced the failure.	2026-02-11 10:23:18 +00:00
Michael Bolin	476c1a7160	Remove `test-support` feature from `codex-core` and replace it with explicit test toggles (#11405 ) ## Why `codex-core` was being built in multiple feature-resolved permutations because test-only behavior was modeled as crate features. For a large crate, those permutations increase compile cost and reduce cache reuse. ## Net Change - Removed the `test-support` crate feature and related feature wiring so `codex-core` no longer needs separate feature shapes for test consumers. - Standardized cross-crate test-only access behind `codex_core::test_support`. - External test code now imports helpers from `codex_core::test_support`. - Underlying implementation hooks are kept internal (`pub(crate)`) instead of broadly public. ## Outcome - Fewer `codex-core` build permutations. - Better incremental cache reuse across test targets. - No intended production behavior change.	2026-02-10 22:44:02 -08:00
Michael Bolin	f6dd9e37e7	tui: show non-file layer content in /debug-config (#11412 ) The debug output listed non-file-backed layers such as session flags and MDM managed config, but it did not show their values. That made it difficult to explain unexpected effective settings because users could not inspect those layers on disk. Now `/debug-config` might include output like this: ``` Config layer stack (lowest precedence first): 1. system (/etc/codex/config.toml) (enabled) 2. user (/Users/mbolin/.codex/config.toml) (enabled) 3. legacy managed_config.toml (mdm) (enabled) MDM value: # Production Codex configuration file. [otel] log_user_prompt = true environment = "prod" exporter = { otlp-http = { endpoint = "https://example.com/otel", protocol = "binary" }} ```	2026-02-11 06:23:08 +00:00
xl-openai	fdd0cd1de9	feat: support multiple rate limits (#11260 ) Added multi-limit support end-to-end by carrying limit_name in rate-limit snapshots and handling multiple buckets instead of only codex. Extended /usage client parsing to consume additional_rate_limits Updated TUI /status and in-memory state to store/render per-limit snapshots Extended app-server rate-limit read response: kept rate_limits and added rate_limits_by_name. Adjusted usage-limit error messaging for non-default codex limit buckets	2026-02-10 20:09:31 -08:00
Celia Chen	641d5268fa	chore: persist turn_id in rollout session and make turn_id uuid based (#11246 ) Problem: 1. turn id is constructed in-memory; 2. on resuming threads, turn_id might not be unique; 3. client cannot no the boundary of a turn from rollout files easily. This PR does three things: 1. persist `task_started` and `task_complete` events; 1. persist `turn_id` in rollout turn events; 5. generate turn_id as unique uuids instead of incrementing it in memory. This helps us resolve the issue of clients wanting to have unique turn ids for resuming a thread, and knowing the boundry of each turn in rollout files. example debug logs ``` 2026-02-11T00:32:10.746876Z DEBUG codex_app_server_protocol::protocol::thread_history: built turn from rollout items turn_index=8 turn=Turn { id: "019c4a07-d809-74c3-bc4b-fd9618487b4b", items: [UserMessage { id: "item-24", content: [Text { text: "hi", text_elements: [] }] }, AgentMessage { id: "item-25", text: "Hi. I’m in the workspace with your current changes loaded and ready. Send the next task and I’ll execute it end-to-end." }], status: Completed, error: None } 2026-02-11T00:32:10.746888Z DEBUG codex_app_server_protocol::protocol::thread_history: built turn from rollout items turn_index=9 turn=Turn { id: "019c4a18-1004-76c0-a0fb-a77610f6a9b8", items: [UserMessage { id: "item-26", content: [Text { text: "hello", text_elements: [] }] }, AgentMessage { id: "item-27", text: "Hello. Ready for the next change in `codex-rs`; I can continue from the current in-progress diff or start a new task." }], status: Completed, error: None } 2026-02-11T00:32:10.746899Z DEBUG codex_app_server_protocol::protocol::thread_history: built turn from rollout items turn_index=10 turn=Turn { id: "019c4a19-41f0-7db0-ad78-74f1503baeb8", items: [UserMessage { id: "item-28", content: [Text { text: "hello", text_elements: [] }] }, AgentMessage { id: "item-29", text: "Hello. Send the specific change you want in `codex-rs`, and I’ll implement it and run the required checks." }], status: Completed, error: None } ``` backward compatibility: if you try to resume an old session without task_started and task_complete event populated, the following happens: - If you resume and do nothing: those reconstructed historical IDs can differ next time you resume. - If you resume and send a new turn: the new turn gets a fresh UUID from live submission flow and is persisted, so that new turn’s ID is stable on later resumes. I think this behavior is fine, because we only care about deterministic turn id once a turn is triggered.	2026-02-11 03:56:01 +00:00
pakrym-oai	4473147985	Do not resend output items in incremental websockets connections (#11383 ) In the incremental websocket output items are already part of the context, no need to send them again and duplicate.	2026-02-10 19:38:08 -08:00
Dylan Hurd	cc8c293378	fix(exec-policy) No empty command lists (#11397 ) ## Summary This should rarely, if ever, happen in practice. But regardless, we should never provide an empty list of `commands` to ExecPolicy. This PR is almost entirely adding test around these cases. ## Testing - [x] Adds a bunch of unit tests for this	2026-02-10 19:22:23 -08:00
Michael Bolin	b68a84ee8e	Remove `deterministic_process_ids` feature to avoid duplicate `codex-core` builds (#11393 ) ## Why `codex-core` enabled `deterministic_process_ids` through a self dev-dependency. That forced a second feature-resolved build of the same crate, which increased compile time and test latency. ## What Changed - Removed the `deterministic_process_ids` feature from `codex-rs/core/Cargo.toml`. - Removed the self dev-dependency on `codex-core` that enabled that feature. - Removed the Bazel `deterministic_process_ids` crate feature for `codex-core`. - Added a test-only `AtomicBool` override in unified exec process-id allocation. - Added a test-support setter for that override and re-exported it from `codex-core`. - Enabled deterministic process IDs in integration tests via `core_test_support` ctor. ## Behavior - Production behavior remains random process IDs. - Unit tests remain deterministic via `cfg(test)`. - Integration tests remain deterministic via explicit test-support initialization. ## Validation - `just fmt` - `cargo test -p codex-core unified_exec::` - `cargo test -p codex-core --test all unified_exec -- --test-threads=1` - `cargo tree -p codex-core -e features` (verified the removed feature path)	2026-02-10 19:07:01 -08:00
Charley Cunningham	8b46c0ce00	tui: queue non-pending rollback trims in app-event order (#11373 ) ## Summary This PR fixes TUI transcript-sync behavior for `EventMsg::ThreadRolledBack` and makes rollback application order deterministic. Previously, rollback handling depended on `pending_rollback`: - if `pending_rollback` was set (local backtrack), TUI trimmed correctly - otherwise, replayed/external rollbacks were either ignored or could be applied at the wrong time relative to queued transcript inserts This change keeps the local backtrack path intact and routes non-pending rollbacks through the app event queue so rollback trims are applied in FIFO order with transcript cell inserts. ## What changed - Added/used `trim_transcript_cells_drop_last_n_user_turns(...)` for rollback-by-`num_turns` semantics. - Renamed rollback app event: - `AppEvent::ApplyReplayedThreadRollback` -> `AppEvent::ApplyThreadRollback` - Replay path (`ChatWidget`) now emits `ApplyThreadRollback`. - Live non-pending rollback path (`App::handle_backtrack_event`) now emits `ApplyThreadRollback` instead of trimming immediately. - App-level event handler applies `ApplyThreadRollback` after queued `InsertHistoryCell` events and schedules redraw only when a trim occurred. - When a trim occurs with an overlay open, TUI now syncs transcript overlay committed cells, clamps backtrack preview selection, and clears stale `deferred_history_lines` so closed overlays do not re-append rolled-back lines. - Clarified inline comments around the `pending_rollback` branch so future readers can reason about why there are two paths. ## Why queueing matters During resume/replay, transcript cells are populated via queued `InsertHistoryCell` app events. If a rollback is applied immediately outside that queue, it can run against an incomplete transcript and under-trim. Queueing non-pending rollbacks ensures consistent ordering and correct final transcript state. ## Behavior by rollback source - `pending_rollback = Some(...)` (local backtrack requested by this TUI): - use `finish_pending_backtrack()` and the stored selection boundary - `pending_rollback = None` (replay/external/non-local rollback): - enqueue `AppEvent::ApplyThreadRollback { num_turns }` and trim in app-event order ## Tests Added/updated tests covering ordering and semantics: - `app_backtrack::tests::trim_drop_last_n_user_turns_applies_rollback_semantics` - `app_backtrack::tests::trim_drop_last_n_user_turns_allows_overflow` - `app::tests::replayed_initial_messages_apply_rollback_in_queue_order` - `app::tests::live_rollback_during_replay_is_applied_in_app_event_order` - `app::tests::queued_rollback_syncs_overlay_and_clears_deferred_history` - `chatwidget::tests::replayed_thread_rollback_emits_ordered_app_event` Validation run: - `just fmt` - `cargo test -p codex-tui`	2026-02-10 18:53:43 -08:00
pakrym-oai	c68999ee6d	Prefer websocket transport when model opts in (#11386 ) Summary - add a `prefer_websockets` field to `ModelInfo`, defaulting to `false` in all fixtures and constructors - wire the new flag into websocket selection so models that opt in always use websocket transport even when the feature gate is off Testing - Not run (not requested)	2026-02-10 18:50:48 -08:00
pakrym-oai	bfd4e2112c	Disable very flaky tests (#11394 ) Collected from last 20 builds of main in https://github.com/openai/codex/commits/main/.	2026-02-10 18:50:11 -08:00
github-actions[bot]	f101300dba	Update models.json (#11376 ) Automated update of models.json. --------- Co-authored-by: aibrahim-oai <219906144+aibrahim-oai@users.noreply.github.com> Co-authored-by: sayan-oai <sayan@openai.com>	2026-02-10 17:25:35 -08:00
Michael Bolin	d44f4205fb	chore: rename codex-command to codex-shell-command (#11378 ) This addresses some post-merge feedback on https://github.com/openai/codex/pull/11361: - crate rename - reuse `detect_shell_type()` utility	2026-02-10 17:03:46 -08:00
jif-oai	87bbfc50a1	feat: prevent double backfill (#11377 ) ## Summary Add a DB-backed lease to prevent duplicate `.sqlite` backfill workers from running concurrently. ### What changed - Added StateRuntime::try_claim_backfill(lease_seconds) that atomically claims backfill only when: - backfill is not complete, and - no fresh running worker currently owns it. - Updated backfill_sessions to use the claim API and exit early when another worker already holds the lease. - Added runtime tests covering: - singleton claim behavior, - stale lease takeover, - claim blocked after complete. - Set backfill lease to 900s in production and 1s in tests. ### Why This avoids duplicate backfill work and reduces backfill status churn under concurrent startup, while preserving current best-effort fallback behavior.	2026-02-11 00:24:20 +00:00
jif-oai	674799d356	feat: mem v2 - PR6 (consolidation) (#11374 )	2026-02-11 00:02:57 +00:00
jif-oai	2c9be54c9a	feat: mem v2 - PR5 (#11372 )	2026-02-10 23:22:55 +00:00
viyatb-oai	1d47927aa0	Enable SOCKS defaults for common local network proxy use cases (#11362 ) ## Summary - enable local-use defaults in network proxy settings: SOCKS5 on, SOCKS5 UDP on, upstream proxying on, and local binding on - add a regression test that asserts the full `NetworkProxySettings::default()` baseline - Fixed managed listener reservation behavior. Before: we always reserved a loopback SOCKS listener, even when enable_socks5 = false. Now: SOCKS listener is only reserved when SOCKS is enabled. - Fixed /debug-config env output for SOCKS-disabled sessions. ALL_PROXY now shows the HTTP proxy URL when SOCKS is disabled (instead of incorrectly showing socks5h://...). ## Validation - just fmt - cargo test -p codex-network-proxy - cargo clippy -p codex-network-proxy --all-targets	2026-02-10 15:13:52 -08:00
jif-oai	623d3f4071	feat: mem v2 - PR4 (#11369 ) # Memories migration plan (simplified global workflow) ## Target behavior - One shared memory root only: `~/.codex/memories/`. - No per-cwd memory buckets, no cwd hash handling. - Phase 1 candidate rules: - Not currently being processed unless the job lease is stale. - Rollout updated within the max-age window (currently 30 days). - Rollout idle for at least 12 hours (new constant). - Global cap: at most 64 stage-1 jobs in `running` state at any time (new invariant). - Stage-1 model output shape (new): - `rollout_slug` (accepted but ignored for now). - `rollout_summary`. - `raw_memory`. - Phase-1 artifacts written under the shared root: - `rollout_summaries/<thread_id>.md` for each rollout summary. - `raw_memories.md` containing appended/merged raw memory paragraphs. - Phase 2 runs one consolidation agent for the shared `memories/` directory. - Phase-2 lock is DB-backed with 1 hour lease and heartbeat/expiry. ## Current code map - Core startup pipeline: `core/src/memories/startup/mod.rs`. - Stage-1 request+parse: `core/src/memories/startup/extract.rs`, `core/src/memories/stage_one.rs`, templates in `core/templates/memories/`. - File materialization: `core/src/memories/storage.rs`, `core/src/memories/layout.rs`. - Scope routing (cwd/user): `core/src/memories/scope.rs`, `core/src/memories/startup/mod.rs`. - DB job lifecycle and scope queueing: `state/src/runtime/memory.rs`. ## PR plan ## PR 1: Correct phase-1 selection invariants (no behavior-breaking layout changes yet) - Add `PHASE_ONE_MIN_ROLLOUT_IDLE_HOURS: i64 = 12` in `core/src/memories/mod.rs`. - Thread this into `state::claim_stage1_jobs_for_startup(...)`. - Enforce idle-time filter in DB selection logic (not only in-memory filtering after `scan_limit`) so eligible threads are not starved by very recent threads. - Enforce global running cap of 64 at claim time in DB logic: - Count fresh `memory_stage1` running jobs. - Only allow new claims while count < cap. - Keep stale-lease takeover behavior intact. - Add/adjust tests in `state/src/runtime.rs`: - Idle filter inclusion/exclusion around 12h boundary. - Global running-cap guarantee. - Existing stale/fresh ownership behavior still passes. Acceptance criteria: - Startup never creates more than 64 fresh `memory_stage1` running jobs. - Threads updated <12h ago are skipped. - Threads older than 30d are skipped. ## PR 2: Stage-1 output contract + storage artifacts (forward-compatible) - Update parser/types to accept the new structured output while keeping backward compatibility: - Add `rollout_slug` (optional for now). - Add `rollout_summary`. - Keep alias support for legacy `summary` and `rawMemory` until prompt swap completes. - Update stage-1 schema generator in `core/src/memories/stage_one.rs` to include the new keys. - Update prompt templates: - `core/templates/memories/stage_one_system.md`. - `core/templates/memories/stage_one_input.md`. - Replace storage model in `core/src/memories/storage.rs`: - Introduce `rollout_summaries/` directory writer (`<thread_id>.md` files). - Introduce `raw_memories.md` aggregator writer from DB rows. - Keep deterministic rebuild behavior from DB outputs so files can always be regenerated. - Update consolidation prompt template to reference `rollout_summaries/` + `raw_memories.md` inputs. Acceptance criteria: - Stage-1 accepts both old and new output keys during migration. - Phase-1 artifacts are generated in new format from DB state. - No dependence on per-thread files in `raw_memories/`. ## PR 3: Remove per-cwd memories and move to one global memory root - Simplify layout in `core/src/memories/layout.rs`: - Single root: `codex_home/memories`. - Remove cwd-hash bucket helpers and normalization logic used only for memory pathing. - Remove scope branching from startup phase-2 dispatch path: - No cwd/user mapping in `core/src/memories/startup/mod.rs`. - One target root for consolidation. - In `state/src/runtime/memory.rs`, stop enqueueing/handling cwd consolidation scope. - Keep one logical consolidation scope/job key (global/user) to avoid a risky schema rewrite in same PR. - Add one-time migration helper (core side) to preserve current shared memory output: - If `~/.codex/memories/user/memory` exists and new root is empty, move/copy contents into `~/.codex/memories`. - Leave old hashed cwd buckets untouched for now (safe/no-destructive migration). Acceptance criteria: - New runs only read/write `~/.codex/memories`. - No new cwd-scoped consolidation jobs are enqueued. - Existing user-shared memory content is preserved. ## PR 4: Phase-2 global lock simplification and cleanup - Replace multi-scope dispatch with a single global consolidation claim path: - Either reuse jobs table with one fixed key, or add a tiny dedicated lock helper; keep 1h lease. - Ensure at most one consolidation agent can run at once. - Keep heartbeat + stale lock recovery semantics in `core/src/memories/startup/watch.rs`. - Remove dead scope code and legacy constants no longer used. - Update tests: - One-agent-at-a-time behavior. - Lock expiry allows takeover after stale lease. Acceptance criteria: - Exactly one phase-2 consolidation agent can be active cluster-wide (per local DB). - Stale lock recovers automatically. ## PR 5: Final cleanup and docs - Remove legacy artifacts and references: - `raw_memories/` and `memory_summary.md` assumptions from prompts/comments/tests. - Scope constants for cwd memory pathing in core/state if fully unused. - Update docs under `docs/` for memory workflow and directory layout. - Add a brief operator note for rollout: compatibility window for old stage-1 JSON keys and when to remove aliases. Acceptance criteria: - Code and docs reflect only the simplified global workflow. - No stale references to per-cwd memory buckets. ## Notes on sequencing - PR 1 is safest first because it improves correctness without changing external artifact layout. - PR 2 keeps parser compatibility so prompt deployment can happen independently. - PR 3 and PR 4 split filesystem/scope simplification from locking simplification to reduce blast radius. - PR 5 is intentionally cleanup-only.	2026-02-10 23:10:35 +00:00
Michael Bolin	d8f9bb65e2	# Split command parsing/safety out of `codex-core` into new `codex-command` (#11361 ) `codex-core` had accumulated command parsing and command safety logic (`bash`, `powershell`, `parse_command`, and `command_safety`) that is logically cohesive but orthogonal to most core session/runtime logic. Keeping this code in `codex-core` made the crate increasingly monolithic and raised iteration cost for unrelated core changes. This change extracts that surface into a dedicated crate, `codex-command`, while preserving existing `codex_core::...` call sites via re-exports. ## Why this refactor During analysis, command parsing/safety stood out as a good first split because it has: - a clear domain boundary (shell parsing + safety classification) - relatively self-contained dependencies (notably `tree-sitter` / `tree-sitter-bash`) - a meaningful standalone test surface (`134` tests moved with the crate) - many downstream uses that benefit from independent compilation and caching The practical problem was build latency from a large `codex-core` compile/test graph. Clean-build timings before and after this split showed measurable wins: - `cargo check -p codex-core`: `57.08s` -> `53.54s` (~`6.2%` faster) - `cargo test -p codex-core --no-run`: `2m39.9s` -> `2m20s` (~`12.4%` faster) - `codex-core lib` compile unit: `57.18s` -> `49.67s` (~`13.1%` faster) - `codex-core lib(test)` compile unit: `60.87s` -> `53.21s` (~`12.6%` faster) This gives a concrete reduction in core build overhead without changing behavior. ## What changed ### New crate - Added `codex-rs/command` as workspace crate `codex-command`. - Added: - `command/src/lib.rs` - `command/src/bash.rs` - `command/src/powershell.rs` - `command/src/parse_command.rs` - `command/src/command_safety/` - `command/src/shell_detect.rs` - `command/BUILD.bazel` ### Code moved out of `codex-core` - Moved modules from `core/src` into `command/src`: - `bash.rs` - `powershell.rs` - `parse_command.rs` - `command_safety/` ### Dependency graph updates - Added workspace member/dependency entries for `codex-command` in `codex-rs/Cargo.toml`. - Added `codex-command` dependency to `codex-rs/core/Cargo.toml`. - Removed `tree-sitter` and `tree-sitter-bash` from `codex-core` direct deps (now owned by `codex-command`). ### API compatibility for callers To avoid immediate downstream churn, `codex-core` now re-exports the moved modules/functions: - `codex_command::bash` - `codex_command::powershell` - `codex_command::parse_command` - `codex_command::is_safe_command` - `codex_command::is_dangerous_command` This keeps existing `codex_core::...` paths working while enabling gradual migration to direct `codex-command` usage. ### Internal decoupling detail - Added `command::shell_detect` so moved `bash`/`powershell` logic no longer depends on core shell internals. - Adjusted PowerShell helper visibility in `codex-command` for existing core test usage (`UTF8` prefix helper + executable discovery functions). ## Validation - `just fmt` - `just fix -p codex-command -p codex-core` - `cargo test -p codex-command` (`134` passed) - `cargo test -p codex-core --no-run` - `cargo test -p codex-core shell_command_handler` ## Notes / follow-up This commit intentionally prioritizes boundary extraction and compatibility. A follow-up can migrate downstream crates to depend directly on `codex-command` (instead of through `codex-core` re-exports) to realize additional incremental build wins.	2026-02-10 14:43:16 -08:00
github-actions[bot]	3626399811	Update models.json (#11274 ) Automated update of models.json. --------- Co-authored-by: aibrahim-oai <219906144+aibrahim-oai@users.noreply.github.com> Co-authored-by: Ahmed Ibrahim <aibrahim@openai.com> Co-authored-by: Sayan Sisodiya <sayan@openai.com>	2026-02-10 14:28:18 -08:00
jif-oai	3419660767	feat: mem v2 - PR3 (#11366 ) # Memories migration plan (simplified global workflow) ## Target behavior - One shared memory root only: `~/.codex/memories/`. - No per-cwd memory buckets, no cwd hash handling. - Phase 1 candidate rules: - Not currently being processed unless the job lease is stale. - Rollout updated within the max-age window (currently 30 days). - Rollout idle for at least 12 hours (new constant). - Global cap: at most 64 stage-1 jobs in `running` state at any time (new invariant). - Stage-1 model output shape (new): - `rollout_slug` (accepted but ignored for now). - `rollout_summary`. - `raw_memory`. - Phase-1 artifacts written under the shared root: - `rollout_summaries/<thread_id>.md` for each rollout summary. - `raw_memories.md` containing appended/merged raw memory paragraphs. - Phase 2 runs one consolidation agent for the shared `memories/` directory. - Phase-2 lock is DB-backed with 1 hour lease and heartbeat/expiry. ## Current code map - Core startup pipeline: `core/src/memories/startup/mod.rs`. - Stage-1 request+parse: `core/src/memories/startup/extract.rs`, `core/src/memories/stage_one.rs`, templates in `core/templates/memories/`. - File materialization: `core/src/memories/storage.rs`, `core/src/memories/layout.rs`. - Scope routing (cwd/user): `core/src/memories/scope.rs`, `core/src/memories/startup/mod.rs`. - DB job lifecycle and scope queueing: `state/src/runtime/memory.rs`. ## PR plan ## PR 1: Correct phase-1 selection invariants (no behavior-breaking layout changes yet) - Add `PHASE_ONE_MIN_ROLLOUT_IDLE_HOURS: i64 = 12` in `core/src/memories/mod.rs`. - Thread this into `state::claim_stage1_jobs_for_startup(...)`. - Enforce idle-time filter in DB selection logic (not only in-memory filtering after `scan_limit`) so eligible threads are not starved by very recent threads. - Enforce global running cap of 64 at claim time in DB logic: - Count fresh `memory_stage1` running jobs. - Only allow new claims while count < cap. - Keep stale-lease takeover behavior intact. - Add/adjust tests in `state/src/runtime.rs`: - Idle filter inclusion/exclusion around 12h boundary. - Global running-cap guarantee. - Existing stale/fresh ownership behavior still passes. Acceptance criteria: - Startup never creates more than 64 fresh `memory_stage1` running jobs. - Threads updated <12h ago are skipped. - Threads older than 30d are skipped. ## PR 2: Stage-1 output contract + storage artifacts (forward-compatible) - Update parser/types to accept the new structured output while keeping backward compatibility: - Add `rollout_slug` (optional for now). - Add `rollout_summary`. - Keep alias support for legacy `summary` and `rawMemory` until prompt swap completes. - Update stage-1 schema generator in `core/src/memories/stage_one.rs` to include the new keys. - Update prompt templates: - `core/templates/memories/stage_one_system.md`. - `core/templates/memories/stage_one_input.md`. - Replace storage model in `core/src/memories/storage.rs`: - Introduce `rollout_summaries/` directory writer (`<thread_id>.md` files). - Introduce `raw_memories.md` aggregator writer from DB rows. - Keep deterministic rebuild behavior from DB outputs so files can always be regenerated. - Update consolidation prompt template to reference `rollout_summaries/` + `raw_memories.md` inputs. Acceptance criteria: - Stage-1 accepts both old and new output keys during migration. - Phase-1 artifacts are generated in new format from DB state. - No dependence on per-thread files in `raw_memories/`. ## PR 3: Remove per-cwd memories and move to one global memory root - Simplify layout in `core/src/memories/layout.rs`: - Single root: `codex_home/memories`. - Remove cwd-hash bucket helpers and normalization logic used only for memory pathing. - Remove scope branching from startup phase-2 dispatch path: - No cwd/user mapping in `core/src/memories/startup/mod.rs`. - One target root for consolidation. - In `state/src/runtime/memory.rs`, stop enqueueing/handling cwd consolidation scope. - Keep one logical consolidation scope/job key (global/user) to avoid a risky schema rewrite in same PR. - Add one-time migration helper (core side) to preserve current shared memory output: - If `~/.codex/memories/user/memory` exists and new root is empty, move/copy contents into `~/.codex/memories`. - Leave old hashed cwd buckets untouched for now (safe/no-destructive migration). Acceptance criteria: - New runs only read/write `~/.codex/memories`. - No new cwd-scoped consolidation jobs are enqueued. - Existing user-shared memory content is preserved. ## PR 4: Phase-2 global lock simplification and cleanup - Replace multi-scope dispatch with a single global consolidation claim path: - Either reuse jobs table with one fixed key, or add a tiny dedicated lock helper; keep 1h lease. - Ensure at most one consolidation agent can run at once. - Keep heartbeat + stale lock recovery semantics in `core/src/memories/startup/watch.rs`. - Remove dead scope code and legacy constants no longer used. - Update tests: - One-agent-at-a-time behavior. - Lock expiry allows takeover after stale lease. Acceptance criteria: - Exactly one phase-2 consolidation agent can be active cluster-wide (per local DB). - Stale lock recovers automatically. ## PR 5: Final cleanup and docs - Remove legacy artifacts and references: - `raw_memories/` and `memory_summary.md` assumptions from prompts/comments/tests. - Scope constants for cwd memory pathing in core/state if fully unused. - Update docs under `docs/` for memory workflow and directory layout. - Add a brief operator note for rollout: compatibility window for old stage-1 JSON keys and when to remove aliases. Acceptance criteria: - Code and docs reflect only the simplified global workflow. - No stale references to per-cwd memory buckets. ## Notes on sequencing - PR 1 is safest first because it improves correctness without changing external artifact layout. - PR 2 keeps parser compatibility so prompt deployment can happen independently. - PR 3 and PR 4 split filesystem/scope simplification from locking simplification to reduce blast radius. - PR 5 is intentionally cleanup-only.	2026-02-10 22:12:50 +00:00
jif-oai	0229dc5ccf	feat: mem v2 - PR2 (#11365 ) # Memories migration plan (simplified global workflow) ## Target behavior - One shared memory root only: `~/.codex/memories/`. - No per-cwd memory buckets, no cwd hash handling. - Phase 1 candidate rules: - Not currently being processed unless the job lease is stale. - Rollout updated within the max-age window (currently 30 days). - Rollout idle for at least 12 hours (new constant). - Global cap: at most 64 stage-1 jobs in `running` state at any time (new invariant). - Stage-1 model output shape (new): - `rollout_slug` (accepted but ignored for now). - `rollout_summary`. - `raw_memory`. - Phase-1 artifacts written under the shared root: - `rollout_summaries/<thread_id>.md` for each rollout summary. - `raw_memories.md` containing appended/merged raw memory paragraphs. - Phase 2 runs one consolidation agent for the shared `memories/` directory. - Phase-2 lock is DB-backed with 1 hour lease and heartbeat/expiry. ## Current code map - Core startup pipeline: `core/src/memories/startup/mod.rs`. - Stage-1 request+parse: `core/src/memories/startup/extract.rs`, `core/src/memories/stage_one.rs`, templates in `core/templates/memories/`. - File materialization: `core/src/memories/storage.rs`, `core/src/memories/layout.rs`. - Scope routing (cwd/user): `core/src/memories/scope.rs`, `core/src/memories/startup/mod.rs`. - DB job lifecycle and scope queueing: `state/src/runtime/memory.rs`. ## PR plan ## PR 1: Correct phase-1 selection invariants (no behavior-breaking layout changes yet) - Add `PHASE_ONE_MIN_ROLLOUT_IDLE_HOURS: i64 = 12` in `core/src/memories/mod.rs`. - Thread this into `state::claim_stage1_jobs_for_startup(...)`. - Enforce idle-time filter in DB selection logic (not only in-memory filtering after `scan_limit`) so eligible threads are not starved by very recent threads. - Enforce global running cap of 64 at claim time in DB logic: - Count fresh `memory_stage1` running jobs. - Only allow new claims while count < cap. - Keep stale-lease takeover behavior intact. - Add/adjust tests in `state/src/runtime.rs`: - Idle filter inclusion/exclusion around 12h boundary. - Global running-cap guarantee. - Existing stale/fresh ownership behavior still passes. Acceptance criteria: - Startup never creates more than 64 fresh `memory_stage1` running jobs. - Threads updated <12h ago are skipped. - Threads older than 30d are skipped. ## PR 2: Stage-1 output contract + storage artifacts (forward-compatible) - Update parser/types to accept the new structured output while keeping backward compatibility: - Add `rollout_slug` (optional for now). - Add `rollout_summary`. - Keep alias support for legacy `summary` and `rawMemory` until prompt swap completes. - Update stage-1 schema generator in `core/src/memories/stage_one.rs` to include the new keys. - Update prompt templates: - `core/templates/memories/stage_one_system.md`. - `core/templates/memories/stage_one_input.md`. - Replace storage model in `core/src/memories/storage.rs`: - Introduce `rollout_summaries/` directory writer (`<thread_id>.md` files). - Introduce `raw_memories.md` aggregator writer from DB rows. - Keep deterministic rebuild behavior from DB outputs so files can always be regenerated. - Update consolidation prompt template to reference `rollout_summaries/` + `raw_memories.md` inputs. Acceptance criteria: - Stage-1 accepts both old and new output keys during migration. - Phase-1 artifacts are generated in new format from DB state. - No dependence on per-thread files in `raw_memories/`. ## PR 3: Remove per-cwd memories and move to one global memory root - Simplify layout in `core/src/memories/layout.rs`: - Single root: `codex_home/memories`. - Remove cwd-hash bucket helpers and normalization logic used only for memory pathing. - Remove scope branching from startup phase-2 dispatch path: - No cwd/user mapping in `core/src/memories/startup/mod.rs`. - One target root for consolidation. - In `state/src/runtime/memory.rs`, stop enqueueing/handling cwd consolidation scope. - Keep one logical consolidation scope/job key (global/user) to avoid a risky schema rewrite in same PR. - Add one-time migration helper (core side) to preserve current shared memory output: - If `~/.codex/memories/user/memory` exists and new root is empty, move/copy contents into `~/.codex/memories`. - Leave old hashed cwd buckets untouched for now (safe/no-destructive migration). Acceptance criteria: - New runs only read/write `~/.codex/memories`. - No new cwd-scoped consolidation jobs are enqueued. - Existing user-shared memory content is preserved. ## PR 4: Phase-2 global lock simplification and cleanup - Replace multi-scope dispatch with a single global consolidation claim path: - Either reuse jobs table with one fixed key, or add a tiny dedicated lock helper; keep 1h lease. - Ensure at most one consolidation agent can run at once. - Keep heartbeat + stale lock recovery semantics in `core/src/memories/startup/watch.rs`. - Remove dead scope code and legacy constants no longer used. - Update tests: - One-agent-at-a-time behavior. - Lock expiry allows takeover after stale lease. Acceptance criteria: - Exactly one phase-2 consolidation agent can be active cluster-wide (per local DB). - Stale lock recovers automatically. ## PR 5: Final cleanup and docs - Remove legacy artifacts and references: - `raw_memories/` and `memory_summary.md` assumptions from prompts/comments/tests. - Scope constants for cwd memory pathing in core/state if fully unused. - Update docs under `docs/` for memory workflow and directory layout. - Add a brief operator note for rollout: compatibility window for old stage-1 JSON keys and when to remove aliases. Acceptance criteria: - Code and docs reflect only the simplified global workflow. - No stale references to per-cwd memory buckets. ## Notes on sequencing - PR 1 is safest first because it improves correctness without changing external artifact layout. - PR 2 keeps parser compatibility so prompt deployment can happen independently. - PR 3 and PR 4 split filesystem/scope simplification from locking simplification to reduce blast radius. - PR 5 is intentionally cleanup-only.	2026-02-10 21:50:53 +00:00

1 2 3 4 5 ...

3066 commits