Commit graph

3402 commits

Author SHA1 Message Date
sayan-oai
5602edc1d0
chore: limit update to 0.98.0 NUX to < 0.98.0 ver (#10787)
seems like footgun if we forget to remove before releasing 0.99.0,
limited announcement to versions < 0.98.0
2026-02-05 12:11:32 -08:00
Matthew Zeng
7e81f63698
[app-server] Add a method to list experimental features. (#10721)
- [x] Add a method to list experimental features.
2026-02-05 20:04:01 +00:00
jif-oai
ddd09a9368
fix: announcement in prio (#10783) 2026-02-05 19:57:57 +00:00
sayan-oai
5fdf6f5efa
chore: rm web-search-eligible header (#10660)
default-enablement of web_search is now client-side, no need to send
eligibility headers to backend.

Tested locally, headers no longer sent.

will wait for corresponding backend change to deploy before merging
2026-02-05 11:48:34 -08:00
iceweasel-oai
901d5b8fd6
add sandbox policy and sandbox name to codex.tool.call metrics (#10711)
This will give visibility into the comparative success rate of the
Windows sandbox implementations compared to other platforms.
2026-02-05 11:42:12 -08:00
jif-oai
4df9f2020b
nit: gpt-5.3-codex announcement 2 (#10782) 2026-02-05 19:22:24 +00:00
jif-oai
ddfb8bfd77
nit: gpt-5.3-codex announcement (#10775) 2026-02-05 19:17:04 +00:00
Owen Lin
3582b74d01
fix(auth): isolate chatgptAuthTokens concept to auth manager and app-server (#10423)
So that the rest of the codebase (like TUI) don't need to be concerned
whether ChatGPT auth was handled by Codex itself or passed in via
app-server's external auth mode.
2026-02-05 10:46:06 -08:00
Owen Lin
5c0fd62ff1
fix(tui): fix resume_picker_orders_by_updated_at test (#10769)
I think this was due to https://github.com/openai/codex/issues/10752
landing and not rebased on top of
9ee746afd6
2026-02-05 18:03:10 +00:00
Felipe Coury
22545bf206
feat(tui): add sortable resume picker with created/updated timestamp toggle (#10752)
## Summary

- Add sorting support to the resume session picker with Tab key toggle
- Sessions can now be sorted by either creation time or last updated
time
- Display the current sort mode in the picker header
- Default to sorting by creation time (most recent first)

## Changes

- Add `sort_key` field to `PickerState` to track current sort order
- Pass sort key to `RolloutRecorder::list_threads()` for proper backend
sorting
- Add Tab key handler to toggle between `CreatedAt` and `UpdatedAt`
sorting
- Show current sort mode ("Created at" / "Updated at") in header
- Add "Tab to toggle sort" keyboard hint
- Intelligently hide secondary date column when terminal is narrow
- Reload session list when sort order changes

## Test plan

- [x] Unit tests for sort key toggle functionality
- [x] Snapshot tests updated for new header format
- [x] Test that Tab key triggers reload with new sort key
- [x] Test column visibility adapts to narrow terminals
2026-02-05 09:08:31 -08:00
Felipe Coury
b0e5a6305b
feat(tui): add /statusline command for interactive status line configuration (#10546)
## Summary
- Adds a new `/statusline` command to configure TUI footer status line
- Introduces reusable `MultiSelectPicker` component with keyboard
navigation, optional ordering and toggle support
- Implement status line setup modal that persist configuration to
config.toml

  ## Status Line Items
  The following items can be displayed in the status line:
  - **Model**: Current model name (with optional reasoning level)
  - **Context**: Remaining/used context window percentage
  - **Rate Limits**: 5-day and weekly usage limits
  - **Git**: Current branch (with optimized lookups)
  - **Tokens**: Used tokens, input/output token counts
  - **Session**: Session ID (full or shortened prefix)
  - **Paths**: Current directory, project root
  - **Version**: Codex version

  ## Features
  - Live preview while configuring status line items
  - Fuzzy search filtering in the picker
  - Intelligent truncation when items don't fit
  - Items gracefully omit when data is unavailable
  - Configuration persists to `config.toml`
  - Validates and warns about invalid status line items

  ## Test plan
  - [x] Run `/statusline` and verify picker UI appears
  - [x] Toggle items on/off and verify live preview updates
  - [x] Confirm selection persists after restart
  - [x] Verify truncation behavior with many items selected
  - [x] Test git branch detection in and out of git repos

---------

Co-authored-by: Josh McKinney <joshka@openai.com>
2026-02-05 08:50:21 -08:00
gt-oai
3b54fd7336
Add hooks implementation and wire up to notify (#9691)
This introduces a `Hooks` service. It registers hooks from config and
dispatches hook events at runtime.

N.B. The hook config is not wired up to this yet. But for legacy
reasons, we wire up `notify` from config and power it using hooks now.
Nothing about the `notify` interface has changed.

I'd start by reviewing `hooks/types.rs`

Some things to note:
  - hook names subject to change
  - no hook result yet
  - stopping semantics yet to be introduced
  - additional hooks yet to be introduced
2026-02-05 16:49:35 +00:00
jif-oai
9ee746afd6
Leverage state DB metadata for thread summaries (#10621)
Summary:
- read conversation summaries and cwd info from the state DB when
possible so we no longer rely on rollout files for metadata and avoid
extra I/O
- persist CLI version in thread metadata, surface it through summary
builders, and add the necessary DB migration hooks
- simplify thread listing by using enriched state DB data directly
rather than reading rollout heads

Testing:
- Not run (not requested)
2026-02-05 16:39:11 +00:00
jif-oai
68e82e5dc9
nit: add DB version is discrepancy recording (#10762) 2026-02-05 16:24:18 +00:00
jif-oai
901215e310
feat: repair DB in case of missing lines (#10751) 2026-02-05 16:21:49 +00:00
jif-oai
41f3b1ba0b
feat: add memory tool (#10637)
Add a tool for memory to retrieve a full memory based on the memory ID
2026-02-05 16:16:31 +00:00
jif-oai
fe1cbd0f38
chore: handle shutdown correctly in tui (#10756) 2026-02-05 16:07:50 +00:00
jif-oai
d337b51741
feat: wire ephemeral in codex exec (#10758) 2026-02-05 15:49:57 +00:00
jif-oai
4033f905c6
feat: resumable backfill (#10745)
## Summary

This PR makes SQLite rollout backfill resumable and repeatable instead
of one-shot-on-db-create.

## What changed

- Added a persisted backfill state table:
  - state/migrations/0008_backfill_state.sql
- Tracks status (pending|running|complete), last_watermark, and
last_success_at.
- Added backfill state model/types in codex-state:
  - BackfillState, BackfillStatus (state/src/model/backfill_state.rs)
- Added runtime APIs to manage backfill lifecycle/progress:
  - get_backfill_state
  - mark_backfill_running
  - checkpoint_backfill
  - mark_backfill_complete
- Updated core startup behavior:
- Backfill now runs whenever state is not Complete (not only when DB
file is newly created).
- Reworked backfill execution:
- Collect rollout files, derive deterministic watermark per path, sort,
resume from last_watermark.
- Process in batches (BACKFILL_BATCH_SIZE = 200), checkpoint after each
batch.
  - Mark complete with last_success_at at the end.

## Why

Previous behavior could leave users permanently partially backfilled if
the process exited during initial async backfill. This change allows
safe continuation across restarts and avoids restarting from scratch.
2026-02-05 14:34:34 +00:00
iceweasel-oai
f2ffc4e5d0
Include real OS info in metrics. (#10425)
calculated a hashed user ID from either auth user id or API key
Also correctly populates OS.

These will make our metrics more useful and powerful for analysis.
2026-02-05 06:30:31 -08:00
jif-oai
040ecee715
Update explorer role default model (#10748)
Summary
- switch the explorer role in core agent configuration to use
`gpt-5.1-codex-mini` as the default model override
- leave other role defaults untouched

Testing
- Not run (not requested)
2026-02-05 13:51:53 +00:00
pap-openai
b2424cb635
adding fork information (UI) when forking (#10246)
- shows `/fork` command that ran in prev session
- shows `session forked from name (uuid) || uuid (if name is not set)` as an event in new session
2026-02-05 13:24:55 +00:00
jif-oai
aa46b5cf99
nit: backfill stronger (#10738) 2026-02-05 12:30:16 +00:00
jif-oai
97582ac52d
Allow user shell commands to run alongside active turns (#10513)
Summary
- refactor user shell command execution into a shared helper and add
modes for standalone vs active-turn execution
- run user shell commands asynchronously when a turn is already active
so they don’t replace or abort the current turn
- extend the tests to cover the new behavior and add the generated Codex
environment manifest

Testing
- Not run (not requested)
2026-02-05 11:11:00 +00:00
jif-oai
c67120f4a0
fix: flaky landlock (#10689)
https://openai.slack.com/archives/C095U48JNL9/p1770243347893959
2026-02-05 10:30:18 +00:00
Ashutosh Kumar Singh
7b28b350e1
fix(tui): flush input buffer on init to prevent early exit on Windows (#10729)
Fixes #10661.

### Problem
On Windows, the sign-in menu can exit immediately if the OS-level input
buffer contains trailing characters (like the Enter key from running the
command).

### Solution
**Flush Input Buffer on Init**: Use FlushConsoleInputBuffer on Windows
(and cflush on Unix) in ui::init() to discard any input captured before
the TUI was ready.

Verified by @CodebyAmbrose in #10661.
2026-02-05 00:59:32 -08:00
Dylan Hurd
fe8b474acd
fix(core,app-server) resume with different model (#10719)
## Summary
When resuming with a different model, we should also append a developer
message with the model instructions

## Testing
- [x] Added unit tests
2026-02-05 00:40:05 -08:00
xl-openai
1e1146cd29
Reload cloud requirements after user login (#10725)
Reload cloud requirements after user login so it could take effect
immediately.
2026-02-05 00:27:16 -08:00
Charley Cunningham
dc7007beaa
Fix remote compaction estimator/payload instruction small mismatch (#10692)
## Summary
This PR fixes a deterministic mismatch in remote compaction where
pre-trim estimation and the `/v1/responses/compact` payload could use
different base instructions.

Before this change:
- pre-trim estimation used model-derived instructions
(`model_info.get_model_instructions(...)`)
- compact payload used session base instructions
(`sess.get_base_instructions()`)

After this change:
- remote pre-trim estimation and compact payload both use the same
`BaseInstructions` instance from session state.

## Changes
- Added a shared estimator entry point in `ContextManager`:
- `estimate_token_count_with_base_instructions(&self, base_instructions:
&BaseInstructions) -> Option<i64>`
- Kept `estimate_token_count(&TurnContext)` as a thin wrapper that
resolves model/personality instructions and delegates to the new helper.
- Updated remote compaction flow to fetch base instructions once and
reuse it for both:
  - trim preflight estimation
  - compact request payload construction
- Added regression coverage for parity and behavior:
  - unit test verifying explicit-base estimator behavior
- integration test proving remote compaction uses session override
instructions and trims accordingly

## Why this matters
This removes a deterministic divergence source where pre-trim could
think the request fits while the actual compact request exceeded context
because its instructions were longer/different.

## Scope
In scope:
- estimator/payload base-instructions parity in remote compaction

Out of scope:
- retry-on-`context_length_exceeded`
- compaction threshold/headroom policy changes
- broader trimming policy changes

## Codex author:
`codex fork 019c2b24-c2df-7b31-a482-fb8cf7a28559`
2026-02-04 23:24:06 -08:00
Ahmed Ibrahim
cd5f49a619
Make steer stable by default (#10690)
Promotes the Steer feature from Experimental to Stable and enables it by
default.

## What is Steer mode?

Steer mode changes how message submission works in the TUI:

- **With Steer enabled (new default)**: 
  - `Enter` submits messages immediately, even when a task is running
- `Tab` queues messages when a task is running (allows building up a
queue)
  
- **With Steer disabled (old behavior)**:
  - `Enter` queues messages when a task is running
  - This preserves the previous "queue while a task is running" behavior

## How Steer vs Queue work

The key difference is in the submission behavior:

1. **Steer mode** (`steer_enabled = true`):
- Enter → `InputResult::Submitted` → sends immediately via
`submit_user_message()`
- Tab → `InputResult::Queued` → queues via `queue_user_message()` if a
task is running
- This gives users direct control: Enter for immediate submission, Tab
for queuing

2. **Queue mode** (`steer_enabled = false`, previous default):
- Enter → `InputResult::Queued` → always queues when a task is running
   - Tab → `InputResult::Queued` → queues when a task is running
- This preserves the original behavior where Enter respects the running
task queue

## Implementation details

The behavior is controlled in
`ChatComposer::handle_key_event_without_popup()`:
- When `steer_enabled` is true, Enter calls `handle_submission(false)`
(submit immediately)
- When `steer_enabled` is false, Enter calls `handle_submission(true)`
(queue)

See `codex-rs/tui/src/bottom_pane/chat_composer.rs` for the
implementation.

## Documentation

For more details on the chat composer behavior, see:
- [TUI Chat Composer documentation](docs/tui-chat-composer.md)
- Feature flag definition: `codex-rs/core/src/features.rs`
2026-02-04 23:12:59 -08:00
Charley Cunningham
41b4962b0a
Sync collaboration mode naming across Default prompt, tools, and TUI (#10666)
## Summary
- add shared `ModeKind` helpers for display names, TUI visibility, and
`request_user_input` availability
- derive TUI mode filtering/labels from shared `ModeKind` metadata
instead of local hardcoded matches
- derive `request_user_input` availability text and unavailable error
mode names from shared mode metadata
- replace hardcoded known mode names in the Default collaboration-mode
template with `{{KNOWN_MODE_NAMES}}` and fill it from
`TUI_VISIBLE_COLLABORATION_MODES`
- add regression tests for mode metadata sync and placeholder
replacement

## Notes
- `cargo test -p codex-core` integration target (`tests/all`) still
shows pre-existing env-specific failures in this environment due missing
`test_stdio_server` binary resolution; core unit tests are green.

## Codex author
`codex resume 019c26ff-dfe7-7173-bc04-c9e1fff1e447`
2026-02-04 23:03:28 -08:00
Dylan Hurd
e482978261
fix(core) switching model appends model instructions (#10651)
## Summary
When switching models, we should append the instructions of the new
model to the conversation as a developer message.

## Test
- [x] Adds a unit test
2026-02-05 05:50:38 +00:00
Dylan Hurd
a05aadfa1b
chore(config) Default Personality Pragmatic (#10705)
## Summary
Switch back to Pragmatic personality

## Testing
- [x] Updated unit tests
2026-02-04 21:22:47 -08:00
cryptonerdcn
1dc06b6ffc
fix: ensure resume args precede image args (#10709)
## Summary
Fixes argument ordering when `resumeThread()` is used with
`local_image`. The SDK previously emitted CLI args with `--image` before
`resume <threadId>`, which caused the Codex CLI to treat `resume`/UUID
as image paths and start a new session. This PR moves `resume
<threadId>` before any `--image` flags and adds a regression test.

## Bug Report / Links
- OpenAI issue: https://github.com/openai/codex/issues/10708
- Repro repo:
https://github.com/cryptonerdcn/codex-resume-local-image-repro
- Repro issue (repo):
https://github.com/cryptonerdcn/codex-resume-local-image-repro/issues/1

## Repro (pre-fix)
1. Build SDK from source
2. Run resume + local_image
3. Args order: `--image <path> resume <id>`
4. Result: new session created (thread id changes)

## Fix
Move `resume <threadId>` before `--image` in `CodexExec.run` and add a
regression test to assert ordering.

## Tests
- `cd sdk/typescript && npm test`
  - **Failed**: `codex-rs/target/debug/codex` missing (ENOENT)

## Notes
- I can rerun tests in an environment with `codex-rs` built and report
results.
2026-02-04 21:19:56 -08:00
sayan-oai
4ed8d74aab
fix: ensure status indicator present earlier in exec path (#10700)
ensure status indicator present in all classifications of exec tool.
fixes indicator disappearing after preambles, will look into using
`phase` to avoid this class of error in a few hours.

commands parsed as unknown faced this issue

tested locally, added test for specific failure flow
2026-02-05 03:56:50 +00:00
Josh McKinney
d876f3b94f
fix(tui): restore working shimmer after preamble output (#10701)
## Problem
When a turn streamed a preamble line before any tool activity,
`ChatWidget` hid the status row while committing streamed lines and did
not restore it until a later event (commonly `ExecCommandBegin`). During
that idle gap, the UI looked finished even though the turn was still
active.

## Mental model
The bottom status row and transcript stream are separate progress
affordances:
- transcript stream shows committed output
- status row (spinner/shimmer + header) shows liveness of an active turn

While stream output is actively committing, hiding the status row is
acceptable to avoid redundant visual noise. Once stream controllers go
idle, an active turn must restore the status row immediately so liveness
remains visible across preamble-to-tool gaps.

## Non-goals
- No changes to streaming chunking policy or pacing.
- No changes to final completion behavior (status still hides when task
actually ends).
- No refactor of status lifecycle ownership between `ChatWidget` and
`BottomPane`.

## Tradeoffs
- We keep the existing behavior of hiding the status row during active
stream commits.
- We add explicit restoration on the idle boundary when the task is
still running.
- This introduces one extra status update on idle transitions, which is
small overhead but makes liveness semantics consistent.

## Architecture
`run_commit_tick_with_scope` in `chatwidget.rs` now documents and
enforces a two-phase contract:
1. For each committed streamed cell, hide status and append transcript
output.
2. If controllers are present and all idle, restore status iff task is
still running, preserving the current header.

This keeps status ownership in `ChatWidget` while relying on
`BottomPane` helpers:
- `hide_status_indicator()` during active stream commits
- `ensure_status_indicator()` +
`set_status_header(current_status_header)` at stream-idle boundary

Documentation pass additions:
- Clarified the function-level contract and lifecycle intent in
`run_commit_tick_with_scope`.
- Added an explicit regression snapshot test comment describing the
failing sequence.

## Observability
Signal that the fix is present:
- In the preamble-idle state, rendered output still includes `• Working
(… esc to interrupt)`.
- New snapshot:
`codex_tui__chatwidget__tests__preamble_keeps_working_status.snap`.

Debug path for future regressions:
- Start at `run_commit_tick_with_scope` for hide/restore transitions.
- Verify `bottom_pane.is_task_running()` at idle transition.
- Confirm `current_status_header` continuity when status is recreated.
- Use the new snapshot and targeted test sequence to reproduce
deterministic preamble-idle behavior.

## Tests
- Updated regression assertion:
- `streaming_final_answer_keeps_task_running_state` now expects status
widget to remain present while turn is running.
- Renamed/updated behavioral regression:
  - `preamble_keeps_status_indicator_visible_until_exec_begin`.
- Added snapshot regression coverage:
  - `preamble_keeps_working_status_snapshot`.
- Snapshot file:
`tui/src/chatwidget/snapshots/codex_tui__chatwidget__tests__preamble_keeps_working_status.snap`.

Commands run:
- `just fmt`
- `cargo test -p codex-tui
preamble_keeps_status_indicator_visible_until_exec_begin`
- `cargo test -p codex-tui preamble_keeps_working_status_snapshot`

## Risks / Inconsistencies
- Status visibility policy is still split across multiple event paths
(`commit tick`, `turn complete`, `exec begin`), so future regressions
can reintroduce ordering gaps.
- Restoration depends on `is_task_running()` correctness; if task
lifecycle flags drift, status behavior will drift too.
- Snapshot proves rendered state, not animation cadence; cadence still
relies on frame scheduling behavior elsewhere.
2026-02-04 19:28:13 -08:00
Dylan Hurd
73f32840c6
chore(core) personality migration tests (#10650)
## Summary
Adds additional tests for personality edge cases

## Testing
- [x] These are tests
2026-02-04 19:03:14 -08:00
gt-oai
1f47e08d66
Cloud Requirements: increase timeout and retries (#10631)
Add retries and an increased-length timeout for loading Cloud
Requirements.

Co-authored-by: alexsong-oai <alexsong@openai.com>
2026-02-05 01:52:12 +00:00
Josh McKinney
cddfd1e675
feat(core): add configurable log_dir (#10678)
Adds a top-level `log_dir` config key (defaults to `$CODEX_HOME/log`) so
one-off runs can redirect `codex-tui.log` via `-c`, e.g.:

  codex -c log_dir=./.codex-log

Also resolves relative paths in CLI `-c/--config` overrides for
`AbsolutePathBuf` values against the effective cwd (when available).

Tests:
- cargo test -p codex-core
2026-02-05 01:23:30 +00:00
pakrym-oai
0e8d359da9
Session-level model client (#10664)
Make ModelClient a session-scoped object.
Move state that is session level onto the client, and make state that is
per-turn explicit on corresponding methods.
Stop taking a huge Config object, instead only pass in values that are
actually needed.

---------

Co-authored-by: Josh McKinney <joshka@openai.com>
2026-02-04 16:58:48 -08:00
Owen Lin
224c9f768d
chore(app-server): document experimental API opt-in (#10667)
Add a section on how to opt in to the experimental API.
2026-02-04 16:19:13 -08:00
Owen Lin
5ea107a088
feat(app-server, core): allow text + image content items for dynamic tool outputs (#10567)
Took over the work that @aaronl-openai started here:
https://github.com/openai/codex/pull/10397

Now that app-server clients are able to set up custom tools (called
`dynamic_tools` in app-server), we should expose a way for clients to
pass in not just text, but also image outputs. This is something the
Responses API already supports for function call outputs, where you can
pass in either a string or an array of content outputs (text, image,
file):
https://platform.openai.com/docs/api-reference/responses/create#responses_create-input-input_item_list-item-function_tool_call_output-output-array-input_image

So let's just plumb it through in Codex (with the caveat that we only
support text and image for now). This is implemented end-to-end across
app-server v2 protocol types and core tool handling.

## Breaking API change
NOTE: This introduces a breaking change with dynamic tools, but I think
it's ok since this concept was only recently introduced
(https://github.com/openai/codex/pull/9539) and it's better to get the
API contract correct. I don't think there are any real consumers of this
yet (not even the Codex App).

Old shape:
`{ "output": "dynamic-ok", "success": true }`

New shape:
```
{
    "contentItems": [
      { "type": "inputText", "text": "dynamic-ok" },
      { "type": "inputImage", "imageUrl": "data:image/png;base64,AAA" }
    ]
  "success": true
}
```
2026-02-04 16:12:47 -08:00
Ahmed Ibrahim
f9c38f531c
add none personality option (#10688)
- add none personality enum value and empty placeholder behavior\n- add
docs/schema updates and e2e coverage
2026-02-04 15:40:33 -08:00
Eric Traut
7bcc552325
Added support for live updates to skills (#10478)
Add a centralized FileWatcher in codex-core (using notify) that watches
skill roots from the config layer stack (recursive)

Send `SkillsChanged` events when relevant file system changes are
detected

On `SkillsChanged`:
* Invalidate the skills cache immediately in ThreadManager
* Emit EventMsg::SkillsUpdateAvailable to active sessions
~~* Broadcast a new app-server notification:
SkillsListUpdatedNotification~~

This change does not inject new items into the event stream. That means
the agent will not know about new skills, so it won't be able to
implicitly invoke new skills. It also won't know about changes to
existing skills, so if it has already read the contents of a modified
skill, it will not honor the new behavior.

This change also does not detect modifications to AGENTS.md.

I plan to address these limitations in a follow-on PR modeled after
#9985. Injection of new skills and AGENTS was deemed to risky, hence the
need to split the feature into two stages. The changes in this PR were
designed to easily accommodate the second stage once we have some other
foundational changes in place.

Testing: In addition to automated tests, I did manual testing to confirm
that newly-created skills, deleted skills, and renamed skills are
reflected in the TUI skill picker menu. Also confirmed that
modifications to behaviors for explicitly-invoked skills are honored.

---------

Co-authored-by: Xin Lin <xl@openai.com>
2026-02-04 15:25:03 -08:00
gt-oai
d452bb3ae5
Add /debug-config slash command (#10642)
<img width="409" height="175" alt="image"
src="https://github.com/user-attachments/assets/76efe9c5-8375-4af3-b6af-bd9e162c1bc3"
/>
2026-02-04 22:26:17 +00:00
gt-oai
7c6d21a414
Fix test_shell_command_interruption flake (#10649)
## Human summary
Sandboxing (specifically `LandlockRestrict`) is means that e.g. `sleep
10` fails immediately. Therefore it cannot be interrupted.

In suite::interrupt::test_shell_command_interruption, sleep 10 is issued
at 17:28:16.554 (ToolCall: shell_command {"command":"sleep 10"...}),
then fails at 17:28:16.589 with duration_ms=34, success=false,
exit_code=101, and
    Sandbox(LandlockRestrict).

## Codex summary
- set `sandbox_mode = "danger-full-access"` in `interrupt` and
`v2/turn_interrupt` integration tests
- set `sandbox: Some(SandboxMode::DangerFullAccess)` in
`test_codex_jsonrpc_conversation_flow`
- set `sandbox_policy: Some(SandboxPolicy::DangerFullAccess)` in
`command_execution_notifications_include_process_id`

## Why
On some Linux CI environments, command execution fails immediately with
`LandlockRestrict` when sandboxed. These tests are intended to validate
JSON-RPC/task lifecycle behavior (interrupt semantics, command
notification shape/process id, request flow), but early sandbox startup
failure changes turn flow and can trigger extra follow-up requests,
causing flakes.

This change removes environment-specific sandbox startup dependency from
these tests while preserving their primary intent.

## Testing
- not run in this environment (per request)
2026-02-04 22:19:06 +00:00
Matthew Zeng
acdbd8edc5
[apps] Cache MCP actions from apps. (#10662)
MCP actions take a long time to load for users with lots of apps
installed. Adding a cache for these actions with 1hr expiration, given
that they are almost always aren't going to change unless people install
another app, which means they also need to restart codex to pick it up.
2026-02-04 13:51:57 -08:00
canvrno-oai
d589ee05b1
Fix jitter in TUI apps/connectors picker (#10593)
This PR fixes jitter in the TUI apps menu by making the description
column stable during rendering and height measurement.
Added a `stable_desc_col` option to
`SelectionViewParams`/`ListSelectionView`, introduced stable variants of
the shared row render/measure helpers in `selection_popup_common`, and
enabled the stable mode for the apps/connectors picker in `chatwidget`.
With these changes, only the apps/connectors picker uses this new
option, though it could be used elsewhere in the future.

Why: previously, the description column was computed from only currently
visible rows, so as you scrolled or filtered, the column could shift and
cause wrapping/height changes that looked jumpy. Computing it from all
rows in this popup keeps alignment and layout consistent as users scroll
through avaialble apps.



**Before:**

https://github.com/user-attachments/assets/3856cb72-5465-4b90-a993-65a2ffb09113





**After:**

https://github.com/user-attachments/assets/37b9d626-0b21-4c0f-8bb8-244c9ef971ff
2026-02-04 13:51:31 -08:00
jif-oai
4922b3e571
feat: add phase 1 mem db (#10634)
- Schema: thread_id (PK, FK to threads.id with cascade delete),
trace_summary, memory_summary, updated_at.
- Migration: creates the table and an index on (updated_at DESC,
thread_id DESC) for efficient recent-first reads.
  - Runtime API (DB-only):
      - `get_thread_memory(thread_id)`: fetch one memory row.
- `upsert_thread_memory(thread_id, trace_summary, memory_summary)`:
insert/update by thread id and always advance updated_at.
- `get_last_n_thread_memories_for_cwd(cwd, n)`: join thread_memory with
threads and return newest n rows for an exact cwd match.
- Model layer: introduced ThreadMemory and row conversion types to keep
query decoding typed and consistent with existing state models.
2026-02-04 21:38:39 +00:00
Ahmed Ibrahim
7a253076fe
Persist pending input user events (#10656)
- Persist user-message events for mid-turn injected input by emitting
user message turn items when pending input is recorded.
2026-02-04 11:47:10 -08:00