Commit graph

3718 commits

Author SHA1 Message Date
Owen Lin
e8f6d65899
fix(app-server): add will_retry to ErrorNotification (#7611)
VSCE renders `codex/event/stream_error` (automatically retried, e.g.
`"Reconnecting... 1/n"`) and `codex/event/error` (terminal errors)
differently, so add `will_retry` on ErrorNotification to indicate this.
2025-12-04 21:48:37 +00:00
Owen Lin
342c084cc3
fix(app-server): add duration_ms to McpToolCallItem (#7605)
Seems like a nice field to have, and also VSCE does render this one.
2025-12-04 13:45:07 -08:00
Ahmed Ibrahim
903b7774bc
Add models endpoint (#7603)
- Use the codex-api crate to introduce models endpoint. 
- Add `models` to codex core tests helpers
- Add `ModelsInfo` for the endpoint return type
2025-12-04 12:57:54 -08:00
Ahmed Ibrahim
6e6338aa87
Inline response recording and remove process_items indirection (#7310)
- Inline response recording during streaming: `run_turn` now records
items as they arrive instead of building a `ProcessedResponseItem` list
and post‑processing via `process_items`.
- Simplify turn handling: `handle_output_item_done` returns the
follow‑up signal + optional tool future; `needs_follow_up` is set only
there, and in‑flight tool futures are drained once at the end (errors
logged, no extra state writes).
- Flattened stream loop: removed `process_items` indirection and the
extra output queue
- - Tests: relaxed `tool_parallelism::tool_results_grouped` to allow any
completion order while still requiring matching call/output IDs.
2025-12-04 12:17:54 -08:00
Jeremy Rose
7dfc3a4dc7
add --branch to codex cloud exec (#7602)
Adds `--branch` to `codex cloud exec` to set base branch.
2025-12-04 12:00:18 -08:00
Ahmed Ibrahim
9b2055586d
remove model_family from `config (#7571)
- Remove `model_family` from `config`
- Make sure to still override config elements related to `model_family`
like supporting reasoning
2025-12-04 11:57:58 -08:00
Maxime Savard
ce0b38c056
FIX: WSL Paste image does not work (#6793)
## Related issues:  
- https://github.com/openai/codex/issues/3939  
- https://github.com/openai/codex/issues/2292  
- https://github.com/openai/codex/issues/7528 (After correction
https://github.com/openai/codex/pull/3990)

**Area:** `codex-cli` (image handling / clipboard & file uploads)  
**Platforms affected:** WSL (Ubuntu on Windows 10/11). No behavior
change on native Linux/macOS/Windows.

## Summary

This PR fixes image pasting and file uploads when running `codex-cli`
inside WSL. Previously, image operations failed silently or with
permission errors because paths weren't properly mapped between Windows
and WSL filesystems.

## Visual Result

<img width="1118" height="798" alt="image"
src="https://github.com/user-attachments/assets/14e10bc4-6b71-4d1f-b2a6-52c0a67dd069"
/>

## Last Rust-Cli

<img width="1175" height="859" alt="image"
src="https://github.com/user-attachments/assets/7ef41e29-9118-42c9-903c-7116d21e1751"
/>

## Root cause

The CLI assumed native Linux/Windows environments and didn't handle the
WSL↔Windows boundary:

- Used Linux paths for files that lived on the Windows host
- Missing path normalization between Windows (`C:\...`) and WSL
(`/mnt/c/...`)
- Clipboard access failed under WSL

### Why `Ctrl+V` doesn't work in WSL terminals

Most WSL terminal emulators (Windows Terminal, ConEmu, etc.) intercept
`Ctrl+V` at the terminal level to paste text from the Windows clipboard.
This keypress never reaches the CLI application itself, so our clipboard
image handler never gets triggered. Users need `Ctrl+Alt+V`.

## Changes

### WSL detection & path mapping

- Detects WSL by checking `/proc/sys/kernel/osrelease` and the
`WSL_INTEROP` env var
- Maps Windows drive paths to WSL mount paths (`C:\...` → `/mnt/c/...`)

### Clipboard fallback for WSL

- When clipboard access fails under WSL, falls back to PowerShell to
extract images from the Windows clipboard
- Saves to a temp file and maps the path back to WSL

### UI improvements

- Shows `Ctrl+Alt+V` hint on WSL (many terminals intercept plain
`Ctrl+V`)
- Better error messages for unreadable images

## Performance

- Negligible overhead. The fallback adds a single FS copy to a temp file
only when needed.
- Direct streaming remains the default.

## Files changed

- `protocol/src/lib.rs` – Added platform detection module  
- `protocol/src/models.rs` – Added WSL path mapping for local images  
- `protocol/src/platform.rs` – New module with WSL detection utilities  
- `tui/src/bottom_pane/chat_composer.rs` – Added base64 data URL support
and WSL path mapping
- `tui/src/bottom_pane/footer.rs` – WSL-aware keyboard shortcuts  
- `tui/src/clipboard_paste.rs` – PowerShell clipboard fallback

## How to reproduce the original bug (pre-fix)

1. Run `codex-cli` inside WSL2 on Windows.  
2. Paste an image from the Windows clipboard or drag an image from
`C:\...` into the terminal.
3. Observe that the image is not attached (silent failure) or an error
is logged; no artifact reaches the tool.

## How to verify the fix

1. Build this branch and run `codex-cli` inside WSL2.  
2. Paste from clipboard and drag from both Windows and WSL paths.  
3. Confirm that the image appears in the tool and the CLI shows a single
concise info line (no warning unless fallback was used).

I’m happy to adjust paths, naming, or split helpers into a separate
module if you prefer.

## How to try this branch

If you want to try this before it’s merged, you can use my Git branch:

Repository: https://github.com/Waxime64/codex.git  
Branch: `wsl-image-2`

1. Start WSL on your Windows machine.
2. Clone the repository and switch to the branch:
   ```bash
   git clone https://github.com/Waxime64/codex.git
   cd codex
   git checkout wsl-image-2
   # then go into the Rust workspace root, e.g.:
   cd codex-rs
3. Build the TUI binary:
  cargo build -p codex-tui --bin codex-tui --release
4. Install the binary:
   sudo install -m 0755 target/release/codex-tui /usr/local/bin/codex
5. From the project directory where you want to use Codex, start it
with:
   cd /path/to/your/project
   /usr/local/bin/codex

On WSL, use CTRL+ALT+V to paste an image from the Windows clipboard into
the chat.
2025-12-04 10:50:20 -08:00
Dylan Hurd
37c36024c7
chore(core): test apply_patch_cli on Windows (#7554)
## Summary
These tests pass on windows, let's enable them.

## Testing
- [x] These are more tests
2025-12-04 10:39:45 -08:00
jif-oai
291b54a762
chore: review in read-only (#7593) 2025-12-04 10:01:12 -08:00
jif-oai
2b5d0b2935
feat: update sandbox policy to allow TTY (#7580)
**Change**: Seatbelt now allows file-ioctl on /dev/ttys[0-9]+ even
without the sandbox extension so pre-created PTYs remain interactive
(Python REPL, shells).

**Risk**: A seatbelted process that already holds a PTY fd (including
one it shouldn’t) could issue tty ioctls like TIOCSTI or termios changes
on that fd. This doesn’t allow opening new PTYs or reading/writing them;
it only broadens ioctl capability on existing fds.

**Why acceptable**: We already hand the child its PTY for interactive
use; restoring ioctls is required for isatty() and prompts to work. The
attack requires being given or inheriting a sensitive PTY fd; by design
we don’t hand untrusted processes other users’ PTYs (we don't hand them
any PTYs actually), so the practical exposure is limited to the PTY
intentionally allocated for the session.

**Validation**:
Running
```
start a python interpreter and keep it running
```
Followed by:
* `calculate 1+1 using it` -> works as expected
* `Use this Python session to run the command just fix in
/Users/jif/code/codex/codex-rs` -> does not work as expected
2025-12-04 17:58:58 +00:00
zhao-oai
404a1ea34b
Update execpolicy.md (#7595) 2025-12-04 17:55:42 +00:00
jif-oai
36edb412b1
fix: release session ID when not used (#7592) 2025-12-04 17:42:16 +00:00
jif-oai
1b2509f05a
chore: default warning messages to true (#7588) 2025-12-04 17:29:23 +00:00
pakrym-oai
f1b7cdc3bd
Use shared check sandboxing (#7547) 2025-12-04 08:34:09 -08:00
pakrym-oai
c4e18f1b63
Slightly better status display for unified exec (#7563)
Trim bash -lc
2025-12-04 08:32:54 -08:00
jif-oai
8f4e00e1f1
chore: tool tip for /prompt (#7591) 2025-12-04 15:13:49 +00:00
zhao-oai
87666695ba
execpolicy tui flow (#7543)
## Updating the `execpolicy` TUI flow

In the TUI, when going through the command approval flow, codex will now
ask the user if they would like to whitelist the FIRST unmatched command
among a chain of commands.

For example, let's say the agent wants to run `apple | pear` with an
empty `execpolicy`

Neither apple nor pear will match to an `execpolicy` rule. Thus, when
prompting the user, codex tui will ask the user if they would like to
whitelist `apple`.

If the agent wants to run `apple | pear` again, they would be prompted
again because pear is still unknown. when prompted, the user will now be
asked if they'd like to whitelist `pear`.

Here's a demo video of this flow:


https://github.com/user-attachments/assets/fd160717-f6cb-46b0-9f4a-f0a974d4e710

This PR also removed the `allow for this session` option from the TUI.
2025-12-04 07:58:13 +00:00
ae
871f44f385
Add Enterprise plan to ChatGPT login description (#6918)
## Summary
- update ChatGPT onboarding login description to mention Enterprise
plans alongside Plus, Pro, and Team

## Testing
- just fmt


------
[Codex
Task](https://chatgpt.com/codex/tasks/task_i_691e088daf20832c88d8b667adf45128)
2025-12-03 23:47:46 -08:00
zhao-oai
3d35cb4619
Refactor execpolicy fallback evaluation (#7544)
## Refactor of the `execpolicy` crate

To illustrate why we need this refactor, consider an agent attempting to
run `apple | rm -rf ./`. Suppose `apple` is allowed by `execpolicy`.
Before this PR, `execpolicy` would consider `apple` and `pear` and only
render one rule match: `Allow`. We would skip any heuristics checks on
`rm -rf ./` and immediately approve `apple | rm -rf ./` to run.

To fix this, we now thread a `fallback` evaluation function into
`execpolicy` that runs when no `execpolicy` rules match a given command.
In our example, we would run `fallback` on `rm -rf ./` and prevent
`apple | rm -rf ./` from being run without approval.
2025-12-03 23:39:48 -08:00
zhao-oai
e925a380dc
whitelist command prefix integration in core and tui (#7033)
this PR enables TUI to approve commands and add their prefixes to an
allowlist:
<img width="708" height="605" alt="Screenshot 2025-11-21 at 4 18 07 PM"
src="https://github.com/user-attachments/assets/56a19893-4553-4770-a881-becf79eeda32"
/>

note: we only show the option to whitelist the command when 
1) command is not multi-part (e.g `git add -A && git commit -m 'hello
world'`)
2) command is not already matched by an existing rule
2025-12-03 23:17:02 -08:00
Jeremy Rose
ccdeb9d9c4
use markdown for rendering tips (#7557)
## Summary
- render tooltip content through the markdown renderer and prepend a
bold Tip label
- wrap tooltips at the available width using the indent’s measured width
before adding the indent

## Testing
- `/root/.cargo/bin/just fmt`
- `RUSTFLAGS="--cfg tokio_unstable" TOKIO_UNSTABLE=1
/root/.cargo/bin/just fix -p codex-tui` *(fails: codex-tui tests
reference tokio::time::advance/start_paused gated behind the tokio
test-util feature)*
- `RUSTFLAGS="--cfg tokio_unstable" TOKIO_UNSTABLE=1 cargo test -p
codex-tui` *(fails: codex-tui tests reference
tokio::time::advance/start_paused gated behind the tokio test-util
feature)*

------
[Codex
Task](https://chatgpt.com/codex/tasks/task_i_693081406050832c9772ae9fa5dd77ca)
2025-12-04 04:58:35 +00:00
Ahmed Ibrahim
67e67e054f
Migrate codex max (#7566)
- make codex max the default
- fix: we were doing some async work in sync function which caused tui
to panic
2025-12-03 20:54:48 -08:00
Eric Traut
edd98dd3b7
Remove test from #7481 that doesn't add much value (#7558)
Follow-up from PR #7481
2025-12-03 19:10:54 -08:00
Celia Chen
3e6cd5660c
[app-server] make file_path for config optional (#7560)
When we are writing to config using `config/value/write` or
`config/batchWrite`, it always require a `config/read` before it right
now in order to get the correct file path to write to. make this
optional so we read from the default user config file if this is not
passed in.
2025-12-04 03:08:18 +00:00
Ahmed Ibrahim
cee37a32b2
Migrate model family to models manager (#7565)
This PR moves `ModelsFamily` to `openai_models`. It also propagates
`ModelsManager` to session services and use it to drive model family. We
also make `derive_default_model_family` private because it's a step
towards what we want: one place that gives model configuration.

This is a second step at having one source of truth for models
information and config: `ModelsManager`.

Next steps would be to remove `ModelsFamily` from config. That's massive
because it's being used in 41 occasions mostly pre launching `codex`.
Also, we need to make `find_family_for_model` private. It's also big
because it's being used in 21 occasions ~ all tests.
2025-12-03 18:49:47 -08:00
Ahmed Ibrahim
8da91d1c89
Migrate tui to use models manager (#7555)
- This PR treats the `ModelsManager` like `AuthManager` and propagate it
into the tui, replacing the `builtin_model_presets`
- We are also decreasing the visibility of `builtin_model_presets`

based on https://github.com/openai/codex/pull/7552
2025-12-03 18:00:47 -08:00
Ahmed Ibrahim
00cc00ead8
Introduce ModelsManager and migrate app-server to use it. (#7552) 2025-12-03 17:17:56 -08:00
muyuanjin
70b97790be
fix: wrap long exec lines in transcript overlay (#7481)
What
-----
- Fix the Ctrl+T transcript overlay so that very long exec output lines
are soft‑wrapped to the viewport width instead of being rendered as a
single truncated row.
- Add a regression test to `TranscriptOverlay` to ensure long exec
outputs are rendered on multiple lines in the overlay.

Why
----
- Previously, the transcript overlay rendered extremely long single exec
lines as one on‑screen row and simply cut them off at the right edge,
with no horizontal scrolling.
- This made it impossible to inspect the full content of long tool/exec
outputs in the transcript view, even though the main TUI view already
wrapped those lines.
- Fixes #7454.

How
----
- Update `ExecCell::transcript_lines` to wrap exec output lines using
the existing `RtOptions`/`word_wrap_line` helpers so that transcript
rendering is width‑aware.
- Reuse the existing line utilities to expand the wrapped `Line` values
into the transcript overlay, preserving styling while respecting the
current viewport width.
- Add `transcript_overlay_wraps_long_exec_output_lines` test in
`pager_overlay.rs` that constructs a long single‑line exec output,
renders the transcript overlay into a small buffer, and asserts that the
long marker string spans multiple rendered lines.
2025-12-03 16:45:08 -08:00
Michael Bolin
1cfc967eb8
fix: Features should be immutable over the lifetime of a session/thread (#7540)
I noticed that `features: Features` was defined on `struct
SessionConfiguration`, which is commonly owned by `SessionState`, which
is in turn owned by `Session`.

Though I do not believe that `Features` should be allowed to be modified
over the course of a session (if the feature state is not invariant, it
makes it harder to reason about), which argues that it should live on
`Session` rather than `SessionState` or `SessionConfiguration`.

This PR moves `Features` to `Session` and updates all call sites. It
appears the only place we were mutating `Features` was:

- in tests
- the sub-agent config for a review task:


3ef76ff29d/codex-rs/core/src/tasks/review.rs (L86-L89)

Note this change also means it is no longer an `async` call to check the
state of a feature, eliminating the possibility of a
[TOCTTOU](https://en.wikipedia.org/wiki/Time-of-check_to_time-of-use)
error between checking the state of a feature and acting on it:


3ef76ff29d/codex-rs/core/src/codex.rs (L1069-L1076)
2025-12-03 16:12:31 -08:00
xl-openai
9a50a04400
feat: Support listing and selecting skills via $ or /skills (#7506)
List/Select skills with $-mention or /skills
2025-12-03 15:12:46 -08:00
Owen Lin
231ff19ca2
[app-server] fix: add thread_id to turn/plan/updated (#7553)
Realized we're missing this while migrating VSCE.
2025-12-03 15:00:07 -08:00
Aofei Sheng
de08c735a6
feat(tui): map Ctrl-P/N to arrow navigation in textarea (#7530)
- Treat Ctrl-P/N (and their C0 fallbacks) the same as Up/Down so cursor
movement matches popup/history behavior and control bytes never land in
the buffer

Fixes #7529

Signed-off-by: Aofei Sheng <aofei@aofeisheng.com>
2025-12-03 14:43:31 -08:00
muyuanjin
3395ebd96e
fix(tui): limit user shell output by screen lines (#7448)
What
- Limit the TUI "user shell" output panel by the number of visible
screen lines rather than by the number of logical lines.
- Apply middle truncation after wrapping, so a few extremely long lines
cannot expand into hundreds of visible lines.
- Add a regression test to guard this behavior.

Why
When the `ExecCommandSource::UserShell` tool returns a small number of
very long logical lines, the TUI wraps those lines into many visual
lines. The existing truncation logic applied
`USER_SHELL_TOOL_CALL_MAX_LINES` to the number of logical lines *before*
wrapping.

As a result, a command like:

- `Ran bash -lc "grep -R --line-number 'maskAssetId' ."`

or a synthetic command that prints a single ~50,000‑character line, can
produce hundreds of screen lines and effectively flood the viewport. The
intended middle truncation for user shell output does not take effect in
this scenario.

How
- In `codex-rs/tui/src/exec_cell/render.rs`, change the `ExecCell`
rendering path for `ExecCommandSource::UserShell` so that:
- Each logical line from `CommandOutput::aggregated_output` is first
wrapped via `word_wrap_line` into multiple screen lines using the
appropriate `RtOptions` and width from the `EXEC_DISPLAY_LAYOUT`
configuration.
- `truncate_lines_middle` is then applied to the wrapped screen lines,
with `USER_SHELL_TOOL_CALL_MAX_LINES` as the limit. This means the limit
is enforced on visible screen lines, not logical lines.
- The existing layout struct (`ExecDisplayLayout`) continues to provide
`output_max_lines`, so user shell output is subject to both
`USER_SHELL_TOOL_CALL_MAX_LINES` and the layout-specific
`output_max_lines` constraint.
- Keep using `USER_SHELL_TOOL_CALL_MAX_LINES` as the cap, but interpret
it as a per‑tool‑call limit on screen lines.
- Add a regression test `user_shell_output_is_limited_by_screen_lines`
in `codex-rs/tui/src/exec_cell/render.rs` that:
- Constructs two extremely long logical lines containing a short marker
(`"Z"`), so each wrapped screen line still contains the marker.
  - Wraps them at a narrow width to generate many screen lines.
- Asserts that the unbounded wrapped output would exceed
`USER_SHELL_TOOL_CALL_MAX_LINES` screen lines.
- Renders an `ExecCell` for `ExecCommandSource::UserShell` at the same
width and counts rendered lines containing the marker.
- Asserts `output_screen_lines <= USER_SHELL_TOOL_CALL_MAX_LINES`,
guarding against regressions where truncation happens before wrapping.

This change keeps user shell output readable while ensuring it cannot
flood the TUI, even when the tool emits a few extremely long lines.

Tests
- `cargo test -p codex-tui`

Issue
- Fixes #7447
2025-12-03 13:43:17 -08:00
Ahmed Ibrahim
71504325d3
Migrate model preset (#7542)
- Introduce `openai_models` in `/core`
- Move `PRESETS` under it
- Move `ModelPreset`, `ModelUpgrade`, `ReasoningEffortPreset`,
`ReasoningEffortPreset`, and `ReasoningEffortPreset` to `protocol`
- Introduce `Op::ListModels` and `EventMsg::AvailableModels`

Next steps:
- migrate `app-server` and `tui` to use the introduced Operation
2025-12-03 20:30:43 +00:00
jif-oai
7f068cfbcc
fix: main (#7546) 2025-12-03 20:15:12 +00:00
jif-oai
9e6c2c1e64
feat: add pycache to excluded directories (#7545) 2025-12-03 20:06:55 +00:00
jif-oai
8d0f023fa9
chore: update unified exec sandboxing detection (#7541)
No integration test for now because it would make them flaky. Tracking
it in my todos to add some once we have a clock based system for
integration tests
2025-12-03 20:06:47 +00:00
Ahmed Ibrahim
2ad980abf4
add slash resume (#7302)
`codex resume` isn't that discoverable. Adding it to the slash commands
can help
2025-12-03 11:25:44 -08:00
Owen Lin
3ef76ff29d
chore: conversation_id -> thread_id in app-server feedback/upload (#7538)
Use `thread_id: Option<String>` instead of `conversation_id:
Option<ConversationId>` to be consistent with the rest of app-server v2
APIs.
2025-12-03 18:47:35 +00:00
Owen Lin
844de19561
chore: delete unused TodoList item from app-server (#7537)
This item is sent as a turn notification instead: `turn/plan/updated`,
similar to Turn diffs (which is `turn/diff/updated`).

We treat these concepts as ephemeral compared to Items which are usually
persisted.
2025-12-03 18:47:12 +00:00
Owen Lin
343aa35db1
chore: update app-server README (#7510)
Just keeping the README up to date.

- Reorganize structure a bit to read more naturally
- Update RPC methods
- Update events
2025-12-03 10:41:38 -08:00
Shijie Rao
c3e4f920b4
chore: remove bun env var detect (#7534)
### Summary


[Thread](https://openai.slack.com/archives/C08JZTV654K/p1764780129457519)

We were a bit aggressive on assuming package installer based on env
variables for BUN. Here we are removing those checks.
2025-12-03 10:23:45 -08:00
Shijie Rao
4785344c9c
feat: support list mcp servers in app server (#7505)
### Summary
Added `mcp/servers/list` which is equivalent to `/mcp` slash command in
CLI for response. This will be used in VSCE MCP settings to show log in
status, available tools etc.
2025-12-03 09:51:46 -08:00
Jeremy Rose
9b3251f28f
seatbelt: allow openpty() (#7507)
This allows `openpty(3)` to run in the default sandbox. Also permit
reading `kern.argmax`, which is the maximum number of arguments to
exec().
2025-12-03 09:15:38 -08:00
jif-oai
45f3250eec
feat: codex tool tips (#7440)
<img width="551" height="316" alt="Screenshot 2025-12-01 at 12 22 26"
src="https://github.com/user-attachments/assets/6ca3deff-8ef8-4f74-a8e1-e5ea13fd6740"
/>
2025-12-03 16:29:13 +00:00
jif-oai
51307eaf07
feat: retroactive image placeholder to prevent poisoning (#6774)
If an image can't be read by the API, it will poison the entire history,
preventing any new turn on the conversation.
This detect such cases and replace the image by a placeholder
2025-12-03 11:35:56 +00:00
jif-oai
42ae738f67
feat: model warning in case of apply patch (#7494) 2025-12-03 09:07:31 +00:00
Dylan Hurd
00ef9d3784
fix(tui) Support image paste from clipboard on native Windows (#7514)
Closes #3404 

## Summary
On windows, ctrl+v does not work for the same reason that cmd+v does not
work on macos. This PR adds alt/option+v detection, which allows windows
users to paste images from the clipboard using.

We could swap between just ctrl on mac and just alt on windows, but this
felt simpler - I don't feel strongly about it.

Note that this will NOT address image pasting in WSL environments, due
to issues with WSL <> Windows clipboards. I'm planning to address that
in a separate PR since it will likely warrant some discussion.

## Testing
- [x] Tested locally on a Mac and Windows laptop
2025-12-02 22:12:49 -08:00
Robby He
f3989f6092
fix(unified_exec): use platform default shell when unified_exec shell… (#7486)
# Unified Exec Shell Selection on Windows

## Problem

reference issue #7466

The `unified_exec` handler currently deserializes model-provided tool
calls into the `ExecCommandArgs` struct:

```rust
#[derive(Debug, Deserialize)]
struct ExecCommandArgs {
    cmd: String,
    #[serde(default)]
    workdir: Option<String>,
    #[serde(default = "default_shell")]
    shell: String,
    #[serde(default = "default_login")]
    login: bool,
    #[serde(default = "default_exec_yield_time_ms")]
    yield_time_ms: u64,
    #[serde(default)]
    max_output_tokens: Option<usize>,
    #[serde(default)]
    with_escalated_permissions: Option<bool>,
    #[serde(default)]
    justification: Option<String>,
}
```

The `shell` field uses a hard-coded default:

```rust
fn default_shell() -> String {
    "/bin/bash".to_string()
}
```

When the model returns a tool call JSON that only contains `cmd` (which
is the common case), Serde fills in `shell` with this default value.
Later, `get_command` uses that value as if it were a model-provided
shell path:

```rust
fn get_command(args: &ExecCommandArgs) -> Vec<String> {
    let shell = get_shell_by_model_provided_path(&PathBuf::from(args.shell.clone()));
    shell.derive_exec_args(&args.cmd, args.login)
}
```

On Unix, this usually resolves to `/bin/bash` and works as expected.
However, on Windows this behavior is problematic:

- The hard-coded `"/bin/bash"` is not a valid Windows path.
- `get_shell_by_model_provided_path` treats this as a model-specified
shell, and tries to resolve it (e.g. via `which::which("bash")`), which
may or may not exist and may not behave as intended.
- In practice, this leads to commands being executed under a non-default
or non-existent shell on Windows (for example, WSL bash), instead of the
expected Windows PowerShell or `cmd.exe`.

The core of the issue is that **"model did not specify `shell`" is
currently interpreted as "the model explicitly requested `/bin/bash`"**,
which is both Unix-specific and wrong on Windows.

## Proposed Solution

Instead of hard-coding `"/bin/bash"` into `ExecCommandArgs`, we should
distinguish between:

1. **The model explicitly specifying a shell**, e.g.:

   ```json
   {
     "cmd": "echo hello",
     "shell": "pwsh"
   }
   ```

In this case, we *do* want to respect the model’s choice and use
`get_shell_by_model_provided_path`.

2. **The model omitting the `shell` field entirely**, e.g.:

   ```json
   {
     "cmd": "echo hello"
   }
   ```

In this case, we should *not* assume `/bin/bash`. Instead, we should use
`default_user_shell()` and let the platform decide.

To express this distinction, we can:

1. Change `shell` to be optional in `ExecCommandArgs`:

   ```rust
   #[derive(Debug, Deserialize)]
   struct ExecCommandArgs {
       cmd: String,
       #[serde(default)]
       workdir: Option<String>,
       #[serde(default)]
       shell: Option<String>,
       #[serde(default = "default_login")]
       login: bool,
       #[serde(default = "default_exec_yield_time_ms")]
       yield_time_ms: u64,
       #[serde(default)]
       max_output_tokens: Option<usize>,
       #[serde(default)]
       with_escalated_permissions: Option<bool>,
       #[serde(default)]
       justification: Option<String>,
   }
   ```

Here, the absence of `shell` in the JSON is represented as `shell:
None`, rather than a hard-coded string value.
2025-12-02 21:49:25 -08:00
Matthew Zeng
dbec741ef0
Update device code auth strings. (#7498)
- [x] Update device code auth strings.
2025-12-02 17:36:38 -08:00