## 🐛 Problem Users running commands with non-ASCII characters (like Russian text "пример") in Windows/WSL environments experience garbled text in VSCode's shell preview window, with Unicode replacement characters (�) appearing instead of the actual text. **Issue**: https://github.com/openai/codex/issues/6178 ## 🔧 Root Cause The issue was in `StreamOutput<Vec<u8>>::from_utf8_lossy()` method in `codex-rs/core/src/exec.rs`, which used `String::from_utf8_lossy()` to convert shell output bytes to strings. This function immediately replaces any invalid UTF-8 byte sequences with replacement characters, without attempting to decode using other common encodings. In Windows/WSL environments, shell output often uses encodings like: - Windows-1252 (common Windows encoding) - Latin-1/ISO-8859-1 (extended ASCII) ## 🛠️ Solution Replaced the simple `String::from_utf8_lossy()` call with intelligent encoding detection via a new `bytes_to_string_smart()` function that tries multiple encoding strategies: 1. **UTF-8** (fast path for valid UTF-8) 2. **Windows-1252** (handles Windows-specific characters in 0x80-0x9F range) 3. **Latin-1** (fallback for extended ASCII) 4. **Lossy UTF-8** (final fallback, same as before) ## 📁 Changes ### New Files - `codex-rs/core/src/text_encoding.rs` - Smart encoding detection module - `codex-rs/core/tests/suite/text_encoding_fix.rs` - Integration tests ### Modified Files - `codex-rs/core/src/lib.rs` - Added text_encoding module - `codex-rs/core/src/exec.rs` - Updated StreamOutput::from_utf8_lossy() - `codex-rs/core/tests/suite/mod.rs` - Registered new test module ## ✅ Testing - **5 unit tests** covering UTF-8, Windows-1252, Latin-1, and fallback scenarios - **2 integration tests** simulating the exact Issue #6178 scenario - **Demonstrates improvement** over the previous `String::from_utf8_lossy()` approach All tests pass: ```bash cargo test -p codex-core text_encoding cargo test -p codex-core test_shell_output_encoding_issue_6178 ``` ## 🎯 Impact - ✅ **Eliminates garbled text** in VSCode shell preview for non-ASCII content - ✅ **Supports Windows/WSL environments** with proper encoding detection - ✅ **Zero performance impact** for UTF-8 text (fast path) - ✅ **Backward compatible** - UTF-8 content works exactly as before - ✅ **Handles edge cases** with robust fallback mechanism ## 🧪 Test Scenarios The fix has been tested with: - Russian text ("пример") - Windows-1252 quotation marks (""test") - Latin-1 accented characters ("café") - Mixed encoding content - Invalid byte sequences (graceful fallback) ## 📋 Checklist - [X] Addresses the reported issue - [X] Includes comprehensive tests - [X] Maintains backward compatibility - [X] Follows project coding conventions - [X] No breaking changes --------- Co-authored-by: Josh McKinney <joshka@openai.com> |
||
|---|---|---|
| .devcontainer | ||
| .github | ||
| .vscode | ||
| codex-cli | ||
| codex-rs | ||
| docs | ||
| scripts | ||
| sdk/typescript | ||
| .codespellignore | ||
| .codespellrc | ||
| .gitignore | ||
| .npmrc | ||
| .prettierignore | ||
| .prettierrc.toml | ||
| AGENTS.md | ||
| CHANGELOG.md | ||
| cliff.toml | ||
| flake.lock | ||
| flake.nix | ||
| LICENSE | ||
| NOTICE | ||
| package.json | ||
| pnpm-lock.yaml | ||
| pnpm-workspace.yaml | ||
| PNPM.md | ||
| README.md | ||
npm i -g @openai/codex
or brew install --cask codex
Codex CLI is a coding agent from OpenAI that runs locally on your computer.
If you want Codex in your code editor (VS Code, Cursor, Windsurf), install in your IDE
If you are looking for the cloud-based agent from OpenAI, Codex Web, go to chatgpt.com/codex
Quickstart
Installing and running Codex CLI
Install globally with your preferred package manager. If you use npm:
npm install -g @openai/codex
Alternatively, if you use Homebrew:
brew install --cask codex
Then simply run codex to get started:
codex
If you're running into upgrade issues with Homebrew, see the FAQ entry on brew upgrade codex.
You can also go to the latest GitHub Release and download the appropriate binary for your platform.
Each GitHub Release contains many executables, but in practice, you likely want one of these:
- macOS
- Apple Silicon/arm64:
codex-aarch64-apple-darwin.tar.gz - x86_64 (older Mac hardware):
codex-x86_64-apple-darwin.tar.gz
- Apple Silicon/arm64:
- Linux
- x86_64:
codex-x86_64-unknown-linux-musl.tar.gz - arm64:
codex-aarch64-unknown-linux-musl.tar.gz
- x86_64:
Each archive contains a single entry with the platform baked into the name (e.g., codex-x86_64-unknown-linux-musl), so you likely want to rename it to codex after extracting it.
Using Codex with your ChatGPT plan
Run codex and select Sign in with ChatGPT. We recommend signing into your ChatGPT account to use Codex as part of your Plus, Pro, Team, Edu, or Enterprise plan. Learn more about what's included in your ChatGPT plan.
You can also use Codex with an API key, but this requires additional setup. If you previously used an API key for usage-based billing, see the migration steps. If you're having trouble with login, please comment on this issue.
Model Context Protocol (MCP)
Codex can access MCP servers. To configure them, refer to the config docs.
Configuration
Codex CLI supports a rich set of configuration options, with preferences stored in ~/.codex/config.toml. For full configuration options, see Configuration.
Execpolicy Quickstart
Codex can enforce your own rules-based execution policy before it runs shell commands.
- Create a policy directory:
mkdir -p ~/.codex/policy. - Create one or more
.codexpolicyfiles in that folder. Codex automatically loads every.codexpolicyfile in there on startup. - Write
prefix_ruleentries to describe the commands you want to allow, prompt, or block:
prefix_rule(
pattern = ["git", ["push", "fetch"]],
decision = "prompt", # allow | prompt | forbidden
match = [["git", "push", "origin", "main"]], # examples that must match
not_match = [["git", "status"]], # examples that must not match
)
patternis a list of shell tokens, evaluated from left to right; wrap tokens in a nested list to express alternatives (e.g., match bothpushandfetch).decisionsets the severity; Codex picks the strictest decision when multiple rules match (forbidden > prompt > allow).matchandnot_matchact as (optional) unit tests. Codex validates them when it loads your policy, so you get feedback if an example has unexpected behavior.
In this example rule, if Codex wants to run commands with the prefix git push or git fetch, it will first ask for user approval.
Use execpolicy2 CLI to preview decisions for policy files:
cargo run -p codex-execpolicy2 -- check --policy ~/.codex/policy/default.codexpolicy git push origin main
Pass multiple --policy flags to test how several files combine. See the codex-rs/execpolicy2 README for a more detailed walkthrough of the available syntax.
Docs & FAQ
- Getting started
- Configuration
- Sandbox & approvals
- Authentication
- Automating Codex
- Advanced
- Zero data retention (ZDR)
- Contributing
- Install & build
- FAQ
- Open source fund
License
This repository is licensed under the Apache-2.0 License.