Persist initialized js_repl bindings after failed cells (#13482)

## Summary

- Change `js_repl` failed-cell persistence so later cells keep prior
bindings plus only the current-cell bindings whose initialization
definitely completed before the throw.
- Preserve initialized lexical bindings across failed cells via
module-namespace readability, including top-level destructuring that
partially succeeds before a later throw.
- Preserve hoisted `var` and `function` bindings only when execution
clearly reached their declaration site, and preserve direct top-level
pre-declaration `var` writes and updates through explicit write-site
markers.
- Preserve top-level `for...in` / `for...of` `var` bindings when the
loop body executes at least once, using a first-iteration guard to avoid
per-iteration bookkeeping overhead.
- Keep prior module state intact across link-time failures and
evaluation failures before the prelude runs, while still allowing failed
cells that already recreated prior bindings to persist updates to those
existing bindings.
- Hide internal commit hooks from user `js_repl` code after the prelude
aliases them, so snippets cannot spoof committed bindings by calling the
raw `import.meta` hooks directly.
- Add focused regression coverage for the supported failed-cell
behaviors and the intentionally unsupported boundaries.
- Update `js_repl` docs and generated instructions to describe the new,
narrower failed-cell persistence model.

## Motivation

We saw `js_repl` drop bindings that had already been initialized
successfully when a later statement in the same cell threw, for example:

    const { context: liveContext, session } =
      await initializeGoogleSheetsLiveForTab(tab);
    // later statement throws

That was surprising in practice because successful earlier work
disappeared from the next cell.

This change makes failed-cell persistence more useful without trying to
model every possible partially executed JavaScript edge case. The
resulting behavior is narrower and easier to reason about:

- prior bindings are always preserved
- lexical bindings persist when their initialization completed before
the throw
- hoisted `var` / `function` bindings persist only when execution
clearly reached their declaration or a supported top-level `var` write
site
- failed cells that already recreated prior bindings can persist writes
to those existing bindings even if they introduce no new bindings

The detailed edge-case matrix stays in `docs/js_repl.md`. The
model-facing `project_doc` guidance is intentionally shorter and focused
on generation-relevant behavior.

## Supported Failed-Cell Behavior

- Prior bindings remain available after a failed cell.
- Initialized lexical bindings remain available after a failed cell.
- Top-level destructuring like `const { a, b } = ...` preserves names
whose initialization completed before a later throw.
- Hoisted `function` bindings persist when execution reached the
declaration statement before the throw.
- Direct top-level pre-declaration `var` writes and updates persist, for
example:
  - `x = 1`
  - `x += 1`
  - `x++`
- short-circuiting logical assignments only persist when the write
branch actually runs
- Non-empty top-level `for...in` / `for...of` `var` loops persist their
loop bindings.
- Failed cells can persist updates to existing carried bindings after
the prelude has run, even when the cell commits no new bindings.
- Link failures and eval failures before the prelude do not poison
`@prev`.

## Intentionally Unsupported Failed-Cell Cases

- Hoisted function reads before the declaration, such as `foo(); ...;
function foo() {}`
- Aliasing or inference-based recovery from reads before declaration
- Nested writes inside already-instrumented assignment RHS expressions
- Destructuring-assignment recovery for hoisted `var`
- Partial `var` destructuring recovery
- Pre-declaration `undefined` reads for hoisted `var`
- Empty top-level `for...in` / `for...of` loop vars
- Nested or scope-sensitive pre-declaration `var` writes outside direct
top-level expression statements
This commit is contained in:
Curtis 'Fjord' Hawthorne 2026-03-05 11:01:46 -08:00 committed by GitHub
parent ee2e3c415b
commit 657841e7f5
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
5 changed files with 965 additions and 73 deletions

View file

@ -64,7 +64,7 @@ fn render_js_repl_instructions(config: &Config) -> Option<String> {
if config.features.enabled(Feature::ImageDetailOriginal) {
section.push_str("- When generating or converting images for `view_image` in `js_repl`, prefer JPEG at 85% quality unless lossless quality is strictly required; other formats can be used if the user requests them. This keeps uploads smaller and reduces the chance of hitting image size caps.\n");
}
section.push_str("- Top-level bindings persist across cells. If you hit `SyntaxError: Identifier 'x' has already been declared`, reuse the binding, pick a new name, wrap in `{ ... }` for block scope, or reset the kernel with `js_repl_reset`.\n");
section.push_str("- Top-level bindings persist across cells. If a cell throws, prior bindings remain available and bindings that finished initializing before the throw often remain usable in later cells. For code you plan to reuse across cells, prefer declaring or assigning it in direct top-level statements before operations that might throw. If you hit `SyntaxError: Identifier 'x' has already been declared`, reuse the binding, pick a new name, wrap in `{ ... }` for block scope, or reset the kernel with `js_repl_reset`.\n");
section.push_str("- Top-level static import declarations (for example `import x from \"./file.js\"`) are currently unsupported in `js_repl`; use dynamic imports with `await import(\"pkg\")`, `await import(\"./file.js\")`, or `await import(\"/abs/path/file.mjs\")` instead. Imported local files must be ESM `.js`/`.mjs` files and run in the same REPL VM context. Bare package imports always resolve from REPL-global search roots (`CODEX_JS_REPL_NODE_MODULE_DIRS`, then cwd), not relative to the imported file location. Local files may statically import only other local relative/absolute/`file://` `.js`/`.mjs` files; package and builtin imports from local files must stay dynamic. `import.meta.resolve()` returns importable strings such as `file://...`, bare package names, and `node:...` specifiers. Local file modules reload between execs, while top-level bindings persist until `js_repl_reset`.\n");
if config.features.enabled(Feature::JsReplToolsOnly) {
@ -492,7 +492,7 @@ mod tests {
let res = get_user_instructions(&cfg, None, None)
.await
.expect("js_repl instructions expected");
let expected = "## JavaScript REPL (Node)\n- Use `js_repl` for Node-backed JavaScript with top-level await in a persistent kernel.\n- `js_repl` is a freeform/custom tool. Direct `js_repl` calls must send raw JavaScript tool input (optionally with first-line `// codex-js-repl: timeout_ms=15000`). Do not wrap code in JSON (for example `{\"code\":\"...\"}`), quotes, or markdown code fences.\n- Helpers: `codex.tmpDir`, `codex.tool(name, args?)`, and `codex.emitImage(imageLike)`.\n- `codex.tool` executes a normal tool call and resolves to the raw tool output object. Use it for shell and non-shell tools alike. Nested tool outputs stay inside JavaScript unless you emit them explicitly.\n- `codex.emitImage(...)` adds exactly one image to the outer `js_repl` function output. It accepts a direct image URL, a single `input_image` item, an object like `{ bytes, mimeType }`, or a raw tool response object with exactly one image and no text. It rejects mixed text-and-image content.\n- Example of sharing an in-memory Playwright screenshot: `await codex.emitImage({ bytes: await page.screenshot({ type: \"jpeg\", quality: 85 }), mimeType: \"image/jpeg\" })`.\n- Example of sharing a local image tool result: `await codex.emitImage(codex.tool(\"view_image\", { path: \"/absolute/path\" }))`.\n- Top-level bindings persist across cells. If you hit `SyntaxError: Identifier 'x' has already been declared`, reuse the binding, pick a new name, wrap in `{ ... }` for block scope, or reset the kernel with `js_repl_reset`.\n- Top-level static import declarations (for example `import x from \"./file.js\"`) are currently unsupported in `js_repl`; use dynamic imports with `await import(\"pkg\")`, `await import(\"./file.js\")`, or `await import(\"/abs/path/file.mjs\")` instead. Imported local files must be ESM `.js`/`.mjs` files and run in the same REPL VM context. Bare package imports always resolve from REPL-global search roots (`CODEX_JS_REPL_NODE_MODULE_DIRS`, then cwd), not relative to the imported file location. Local files may statically import only other local relative/absolute/`file://` `.js`/`.mjs` files; package and builtin imports from local files must stay dynamic. `import.meta.resolve()` returns importable strings such as `file://...`, bare package names, and `node:...` specifiers. Local file modules reload between execs, while top-level bindings persist until `js_repl_reset`.\n- Avoid direct access to `process.stdout` / `process.stderr` / `process.stdin`; it can corrupt the JSON line protocol. Use `console.log`, `codex.tool(...)`, and `codex.emitImage(...)`.";
let expected = "## JavaScript REPL (Node)\n- Use `js_repl` for Node-backed JavaScript with top-level await in a persistent kernel.\n- `js_repl` is a freeform/custom tool. Direct `js_repl` calls must send raw JavaScript tool input (optionally with first-line `// codex-js-repl: timeout_ms=15000`). Do not wrap code in JSON (for example `{\"code\":\"...\"}`), quotes, or markdown code fences.\n- Helpers: `codex.tmpDir`, `codex.tool(name, args?)`, and `codex.emitImage(imageLike)`.\n- `codex.tool` executes a normal tool call and resolves to the raw tool output object. Use it for shell and non-shell tools alike. Nested tool outputs stay inside JavaScript unless you emit them explicitly.\n- `codex.emitImage(...)` adds exactly one image to the outer `js_repl` function output. It accepts a direct image URL, a single `input_image` item, an object like `{ bytes, mimeType }`, or a raw tool response object with exactly one image and no text. It rejects mixed text-and-image content.\n- Example of sharing an in-memory Playwright screenshot: `await codex.emitImage({ bytes: await page.screenshot({ type: \"jpeg\", quality: 85 }), mimeType: \"image/jpeg\" })`.\n- Example of sharing a local image tool result: `await codex.emitImage(codex.tool(\"view_image\", { path: \"/absolute/path\" }))`.\n- Top-level bindings persist across cells. If a cell throws, prior bindings remain available and bindings that finished initializing before the throw often remain usable in later cells. For code you plan to reuse across cells, prefer declaring or assigning it in direct top-level statements before operations that might throw. If you hit `SyntaxError: Identifier 'x' has already been declared`, reuse the binding, pick a new name, wrap in `{ ... }` for block scope, or reset the kernel with `js_repl_reset`.\n- Top-level static import declarations (for example `import x from \"./file.js\"`) are currently unsupported in `js_repl`; use dynamic imports with `await import(\"pkg\")`, `await import(\"./file.js\")`, or `await import(\"/abs/path/file.mjs\")` instead. Imported local files must be ESM `.js`/`.mjs` files and run in the same REPL VM context. Bare package imports always resolve from REPL-global search roots (`CODEX_JS_REPL_NODE_MODULE_DIRS`, then cwd), not relative to the imported file location. Local files may statically import only other local relative/absolute/`file://` `.js`/`.mjs` files; package and builtin imports from local files must stay dynamic. `import.meta.resolve()` returns importable strings such as `file://...`, bare package names, and `node:...` specifiers. Local file modules reload between execs, while top-level bindings persist until `js_repl_reset`.\n- Avoid direct access to `process.stdout` / `process.stderr` / `process.stdin`; it can corrupt the JSON line protocol. Use `console.log`, `codex.tool(...)`, and `codex.emitImage(...)`.";
assert_eq!(res, expected);
}
@ -511,7 +511,7 @@ mod tests {
let res = get_user_instructions(&cfg, None, None)
.await
.expect("js_repl instructions expected");
let expected = "## JavaScript REPL (Node)\n- Use `js_repl` for Node-backed JavaScript with top-level await in a persistent kernel.\n- `js_repl` is a freeform/custom tool. Direct `js_repl` calls must send raw JavaScript tool input (optionally with first-line `// codex-js-repl: timeout_ms=15000`). Do not wrap code in JSON (for example `{\"code\":\"...\"}`), quotes, or markdown code fences.\n- Helpers: `codex.tmpDir`, `codex.tool(name, args?)`, and `codex.emitImage(imageLike)`.\n- `codex.tool` executes a normal tool call and resolves to the raw tool output object. Use it for shell and non-shell tools alike. Nested tool outputs stay inside JavaScript unless you emit them explicitly.\n- `codex.emitImage(...)` adds exactly one image to the outer `js_repl` function output. It accepts a direct image URL, a single `input_image` item, an object like `{ bytes, mimeType }`, or a raw tool response object with exactly one image and no text. It rejects mixed text-and-image content.\n- Example of sharing an in-memory Playwright screenshot: `await codex.emitImage({ bytes: await page.screenshot({ type: \"jpeg\", quality: 85 }), mimeType: \"image/jpeg\" })`.\n- Example of sharing a local image tool result: `await codex.emitImage(codex.tool(\"view_image\", { path: \"/absolute/path\" }))`.\n- Top-level bindings persist across cells. If you hit `SyntaxError: Identifier 'x' has already been declared`, reuse the binding, pick a new name, wrap in `{ ... }` for block scope, or reset the kernel with `js_repl_reset`.\n- Top-level static import declarations (for example `import x from \"./file.js\"`) are currently unsupported in `js_repl`; use dynamic imports with `await import(\"pkg\")`, `await import(\"./file.js\")`, or `await import(\"/abs/path/file.mjs\")` instead. Imported local files must be ESM `.js`/`.mjs` files and run in the same REPL VM context. Bare package imports always resolve from REPL-global search roots (`CODEX_JS_REPL_NODE_MODULE_DIRS`, then cwd), not relative to the imported file location. Local files may statically import only other local relative/absolute/`file://` `.js`/`.mjs` files; package and builtin imports from local files must stay dynamic. `import.meta.resolve()` returns importable strings such as `file://...`, bare package names, and `node:...` specifiers. Local file modules reload between execs, while top-level bindings persist until `js_repl_reset`.\n- Do not call tools directly; use `js_repl` + `codex.tool(...)` for all tool calls, including shell commands.\n- MCP tools (if any) can also be called by name via `codex.tool(...)`.\n- Avoid direct access to `process.stdout` / `process.stderr` / `process.stdin`; it can corrupt the JSON line protocol. Use `console.log`, `codex.tool(...)`, and `codex.emitImage(...)`.";
let expected = "## JavaScript REPL (Node)\n- Use `js_repl` for Node-backed JavaScript with top-level await in a persistent kernel.\n- `js_repl` is a freeform/custom tool. Direct `js_repl` calls must send raw JavaScript tool input (optionally with first-line `// codex-js-repl: timeout_ms=15000`). Do not wrap code in JSON (for example `{\"code\":\"...\"}`), quotes, or markdown code fences.\n- Helpers: `codex.tmpDir`, `codex.tool(name, args?)`, and `codex.emitImage(imageLike)`.\n- `codex.tool` executes a normal tool call and resolves to the raw tool output object. Use it for shell and non-shell tools alike. Nested tool outputs stay inside JavaScript unless you emit them explicitly.\n- `codex.emitImage(...)` adds exactly one image to the outer `js_repl` function output. It accepts a direct image URL, a single `input_image` item, an object like `{ bytes, mimeType }`, or a raw tool response object with exactly one image and no text. It rejects mixed text-and-image content.\n- Example of sharing an in-memory Playwright screenshot: `await codex.emitImage({ bytes: await page.screenshot({ type: \"jpeg\", quality: 85 }), mimeType: \"image/jpeg\" })`.\n- Example of sharing a local image tool result: `await codex.emitImage(codex.tool(\"view_image\", { path: \"/absolute/path\" }))`.\n- Top-level bindings persist across cells. If a cell throws, prior bindings remain available and bindings that finished initializing before the throw often remain usable in later cells. For code you plan to reuse across cells, prefer declaring or assigning it in direct top-level statements before operations that might throw. If you hit `SyntaxError: Identifier 'x' has already been declared`, reuse the binding, pick a new name, wrap in `{ ... }` for block scope, or reset the kernel with `js_repl_reset`.\n- Top-level static import declarations (for example `import x from \"./file.js\"`) are currently unsupported in `js_repl`; use dynamic imports with `await import(\"pkg\")`, `await import(\"./file.js\")`, or `await import(\"/abs/path/file.mjs\")` instead. Imported local files must be ESM `.js`/`.mjs` files and run in the same REPL VM context. Bare package imports always resolve from REPL-global search roots (`CODEX_JS_REPL_NODE_MODULE_DIRS`, then cwd), not relative to the imported file location. Local files may statically import only other local relative/absolute/`file://` `.js`/`.mjs` files; package and builtin imports from local files must stay dynamic. `import.meta.resolve()` returns importable strings such as `file://...`, bare package names, and `node:...` specifiers. Local file modules reload between execs, while top-level bindings persist until `js_repl_reset`.\n- Do not call tools directly; use `js_repl` + `codex.tool(...)` for all tool calls, including shell commands.\n- MCP tools (if any) can also be called by name via `codex.tool(...)`.\n- Avoid direct access to `process.stdout` / `process.stderr` / `process.stdin`; it can corrupt the JSON line protocol. Use `console.log`, `codex.tool(...)`, and `codex.emitImage(...)`.";
assert_eq!(res, expected);
}
@ -530,7 +530,7 @@ mod tests {
let res = get_user_instructions(&cfg, None, None)
.await
.expect("js_repl instructions expected");
let expected = "## JavaScript REPL (Node)\n- Use `js_repl` for Node-backed JavaScript with top-level await in a persistent kernel.\n- `js_repl` is a freeform/custom tool. Direct `js_repl` calls must send raw JavaScript tool input (optionally with first-line `// codex-js-repl: timeout_ms=15000`). Do not wrap code in JSON (for example `{\"code\":\"...\"}`), quotes, or markdown code fences.\n- Helpers: `codex.tmpDir`, `codex.tool(name, args?)`, and `codex.emitImage(imageLike)`.\n- `codex.tool` executes a normal tool call and resolves to the raw tool output object. Use it for shell and non-shell tools alike. Nested tool outputs stay inside JavaScript unless you emit them explicitly.\n- `codex.emitImage(...)` adds exactly one image to the outer `js_repl` function output. It accepts a direct image URL, a single `input_image` item, an object like `{ bytes, mimeType }`, or a raw tool response object with exactly one image and no text. It rejects mixed text-and-image content.\n- Example of sharing an in-memory Playwright screenshot: `await codex.emitImage({ bytes: await page.screenshot({ type: \"jpeg\", quality: 85 }), mimeType: \"image/jpeg\" })`.\n- Example of sharing a local image tool result: `await codex.emitImage(codex.tool(\"view_image\", { path: \"/absolute/path\" }))`.\n- When generating or converting images for `view_image` in `js_repl`, prefer JPEG at 85% quality unless lossless quality is strictly required; other formats can be used if the user requests them. This keeps uploads smaller and reduces the chance of hitting image size caps.\n- Top-level bindings persist across cells. If you hit `SyntaxError: Identifier 'x' has already been declared`, reuse the binding, pick a new name, wrap in `{ ... }` for block scope, or reset the kernel with `js_repl_reset`.\n- Top-level static import declarations (for example `import x from \"./file.js\"`) are currently unsupported in `js_repl`; use dynamic imports with `await import(\"pkg\")`, `await import(\"./file.js\")`, or `await import(\"/abs/path/file.mjs\")` instead. Imported local files must be ESM `.js`/`.mjs` files and run in the same REPL VM context. Bare package imports always resolve from REPL-global search roots (`CODEX_JS_REPL_NODE_MODULE_DIRS`, then cwd), not relative to the imported file location. Local files may statically import only other local relative/absolute/`file://` `.js`/`.mjs` files; package and builtin imports from local files must stay dynamic. `import.meta.resolve()` returns importable strings such as `file://...`, bare package names, and `node:...` specifiers. Local file modules reload between execs, while top-level bindings persist until `js_repl_reset`.\n- Avoid direct access to `process.stdout` / `process.stderr` / `process.stdin`; it can corrupt the JSON line protocol. Use `console.log`, `codex.tool(...)`, and `codex.emitImage(...)`.";
let expected = "## JavaScript REPL (Node)\n- Use `js_repl` for Node-backed JavaScript with top-level await in a persistent kernel.\n- `js_repl` is a freeform/custom tool. Direct `js_repl` calls must send raw JavaScript tool input (optionally with first-line `// codex-js-repl: timeout_ms=15000`). Do not wrap code in JSON (for example `{\"code\":\"...\"}`), quotes, or markdown code fences.\n- Helpers: `codex.tmpDir`, `codex.tool(name, args?)`, and `codex.emitImage(imageLike)`.\n- `codex.tool` executes a normal tool call and resolves to the raw tool output object. Use it for shell and non-shell tools alike. Nested tool outputs stay inside JavaScript unless you emit them explicitly.\n- `codex.emitImage(...)` adds exactly one image to the outer `js_repl` function output. It accepts a direct image URL, a single `input_image` item, an object like `{ bytes, mimeType }`, or a raw tool response object with exactly one image and no text. It rejects mixed text-and-image content.\n- Example of sharing an in-memory Playwright screenshot: `await codex.emitImage({ bytes: await page.screenshot({ type: \"jpeg\", quality: 85 }), mimeType: \"image/jpeg\" })`.\n- Example of sharing a local image tool result: `await codex.emitImage(codex.tool(\"view_image\", { path: \"/absolute/path\" }))`.\n- When generating or converting images for `view_image` in `js_repl`, prefer JPEG at 85% quality unless lossless quality is strictly required; other formats can be used if the user requests them. This keeps uploads smaller and reduces the chance of hitting image size caps.\n- Top-level bindings persist across cells. If a cell throws, prior bindings remain available and bindings that finished initializing before the throw often remain usable in later cells. For code you plan to reuse across cells, prefer declaring or assigning it in direct top-level statements before operations that might throw. If you hit `SyntaxError: Identifier 'x' has already been declared`, reuse the binding, pick a new name, wrap in `{ ... }` for block scope, or reset the kernel with `js_repl_reset`.\n- Top-level static import declarations (for example `import x from \"./file.js\"`) are currently unsupported in `js_repl`; use dynamic imports with `await import(\"pkg\")`, `await import(\"./file.js\")`, or `await import(\"/abs/path/file.mjs\")` instead. Imported local files must be ESM `.js`/`.mjs` files and run in the same REPL VM context. Bare package imports always resolve from REPL-global search roots (`CODEX_JS_REPL_NODE_MODULE_DIRS`, then cwd), not relative to the imported file location. Local files may statically import only other local relative/absolute/`file://` `.js`/`.mjs` files; package and builtin imports from local files must stay dynamic. `import.meta.resolve()` returns importable strings such as `file://...`, bare package names, and `node:...` specifiers. Local file modules reload between execs, while top-level bindings persist until `js_repl_reset`.\n- Avoid direct access to `process.stdout` / `process.stderr` / `process.stdin`; it can corrupt the JSON line protocol. Use `console.log`, `codex.tool(...)`, and `codex.emitImage(...)`.";
assert_eq!(res, expected);
}

View file

@ -78,7 +78,7 @@ context.btoa = (data) => Buffer.from(data, "binary").toString("base64");
// REPL state model:
// - Every exec is compiled as a fresh ESM "cell".
// - `previousModule` is the most recently evaluated module namespace.
// - `previousModule` is the most recently committed module namespace.
// - `previousBindings` tracks which top-level names should be carried forward.
// Each new cell imports a synthetic view of the previous namespace and
// redeclares those names so user variables behave like a persistent REPL.
@ -86,6 +86,12 @@ let previousModule = null;
/** @type {Binding[]} */
let previousBindings = [];
let cellCounter = 0;
let internalBindingCounter = 0;
const internalBindingSalt = (() => {
const raw = process.env.CODEX_THREAD_ID ?? "";
const sanitized = raw.replace(/[^A-Za-z0-9_$]/g, "_");
return sanitized || "session";
})();
let activeExecId = null;
let fatalExitScheduled = false;
@ -553,17 +559,447 @@ function collectBindings(ast) {
return Array.from(map.entries()).map(([name, kind]) => ({ name, kind }));
}
function collectPatternBindingNames(pattern) {
const map = new Map();
collectPatternNames(pattern, "binding", map);
return Array.from(map.keys());
}
function nextInternalBindingName() {
// We intentionally do not scan user-declared names here. Internal helpers use
// a per-thread salt plus a counter instead. A user could still collide by
// deliberately spelling the exact generated name, but the thread-id salt
// keeps accidental collisions negligible while avoiding more AST bookkeeping.
return `__codex_internal_commit_${internalBindingSalt}_${internalBindingCounter++}`;
}
function buildMarkCommittedExpression(names, markCommittedFnName) {
const serializedNames = names.map((name) => JSON.stringify(name)).join(", ");
return `(${markCommittedFnName}(${serializedNames}), undefined)`;
}
function tryReadBindingValue(module, bindingName) {
if (!module) {
return { ok: false, value: undefined };
}
try {
return { ok: true, value: module.namespace[bindingName] };
} catch {
return { ok: false, value: undefined };
}
}
function instrumentVariableDeclarationSource(
code,
declaration,
markCommittedFnName,
) {
if (!declaration.declarations?.length) {
return code.slice(declaration.start, declaration.end);
}
const prefix = code.slice(declaration.start, declaration.declarations[0].start);
const suffix = code.slice(
declaration.declarations[declaration.declarations.length - 1].end,
declaration.end,
);
const parts = [];
for (const decl of declaration.declarations) {
parts.push(code.slice(decl.start, decl.end));
const names = collectPatternBindingNames(decl.id);
if (names.length > 0) {
const helperName = nextInternalBindingName();
parts.push(
`${helperName} = ${buildMarkCommittedExpression(names, markCommittedFnName)}`,
);
}
}
return `${prefix}${parts.join(", ")}${suffix}`;
}
function instrumentLoopBody(code, body, names, guardName, markCommittedFnName) {
const marker = `if (${guardName}) { ${guardName} = false; ${markCommittedFnName}(${names
.map((name) => JSON.stringify(name))
.join(", ")}); }`;
const bodyCode = code.slice(body.start, body.end);
if (body.type === "BlockStatement") {
return `{ ${marker}${bodyCode.slice(1)}`;
}
return `{ ${marker} ${bodyCode} }`;
}
function applyReplacements(code, replacements) {
let instrumentedCode = code;
for (const replacement of replacements.sort((a, b) => b.start - a.start)) {
instrumentedCode =
instrumentedCode.slice(0, replacement.start) +
replacement.text +
instrumentedCode.slice(replacement.end);
}
return instrumentedCode;
}
function collectHoistedVarDeclarationStarts(ast) {
const varDeclarationStarts = new Map();
const recordDeclarationStart = (map, name, start) => {
const existingStart = map.get(name);
if (existingStart === undefined || start < existingStart) {
map.set(name, start);
}
};
const recordVarDeclarationStarts = (declaration) => {
for (const name of collectPatternBindingNames(declaration.id)) {
recordDeclarationStart(varDeclarationStarts, name, declaration.start);
}
};
for (const stmt of ast.body ?? []) {
if (stmt.type === "VariableDeclaration" && stmt.kind === "var") {
for (const declaration of stmt.declarations ?? []) {
recordVarDeclarationStarts(declaration);
}
continue;
}
if (
stmt.type === "ForStatement" &&
stmt.init?.type === "VariableDeclaration" &&
stmt.init.kind === "var"
) {
for (const declaration of stmt.init.declarations ?? []) {
recordVarDeclarationStarts(declaration);
}
continue;
}
if (
(stmt.type === "ForInStatement" || stmt.type === "ForOfStatement") &&
stmt.left?.type === "VariableDeclaration" &&
stmt.left.kind === "var"
) {
for (const declaration of stmt.left.declarations ?? []) {
recordVarDeclarationStarts(declaration);
}
}
}
return varDeclarationStarts;
}
function collectFutureVarWriteReplacements(
code,
ast,
{
helperDeclarations = null,
markCommittedFnName = null,
} = {},
) {
// Failed-cell hoisted tracking intentionally stays small here. We only mark
// direct top-level writes to future `var` bindings, plus top-level
// declaration-site markers handled later in `instrumentCurrentBindings`.
// We do not recurse through nested statement structure because that quickly
// requires real lexical-scope tracking for blocks, loop scopes, catch
// bindings, and similar shadowing cases. Supported write recovery is limited
// to direct top-level expression statements such as `x = 1`, `x += 1`,
// `x++`, and logical assignments.
const varDeclarationStarts = collectHoistedVarDeclarationStarts(ast);
if (varDeclarationStarts.size === 0) {
return [];
}
const replacements = [];
const replacementKeys = new Set();
if (!markCommittedFnName) {
throw new Error(
"collectFutureVarWriteReplacements expected a commit marker binding name",
);
}
const addReplacement = (start, end, text) => {
const key = `${start}:${end}`;
if (!replacementKeys.has(key)) {
replacementKeys.add(key);
replacements.push({ start, end, text });
}
};
const getFutureVarName = (identifier) => {
if (!identifier || identifier.type !== "Identifier") {
return null;
}
const declarationStart = varDeclarationStarts.get(identifier.name);
if (
declarationStart === undefined ||
identifier.start >= declarationStart
) {
return null;
}
return identifier.name;
};
const instrumentUpdateExpression = (node, identifier) => {
const bindingName = getFutureVarName(identifier);
if (!bindingName) {
return false;
}
addReplacement(
node.start,
node.end,
`(${markCommittedFnName}(${JSON.stringify(bindingName)}), ${code.slice(
node.start,
node.end,
)})`,
);
return true;
};
const instrumentAssignmentExpression = (node) => {
if (node.left.type !== "Identifier") {
return false;
}
const bindingName = getFutureVarName(node.left);
if (!bindingName) {
return false;
}
if (
node.operator === "&&=" ||
node.operator === "||=" ||
node.operator === "??="
) {
if (!helperDeclarations) {
throw new Error(
"collectFutureVarWriteReplacements expected helperDeclarations for logical assignment rewriting",
);
}
const helperName = nextInternalBindingName();
helperDeclarations.push(`let ${helperName};`);
const shortCircuitOperator =
node.operator === "&&="
? "&&"
: node.operator === "||="
? "||"
: "??";
addReplacement(
node.start,
node.end,
`((${helperName} = ${node.left.name}), ${helperName} ${shortCircuitOperator} ((${node.left.name} = ${code.slice(node.right.start, node.right.end)}), ${buildMarkCommittedExpression([bindingName], markCommittedFnName)}, ${node.left.name}))`,
);
return true;
}
addReplacement(
node.start,
node.end,
`((${code.slice(node.start, node.end)}), ${buildMarkCommittedExpression([bindingName], markCommittedFnName)}, ${node.left.name})`,
);
return true;
};
const unwrapParenthesizedExpression = (node) => {
let current = node;
while (current?.type === "ParenthesizedExpression") {
current = current.expression;
}
return current;
};
for (const statement of ast.body ?? []) {
if (statement.type !== "ExpressionStatement") {
continue;
}
const expression = unwrapParenthesizedExpression(statement.expression);
if (!expression) {
continue;
}
if (
expression.type === "UpdateExpression" &&
expression.argument.type === "Identifier"
) {
instrumentUpdateExpression(expression, expression.argument);
continue;
}
if (expression.type === "AssignmentExpression") {
instrumentAssignmentExpression(expression);
}
}
return replacements;
}
function instrumentCurrentBindings(
code,
ast,
currentBindings,
priorBindings,
markCommittedFnName,
) {
if (currentBindings.length === 0) {
return code;
}
const replacements = [];
for (const stmt of ast.body ?? []) {
if (stmt.type === "VariableDeclaration") {
replacements.push({
start: stmt.start,
end: stmt.end,
text: instrumentVariableDeclarationSource(
code,
stmt,
markCommittedFnName,
),
});
continue;
}
if (stmt.type === "FunctionDeclaration" && stmt.id) {
replacements.push({
start: stmt.start,
end: stmt.end,
// Keep function source text stable for things like `foo.toString()`.
// Pre-declaration uses are tracked separately by instrumenting the
// top-level expressions that actually read the hoisted function value.
text: `${code.slice(stmt.start, stmt.end)}\n;${markCommittedFnName}(${JSON.stringify(stmt.id.name)});`,
});
continue;
}
if (stmt.type === "ClassDeclaration" && stmt.id) {
replacements.push({
start: stmt.start,
end: stmt.end,
text: `${code.slice(stmt.start, stmt.end)}\n;${markCommittedFnName}(${JSON.stringify(stmt.id.name)});`,
});
continue;
}
if (
stmt.type === "ForStatement" &&
stmt.init &&
stmt.init.type === "VariableDeclaration" &&
stmt.init.kind === "var"
) {
replacements.push({
start: stmt.start,
end: stmt.end,
text: `${code.slice(stmt.start, stmt.init.start)}${instrumentVariableDeclarationSource(
code,
stmt.init,
markCommittedFnName,
)}${code.slice(stmt.init.end, stmt.end)}`,
});
continue;
}
if (
(stmt.type === "ForInStatement" || stmt.type === "ForOfStatement") &&
stmt.left &&
stmt.left.type === "VariableDeclaration" &&
stmt.left.kind === "var"
) {
const names = stmt.left.declarations.flatMap((decl) =>
collectPatternBindingNames(decl.id),
);
if (names.length > 0) {
const guardName = nextInternalBindingName();
replacements.push({
start: stmt.start,
end: stmt.end,
// Mark top-level `for...in` / `for...of` vars on the first body
// execution instead of every iteration. This keeps hot loops cheap
// after the first pass while still preserving vars for the common
// case where the loop actually ran before a later throw.
//
// The tradeoff is that `for (var x of []) {}` in a failed cell will
// not carry `x` forward as `undefined`, because the body never runs
// and the one-time marker never fires. We accept that edge case:
// `var` is redeclarable, and the only lost state is an unassigned
// `undefined` from an empty top-level loop in a cell that later
// fails.
text: `let ${guardName} = true;\n${code.slice(
stmt.start,
stmt.body.start,
)}${instrumentLoopBody(
code,
stmt.body,
names,
guardName,
markCommittedFnName,
)}`,
});
}
}
}
return applyReplacements(code, replacements);
}
async function buildModuleSource(code) {
const meriyah = await meriyahPromise;
const ast = meriyah.parseModule(code, {
next: true,
module: true,
ranges: false,
ranges: true,
loc: false,
disableWebCompat: true,
});
const currentBindings = collectBindings(ast);
const priorBindings = previousModule ? previousBindings : [];
const helperDeclarations = [];
const markCommittedFnName = nextInternalBindingName();
const markPreludeCompletedFnName = nextInternalBindingName();
helperDeclarations.push(
// `import.meta` is syntax-level and cannot be shadowed by user bindings
// like `const globalThis = ...`, so alias the marker helper through it
// once in the prelude and use that stable local binding everywhere.
// Then delete the raw import.meta hooks so user code cannot spoof
// committed bindings by calling them directly.
`const ${markCommittedFnName} = import.meta.__codexInternalMarkCommittedBindings;`,
`const ${markPreludeCompletedFnName} = import.meta.__codexInternalMarkPreludeCompleted;`,
"delete import.meta.__codexInternalMarkCommittedBindings;",
"delete import.meta.__codexInternalMarkPreludeCompleted;",
);
const writeInstrumentedCode = applyReplacements(
code,
collectFutureVarWriteReplacements(code, ast, {
helperDeclarations,
markCommittedFnName,
}),
);
const instrumentedAst = meriyah.parseModule(writeInstrumentedCode, {
next: true,
module: true,
ranges: true,
loc: false,
disableWebCompat: true,
});
const instrumentedCode = instrumentCurrentBindings(
writeInstrumentedCode,
instrumentedAst,
currentBindings,
priorBindings,
markCommittedFnName,
);
let prelude = "";
if (previousModule && priorBindings.length) {
@ -578,6 +1014,10 @@ async function buildModuleSource(code) {
.join("\n");
prelude += "\n";
}
if (helperDeclarations.length > 0) {
prelude += `${helperDeclarations.join("\n")}\n`;
}
prelude += `${markPreludeCompletedFnName}();\n`;
const mergedBindings = new Map();
for (const binding of priorBindings) {
@ -596,7 +1036,60 @@ async function buildModuleSource(code) {
name,
kind,
}));
return { source: `${prelude}${code}${exportStmt}`, nextBindings };
return {
source: `${prelude}${instrumentedCode}${exportStmt}`,
currentBindings,
nextBindings,
priorBindings,
};
}
function canReadCommittedBinding(module, binding) {
if (
!module ||
binding.kind === "var" ||
binding.kind === "function"
) {
return false;
}
return tryReadBindingValue(module, binding.name).ok;
}
// Failed cells keep prior bindings plus the current-cell bindings whose
// initialization definitely ran before the throw. That means:
// - lexical bindings (`const` / `let` / `class`) can fall back to namespace
// readability, which preserves names whose initialization already completed
// even when a later step in the same declarator throws
// - `var` / `function` bindings only persist when an explicit declaration-site
// or write-site marker fired, so unreached hoisted bindings do not become
// ghost bindings in later cells
function collectCommittedBindings(
module,
priorBindings,
currentBindings,
committedCurrentBindingNames,
) {
const mergedBindings = new Map();
let committedCurrentBindingCount = 0;
for (const binding of priorBindings) {
mergedBindings.set(binding.name, binding.kind);
}
for (const binding of currentBindings) {
if (
committedCurrentBindingNames.has(binding.name) ||
canReadCommittedBinding(module, binding)
) {
mergedBindings.set(binding.name, binding.kind);
committedCurrentBindingCount += 1;
}
}
return {
bindings: Array.from(mergedBindings, ([name, kind]) => ({ name, kind })),
committedCurrentBindingCount,
};
}
function send(message) {
@ -1012,9 +1505,32 @@ async function handleExec(message) {
};
};
let module = null;
/** @type {Binding[]} */
let currentBindings = [];
/** @type {Binding[]} */
let nextBindings = [];
/** @type {Binding[]} */
let priorBindings = previousBindings;
let moduleLinked = false;
let preludeCompleted = false;
const committedCurrentBindingNames = new Set();
const markCommittedBindings = (...names) => {
for (const name of names) {
committedCurrentBindingNames.add(name);
}
};
const markPreludeCompleted = () => {
preludeCompleted = true;
};
try {
const code = typeof message.code === "string" ? message.code : "";
const { source, nextBindings } = await buildModuleSource(code);
const builtSource = await buildModuleSource(code);
const source = builtSource.source;
currentBindings = builtSource.currentBindings;
nextBindings = builtSource.nextBindings;
priorBindings = builtSource.priorBindings;
let output = "";
context.codex = { tmpDir, tool, emitImage };
@ -1025,11 +1541,13 @@ async function handleExec(message) {
process.cwd(),
`.codex_js_repl_cell_${cellCounter++}.mjs`,
);
const module = new SourceTextModule(source, {
module = new SourceTextModule(source, {
context,
identifier: cellIdentifier,
initializeImportMeta(meta, mod) {
setImportMeta(meta, mod, true);
meta.__codexInternalMarkCommittedBindings = markCommittedBindings;
meta.__codexInternalMarkPreludeCompleted = markPreludeCompleted;
},
importModuleDynamically(specifier, referrer) {
return importResolved(resolveSpecifier(specifier, referrer?.identifier));
@ -1059,6 +1577,7 @@ async function handleExec(message) {
`Top-level static import "${specifier}" is not supported in js_repl. Use await import("${specifier}") instead.`,
);
});
moduleLinked = true;
await module.evaluate();
if (pendingBackgroundTasks.size > 0) {
@ -1070,11 +1589,12 @@ async function handleExec(message) {
throw firstUnhandledBackgroundError.error;
}
}
previousModule = module;
previousBindings = nextBindings;
output = logs.join("\n");
});
previousModule = module;
previousBindings = nextBindings;
send({
type: "exec_result",
id: message.id,
@ -1083,6 +1603,29 @@ async function handleExec(message) {
error: null,
});
} catch (error) {
const { bindings: committedBindings, committedCurrentBindingCount } =
collectCommittedBindings(
moduleLinked ? module : null,
priorBindings,
currentBindings,
committedCurrentBindingNames,
);
// Preserve the last successfully linked module across link-time failures.
// A module whose link step failed cannot safely back @prev because reading
// its namespace throws before evaluation ever begins. Likewise, if a
// linked module failed before its prelude recreated carried bindings, keep
// the old module so @prev still points at the last cell whose prelude and
// body actually established the carried values. Once the prelude has run,
// promote the failed module even if it only updated existing bindings.
if (
module &&
moduleLinked &&
(committedCurrentBindingCount > 0 ||
(preludeCompleted && priorBindings.length > 0))
) {
previousModule = module;
previousBindings = committedBindings;
}
send({
type: "exec_result",
id: message.id,

View file

@ -2124,7 +2124,11 @@ mod tests {
// integration tests instead.
cfg!(target_os = "macos")
}
fn write_js_repl_test_package(base: &Path, name: &str, value: &str) -> anyhow::Result<()> {
fn write_js_repl_test_package_source(
base: &Path,
name: &str,
source: &str,
) -> anyhow::Result<()> {
let pkg_dir = base.join("node_modules").join(name);
fs::create_dir_all(&pkg_dir)?;
fs::write(
@ -2133,9 +2137,15 @@ mod tests {
"{{\n \"name\": \"{name}\",\n \"version\": \"1.0.0\",\n \"type\": \"module\",\n \"exports\": {{\n \"import\": \"./index.js\"\n }}\n}}\n"
),
)?;
fs::write(
pkg_dir.join("index.js"),
format!("export const value = \"{value}\";\n"),
fs::write(pkg_dir.join("index.js"), source)?;
Ok(())
}
fn write_js_repl_test_package(base: &Path, name: &str, value: &str) -> anyhow::Result<()> {
write_js_repl_test_package_source(
base,
name,
&format!("export const value = \"{value}\";\n"),
)?;
Ok(())
}

View file

@ -31,6 +31,22 @@ fn custom_tool_output_text_and_success(
(output.unwrap_or_default(), success)
}
fn assert_js_repl_ok(req: &ResponsesRequest, call_id: &str, expected_output: &str) {
let (output, success) = custom_tool_output_text_and_success(req, call_id);
assert_ne!(
success,
Some(false),
"js_repl call failed unexpectedly: {output}"
);
assert!(output.contains(expected_output), "output was: {output}");
}
fn assert_js_repl_err(req: &ResponsesRequest, call_id: &str, expected_output: &str) {
let (output, success) = custom_tool_output_text_and_success(req, call_id);
assert_ne!(success, Some(true), "js_repl call should fail: {output}");
assert!(output.contains(expected_output), "output was: {output}");
}
fn tool_names(body: &serde_json::Value) -> Vec<String> {
body["tools"]
.as_array()
@ -75,6 +91,22 @@ async fn run_js_repl_turn(
prompt: &str,
calls: &[(&str, &str)],
) -> Result<ResponseMock> {
let mut mocks = run_js_repl_sequence(server, prompt, calls).await?;
Ok(mocks
.pop()
.expect("js_repl test should return a request mock"))
}
async fn run_js_repl_sequence(
server: &MockServer,
prompt: &str,
calls: &[(&str, &str)],
) -> Result<Vec<ResponseMock>> {
anyhow::ensure!(
!calls.is_empty(),
"js_repl test must include at least one call"
);
let mut builder = test_codex().with_config(|config| {
config
.features
@ -83,24 +115,68 @@ async fn run_js_repl_turn(
});
let test = builder.build(server).await?;
let mut first_events = vec![ev_response_created("resp-1")];
for (call_id, js_input) in calls {
first_events.push(ev_custom_tool_call(call_id, "js_repl", js_input));
}
first_events.push(ev_completed("resp-1"));
responses::mount_sse_once(server, sse(first_events)).await;
let second_mock = responses::mount_sse_once(
responses::mount_sse_once(
server,
sse(vec![
ev_assistant_message("msg-1", "done"),
ev_completed("resp-2"),
ev_response_created("resp-1"),
ev_custom_tool_call(calls[0].0, "js_repl", calls[0].1),
ev_completed("resp-1"),
]),
)
.await;
let mut mocks = Vec::with_capacity(calls.len());
for (response_index, (call_id, js_input)) in calls.iter().enumerate().skip(1) {
let response_id = format!("resp-{}", response_index + 1);
let mock = responses::mount_sse_once(
server,
sse(vec![
ev_response_created(&response_id),
ev_custom_tool_call(call_id, "js_repl", js_input),
ev_completed(&response_id),
]),
)
.await;
mocks.push(mock);
}
let final_response_id = format!("resp-{}", calls.len() + 1);
let final_mock = responses::mount_sse_once(
server,
sse(vec![
ev_assistant_message("msg-1", "done"),
ev_completed(&final_response_id),
]),
)
.await;
mocks.push(final_mock);
test.submit_turn(prompt).await?;
Ok(second_mock)
Ok(mocks)
}
async fn assert_failed_cell_followup(
server: &MockServer,
prompt: &str,
failing_cell: &str,
followup_cell: &str,
expected_followup_output: &str,
) -> Result<()> {
let mocks = run_js_repl_sequence(
server,
prompt,
&[("call-1", failing_cell), ("call-2", followup_cell)],
)
.await?;
assert_js_repl_err(&mocks[0].single_request(), "call-1", "boom");
assert_js_repl_ok(
&mocks[1].single_request(),
"call-2",
expected_followup_output,
);
Ok(())
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
@ -165,68 +241,328 @@ async fn js_repl_is_not_advertised_when_startup_node_is_incompatible() -> Result
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn js_repl_persists_top_level_bindings_and_supports_tla() -> Result<()> {
async fn js_repl_persists_top_level_destructured_bindings_and_supports_tla() -> Result<()> {
skip_if_no_network!(Ok(()));
let server = responses::start_mock_server().await;
let mut builder = test_codex().with_config(|config| {
config
.features
.enable(Feature::JsRepl)
.expect("test config should allow feature update");
});
let test = builder.build(&server).await?;
responses::mount_sse_once(
let mocks = run_js_repl_sequence(
&server,
sse(vec![
ev_response_created("resp-1"),
ev_custom_tool_call(
"run js_repl twice",
&[
(
"call-1",
"js_repl",
"let x = await Promise.resolve(41); console.log(x);",
"const { context: liveContext, session } = await Promise.resolve({ context: 41, session: 1 }); console.log(liveContext + session);",
),
ev_completed("resp-1"),
]),
("call-2", "console.log(liveContext + session);"),
],
)
.await;
let second_mock = responses::mount_sse_once(
.await?;
assert_js_repl_ok(&mocks[0].single_request(), "call-1", "42");
assert_js_repl_ok(&mocks[1].single_request(), "call-2", "42");
Ok(())
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn js_repl_failed_cells_commit_initialized_bindings_only() -> Result<()> {
skip_if_no_network!(Ok(()));
let server = responses::start_mock_server().await;
let mocks = run_js_repl_sequence(
&server,
sse(vec![
ev_response_created("resp-2"),
ev_custom_tool_call("call-2", "js_repl", "console.log(x + 1);"),
ev_completed("resp-2"),
]),
"run js_repl across a failed cell",
&[
("call-1", "const base = 40; console.log(base);"),
(
"call-2",
"const { session } = await Promise.resolve({ session: 2 }); throw new Error(\"boom\"); const late = 99;",
),
("call-3", "console.log(base + session, typeof late);"),
],
)
.await;
let third_mock = responses::mount_sse_once(
.await?;
assert_js_repl_ok(&mocks[0].single_request(), "call-1", "40");
assert_js_repl_err(&mocks[1].single_request(), "call-2", "boom");
assert_js_repl_ok(&mocks[2].single_request(), "call-3", "42 undefined");
Ok(())
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn js_repl_failed_cells_preserve_initialized_lexical_destructuring_bindings() -> Result<()> {
skip_if_no_network!(Ok(()));
let server = responses::start_mock_server().await;
let mocks = run_js_repl_sequence(
&server,
sse(vec![
ev_assistant_message("msg-1", "done"),
ev_completed("resp-3"),
]),
"run js_repl through partial destructuring failure",
&[
(
"call-1",
"const { a, b } = { a: 1, get b() { throw new Error(\"boom\"); } };",
),
(
"call-2",
"let aValue; try { aValue = a; } catch (error) { aValue = error.name; } let bValue; try { bValue = b; } catch (error) { bValue = error.name; } console.log(aValue, bValue);",
),
],
)
.await;
.await?;
test.submit_turn("run js_repl twice").await?;
assert_js_repl_err(&mocks[0].single_request(), "call-1", "boom");
assert_js_repl_ok(&mocks[1].single_request(), "call-2", "1 ReferenceError");
let req2 = second_mock.single_request();
let (first_output, first_success) = custom_tool_output_text_and_success(&req2, "call-1");
assert_ne!(
first_success,
Some(false),
"first js_repl call failed unexpectedly: {first_output}"
Ok(())
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn js_repl_link_failures_keep_prior_module_state() -> Result<()> {
skip_if_no_network!(Ok(()));
let server = responses::start_mock_server().await;
let mocks = run_js_repl_sequence(
&server,
"run js_repl across a link failure",
&[
("call-1", "const answer = 41; console.log(answer);"),
("call-2", "import value from \"./foo\";"),
("call-3", "console.log(answer + 1);"),
],
)
.await?;
assert_js_repl_ok(&mocks[0].single_request(), "call-1", "41");
assert_js_repl_err(
&mocks[1].single_request(),
"call-2",
"Top-level static import \"./foo\" is not supported in js_repl",
);
assert!(first_output.contains("41"));
assert_js_repl_ok(&mocks[2].single_request(), "call-3", "42");
let req3 = third_mock.single_request();
let (second_output, second_success) = custom_tool_output_text_and_success(&req3, "call-2");
assert_ne!(
second_success,
Some(false),
"second js_repl call failed unexpectedly: {second_output}"
);
assert!(second_output.contains("42"));
Ok(())
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn js_repl_failed_cells_do_not_commit_unreached_hoisted_bindings() -> Result<()> {
skip_if_no_network!(Ok(()));
let server = responses::start_mock_server().await;
let mocks = run_js_repl_sequence(
&server,
"run js_repl through hoisted binding failure",
&[
(
"call-1",
"var early = 1; throw new Error(\"boom\"); var late = 2; function fn() { return 1; }",
),
(
"call-2",
"const late = 40; const fn = 1; console.log(early + late + fn);",
),
],
)
.await?;
assert_js_repl_err(&mocks[0].single_request(), "call-1", "boom");
assert_js_repl_ok(&mocks[1].single_request(), "call-2", "42");
Ok(())
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn js_repl_failed_cells_do_not_preserve_hoisted_function_reads_before_declaration()
-> Result<()> {
skip_if_no_network!(Ok(()));
let server = responses::start_mock_server().await;
let mocks = run_js_repl_sequence(
&server,
"run js_repl through unsupported hoisted function reads",
&[
(
"call-1",
"foo(); throw new Error(\"boom\"); function foo() {}",
),
(
"call-2",
"let value; try { foo; value = \"present\"; } catch (error) { value = error.name; } console.log(value);",
),
],
)
.await?;
assert_js_repl_err(&mocks[0].single_request(), "call-1", "boom");
assert_js_repl_ok(&mocks[1].single_request(), "call-2", "ReferenceError");
Ok(())
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn js_repl_failed_cells_preserve_functions_when_declaration_sites_are_reached() -> Result<()>
{
skip_if_no_network!(Ok(()));
let server = responses::start_mock_server().await;
let mocks = run_js_repl_sequence(
&server,
"run js_repl through supported function declaration persistence",
&[
("call-1", "function foo() {} throw new Error(\"boom\");"),
("call-2", "console.log(typeof foo);"),
],
)
.await?;
assert_js_repl_err(&mocks[0].single_request(), "call-1", "boom");
assert_js_repl_ok(&mocks[1].single_request(), "call-2", "function");
Ok(())
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn js_repl_failed_cells_preserve_prior_binding_writes_without_new_bindings() -> Result<()> {
skip_if_no_network!(Ok(()));
let server = responses::start_mock_server().await;
let mocks = run_js_repl_sequence(
&server,
"run js_repl through failed prior-binding writes",
&[
("call-1", "let x = 1; console.log(x);"),
("call-2", "x = 2; throw new Error(\"boom\");"),
("call-3", "console.log(x);"),
],
)
.await?;
assert_js_repl_ok(&mocks[0].single_request(), "call-1", "1");
assert_js_repl_err(&mocks[1].single_request(), "call-2", "boom");
assert_js_repl_ok(&mocks[2].single_request(), "call-3", "2");
Ok(())
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn js_repl_failed_cells_var_persistence_boundaries() -> Result<()> {
skip_if_no_network!(Ok(()));
let server = responses::start_mock_server().await;
let cases = [
(
"run js_repl through supported pre-declaration var writes",
"x = 5; y = 1; y += 2; z = 1; z++; throw new Error(\"boom\"); var x, y, z;",
"console.log(x, y, z);",
"5 3 2",
),
(
"run js_repl through short-circuited logical var assignments",
"x &&= 1; y ||= 2; z ??= 3; throw new Error(\"boom\"); var x, y, z;",
"let xValue; try { xValue = x; } catch (error) { xValue = error.name; } console.log(xValue, y, z);",
"ReferenceError 2 3",
),
(
"run js_repl through unsupported shadowed nested var writes",
"{ let x = 1; x = 2; } throw new Error(\"boom\"); var x;",
"let value; try { value = x; } catch (error) { value = error.name; } console.log(value);",
"ReferenceError",
),
(
"run js_repl through unsupported nested assignment writes",
"x = (y = 1); throw new Error(\"boom\"); var x, y;",
"let yValue; try { yValue = y; } catch (error) { yValue = error.name; } console.log(x, yValue);",
"1 ReferenceError",
),
(
"run js_repl through unsupported var destructuring recovery",
"var { a, b } = { a: 1, get b() { throw new Error(\"boom\"); } };",
"let aValue; try { aValue = a; } catch (error) { aValue = error.name; } let bValue; try { bValue = b; } catch (error) { bValue = error.name; } console.log(aValue, bValue);",
"ReferenceError ReferenceError",
),
];
for (prompt, failing_cell, followup_cell, expected_followup_output) in cases {
assert_failed_cell_followup(
&server,
prompt,
failing_cell,
followup_cell,
expected_followup_output,
)
.await?;
}
Ok(())
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn js_repl_failed_cells_commit_non_empty_loop_vars_but_skip_empty_loops() -> Result<()> {
skip_if_no_network!(Ok(()));
let server = responses::start_mock_server().await;
let mocks = run_js_repl_sequence(
&server,
"run js_repl through failed loop bindings",
&[
(
"call-1",
"for (var item of [2]) {} for (var emptyItem of []) {} throw new Error(\"boom\");",
),
(
"call-2",
"let itemValue; try { itemValue = item; } catch (error) { itemValue = error.name; } let emptyValue; try { emptyValue = emptyItem; } catch (error) { emptyValue = error.name; } console.log(itemValue, emptyValue);",
),
],
)
.await?;
assert_js_repl_err(&mocks[0].single_request(), "call-1", "boom");
assert_js_repl_ok(&mocks[1].single_request(), "call-2", "2 ReferenceError");
Ok(())
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn js_repl_keeps_function_to_string_stable() -> Result<()> {
skip_if_no_network!(Ok(()));
let server = responses::start_mock_server().await;
let mock = run_js_repl_turn(
&server,
"run js_repl through function toString",
&[(
"call-1",
"function foo() { return 1; } console.log(foo.toString());",
)],
)
.await?;
let req = mock.single_request();
assert_js_repl_ok(&req, "call-1", "function foo() { return 1; }");
let (output, _) = custom_tool_output_text_and_success(&req, "call-1");
assert!(!output.contains("__codexInternalMarkCommittedBindings"));
Ok(())
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn js_repl_allows_globalthis_shadowing_with_instrumented_bindings() -> Result<()> {
skip_if_no_network!(Ok(()));
let server = responses::start_mock_server().await;
let mock = run_js_repl_turn(
&server,
"run js_repl with shadowed globalThis",
&[(
"call-1",
"const globalThis = {}; const value = 1; console.log(typeof globalThis, value);",
)],
)
.await?;
let req = mock.single_request();
assert_js_repl_ok(&req, "call-1", "object 1");
Ok(())
}

View file

@ -60,6 +60,9 @@ imported local file. They are not resolved relative to the imported file's locat
- Optional first-line pragma:
- `// codex-js-repl: timeout_ms=15000`
- Top-level bindings persist across calls.
- If a cell throws, prior bindings remain available, lexical bindings whose initialization completed before the throw stay available in later calls, and hoisted `var` / `function` bindings persist only when execution clearly reached their declaration or a supported write site.
- Supported hoisted-`var` failed-cell cases are direct top-level identifier writes and updates before the declaration (for example `x = 1`, `x += 1`, `x++`, `x &&= 1`) and non-empty top-level `for...in` / `for...of` loops.
- Intentionally unsupported failed-cell cases include hoisted function reads before the declaration, aliasing or direct-IIFE-based inference, writes in nested blocks or other nested statement structure, nested writes inside already-instrumented assignment RHS expressions, destructuring-assignment recovery for hoisted `var`, partial `var` destructuring recovery, pre-declaration `undefined` reads, and empty top-level `for...in` / `for...of` loop vars.
- Top-level static import declarations (for example `import x from "pkg"`) are currently unsupported; use dynamic imports with `await import("pkg")`.
- Imported local files must be ESM `.js` / `.mjs` files and run in the same REPL VM context as the calling cell.
- Static imports inside imported local files may only target other local `.js` / `.mjs` files via relative paths, absolute paths, or `file://` URLs. Bare package and builtin imports from local files must stay dynamic via `await import(...)`.