nit: drop file (#7314)
This commit is contained in:
parent
e953092949
commit
963009737f
1 changed files with 0 additions and 206 deletions
|
|
@ -1,206 +0,0 @@
|
|||
# Client Extraction Plan
|
||||
|
||||
## Goals
|
||||
- Split the HTTP transport/client code out of `codex-core` into a reusable crate that is agnostic of Codex/OpenAI business logic and API schemas.
|
||||
- Create a separate API library crate that houses typed requests/responses for well-known APIs (Responses, Chat Completions, Compact) and plugs into the transport crate via minimal traits.
|
||||
- Preserve current behaviour (auth headers, retries, SSE handling, rate-limit parsing, compaction, fixtures) while making the APIs symmetric and avoiding code duplication.
|
||||
- Keep existing consumers (`codex-core`, tests, and tools) stable by providing a small compatibility layer during the transition.
|
||||
|
||||
## Snapshot of Today
|
||||
- `core/src/client.rs (ModelClient)` owns config/auth/session state, chooses wire API, builds payloads, drives retries, parses SSE, compaction, and rate-limit headers.
|
||||
- `core/src/chat_completions.rs` implements the Chat Completions call + SSE parser + aggregation helper.
|
||||
- `core/src/client_common.rs` holds `Prompt`, tool specs, shared request structs (`ResponsesApiRequest`, `TextControls`), and `ResponseEvent`/`ResponseStream`.
|
||||
- `core/src/default_client.rs` wraps `reqwest` with Codex UA/originator defaults.
|
||||
- `core/src/model_provider_info.rs` models providers (base URL, headers, env keys, retry/timeout tuning) and builds `CodexRequestBuilder`s.
|
||||
- Current retry logic is co-located with API handling; streaming SSE parsing is duplicated across Responses/Chat.
|
||||
|
||||
## Target Crates (with interfaces)
|
||||
|
||||
- `codex-client` (generic transport)
|
||||
- Owns the generic HTTP machinery: a `CodexHttpClient`/`CodexRequestBuilder`-style wrapper, retry/backoff hooks, streaming connector (SSE framing + idle timeout), header injection, and optional telemetry callbacks.
|
||||
- Does **not** know about OpenAI/Codex-specific paths, headers, or error codes; it only exposes HTTP-level concepts (status, headers, bodies, connection errors).
|
||||
- Minimal surface:
|
||||
```rust
|
||||
pub trait HttpTransport {
|
||||
fn execute(&self, req: Request) -> Result<Response, TransportError>;
|
||||
fn stream(&self, req: Request) -> Result<ByteStream, TransportError>;
|
||||
}
|
||||
|
||||
pub struct Request {
|
||||
pub method: Method,
|
||||
pub url: String,
|
||||
pub headers: HeaderMap,
|
||||
pub body: Option<serde_json::Value>,
|
||||
pub timeout: Option<Duration>,
|
||||
}
|
||||
```
|
||||
- Generic client traits (request/response/chunk are abstract over the transport):
|
||||
```rust
|
||||
#[async_trait::async_trait]
|
||||
pub trait UnaryClient<Req, Resp> {
|
||||
async fn run(&self, req: Req) -> Result<Resp, TransportError>;
|
||||
}
|
||||
|
||||
#[async_trait::async_trait]
|
||||
pub trait StreamClient<Req, Chunk> {
|
||||
async fn run(&self, req: Req) -> Result<ResponseStream<Chunk>, TransportError>;
|
||||
}
|
||||
|
||||
pub struct RetryPolicy {
|
||||
pub max_attempts: u64,
|
||||
pub base_delay: Duration,
|
||||
pub retry_on: RetryOn, // e.g., transport errors + 429/5xx
|
||||
}
|
||||
```
|
||||
- `RetryOn` lives in `codex-client` and captures HTTP status classes and transport failures that qualify for retry.
|
||||
- Implementations in `codex-api` plug in their own request types, parsers, and retry policies while reusing the transport’s backoff and error types.
|
||||
- Planned runtime helper:
|
||||
```rust
|
||||
pub async fn run_with_retry<T, F, Fut>(
|
||||
policy: RetryPolicy,
|
||||
make_req: impl Fn() -> Request,
|
||||
op: F,
|
||||
) -> Result<T, TransportError>
|
||||
where
|
||||
F: Fn(Request) -> Fut,
|
||||
Fut: Future<Output = Result<T, TransportError>>,
|
||||
{
|
||||
for attempt in 0..=policy.max_attempts {
|
||||
let req = make_req();
|
||||
match op(req).await {
|
||||
Ok(resp) => return Ok(resp),
|
||||
Err(err) if policy.retry_on.should_retry(&err, attempt) => {
|
||||
tokio::time::sleep(backoff(policy.base_delay, attempt + 1)).await;
|
||||
}
|
||||
Err(err) => return Err(err),
|
||||
}
|
||||
}
|
||||
Err(TransportError::RetryLimit)
|
||||
}
|
||||
```
|
||||
- Unary clients wrap `transport.execute` with this helper and then deserialize.
|
||||
- Stream clients wrap the **initial** `transport.stream` call with this helper. Mid-stream disconnects are surfaced as `StreamError`s; automatic resume/reconnect can be added later on top of this primitive if we introduce cursor support.
|
||||
- Common helpers: `retry::backoff(attempt)`, `errors::{TransportError, StreamError}`.
|
||||
- Streaming utility (SSE framing only):
|
||||
```rust
|
||||
pub fn sse_stream<S>(
|
||||
bytes: S,
|
||||
idle_timeout: Duration,
|
||||
tx: mpsc::Sender<Result<String, StreamError>>,
|
||||
telemetry: Option<Box<dyn Telemetry>>,
|
||||
)
|
||||
where
|
||||
S: Stream<Item = Result<Bytes, TransportError>> + Unpin + Send + 'static;
|
||||
```
|
||||
- `sse_stream` is responsible for timeouts, connection-level errors, and emitting raw `data:` chunks as UTF-8 strings; parsing those strings into structured events is done in `codex-api`.
|
||||
|
||||
- `codex-api` (OpenAI/Codex API library)
|
||||
- Owns typed models for Responses/Chat/Compact plus shared helpers (`Prompt`, tool specs, text controls, `ResponsesApiRequest`, etc.).
|
||||
- Knows about OpenAI/Codex semantics:
|
||||
- URL shapes (`/v1/responses`, `/v1/chat/completions`, `/responses/compact`).
|
||||
- Provider configuration (`WireApi`, base URLs, query params, per-provider retry knobs).
|
||||
- Rate-limit headers (`x-codex-*`) and their mapping into `RateLimitSnapshot` / `CreditsSnapshot`.
|
||||
- Error body formats (`{ error: { type, code, message, plan_type, resets_at } }`) and how they become API errors (context window exceeded, quota/usage limit, etc.).
|
||||
- SSE event names (`response.output_item.done`, `response.completed`, `response.failed`, etc.) and their mapping into high-level events.
|
||||
- Provides a provider abstraction (conceptually similar to `ModelProviderInfo`):
|
||||
```rust
|
||||
pub struct Provider {
|
||||
pub name: String,
|
||||
pub base_url: String,
|
||||
pub wire: WireApi, // Responses | Chat
|
||||
pub headers: HeaderMap,
|
||||
pub retry: RetryConfig,
|
||||
pub stream_idle_timeout: Duration,
|
||||
}
|
||||
|
||||
pub trait AuthProvider {
|
||||
/// Returns a bearer token to use for this request (if any).
|
||||
/// Implementations are expected to be cheap and to surface already-refreshed tokens;
|
||||
/// higher layers (`codex-core`) remain responsible for token refresh flows.
|
||||
fn bearer_token(&self) -> Option<String>;
|
||||
|
||||
/// Optional ChatGPT account id header for Chat mode.
|
||||
fn account_id(&self) -> Option<String>;
|
||||
}
|
||||
```
|
||||
- Ready-made clients built on `HttpTransport`:
|
||||
```rust
|
||||
pub struct ResponsesClient<T: HttpTransport, A: AuthProvider> { /* ... */ }
|
||||
impl<T, A> ResponsesClient<T, A> {
|
||||
pub async fn stream(&self, prompt: &Prompt) -> ApiResult<ResponseStream<ApiEvent>>;
|
||||
pub async fn compact(&self, prompt: &Prompt) -> ApiResult<Vec<ResponseItem>>;
|
||||
}
|
||||
|
||||
pub struct ChatClient<T: HttpTransport, A: AuthProvider> { /* ... */ }
|
||||
impl<T, A> ChatClient<T, A> {
|
||||
pub async fn stream(&self, prompt: &Prompt) -> ApiResult<ResponseStream<ApiEvent>>;
|
||||
}
|
||||
|
||||
pub struct CompactClient<T: HttpTransport, A: AuthProvider> { /* ... */ }
|
||||
impl<T, A> CompactClient<T, A> {
|
||||
pub async fn compact(&self, prompt: &Prompt) -> ApiResult<Vec<ResponseItem>>;
|
||||
}
|
||||
```
|
||||
- Streaming events unified across wire APIs (this can closely mirror `ResponseEvent` today, and we may type-alias one to the other during migration):
|
||||
```rust
|
||||
pub enum ApiEvent {
|
||||
Created,
|
||||
OutputItemAdded(ResponseItem),
|
||||
OutputItemDone(ResponseItem),
|
||||
OutputTextDelta(String),
|
||||
ReasoningContentDelta { delta: String, content_index: i64 },
|
||||
ReasoningSummaryDelta { delta: String, summary_index: i64 },
|
||||
RateLimits(RateLimitSnapshot),
|
||||
Completed { response_id: String, token_usage: Option<TokenUsage> },
|
||||
}
|
||||
```
|
||||
- Error layering:
|
||||
- `codex-client`: defines `TransportError` / `StreamError` (status codes, IO, timeouts).
|
||||
- `codex-api`: defines `ApiError` that wraps `TransportError` plus API-specific errors parsed from bodies and headers.
|
||||
- `codex-core`: maps `ApiError` into existing `CodexErr` variants so downstream callers remain unchanged.
|
||||
- Aggregation strategies (today’s `AggregateStreamExt`) live here as adapters (`Aggregated`, `Streaming`) that transform `ResponseStream<ApiEvent>` into the higher-level views used by `codex-core`.
|
||||
|
||||
## Implementation Steps
|
||||
|
||||
1. **Create crates**: add `codex-client` and `codex-api` (names keep the `codex-` prefix). Stub lib files with feature flags/tests wired into the workspace; wire them into `Cargo.toml`.
|
||||
2. **Extract API-level SSE + rate limits into `codex-api`**:
|
||||
- Move the Responses SSE parser (`process_sse`), rate-limit parsing, and related tests from `core/src/client.rs` into `codex-api`, keeping the behavior identical.
|
||||
- Introduce `ApiEvent` (initially equivalent to `ResponseEvent`) and `ApiError`, and adjust the parser to emit those.
|
||||
- Provide test-only helpers for fixture streams (replacement for `CODEX_RS_SSE_FIXTURE`) in `codex-api`.
|
||||
3. **Lift transport layer into `codex-client`**:
|
||||
- Move `CodexHttpClient`/`CodexRequestBuilder`, UA/originator plumbing, and backoff helpers from `core/src/default_client.rs` into `codex-client` (or a thin wrapper on top of it).
|
||||
- Introduce `HttpTransport`, `Request`, `RetryPolicy`, `RetryOn`, and `run_with_retry` as described above.
|
||||
- Keep sandbox/no-proxy toggles behind injected configuration so `codex-client` stays generic and does not depend on Codex-specific env vars.
|
||||
4. **Model provider abstraction in `codex-api`**:
|
||||
- Relocate `ModelProviderInfo` (base URL, env/header resolution, retry knobs, wire API enum) into `codex-api`, expressed in terms of `Provider` and `AuthProvider`.
|
||||
- Ensure provider logic handles:
|
||||
- URL building for Responses/Chat/Compact (including Azure special cases).
|
||||
- Static and env-based headers.
|
||||
- Per-provider retry and idle-timeout settings that map cleanly into `RetryPolicy`/`RetryOn`.
|
||||
5. **API crate wiring**:
|
||||
- Move `Prompt`, tool specs, `ResponsesApiRequest`, `TextControls`, and `ResponseEvent/ResponseStream` into `codex-api` under modules (`common`, `responses`, `chat`, `compact`), keeping public types stable or re-exported through `codex-core` as needed.
|
||||
- Rebuild Responses and Chat clients on top of `HttpTransport` + `StreamClient`, reusing shared retry + SSE helpers; keep aggregation adapters as reusable strategies instead of `ModelClient`-local logic.
|
||||
- Implement Compact on top of `UnaryClient` and the unary `execute` path with JSON deserialization, sharing the same retry policy.
|
||||
- Keep request builders symmetric: each client prepares a `Request<serde_json::Value>`, attaches headers/auth via `AuthProvider`, and plugs in its parser (streaming clients) or deserializer (unary) while sharing retry/backoff configuration derived from `Provider`.
|
||||
6. **Core integration layer**:
|
||||
- Replace `core::ModelClient` internals with thin adapters that construct `codex-api` clients using `Config`, `AuthManager`, and `OtelEventManager`.
|
||||
- Keep the public `ModelClient` API and `ResponseEvent`/`ResponseStream` types stable by re-exporting `codex-api` types or providing type aliases.
|
||||
- Preserve existing auth flows (including ChatGPT token refresh) inside `codex-core` or a thin adapter, using `AuthProvider` to surface bearer tokens to `codex-api` and handling 401/refresh semantics at this layer.
|
||||
7. **Tests/migration**:
|
||||
- Move unit tests for SSE parsing, retry/backoff decisions, and provider/header behavior into the new crates; keep integration tests in `core` using the compatibility layer.
|
||||
- Update fixtures to be consumed via test-only adapters in `codex-api`.
|
||||
- Run targeted `just fmt`, `just fix -p` for the touched crates, and scoped `cargo test -p codex-client`, `-p codex-api`, and existing `codex-core` suites.
|
||||
|
||||
## Design Decisions
|
||||
|
||||
- **UA construction**
|
||||
- `codex-client` exposes an optional UA suffix/provider hook (tiny feature) and remains unaware of the CLI; `codex-core` / the CLI compute the full UA (including `terminal::user_agent()`) and pass the suffix or builder down.
|
||||
- **Config vs provider**
|
||||
- Most configuration stays in `codex-core`. `codex-api::Provider` only contains what is strictly required for HTTP (base URLs, query params, retry/timeout knobs, wire API), while higher-level knobs (reasoning defaults, verbosity flags, etc.) remain core concerns.
|
||||
- **Auth flow ownership**
|
||||
- Auth flows (including ChatGPT token refresh) remain in `codex-core`. `AuthProvider` simply exposes already-fresh tokens/account IDs; 401 handling and refresh retries stay in the existing auth layer.
|
||||
- **Error enums**
|
||||
- `codex-client` continues to define `TransportError` / `StreamError`. `codex-api` defines an `ApiError` (deriving `thiserror::Error`) that wraps `TransportError` and API-specific failures, and `codex-core` maps `ApiError` into existing `CodexErr` variants for callers.
|
||||
- **Streaming reconnection semantics**
|
||||
- For now, mid-stream SSE failures are surfaced as errors and only the initial connection is retried via `run_with_retry`. We will revisit mid-stream reconnect/resume once the underlying APIs support cursor/idempotent event semantics.
|
||||
|
||||
Loading…
Add table
Reference in a new issue