13 KiB
13 KiB
Client Extraction Plan
Goals
- Split the HTTP transport/client code out of
codex-coreinto a reusable crate that is agnostic of Codex/OpenAI business logic and API schemas. - Create a separate API library crate that houses typed requests/responses for well-known APIs (Responses, Chat Completions, Compact) and plugs into the transport crate via minimal traits.
- Preserve current behaviour (auth headers, retries, SSE handling, rate-limit parsing, compaction, fixtures) while making the APIs symmetric and avoiding code duplication.
- Keep existing consumers (
codex-core, tests, and tools) stable by providing a small compatibility layer during the transition.
Snapshot of Today
core/src/client.rs (ModelClient)owns config/auth/session state, chooses wire API, builds payloads, drives retries, parses SSE, compaction, and rate-limit headers.core/src/chat_completions.rsimplements the Chat Completions call + SSE parser + aggregation helper.core/src/client_common.rsholdsPrompt, tool specs, shared request structs (ResponsesApiRequest,TextControls), andResponseEvent/ResponseStream.core/src/default_client.rswrapsreqwestwith Codex UA/originator defaults.core/src/model_provider_info.rsmodels providers (base URL, headers, env keys, retry/timeout tuning) and buildsCodexRequestBuilders.- Current retry logic is co-located with API handling; streaming SSE parsing is duplicated across Responses/Chat.
Target Crates (with interfaces)
-
codex-client(generic transport)- Owns the generic HTTP machinery: a
CodexHttpClient/CodexRequestBuilder-style wrapper, retry/backoff hooks, streaming connector (SSE framing + idle timeout), header injection, and optional telemetry callbacks. - Does not know about OpenAI/Codex-specific paths, headers, or error codes; it only exposes HTTP-level concepts (status, headers, bodies, connection errors).
- Minimal surface:
pub trait HttpTransport { fn execute(&self, req: Request) -> Result<Response, TransportError>; fn stream(&self, req: Request) -> Result<ByteStream, TransportError>; } pub struct Request { pub method: Method, pub url: String, pub headers: HeaderMap, pub body: Option<serde_json::Value>, pub timeout: Option<Duration>, } - Generic client traits (request/response/chunk are abstract over the transport):
#[async_trait::async_trait] pub trait UnaryClient<Req, Resp> { async fn run(&self, req: Req) -> Result<Resp, TransportError>; } #[async_trait::async_trait] pub trait StreamClient<Req, Chunk> { async fn run(&self, req: Req) -> Result<ResponseStream<Chunk>, TransportError>; } pub struct RetryPolicy { pub max_attempts: u64, pub base_delay: Duration, pub retry_on: RetryOn, // e.g., transport errors + 429/5xx }RetryOnlives incodex-clientand captures HTTP status classes and transport failures that qualify for retry.- Implementations in
codex-apiplug in their own request types, parsers, and retry policies while reusing the transport’s backoff and error types. - Planned runtime helper:
pub async fn run_with_retry<T, F, Fut>( policy: RetryPolicy, make_req: impl Fn() -> Request, op: F, ) -> Result<T, TransportError> where F: Fn(Request) -> Fut, Fut: Future<Output = Result<T, TransportError>>, { for attempt in 0..=policy.max_attempts { let req = make_req(); match op(req).await { Ok(resp) => return Ok(resp), Err(err) if policy.retry_on.should_retry(&err, attempt) => { tokio::time::sleep(backoff(policy.base_delay, attempt + 1)).await; } Err(err) => return Err(err), } } Err(TransportError::RetryLimit) }- Unary clients wrap
transport.executewith this helper and then deserialize. - Stream clients wrap the initial
transport.streamcall with this helper. Mid-stream disconnects are surfaced asStreamErrors; automatic resume/reconnect can be added later on top of this primitive if we introduce cursor support.
- Unary clients wrap
- Common helpers:
retry::backoff(attempt),errors::{TransportError, StreamError}. - Streaming utility (SSE framing only):
pub fn sse_stream<S>( bytes: S, idle_timeout: Duration, tx: mpsc::Sender<Result<String, StreamError>>, telemetry: Option<Box<dyn Telemetry>>, ) where S: Stream<Item = Result<Bytes, TransportError>> + Unpin + Send + 'static;sse_streamis responsible for timeouts, connection-level errors, and emitting rawdata:chunks as UTF-8 strings; parsing those strings into structured events is done incodex-api.
- Owns the generic HTTP machinery: a
-
codex-api(OpenAI/Codex API library)- Owns typed models for Responses/Chat/Compact plus shared helpers (
Prompt, tool specs, text controls,ResponsesApiRequest, etc.). - Knows about OpenAI/Codex semantics:
- URL shapes (
/v1/responses,/v1/chat/completions,/responses/compact). - Provider configuration (
WireApi, base URLs, query params, per-provider retry knobs). - Rate-limit headers (
x-codex-*) and their mapping intoRateLimitSnapshot/CreditsSnapshot. - Error body formats (
{ error: { type, code, message, plan_type, resets_at } }) and how they become API errors (context window exceeded, quota/usage limit, etc.). - SSE event names (
response.output_item.done,response.completed,response.failed, etc.) and their mapping into high-level events.
- URL shapes (
- Provides a provider abstraction (conceptually similar to
ModelProviderInfo):pub struct Provider { pub name: String, pub base_url: String, pub wire: WireApi, // Responses | Chat pub headers: HeaderMap, pub retry: RetryConfig, pub stream_idle_timeout: Duration, } pub trait AuthProvider { /// Returns a bearer token to use for this request (if any). /// Implementations are expected to be cheap and to surface already-refreshed tokens; /// higher layers (`codex-core`) remain responsible for token refresh flows. fn bearer_token(&self) -> Option<String>; /// Optional ChatGPT account id header for Chat mode. fn account_id(&self) -> Option<String>; } - Ready-made clients built on
HttpTransport:pub struct ResponsesClient<T: HttpTransport, A: AuthProvider> { /* ... */ } impl<T, A> ResponsesClient<T, A> { pub async fn stream(&self, prompt: &Prompt) -> ApiResult<ResponseStream<ApiEvent>>; pub async fn compact(&self, prompt: &Prompt) -> ApiResult<Vec<ResponseItem>>; } pub struct ChatClient<T: HttpTransport, A: AuthProvider> { /* ... */ } impl<T, A> ChatClient<T, A> { pub async fn stream(&self, prompt: &Prompt) -> ApiResult<ResponseStream<ApiEvent>>; } pub struct CompactClient<T: HttpTransport, A: AuthProvider> { /* ... */ } impl<T, A> CompactClient<T, A> { pub async fn compact(&self, prompt: &Prompt) -> ApiResult<Vec<ResponseItem>>; } - Streaming events unified across wire APIs (this can closely mirror
ResponseEventtoday, and we may type-alias one to the other during migration):pub enum ApiEvent { Created, OutputItemAdded(ResponseItem), OutputItemDone(ResponseItem), OutputTextDelta(String), ReasoningContentDelta { delta: String, content_index: i64 }, ReasoningSummaryDelta { delta: String, summary_index: i64 }, RateLimits(RateLimitSnapshot), Completed { response_id: String, token_usage: Option<TokenUsage> }, } - Error layering:
codex-client: definesTransportError/StreamError(status codes, IO, timeouts).codex-api: definesApiErrorthat wrapsTransportErrorplus API-specific errors parsed from bodies and headers.codex-core: mapsApiErrorinto existingCodexErrvariants so downstream callers remain unchanged.
- Aggregation strategies (today’s
AggregateStreamExt) live here as adapters (Aggregated,Streaming) that transformResponseStream<ApiEvent>into the higher-level views used bycodex-core.
- Owns typed models for Responses/Chat/Compact plus shared helpers (
Implementation Steps
- Create crates: add
codex-clientandcodex-api(names keep thecodex-prefix). Stub lib files with feature flags/tests wired into the workspace; wire them intoCargo.toml. - Extract API-level SSE + rate limits into
codex-api:- Move the Responses SSE parser (
process_sse), rate-limit parsing, and related tests fromcore/src/client.rsintocodex-api, keeping the behavior identical. - Introduce
ApiEvent(initially equivalent toResponseEvent) andApiError, and adjust the parser to emit those. - Provide test-only helpers for fixture streams (replacement for
CODEX_RS_SSE_FIXTURE) incodex-api.
- Move the Responses SSE parser (
- Lift transport layer into
codex-client:- Move
CodexHttpClient/CodexRequestBuilder, UA/originator plumbing, and backoff helpers fromcore/src/default_client.rsintocodex-client(or a thin wrapper on top of it). - Introduce
HttpTransport,Request,RetryPolicy,RetryOn, andrun_with_retryas described above. - Keep sandbox/no-proxy toggles behind injected configuration so
codex-clientstays generic and does not depend on Codex-specific env vars.
- Move
- Model provider abstraction in
codex-api:- Relocate
ModelProviderInfo(base URL, env/header resolution, retry knobs, wire API enum) intocodex-api, expressed in terms ofProviderandAuthProvider. - Ensure provider logic handles:
- URL building for Responses/Chat/Compact (including Azure special cases).
- Static and env-based headers.
- Per-provider retry and idle-timeout settings that map cleanly into
RetryPolicy/RetryOn.
- Relocate
- API crate wiring:
- Move
Prompt, tool specs,ResponsesApiRequest,TextControls, andResponseEvent/ResponseStreamintocodex-apiunder modules (common,responses,chat,compact), keeping public types stable or re-exported throughcodex-coreas needed. - Rebuild Responses and Chat clients on top of
HttpTransport+StreamClient, reusing shared retry + SSE helpers; keep aggregation adapters as reusable strategies instead ofModelClient-local logic. - Implement Compact on top of
UnaryClientand the unaryexecutepath with JSON deserialization, sharing the same retry policy. - Keep request builders symmetric: each client prepares a
Request<serde_json::Value>, attaches headers/auth viaAuthProvider, and plugs in its parser (streaming clients) or deserializer (unary) while sharing retry/backoff configuration derived fromProvider.
- Move
- Core integration layer:
- Replace
core::ModelClientinternals with thin adapters that constructcodex-apiclients usingConfig,AuthManager, andOtelEventManager. - Keep the public
ModelClientAPI andResponseEvent/ResponseStreamtypes stable by re-exportingcodex-apitypes or providing type aliases. - Preserve existing auth flows (including ChatGPT token refresh) inside
codex-coreor a thin adapter, usingAuthProviderto surface bearer tokens tocodex-apiand handling 401/refresh semantics at this layer.
- Replace
- Tests/migration:
- Move unit tests for SSE parsing, retry/backoff decisions, and provider/header behavior into the new crates; keep integration tests in
coreusing the compatibility layer. - Update fixtures to be consumed via test-only adapters in
codex-api. - Run targeted
just fmt,just fix -pfor the touched crates, and scopedcargo test -p codex-client,-p codex-api, and existingcodex-coresuites.
- Move unit tests for SSE parsing, retry/backoff decisions, and provider/header behavior into the new crates; keep integration tests in
Design Decisions
- UA construction
codex-clientexposes an optional UA suffix/provider hook (tiny feature) and remains unaware of the CLI;codex-core/ the CLI compute the full UA (includingterminal::user_agent()) and pass the suffix or builder down.
- Config vs provider
- Most configuration stays in
codex-core.codex-api::Provideronly contains what is strictly required for HTTP (base URLs, query params, retry/timeout knobs, wire API), while higher-level knobs (reasoning defaults, verbosity flags, etc.) remain core concerns.
- Most configuration stays in
- Auth flow ownership
- Auth flows (including ChatGPT token refresh) remain in
codex-core.AuthProvidersimply exposes already-fresh tokens/account IDs; 401 handling and refresh retries stay in the existing auth layer.
- Auth flows (including ChatGPT token refresh) remain in
- Error enums
codex-clientcontinues to defineTransportError/StreamError.codex-apidefines anApiError(derivingthiserror::Error) that wrapsTransportErrorand API-specific failures, andcodex-coremapsApiErrorinto existingCodexErrvariants for callers.
- Streaming reconnection semantics
- For now, mid-stream SSE failures are surfaced as errors and only the initial connection is retried via
run_with_retry. We will revisit mid-stream reconnect/resume once the underlying APIs support cursor/idempotent event semantics.
- For now, mid-stream SSE failures are surfaced as errors and only the initial connection is retried via