### Summary Propagate trace context originating at app-server RPC method handlers -> codex core submission loop (so this includes spans such as `run_turn`!). This implements PR 2 of the app-server tracing rollout. This also removes the old lower-level env-based reparenting in core so explicit request/submission ancestry wins instead of being overridden by ambient `TRACEPARENT` state. ### What changed - Added `trace: Option<W3cTraceContext>` to codex_protocol::Submission - Taught `Codex::submit()` / `submit_with_id()` to automatically capture the current span context when constructing or forwarding a submission - Wrapped the core submission loop in a submission_dispatch span parented from Submission.trace - Warn on invalid submission trace carriers and ignore them cleanly - Removed the old env-based downstream reparenting path in core task execution - Stopped OTEL provider init from implicitly attaching env trace context process-wide - Updated mcp-server Submission call sites for the new field Added focused unit tests for: - capturing trace context into Submission - preferring `Submission.trace` when building the core dispatch span ### Why PR 1 gave us consistent inbound request spans in app-server, but that only covered the transport boundary. For long-running work like turns and reviews, the important missing piece was preserving ancestry after the request handler returns and core continues work on a different async path. This change makes that handoff explicit and keeps the parentage rules simple: - app-server request span sets the current context - `Submission.trace` snapshots that context - core restores it once, at the submission boundary - deeper core spans inherit naturally That also lets us stop relying on env-based reparenting for this path, which was too ambient and could override explicit ancestry. |
||
|---|---|---|
| .. | ||
| src | ||
| tests | ||
| BUILD.bazel | ||
| Cargo.toml | ||
| README.md | ||
codex-otel
codex-otel is the OpenTelemetry integration crate for Codex. It provides:
- Trace/log/metrics exporters and tracing subscriber layers (
codex_otel::otel_provider). - A structured event helper (
codex_otel::OtelManager). - OpenTelemetry metrics support via OTLP exporters (
codex_otel::metrics). - A metrics facade on
OtelManagerso tracing + metrics share metadata.
Tracing and logs
Create an OTEL provider from OtelSettings. The provider also configures
metrics (when enabled), then attach its layers to your tracing_subscriber
registry:
use codex_otel::config::OtelExporter;
use codex_otel::config::OtelHttpProtocol;
use codex_otel::config::OtelSettings;
use codex_otel::otel_provider::OtelProvider;
use tracing_subscriber::prelude::*;
let settings = OtelSettings {
environment: "dev".to_string(),
service_name: "codex-cli".to_string(),
service_version: env!("CARGO_PKG_VERSION").to_string(),
codex_home: std::path::PathBuf::from("/tmp"),
exporter: OtelExporter::OtlpHttp {
endpoint: "https://otlp.example.com".to_string(),
headers: std::collections::HashMap::new(),
protocol: OtelHttpProtocol::Binary,
tls: None,
},
trace_exporter: OtelExporter::OtlpHttp {
endpoint: "https://otlp.example.com".to_string(),
headers: std::collections::HashMap::new(),
protocol: OtelHttpProtocol::Binary,
tls: None,
},
metrics_exporter: OtelExporter::None,
};
if let Some(provider) = OtelProvider::from(&settings)? {
let registry = tracing_subscriber::registry()
.with(provider.logger_layer())
.with(provider.tracing_layer());
registry.init();
}
OtelManager (events)
OtelManager adds consistent metadata to tracing events and helps record
Codex-specific events.
use codex_otel::OtelManager;
let manager = OtelManager::new(
conversation_id,
model,
slug,
account_id,
account_email,
auth_mode,
log_user_prompts,
terminal_type,
session_source,
);
manager.user_prompt(&prompt_items);
Metrics (OTLP or in-memory)
Modes:
- OTLP: exports metrics via the OpenTelemetry OTLP exporter (HTTP or gRPC).
- In-memory: records via
opentelemetry_sdk::metrics::InMemoryMetricExporterfor tests/assertions; callshutdown()to flush.
codex-otel also provides OtelExporter::Statsig, a shorthand for exporting OTLP/HTTP JSON metrics
to Statsig using Codex-internal defaults.
Statsig ingestion (OTLP/HTTP JSON) example:
use codex_otel::config::{OtelExporter, OtelHttpProtocol};
let metrics = MetricsClient::new(MetricsConfig::otlp(
"dev",
"codex-cli",
env!("CARGO_PKG_VERSION"),
OtelExporter::OtlpHttp {
endpoint: "https://api.statsig.com/otlp".to_string(),
headers: std::collections::HashMap::from([(
"statsig-api-key".to_string(),
std::env::var("STATSIG_SERVER_SDK_SECRET")?,
)]),
protocol: OtelHttpProtocol::Json,
tls: None,
},
))?;
metrics.counter("codex.session_started", 1, &[("source", "tui")])?;
metrics.histogram("codex.request_latency", 83, &[("route", "chat")])?;
In-memory (tests):
let exporter = InMemoryMetricExporter::default();
let metrics = MetricsClient::new(MetricsConfig::in_memory(
"test",
"codex-cli",
env!("CARGO_PKG_VERSION"),
exporter.clone(),
))?;
metrics.counter("codex.turns", 1, &[("model", "gpt-5.1")])?;
metrics.shutdown()?; // flushes in-memory exporter
Shutdown
OtelProvider::shutdown()stops the OTEL exporter.OtelManager::shutdown_metrics()flushes and shuts down the metrics provider.
Both are optional because drop performs best-effort shutdown, but calling them explicitly gives deterministic flushing (or a shutdown error if flushing does not complete in time).