Skip to main content
Version: 0.8

Tracing Overview

CubePi emits OpenTelemetry spans that follow the GenAI Semantic Conventions v1.41 so any OTel-compatible backend (Jaeger, Tempo, Honeycomb, Datadog, AWS X-Ray, Azure Monitor, …) can ingest agent runs without custom instrumentation.

Attach a Tracer to an Agent and every prompt produces a tree of spans you can pivot, query, and join with the rest of your service traces:

trace
└── invoke_agent 14425.8ms [0x1cd97cdb] ← one per agent.prompt()
├── cubepi.turn 1283.1ms [0x5cfda93e] ← one per LLM round-trip
│ ├── chat deepseek-v4-flash 1208.7ms tok 6845/68 [0x0d130229]
│ └── execute_tool subagent 9610.2ms subagent [0x38bdd10a]
│ └── invoke_agent 9601.0ms [0x8094f99b] ← subagent run, nested
│ └── cubepi.turn 9598.4ms [0x57c5cfc7]
│ ├── chat deepseek-v4-flash 1190.3ms [0x8205ca6b]
│ └── execute_tool web_search 6500.2ms web_search [0xca4e59fc]
└── cubepi.turn 491.9ms ERROR [0xce25f242]
└── chat deepseek-v4-flash 427.2ms ERROR [0x0bff68ec]
└── error: Error code: 400 - ... `tool_use` ids were found without
`tool_result` blocks immediately after: call_01_...

Each layer carries standard gen_ai.* attributes — gen_ai.operation.name, gen_ai.request.model, gen_ai.provider.name, gen_ai.usage.input_tokens, gen_ai.usage.output_tokens, gen_ai.response.finish_reasons, …

What ships out of the box

  • Tracer — builds an SDK TracerProvider, attaches one BatchSpanProcessor per exporter, wires the CubePi event stream into spans.
  • Meter — sibling for OTel histograms: gen_ai.client.operation.duration, gen_ai.client.operation.time_to_first_chunk, gen_ai.client.token.usage.
  • JsonlSpanExporter — write one JSON line per span to ./cubepi-traces/<date>/<trace_id>.jsonl. Files are sharded by trace_id, so one file holds a whole trace — the run plus any nested subagent runs (which inherit the trace). Useful for local dev and offline debugging; works with any OTel viewer that reads JSONL, and with the cubepi trace CLI.
  • OTLP — bring your own exporter via opentelemetry-exporter-otlp-proto-http (HTTP) or …-grpc and hand it to Tracer(exporters=[…]).
  • W3C trace context propagation — outgoing MCP calls inject the active traceparent as an HTTP header so an instrumented MCP server can continue the trace.
  • tracer.attached(agent) / meter.attached(agent) — async context managers that RAII-wrap attach/detach, so cleanup is one async with block instead of an explicit try/finally.
  • atexit flush hookTracer(atexit_flush=True) (default) registers a process-exit handler that sync-flushes any buffered spans, so callers who forget await tracer.shutdown() still get their spans exported on normal exit / Ctrl-C / unhandled exception.
  • tracing_context() — set per-run tags and metadata (cubepi.tags = ("beta-arm",), cubepi.metadata.user_id = "u-42") via a contextvar-scoped block. Concurrent agents each see their own values.
  • Middleware-owned providers traced automatically — middleware that exposes extra BaseProvider instances via Middleware.providers() has its listener registry wired by Recorder.attach() so those providers' chat spans land in the same trace as the agent's main call. CompactionMiddleware uses this to surface its summarizer LLM call as a chat <summary-model> span nested under cubepi.compaction.summarize — see the compaction guide.

What it costs

  • One pure-Python recorder per agent run subscribing to the agent's event stream and the provider's listener registry — no monkey-patching, no extra threads.
  • One OTel SDK span per layer above. BatchSpanProcessor batches export off the hot path.
  • No payloads are recorded by default. gen_ai.input.messages, gen_ai.output.messages, raw request/response, and tool args/results all require explicit opt-in via record_content=True so you don't accidentally ship PII to your backend. See Content & Redaction.

When to use each piece

You wantUse
Trace one local agent run and inspect a JSONL fileTracer + JsonlSpanExporter
Ship to Jaeger / Tempo / Honeycomb / DatadogTracer + OTLP exporter
Latency + token histograms next to the spansMeter alongside Tracer
Record prompts / model outputs for evaluationTracer(record_content=True)
Redact PII before it leaves the processTracer(redact=…)
Tag runs with user_id / session_id / A-B armtracing_context(tags=…, metadata=…)
One-liner cleanup, no try/finallyasync with tracer.attached(agent): …
Forget to call shutdown() and not lose spansTracer(atexit_flush=True) (default)
Continue a trace from an upstream serviceTracer(resource=…) + W3C traceparent (auto for MCP, manual for HTTP)

Where to go next