跳到主要内容
版本:0.4

Tracing Overview

CubePi emits OpenTelemetry spans that follow the GenAI Semantic Conventions v1.41 so any OTel-compatible backend (Jaeger, Tempo, Honeycomb, Datadog, AWS X-Ray, Azure Monitor, …) can ingest agent runs without custom instrumentation.

Attach a Tracer to an Agent and every prompt produces a tree of spans you can pivot, query, and join with the rest of your service traces:

invoke_agent <agent_name> [INTERNAL] one per agent.prompt()
└── cubepi.turn [INTERNAL] one per LLM round-trip
├── chat <model> [CLIENT] the LLM call itself
└── execute_tool <tool_name> [INTERNAL] each tool invocation
└── tools/call <tool_name> [CLIENT] (MCP tools only)

Each layer carries standard gen_ai.* attributes — gen_ai.operation.name, gen_ai.request.model, gen_ai.provider.name, gen_ai.usage.input_tokens, gen_ai.usage.output_tokens, gen_ai.response.finish_reasons, …

What ships out of the box

  • Tracer — builds an SDK TracerProvider, attaches one BatchSpanProcessor per exporter, wires the cubepi event stream into spans.
  • Meter — sibling for OTel histograms: gen_ai.client.operation.duration, gen_ai.client.operation.time_to_first_chunk, gen_ai.client.token.usage.
  • JsonlSpanExporter — write one JSON line per span to ./cubepi-traces/<date>/<run_id>.jsonl. Useful for local dev and offline debugging; works with any OTel viewer that reads JSONL.
  • OTLP — bring your own exporter via opentelemetry-exporter-otlp-proto-http (HTTP) or …-grpc and hand it to Tracer(exporters=[…]).
  • W3C trace context propagation — outgoing MCP calls inject the active traceparent as an HTTP header so an instrumented MCP server can continue the trace.
  • tracer.attached(agent) / meter.attached(agent) — async context managers that RAII-wrap attach/detach, so cleanup is one async with block instead of an explicit try/finally.
  • atexit flush hookTracer(atexit_flush=True) (default) registers a process-exit handler that sync-flushes any buffered spans, so callers who forget await tracer.shutdown() still get their spans exported on normal exit / Ctrl-C / unhandled exception.
  • tracing_context() — set per-run tags and metadata (cubepi.tags = ("beta-arm",), cubepi.metadata.user_id = "u-42") via a contextvar-scoped block. Concurrent agents each see their own values.

What it costs

  • One pure-Python recorder per agent run subscribing to the agent's event stream and the provider's listener registry — no monkey-patching, no extra threads.
  • One OTel SDK span per layer above. BatchSpanProcessor batches export off the hot path.
  • No payloads are recorded by default. gen_ai.input.messages, gen_ai.output.messages, raw request/response, and tool args/results all require explicit opt-in via record_content=True so you don't accidentally ship PII to your backend. See Content & Redaction.

When to use each piece

You wantUse
Trace one local agent run and inspect a JSONL fileTracer + JsonlSpanExporter
Ship to Jaeger / Tempo / Honeycomb / DatadogTracer + OTLP exporter
Latency + token histograms next to the spansMeter alongside Tracer
Record prompts / model outputs for evaluationTracer(record_content=True)
Redact PII before it leaves the processTracer(redact=…)
Tag runs with user_id / session_id / A-B armtracing_context(tags=…, metadata=…)
One-liner cleanup, no try/finallyasync with tracer.attached(agent): …
Forget to call shutdown() and not lose spansTracer(atexit_flush=True) (default)
Continue a trace from an upstream serviceTracer(resource=…) + W3C traceparent (auto for MCP, manual for HTTP)

Where to go next