Skip to main content
Version: Next 🚧

Inspecting traces with cubepi trace

The JsonlSpanExporter writes one file per trace under ./cubepi-traces/<date>/<trace_id>.jsonl. The cubepi trace CLI (provided by the trace-cli extra) reads those files so you can see exactly what a run did — which LLM and tool calls fired, in what order, what each returned, where it errored, and the token counts — without re-running it.

pip install 'cubepi[trace-cli]' # or: uv sync --extra trace-cli
cubepi trace --help

--dir defaults to ./cubepi-traces; pass --dir <path> if your traces live elsewhere. Each file is one trace: the run plus any nested subagent runs, which inherit the parent's trace_id and so land in the same file.

ls — list recent traces

cubepi trace ls # newest first; -n N to limit
columnmeaning
startedtrace start time (UTC)
trace_idthe id you pass to view / follow / stats
spansspan count for the whole trace (incl. subagents)
statusok or error
durationwall-clock span of the trace
inputthe user's prompt, to identify the run

Filter by run metadata (--meta)

If the host stamped run-scoped metadata onto the trace (via tracing_context(metadata=…) — e.g. cubebox records conversation_id, user_id, org_id, workspace_id on the root invoke_agent span), filter to just those traces:

cubepi trace ls --meta conversation_id=conv_123
cubepi trace ls --meta user_id=usr_9 --meta org_id=org_1 # repeatable = AND, exact match

Each --meta KEY=VALUE is matched exactly against the trace's root metadata; repeating the flag ANDs the conditions.

To display metadata values as columns (rather than only filter by them), add --show-meta KEY[,KEY…]:

cubepi trace ls --show-meta conversation_id,user_id
cubepi trace ls --meta org_id=org_1 --show-meta conversation_id # filter + show

(Or see all of a single trace's metadata with cubepi trace view <id> -v.)

view — render a trace as a span tree

A trace-id prefix is enough (the table truncates ids); an ambiguous prefix lists candidates.

cubepi trace view 1cd97cdb
trace
└── invoke_agent 14425.8ms [0x1cd97cdb]
├── cubepi.turn 1283.1ms [0x5cfda93e]
│ ├── chat deepseek-v4-flash 1208.7ms tok 6845/68 [0x0d130229]
│ └── execute_tool subagent 9610.2ms subagent [0x38bdd10a]
│ └── invoke_agent 9601.0ms [0x8094f99b] ← subagent run, nested
│ └── cubepi.turn 9598.4ms [0x57c5cfc7]
│ ├── chat deepseek-v4-flash 1190.3ms [0x8205ca6b]
│ └── execute_tool web_search 6500.2ms web_search [0xca4e59fc]
└── cubepi.turn 491.9ms ERROR [0xce25f242]
└── chat deepseek-v4-flash 427.2ms ERROR [0x0bff68ec]
└── error: Error code: 400 - ... `tool_use` ids were found without
`tool_result` blocks immediately after: call_01_...

Read it top-down: invoke_agent (a run) → cubepi.turn (one agent-loop turn) → chat <model> (an LLM call, with tok <input>/<output>) and execute_tool <name> (a tool call). A subagent shows up as execute_tool subagent with its own invoke_agent → cubepi.turn → … nested directly beneath it. The [0x…] suffix on each node is the span's span_id — grep it in the raw JSONL to inspect that exact span. Errors print inline under the failing span.

Flags:

cubepi trace view <id> --content # expand gen_ai prompts / tool args / results
cubepi trace view <id> -v # expand ALL span attributes (verbose, large)

--content requires the run to have been recorded with record_content=True (see Content & Redaction).

follow — watch a trace live

cubepi trace follow <id> # polls as spans complete; good for a run in progress

stats — aggregate across traces

cubepi trace stats --by model # latency p50/p95, error rate, tokens
cubepi trace stats --by tool --since 2026-05-20

stats also accepts --meta KEY=VALUE (same semantics as ls) to aggregate only the traces that match — e.g. latency / error-rate / tokens for one user or conversation:

cubepi trace stats --by model --meta user_id=usr_9
cubepi trace stats --by tool --meta conversation_id=conv_123

convert — reconstruct an API request body

When you need to replay a specific LLM call — reproduce a failure, test a prompt change, or run a raw curl against the same context — convert reads a recorded chat span and outputs the complete request body.

Requires record_content=True.

# Default: last chat span in the trace, OpenAI JSON format
cubepi trace convert <trace_id>

# Select which LLM call to reconstruct
cubepi trace convert <trace_id> --turn 2 # 2nd chat span (1-indexed)
cubepi trace convert <trace_id> --span 0xbb7eb1 # by span_id prefix (from `view`)

# Output formats
cubepi trace convert <trace_id> --format openai # default — JSON request body
cubepi trace convert <trace_id> --format anthropic # Anthropic Messages API body
cubepi trace convert <trace_id> --format curl # shell-executable curl command

The [0x…] span id from view output goes directly into --span:

├── chat kimi-k2.6 31704.5ms [0xbb7eb192] ← paste as: --span 0xbb7eb1
├── chat kimi-k2.6 32420.2ms [0x7c76f48d] ← or: --span 0x7c76f4

The reconstructed body includes the full conversation history, the system prompt, all tool definitions, and request parameters (model, max_tokens, temperature). Pipe to python -m json.tool, jq, or directly to a replay script.

Beyond the CLI

The files are plain JSONL — one span per line — so you can parse them directly (jq, python -c) to pull a specific attribute (gen_ai.usage.*, gen_ai.tool.call.result, gen_ai.input.messages, …). Error detail lives in a span event named gen_ai.client.operation.exception.

For AI agents

A bundled cubepi-trace skill drives this CLI for debugging ("why did the run end with no reply?", "the tool result is wrong"). It encodes the fast path (lsview <prefix>) and the token/cache-rate conventions.

npx skills add cubeplexai/cubepi@cubepi-trace -a claude-code