Multi-Agent Tracing

ThinkHive supports tracing multi-agent systems where multiple AI agents collaborate, delegate, or orchestrate work. This guide covers how to instrument, visualize, and analyze multi-agent workflows.

Overview

In multi-agent architectures, a single user request may flow through several agents:

ThinkHive captures these interactions as a single trace with spans representing each agent’s work, making it easy to identify bottlenecks, failures, and quality issues across the entire workflow.

Instrumenting Multi-Agent Systems

Parent-Child Span Relationships

Use parent span IDs to establish the agent hierarchy:

import { ThinkHive } from 'thinkhive-js';
 
const th = new ThinkHive({
  apiKey: process.env.THINKHIVE_API_KEY,
  endpoint: 'https://app.thinkhive.ai',
  serviceName: 'multi-agent-system'
});
 
// Start the orchestrator trace
const trace = th.startTrace({ name: 'customer-inquiry' });
 
// Orchestrator span
const orchestratorSpan = trace.startSpan({
  name: 'orchestrator',
  type: 'chain',
  attributes: { 'agent.role': 'orchestrator' }
});
 
// Research agent (child of orchestrator)
const researchSpan = trace.startSpan({
  name: 'research-agent',
  type: 'chain',
  parentSpanId: orchestratorSpan.id,
  attributes: { 'agent.role': 'researcher' }
});
 
// LLM call within research agent
const llmSpan = trace.startSpan({
  name: 'gpt-4-research',
  type: 'llm',
  parentSpanId: researchSpan.id,
  attributes: {
    'llm.model': 'gpt-4o',
    'llm.provider': 'openai'
  }
});
 
llmSpan.end();
researchSpan.end();
 
// Analysis agent (child of orchestrator)
const analysisSpan = trace.startSpan({
  name: 'analysis-agent',
  type: 'chain',
  parentSpanId: orchestratorSpan.id,
  attributes: { 'agent.role': 'analyst' }
});
 
analysisSpan.end();
orchestratorSpan.end();
 
await trace.end();

Framework Integrations

ThinkHive auto-instruments popular multi-agent frameworks:

Framework	Auto-Instrumentation	Manual Setup
LangGraph	Automatic span creation	Pass trace context
CrewAI	Automatic agent tracing	Configure crew callbacks
AutoGen	Automatic message tracing	Register ThinkHive observer
LangChain	Automatic chain tracing	Use ThinkHive callback handler

See the JavaScript SDK and Python SDK docs for framework-specific integration examples.

Visualizing Multi-Agent Traces

In the ThinkHive dashboard, multi-agent traces display as a hierarchical timeline:

Timeline View — Gantt-chart style view showing span durations and concurrency
Tree View — Nested hierarchy showing agent relationships
Flame Graph — Identify time-consuming operations across agents

Key Metrics for Multi-Agent Systems

Metric	Description
Total Latency	End-to-end time for the full workflow
Agent Contribution	Time each agent spent processing
Parallelism Ratio	How much concurrent work occurred
Handoff Latency	Time spent transitioning between agents
Token Usage	LLM tokens consumed per agent

Analyzing Multi-Agent Quality

Per-Agent Evaluation

Run ThinkEval suites scoped to specific agents:

curl -X POST "https://app.thinkhive.ai/api/v1/evaluation/run" \
  -H "Authorization: Bearer thk_your_api_key" \
  -H "Content-Type: application/json" \
  -d '{
    "suiteId": "suite_research_quality",
    "filter": {
      "spanAttributes": { "agent.role": "researcher" }
    }
  }'

Cross-Agent Cases

Cases can span multiple agents. ThinkHive identifies when failures in one agent cause cascading issues:

{
  "id": "case_multi_001",
  "title": "Research agent retrieval failure causes incorrect analysis",
  "affectedAgents": ["research-agent", "analysis-agent"],
  "rootCause": "research-agent",
  "pattern": "Empty retrieval results from research agent propagate to analyst"
}

Best Practices

Use descriptive span names — include the agent role (e.g., research-agent, orchestrator)
Set agent.role attributes — enables filtering and per-agent analytics
Trace agent-to-agent communication — capture messages passed between agents as span events
Monitor parallelism — verify that agents run concurrently when expected
Set per-agent evaluation criteria — different agents may have different quality requirements

Next Steps

Core Concepts — Understand runs, traces, and spans
ThinkEval — Evaluate agent quality
Cases & Fixes — Investigate multi-agent failures
JavaScript SDK — SDK integration details

Shadow Testing Human Review