Shadow Testing Guide
Test proposed fixes against real traffic without affecting production.
What is Shadow Testing?
Shadow testing runs a proposed fix alongside your production agent:
- Real queries are sent to both versions
- Responses are compared
- No impact on users
Workflow
Identify a failure cluster
const clusters = await api.get('/api/v1/cases', {
params: { agentId: 'agent_123', status: 'open' }
});
const targetCluster = clusters.data[0];
// { id: 'case_001', title: 'Auth guidance errors', traceCount: 47 }Generate a fix
const fix = await api.post('/api/v1/fixes', {
caseId: 'case_001',
type: 'prompt_update',
description: 'Add 2FA verification step',
changes: {
prompt: 'Updated system prompt with 2FA instructions'
}
});Run shadow test
const test = await api.post('/api/v1/shadow-tests', {
fixId: fix.data.id,
config: {
sampleSize: 50,
comparisonMode: 'side_by_side'
}
});Review results
const results = await api.get(`/api/v1/shadow-tests/${test.data.id}/results`);
console.log(results.data);
// {
// improved: 42,
// unchanged: 5,
// regressed: 3,
// improvementRate: 0.84,
// recommendation: 'Safe to deploy'
// }Apply fix
if (results.data.improvementRate > 0.8) {
await api.post(`/api/v1/fixes/${fix.data.id}/apply`);
}Configuration Options
| Option | Description |
|---|---|
sampleSize | Number of traces to test |
comparisonMode | side_by_side or sequential |
metrics | Metrics to compare |
threshold | Minimum improvement required |
Best Practices
- Test with representative samples
- Include edge cases
- Set appropriate thresholds
- Monitor regression closely
Next Steps
- Deployment - Deploy to production
- API Reference - Issues and fixes API