Documentation Index
Fetch the complete documentation index at: https://docs.nolma.ai/llms.txt
Use this file to discover all available pages before exploring further.
Shadow Mode
Shadow mode routes a percentage of real traffic to a candidate model. Compare cost and quality before fully switching.How it works
Setting up shadow mode
Dashboard → Lens → Shadow Mode → Select agent → Select candidate model → Set traffic % (default 10%) → EnableReading results
After 50+ shadow calls the comparison table appears:| Metric | Primary (gpt-4o) | Candidate (gpt-4o-mini) |
|---|---|---|
| Cost/call | $0.0241 | $0.0028 |
| Avg latency | 1,240ms | 380ms |
| Sample size | 900 | 100 |
Promoting a candidate
When you click “Promote to 100%”:- Shadow mode disables
- You update your code to use the new model directly
- Nolma tracks the new model as the primary going forward