Anomaly Detection
Know immediately when something drifts. Anomaly detection is a platform-level layer — it works on whichever modules you activate and watches the metrics those modules emit. No extra module to buy; no dashboards to rebuild.
How it works
- You pick the metric. Things like on-time rate, audit findings per day, agent cost per run, invoices sent per day.
- Choose a monitor type. Rule-based for concrete thresholds you already know ("fire if on-time rate < 85%"). Statistical when you don't know the threshold yet — the platform fits z-score or IQR bands over 30 days of history and alerts on outliers automatically.
- Detectors run automatically. Rule monitors every 15 minutes, statistical monitors every hour. No scheduler to configure, no SQL to write.
- Anomalies land in one inbox. The same hourly digest email you already get for findings and overdue invoices gains an "Anomalies detected" section. Each event deep-links to a detail page where you can acknowledge, resolve, or silence it for N hours.
What it catches
- Carrier SLA regressions. One carrier's on-time rate drops 10 points even though your aggregate looks fine — a per-carrier z-score monitor surfaces it the same hour.
- Agent cost spikes.
A prompt change pushes token spend 3x. The
agent.cost_per_run_centsrule fires before next month's bill shows up. - Review backlog warnings. Findings sitting unreviewed for 48+ hours indicates a user-adoption problem the moment it starts, not a quarter later.
- Misconfigurations. Customer-invoice volume suddenly 5x normal? A runaway markup rule just sent test invoices to real customers. Catch it in the first hour, not the first complaint.
- Rate-card drift. New rate card uploaded that differs from the previous active one by >20%? Flagged for human review before it audits a single invoice.
Two detector types
Rule-based
Concrete thresholds: gt, lt,
outside_range. Optionally require the breach
to persist for N consecutive windows before firing.
Best for: KPIs with known contractual or business thresholds you'd want to hold the line on.
Statistical
Z-score band, IQR Tukey fence, or both combined. Fit against a configurable lookback (default 30 days) — today's history doesn't skew today's band.
Best for: count metrics and ratios where the "normal" range changes with volume and you don't want to hand-tune thresholds forever.
Event lifecycle
Every detected outlier moves through a simple state machine you control:
- Detected — just fired; in the next digest.
- Acknowledged — someone's looking at it.
- Resolved — fixed or no longer relevant.
- Silenced for N hours — known and being worked; don't email me.
No extra activation required. Anomaly detection
ships with every customer's platform from day one. Rules can be
edited at /settings/anomalies; events appear at
/anomalies.