Meaningful SLIs for APIs teams will not argue about

Start from the journey

SLIs should map to what a user is trying to finish: checkout, render a report, submit a form. Availability measured on synthetic pings is better than nothing, but it often lies about real success rates.

Good SLI ingredients

Request success classified by business rules, not raw HTTP 200 (which can mask half-failures).
Latency measured near the client-facing edge for a meaningful percentile—commonly p95/p99 for interactive flows.
Saturation signals where applicable: queue depth, thread pool stalls, DB connections.

Error budgets as a product conversation

A budget is only useful if breaching it changes prioritization: feature freeze, chaos time, or shedding low-value paths. Without consequences, dashboards are décor.

Anti-pattern

“Inventing” SLIs to look comprehensive—too many SLIs means none of them get owned. Three crisp signals beat fifteen sleepy graphs.