Calculate Service Level Objectives and compliance with downtime allowances. Part of the DevTools Surf developer suite. Browse more tools in the Calculators collection.
Use Cases
Calculate annual downtime budget for a given availability target (99.9%, 99.95%, 99.99%).
Determine the error budget remaining after a measured period of downtime.
Compare SLO targets to choose an appropriate tier for a new service based on criticality.
Model the impact of an outage on monthly SLO compliance to decide whether to trigger a policy response.
Tips
Express SLOs as rolling-window metrics (last 30 days), not calendar-month — rolling windows are more operationally meaningful and avoid the 'save up downtime for January 1' problem.
Define the measurement method in the SLO document, not just the target — two teams can have the same 99.9% SLO and measure it differently, producing incomparable results.
Track SLO compliance weekly, not only during incidents — chronic borderline compliance at 99.91% when the target is 99.9% needs investigation even if no individual alert fired.
Fun Facts
Google's Site Reliability Engineering book (2016), which popularized SLOs and error budgets, defines the relationship between SLI (indicator), SLO (objective), and SLA (agreement) — a framework now adopted by most large cloud organizations.
The 'five nines' (99.999% availability) translates to just 5.26 minutes of allowed downtime per year — achieving it requires redundant systems at every layer with no single point of failure.
AWS's S3 service SLA targets 99.9% monthly uptime. S3's actual historical availability has exceeded 99.99% most months — demonstrating that SLAs are contractual floors, not operational targets.
FAQ
What is the difference between SLI, SLO, and SLA?
SLI is the metric being measured (e.g., request success rate). SLO is the internal target (e.g., 99.9% success rate). SLA is the external contractual commitment with a financial penalty for breach.
What is an error budget?
Error budget = 100% - SLO target. For 99.9%, the error budget is 0.1% (43.8 minutes/month). When the error budget is exhausted, teams should freeze new feature releases and focus on reliability.