Concepts and Definitions
SLIs, SLOs, Error Budgets, Burn Rate and Dimensions.
Before delving deeper into service level objectives and how you can use Rely.io to extract the most value out of them, it's important to understand and clarify some basic concepts around SLOs.
- Service level indicators (SLIs) are a measurement of a service’s behaviour within a 0-100% scale and indicate the level of performance a client is receiving. SLIs are calculated from metrics collected by monitoring systems and measured for compliance against an SLO. In Rely.io SLI time series are calculated using rolling windows.
- Service level objectives (SLOs) define internal promisses about the reliability of an application that IT and DevOps teams need to hit and measure themselves against in order to ensure customer expectations are being met. An SLO is usually derived from an agreement within an SLA and defines a performance target for a specific metric, such as uptime or response time.
- Error budgets are derived from an SLO’s target and define the maximum amount of unreliability allowed for a specific technical system within a certain time period. They can be interpreted as a time buffer where failures are allowed meant to be allocated for product development, feature releases or simply work as a "rainy day" fund.
- Remaining Error Budget is the amount of error budget an SLO has left within its compliance window. For request-based SLOs, the remaning error budget is calculated using an estimated error budget value. This estimation is made based on historical request data from past compliance periods. For time-based SLOs, the remaining error budget is calculated using the actual error budget derived from the SLO's target for a specific compliance window.
- The Burn Rate is a measurement of the rate of consumption of an SLO’s error budget. It indicates how fast a given error budget is being depleted. The burn rate value is meant to be interpreted as the fraction of the SLO’s compliance window for which the error budget will be fully depleted. For example, a burn rate value of 2 would mean that, at the current error rate, the error budget would go from 100% to 0% within half (1/2) the duration of the SLO’s compliance window. A burn rate value of 5 would mean that error budget would be fully depleted in one fifth (1/5) the duration of the SLO’s compliance window.
Dimensions are an intrinsic concept of the Rely.io platform. A dimension is a key/value pair that provides context to an SLO and helps shape its identity. SLOs can measure different areas and segments of an application. Dimensions can be thought of as the categories that describe these areas and segments. An expressive SLO name combined with thoroughly assigned dimensions can help communicate to both technicall and less technicall stakeholders what an SLO is actually monitoring.
These dimensions can be leveraged in a number of different ways within the Rely.io platform. For example, you can use dimensions to track the SLO coverage level of different services and products. By tagging each SLO with specific instances of Service and Product dimensions, you can then grouped them according to these dimensions and understand exactly where you stand with regards to the maturity of your SLO monitoring.
Dimensions can also help you to look at the performance of your application from a customer facing perspective. You can for example, assign a User Journey dimension to your SLOs so you know exactly what customer flows each SLO is monitoring. You can also use dimensions to categorise your SLOs for simple organisational purposes, for example by creating a Client Profile dimension that lets you know if a given SLO is monitoring a service used by paid, or free clients.
Incident response teams can also benefit from the proper use of dimensions. Dimensions can provide deeper context to SLO generated alerts allowing teams to prioritise their effort and quickly infer about the severity and criticality of each alert from the customer's point of view.
If you don't know which dimensions to assign to your SLOs, we recommend you start by creating Service, User Journey and Product dimensions.
This set of dimensions helps you contextualize your application from different perspectives that are usefull to most functions of your organisation such as executives, customer success and product teams and engineers.