SLO/SLI/SLA

  • defining what availability is & defining how available you want to be.

SLI - Service Level Indicator

  • Rquest latency
  • Batch throughput - the throughput of requests per second in a case of a batch system
  • Failures per request - failures per total number of requests

ex:

  • 99th latency of requests received in the past five minutes less than 300 milliseconds
  • 95th percentile latency of homepage requests over past 5 minutes < 300ms
  • the ratio of errors to total requests in the past five minutes less than 1%

SLIs are service level indicators or metrics overtime, which inform about the health of a service.

SLO - Service Level Objectives

Binding target for a collection of SLIs

SLOs are service level objectives, which are agreed upon bounds for how often those SLIs must be met.

ex:

  • 95th percentile homepage SLI will successd 99.9% over tailing year.

SLA - Service Level Aggrement

Business aggrement between a customer and service provider typically based on SLOs

Note:

  • define: What I am going to do if I don't meet the level of reliability?
  • SLAs describe the set of services and availability promises that a provider is willing to make to a customer, and then the ramifications associated with failing to deliver on those promises.
  • Those ramifications might be things like money back or free credits for failing to deliver the service availability.

ex:

  • Service credits if 95th percentile homepage SLI succeeds less than 99.5% over trailing year.

Relationship

SLIs drive SLOs which inform SLAs

You really want to make sure your SLA more lenient than your SLO. So you get early warning before you have to do things like field angry phone calls from customers or pay them lots of money for failing to deliver the services promised.

SLAs are business level agreements, which define the service availability for a customer and the penalties for failing to deliver that availability