# System Monitoring
---
```
$ whoami
Hoang Manh Tien <tienhm@eway.vn> - 1990
Dad, Backend, SysAdmin, DevOps, SRE
Stack: python, postgres, mysql, redis,
gitlabci, phabricator, arcanist,
docker, k8s, gcp, nginx, git
```
---
### Monitoring is:
<span>- System status<!-- .element: class="fragment" data-fragment-index="0" --></span>
<span>- Detect incident --> Avoid disaster<!-- .element: class="fragment" data-fragment-index="1" --></span>
<span>- Define Threshold and Alert<!-- .element: class="fragment" data-fragment-index="2" --></span>
---
### Monitoring is not:
<span>- Raw log/Event collection<!-- .element: class="fragment" data-fragment-index="0" --></span>
<span>- Request tracing<!-- .element: class="fragment" data-fragment-index="1" --></span>
<span>- "Black Magic" detection<!-- .element: class="fragment" data-fragment-index="2" --></span>
<span>- Durable long-term storage<!-- .element: class="fragment" data-fragment-index="3" --></span>
<span>- Automation horizontal scaling<!-- .element: class="fragment" data-fragment-index="4" --></span>
---
### Expected
![](https://i.imgur.com/uOj5Z6h.png)
---
### Reality
![](https://i.imgur.com/D3ijhUV.png)
---
![](https://i.imgur.com/dYSOz5C.png)
---
![](https://i.imgur.com/4fHsJGS.png)
---
### SLI, SLO, SLA, oh my
- SLI - Service Level Indicator
- SLO - Service Level Objective
- SLA - Service Level Agreement
Note:
SLI: Alive signal
SLO: Objective of aliveness
SLA: Contract of objective
---
### Risk Management
- Error budgets
---
### Alert
Avoid spamming alerts:
- Oncall
- Ticket
---
### References
[Site Reliability Engineering](https://goo.gl/aCiiPV)
[How SRE relates to DevOps](https://goo.gl/NWKuj9)
[Liz & Seth playlist](https://goo.gl/CKv3tV)
[SLI, SLO, SLA](https://cloudplatform.googleblog.com/2018/07/sre-fundamentals-slis-slas-and-slos.html)
[LinkdedIn experiement](https://www.slideshare.net/ToddPalino/im-no-hero-full-stack-reliability-at-linkedin)
[End to end monitoring with Prometheus](https://www.slideshare.net/Paris_Container_Day/end-toend-monitoring-with-the-prometheus-operator-max-inden)
{"metaMigratedAt":"2023-06-14T17:57:01.370Z","metaMigratedFrom":"YAML","title":"System Monitoring","breaks":true,"slideOptions":"{\"transition\":\"slide\",\"theme\":\"moon\"}","contributors":"[{\"id\":\"e6b773b2-cc89-48e8-972b-f54bc2b4028d\",\"add\":2620,\"del\":419}]"}