Note of Storage of Prometheus & thanos

tags: IT

The document might jump around topics. This is more like a record of investigating behaviors about how prometheus and thanos store data.

Thanos sidecar

By default, Thanos sidecar sends data to thanos storage every 2 hours
https://thanos.io/tip/components/sidecar.md/#sidecar

The default of 2h is recommended.

See following pic

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

The amount of data transferred depends on the data output of prometheus.
Prometheus is also set to flush every two hours

--storage.tsdb.min-block-duration=2h --storage.tsdb.max-block-duration=2h

We can find a proper memory usage based on how much data we save withing 2 hours.

Thanos downsampling

https://github.com/thanos-io/thanos/blob/main/docs/components/compact.md

​​​​Creating 5m downsampling for blocks older than 40 hours (2d)
​​​​Creating 1h downsampling for blocks older than 10 days (2w)

Thanos will not only save the original blocks, and will also save different resolutions.
It is divided into two, sampling every five minutes and sampling every hour.
Therefore, when calculating the storage space of thanos, it is necessary to add the space occupied by these two types of sampling. Otherwise the capacity won't be sufficient.

By default, Thanos will do 5m dowsampling every two days and downsampling every 14 days.
When doing downsampling, thanos create the dowsampling file and then delete the original file, so you can see the usage is ups and downs

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

storage.tsdb.min-block-duration

This parameter tells prometheus how long to keep the data in memory (default is 2hr).
If the parameter is too large, or prometheus manages too many targets, it may cause insufficient memory (because all the data is left in memory). If the parameters are set too small, prometheus will do read and write disk io often, which will affect performance
(because disk reading and writing will be slower, and some systems may use SSD to accelerate)

https://github.com/laszlocph/tsdbinfo

Calculate HDD usage

http_requests_total

Other Reference