---
tags: cosc430 2020
---
# COSC430 discussion: Gorilla
## Key take-away points about the time series lecture
- ...
# Discussion of the Gorilla paper
## Breakout room 1
### Key problem under investigation?
- How to balance between the efficiency, scalability in the design of TSDBs.
- Handling of failures from the single node to the entire region.
### Key idea of the proposed solution?
- It uses write-through cache in memory time series
### How does it solve the problem?
- It used fault tolerance capability via many large scale simulated the failure and many disaster situation.
- By using finely tuned compression algorithms for time stamps used.
- Compression of the Time Stamps and values
- By reducing the data consistency restriction it becomes higly available.
### Evaluation?
- Reduction of query latency compared to the previous on disk Time Series.
### Drawbacks?
- Gorilla host available for reads before older data is read off disk.
- Prioritize recent data over the historical data.
- To withstand single host failures and disaster events resource efficiency takes a hit.
## Breakout room 2
### Key problem under investigation?
- Storing measurements for monitoring purposes.
- Required properties:
- writes dominate,
- state transitions,
- high availability,
- fault tolerance.
### Key idea of the proposed solution?
- In-memory time series database that is used as a write-through cache of the most recent 26 hours of data.
### How does it solve the problem?
- Specialised and fine-tuned compression algorithms for timestamps and values are used to achieve in-memory fitness.
- Fault tolerance is achieved by saving data to disk and by replicating data across two datacenters in different geographical locations.
- Scalability is achieved by sharding data across multiple servers. Sharding is implemented using time series map.
- High availability is achieved by loose restrictions on data consistency, i.e., we do not guarantee ACID properties.
### Evaluation?
- Query latency: compared to HBase, Gorilla has provided 73x - 350x improvement depending on query size.
- Queries per second: the previous system served 450 qps, Gorilla currently handles more than 5000 qps, peaking at one point to 40000 qps.
- High performance allowed to create new tools such as correlation engines, visualisation and aggregation tools.
- Fault tolerance: Gorilla was successfully tested againts network cuts, disasters, node failures, restarts, and release bugs.
### Drawbacks?
- Small amounts of data could be lost in case of failures.
- Poor data model (e.g., only real values, no units), no database query language.
## Breakout room 3
### Key problem under investigation?
- Key problem is how to strike the right balance between efficiency, scalability, and relia- bility in TSDBs.
- Writes dominate
- State transitions
- High availability
- Fault Tolerance
### Key idea of the proposed solution?
- leverage compression tech- niques such as delta-of-delta timestamps and XOR’d floating point values
- On disk structures
- New time series compression algorithm
- High availability trumps resource efficiency
-
### How does it solve the problem?
- Compressing Time stamps
- Compressing values
### Evaluation?
- Gorilla has allowed us to reduce our production query latency by over 70x when compared to previous on-disk TSDB
Success- fully doubled in size twice in this period without much operational effort demonstrating the scalability of TSDBs
-
### Drawbacks?
- Prioritize recent data over historical data.
- Read latency
- High availability trumps resource efficiency.