## Already great 1- Very good language and detailed description of the technology. 2- Good sequence and the split is very good between 2 members. It's challenging to both contribute to the same report while maintaining a cohesive structure but you managed to do it. ## Nice to have suggestions 1- Is it possible to add a subschema of the data and explain the problem from a business perspective? What are analytics looking for? What kind of queries the system needs to support? 2- Why was only relational database choice discussed. Even if you never planned to have a NOSql solution, it's worth discussing it. 3- Related works are solely from academia. ## Improvements suggestions: Can you please improve th spacing a bit? I found it really hard to read the text. Not sure about people with visual impairments 1.4 - `since it is considered the most valuable data for analytical purposes, it was decided to limit the scope to just` You should avoid stating such statements without explanation "the most valuable" like why? 2 - Very high level description of the tools. Google Cloud Platform, Dataflow. There is nothing wrong at all with the explanations. It's quite good. However you can still replace GCP with something like Databricks and would still be a fact. You should mention those technologies but probably with less words. The reader should read more about the problem you are solving, not about what can the tools do (instead, focus on what you did with them). 4.2.2 Query performance, it would be nice if numbers have a \ref to the query in appendix. 4.1 It was tricky for me to understand 4.3 figure. After I did, I realized it could be just easier if the x-axis represents the data size and the y-axis represents the time. So no additional mapping. ## Readability improvements: ### Unclear sentences 3.2.1 Each pipeline was selected to handle a large amount of time-series data. (large does not say anything, how large?) 3.2.1 PBS maybe could have been explained in the previous section along with the other technologies?