# Covid-19 Status report Simple dashboard on [Google Data Studio](https://datastudio.google.com/s/n1sxlBJVXcU) that visualize the situation of Covid-19 outbreak right now on the world. Here is the simplify flow of the project 1. Data is taken from the the github repo [2019 Novel Coronavirus COVID-19 (2019-nCoV) Data Repository by Johns Hopkins CSSE.](https://github.com/CSSEGISandData/COVID-19) in the CSV format. Data is in timeseries format, in this particular case, the original data is in long format, with each columns as the date. 2. Since Google suggest the tall and narrow table for timeseries (you can read it [here](https://cloud.google.com/bigtable/docs/schema-design-time-series)), I need to transform all the column date to a single column and the infected people to a single column. This task can be done by [melt](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.melt.html) or [pivot](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.pivot.html) method in pandas, but melt is easier to use in this situation. 3. After transfomation, data is pushed to Bigquery using the Bigquery client libaries. A few credentials setup is needed, but I won't go into details in this article. You can read more about it [here](https://cloud.google.com/bigquery/docs/reference/libraries) 4. Data is updated daily at 1:00 in UTC timezone (8:00 in Asia/Ho_chi_minh timezone) using cronjob from [Prefect](https://www.prefect.io/). An instance on GCP is in charge of handle this cronjob everyday 5. After I have the data source ready on Bigquery, I use Google Data Studio (GDS) to connect to it. You can select the entire table in Bigquery with GDS or a portion of it using SQL command. Some of the fields on the dashboard are create using the custom field function in GDS to avoid excessive query in Bigquery, which cost a lot of money.