--- title: HW6-7 tags: dataframe --- <style> .green { padding: 0 2px; white-space: pre-line; border-radius: 2px; background-color: #CCFF99;} .red { padding: 0 2px; white-space: pre-line; border-radius: 2px; background-color: #FFA488;} .blue { padding: 0 2px; white-space: pre-line; border-radius: 2px; background-color: #77DDFF;} .purple { padding: 0 2px; white-space: pre-line; border-radius: 2px; background-color: #D1BBFF;} .ph { margin-left : auto ; margin-right: auto ; display : block; } .ph_7{ margin-left : auto ; margin-right: auto ; display : block; width : 70%;} </style> # `HW6-7重點小整理:` + [PLT 的小教學](https://wizardforcel.gitbooks.io/matplotlib-intro-tut/content/matplotlib/3.html) **input: 使⽤資料來源:Johns Hopkins CSSE,USA process: 統計巴⻄的死亡⼈數趨勢圖 output : 死亡⼈數趨勢圖(請參考輸出結果)** ```python import pandas as pd import matplotlib.pyplot as plt from datetime import datetime # 匯入日期模組 import math # 匯入數學模組 import numpy as np # 匯入資料運算模組 # load dataset dataset_C= pd.read_csv('https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_confirmed_global.csv') dataset_D = pd.read_csv('https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_deaths_global.csv') # data pre-processing & analysis sCountry='Brazil' df_C=dataset_C[dataset_C['Country/Region']==sCountry] df_D=dataset_D[dataset_D['Country/Region']==sCountry] total_D = df_D.iloc[0,len(df_D.columns)-1] # total number of deaths cases temp=df_C.iloc[0, 4:] x=df_D.iloc[0, 4:].index # index of series y=df_D.iloc[0, 4:].values # value of series # data visualization plt.figure(figsize=(16,12)) plt.plot(x,y, #x軸為日期,y軸為人數 color = 'red', label = 'Num. of Confirmed', marker = '.') plt.title('COVID19 Number of Deaths in '+sCountry, color='black', fontsize=20) plt.ylabel('Number of Persons', fontsize=15) plt.xlabel('Date', color='black', fontsize=15) # optimize x axis start_date=datetime.strptime('01/22/20', '%m/%d/%y') end_date=datetime.strptime(df_C.columns[-1], '%m/%d/%y') delta=end_date-start_date # 計算從第一天到統計的當天,一共有幾天 nMonth=math.ceil(delta.days/30) # 換算成有幾月 x_tick=[] for i in range(nMonth): x_tick.append(df_C.columns[4+i*30]) # 準備 x 刻度,每一刻度為一月 xvalues = np.arange(0, nMonth*30, step=30) xlabels = x_tick plt.xticks(xvalues, xlabels, rotation=45,size=7,color='blue') # optimize y axis y_tick=[] y_seg_size = total_D / 10 for i in range(10): y_tick.append('{0:,.0f}'.format(y_seg_size*(i+1))) plt.yticks(np.arange(y_seg_size, total_D+1, step=y_seg_size), y_tick ,size=10, color='red') # 十等分 plt.grid() ``` ![](https://i.imgur.com/BwtMao2.png) --- **input: 使⽤資料來源:Johns Hopkins CSSE,USA process:統計印度的每日新增加死亡人數 output:印度的每日新增加死亡人數趨勢圖(請參考輸出圖型) (提示:使用series 的 diff()函式)** ```python import pandas as pd import matplotlib.pyplot as plt from datetime import datetime # 匯入日期模組 import math # 匯入數學模組 import numpy as np # 匯入資料運算模組 # load dataset dataset_D = pd.read_csv('https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_deaths_global.csv') # data pre-processing & analysis sCountry='India' df_D=dataset_D[dataset_D['Country/Region']==sCountry] x=df_D.iloc[0, 5::].index # index of series y=df_D.iloc[0, 4::].values # value of series #dy=df_D.iloc[0, 5::].values - df_D.iloc[0, 4:-1:].values # value of series Dy = pd.Series(np.array(y)) Dy = Dy.diff() # value of series dy = Dy[1::] #%% ytick = np.linspace(max(dy)/9, max(dy), num = 9) #取等分 xtick = np.arange(0, dy.size,30) #每30個取一次(30天) # data visualization fig = plt.figure(figsize=(24,12)) ax = fig.add_subplot(1, 1, 1) plt.plot(x, dy, #x軸為日期,y軸為人數 color = 'red', label = 'Daily. of Deaths', marker = '.') plt.ylim([0, max(dy) * 11 / 10]) ax.grid(which='both') ax.set_xticks(xtick) ax.set_yticks(ytick) ax.tick_params(axis = 'x', colors = 'blue') ax.tick_params(axis = 'y', colors = 'red') plt.title('COVID19 Number of Deaths Indcreased Daily in '+sCountry, color='black', fontsize=20) plt.ylabel('Number of Persons', fontsize=15) plt.xlabel('Date', fontsize=15) ``` ![](https://i.imgur.com/jQqwThI.png) --- # PLT Settings + `PLOT` ```python plt.plot(x, # x軸 y, # y軸 color = 'red', label = 'Num. of Confirmed', marker = '.') ``` + `X, Y 軸` ```python labelname = 'Number of Persons' plt.xlabel(labelname, fontsize = 15, color = 'black') ``` + 可以對X, Y 軸進行設定`plt.xlabel`, `plt.ylabel` + `X Y 軸刻度` ```python xvalues = np.arange(0, nMonth*30, step=30) xlabels = x_tick plt.xticks(xvalues, # X 軸刻度 xlabels, # X 軸各科度的名字 rotation = 45, size = 7, color = 'blue') ``` + 可以對X, Y 軸進行設定`plt.xticks`, `plt.yticks` + `Figure Title` ```python title = 'title' plt.title(title, fontsize = 20, color = 'black') ``` + `格線` ```python plt.grid() ``` + `各線條的名稱` ```python plt.legend() ``` + 用途在於把各條線的`label`顯示出來 ```python import matplotlib.pyplot as plt x = [1,2,3] y = [5,7,4] x2 = [1,2,3] y2 = [10,14,12] plt.plot(x, y, label='First Line') plt.plot(x2, y2, label='Second Line') plt.xlabel('Plot Number') plt.ylabel('Important var') plt.title('Interesting Graph\nCheck it out') plt.legend() plt.show() ``` <img src = https://i.imgur.com/pYCBmrK.png class = "ph">