# 個人第十堂課 ****課堂筆記**** * 算出資料個數: ```python= dfIncome ['Income']. count() ``` * 自建DATA算出資料個數: ```python= data={ "duration":[50,40,None,None,90,201], "Pulse":[109,140,110,125,138,170] } df= pd.DataFrame(data) print(df['Pulse'].count()) ``` * 計算資料總和和個數: ```python= sum=dfIncome['Income'].sum() n=dfIncome['Income'].count() print (f'總和= {sum},樣本數={n}') ``` * 資料轉直方圖形式 ```python plt.hist (dfIncome ['Income'], bins=range (0,100000,1000)) plt.show() ``` * 設定資料轉直方圖並選擇顏色參數後帶入 ```python= x = [21,22,23,4,5,6,77,8,9,10,31,32,33,34,35, 36,37,18,49,50,100] num_bins=3 n, bins, patches = plt.hist(x, num_bins, facecolor='black', alpha=0.9) ``` * 設定資料字串並帶入 ```python= price=8 amount=9 total=price*amount print( f'你購買 {amount}個蘋果,每個單價{price}錢,總價為{total}' ) ``` * 自定變數col並輸出 ```python= df = pd.DataFrame ({ 'col1':['A','A','b',np.nan,'D','C'], 'col2':[2,1,9,8,7,4], 'col3':[0,1,9,4,2,3], 'col4':['a','B','c','D','e','F'] }) df['col1'].sort_values() #col的首像記為第0項 ``` * 自定變數col並輸出 ```python= dfLorenz=dfIncome[:] se=dfIncome['Income'].sort_values() cumulativeSum=0 ``` * 吉尼係數設定 ```python= dfLorenz=dfIncome[:] #資料帶入與變數設定 se=dfIncome['Income'].sort_values() cumulativeSum=0 i=0 xx=[] yy=[] #手動追蹤 for x in se: i=i+1 cumulativeSum=cumulativeSum+x xx.append(i/n) yy.append(cumulativeSum/sum) print(f'i={i} x={x} cumulativeSum={cumulativeSum} i/n={i/n} ') print(f'吉尼係數=') ``` * 散點圖 ```python= plt.scatter(x=xx,y=yy,s=0.1) plt.axis('square') plt.xlim(0,1) plt.ylim(0,1) print('羅倫茲曲線((Lorenz curve)') plt.show() ``` * 散點圖應用 ```python= x=np.array ([5, 7, 8, 7]) y=np.array ([99,86,87,88]) plt.scatter(x,y) plt.show() ``` * 散點圖應用並設置圓點大小 ```python= x=np.array ([5, 7, 8, 7]) y=np.array ([9,6,7,3]) plt.scatter(x,y,s=500) plt.axis('square') plt.xlim(0,10) plt.ylim(0,10) plt.show() ```