# 個人第十堂課
****課堂筆記****
* 算出資料個數:
```python=
dfIncome ['Income']. count()
```
* 自建DATA算出資料個數:
```python=
data={
"duration":[50,40,None,None,90,201],
"Pulse":[109,140,110,125,138,170]
}
df= pd.DataFrame(data)
print(df['Pulse'].count())
```
* 計算資料總和和個數:
```python=
sum=dfIncome['Income'].sum()
n=dfIncome['Income'].count()
print (f'總和= {sum},樣本數={n}')
```
* 資料轉直方圖形式
```python
plt.hist (dfIncome ['Income'], bins=range (0,100000,1000))
plt.show()
```
* 設定資料轉直方圖並選擇顏色參數後帶入
```python=
x = [21,22,23,4,5,6,77,8,9,10,31,32,33,34,35, 36,37,18,49,50,100]
num_bins=3
n, bins, patches = plt.hist(x, num_bins, facecolor='black', alpha=0.9)
```
* 設定資料字串並帶入
```python=
price=8
amount=9
total=price*amount
print( f'你購買 {amount}個蘋果,每個單價{price}錢,總價為{total}' )
```
* 自定變數col並輸出
```python=
df = pd.DataFrame ({
'col1':['A','A','b',np.nan,'D','C'],
'col2':[2,1,9,8,7,4],
'col3':[0,1,9,4,2,3],
'col4':['a','B','c','D','e','F']
})
df['col1'].sort_values()
#col的首像記為第0項
```
* 自定變數col並輸出
```python=
dfLorenz=dfIncome[:]
se=dfIncome['Income'].sort_values()
cumulativeSum=0
```
* 吉尼係數設定
```python=
dfLorenz=dfIncome[:]
#資料帶入與變數設定
se=dfIncome['Income'].sort_values()
cumulativeSum=0
i=0
xx=[]
yy=[]
#手動追蹤
for x in se:
i=i+1
cumulativeSum=cumulativeSum+x
xx.append(i/n)
yy.append(cumulativeSum/sum)
print(f'i={i} x={x} cumulativeSum={cumulativeSum} i/n={i/n} ')
print(f'吉尼係數=')
```
* 散點圖
```python=
plt.scatter(x=xx,y=yy,s=0.1)
plt.axis('square')
plt.xlim(0,1)
plt.ylim(0,1)
print('羅倫茲曲線((Lorenz curve)')
plt.show()
```
* 散點圖應用
```python=
x=np.array ([5, 7, 8, 7])
y=np.array ([99,86,87,88])
plt.scatter(x,y)
plt.show()
```
* 散點圖應用並設置圓點大小
```python=
x=np.array ([5, 7, 8, 7])
y=np.array ([9,6,7,3])
plt.scatter(x,y,s=500)
plt.axis('square')
plt.xlim(0,10)
plt.ylim(0,10)
plt.show()
```