Linear Regression

## Linear Regression ---- Examples: $f$(藥物名稱) ⮕ 藥劑量 $f$(5090) ⮕ 算力 --- ### 寶可夢CP預測 $f$ (![image](https://hackmd.io/_uploads/By40nMbgC.png)) ⮕ CP after evolution { $x$$_{cp}$ , $x$$_s$ , $x$$_{hp}$} ⮕ _$y$_ --- ### Step 1 ---- Set up the __model(a set of functions)__ __定義模型__ Assume $y=b+w$$\cdot$$x_{cp}$ $\Rightarrow$ This is a __Linear Model__ __w__ and __b__ are parameters(參數) $\Rightarrow$ Can be any value $f_1$: $y$$= 10.0+9.0$$\cdot$$x$$_{cp}$ $f_2$: $y$$= -3.1+8.9$$\cdot$$x$$_{cp}$ $f_3$: $y$$= -0.8-0.3$$\cdot$$x$$_{cp}$ ---- ### Linear Model All linear model can generalize into $y$$=b+$$\sum$$w$$_i$$x$$_{i}$ $x$$_{i}$: an attribute of input $\Rightarrow$ __feature__ $w$$_{i}$: weight, $b$: bias --- ### Step 2 ---- ### Review A set of Functions(Model) $\downarrow$ Goodness of $f()$ $\uparrow$ Training Data 第二步要做的就是找出goodness of $f()$ __定義目標函數並訓練__ ---- ### Training Data How to define input output? function input: _$x$$^n$_ function output: _$\hat{y}$$^n$_ _$\hat{y}$_ 代表正確output ---- ### Loss function _L_ 一種函式用來表示預測output與實際output的差異假設有m筆data則他的lost function如下: $L(f)$ $=$ $L(w,b)$ $=$ $\frac{1}{m}\sum\limits_{n = 1}^{m}$ $($$\hat{y}$$^n$$-$$($$b+w$$\cdot$$x^{n}_{cp}$$)$$)$^2^ $\hat{y}$$^n$ 為實際output $b+w$$\cdot$$x^{n}_{cp}$ 為預測output 每一筆資料都取他們的平方差最後再把所有資料加總就是lost function輸出的值 ---- ### Training Data範例(Row[2]&Row[14]) ```csvpreview= name,species,cp,hp,weight,height,power_up_stardust,power_up_candy,attack_weak,attack_weak_type,attack_weak_value,attack_strong,attack_strong_type,attack_strong_value,cp_new,hp_new,weight_new,height_new,power_up_stardust_new,power_up_candy_new,attack_weak_new,attack_weak_type_new,attack_weak_value_new,attack_strong_new,attack_strong_type_new,attack_strong_value_new,notes, Pidgey20,Pidgey,176,35,1.81,0.29,1000,1,Tackle,Normal,12,Air Cutter,Flying,30,327,54,3.46,1.07,1000,1,Steel Wing,Steel,15,Twister,Dragon,25, Pidgey21,Pidgey,97,30,1.95,0.31,600,1,Tackle,Normal,12,Twister,Dragon,25,181,44,1.79,1.15,600,1,Steel Wing,Steel,15,Aerial Ace,Flying,30, Pidgey22,Pidgey,74,24,0.78,0.26,600,1,Tackle,Normal,12,Twister,Dragon,25,141,38,30,0.95,600,1,Steel Wing,Steel,15,Twister,Dragon,25, Pidgey23,Pidgey,127,32,2.33,0.34,800,1,Tackle,Normal,12,Air Cutter,Flying,30,241,49,3.4,1.23,800,1,Steel Wing,Steel,15,Air Cutter,Flying,30, Pidgey24,Pidgey,78,26,2.11,0.32,600,1,Tackle,Normal,12,Twister,Dragon,25,146,39,3.16,1.17,600,1,Steel Wing,Steel,15,Aerial Ace,Flying,30, Pidgey25,Pidgey,240,41,2.53,0.33,1600,2,Tackle,Normal,12,Aerial Ace,Flying,30,448,64,8.21,1.21,1600,2,Wing Attack,Flying,9,Air Cutter,Flying,30, Pidgey26,Pidgey,276,48,0.87,0.24,2200,2,Tackle,Normal,12,Aerial Ace,Flying,30,530,74,30,0.86,2200,2,Steel Wing,Steel,15,Aerial Ace,Flying,30, Pidgey27,Pidgey,207,41,1.19,0.26,1600,2,Quick Attack,Normal,10,Air Cutter,Flying,30,393,64,30,0.96,1600,2,Steel Wing,Steel,15,Air Cutter,Flying,30, Pidgey28,Pidgey,176,35,1.68,0.31,1300,2,Tackle,Normal,12,Air Cutter,Flying,30,335,55,30,1.13,1300,2,Steel Wing,Steel,15,Aerial Ace,Flying,30, Pidgey29,Pidgey,316,47,1.71,0.27,2500,2,Quick Attack,Normal,10,Aerial Ace,Flying,30,594,74,5.71,0.99,2500,2,Wing Attack,Flying,9,Twister,Dragon,25, Pidgey30,Pidgey,305,53,2.04,0.32,2200,2,Quick Attack,Normal,10,Twister,Dragon,25,567,79,1.95,1.17,2200,2,Steel Wing,Steel,15,Aerial Ace,Flying,30, Pidgey31,Pidgey,304,46,1.99,0.31,2500,2,Tackle,Normal,12,Air Cutter,Flying,30,579,73,3.88,1.12,2500,2,Steel Wing,Steel,15,Twister,Dragon,25, Pidgey32,Pidgey,242,43,2.09,0.34,1900,2,Quick Attack,Normal,10,Twister,Dragon,25,459,67,30,1.26,1900,2,Steel Wing,Steel,15,Air Cutter,Flying,30, Pidgey33,Pidgey,250,42,0.95,0.26,1900,2,Quick Attack,Normal,10,Twister,Dragon,25,471,66,30,0.95,1900,2,Wing Attack,Flying,9,Twister,Dragon,25, Pidgey34,Pidgey,226,40,2.04,0.31,1600,2,Tackle,Normal,12,Aerial Ace,Flying,30,428,63,4.76,1.12,1600,2,Steel Wing,Steel,15,Aerial Ace,Flying,30, Pidgey35,Pidgey,220,43,2.13,0.3,1600,2,Tackle,Normal,12,Aerial Ace,Flying,30,418,66,7.41,1.1,1600,2,Steel Wing,Steel,15,Twister,Dragon,25, Pidgey36,Pidgey,185,40,1.63,0.29,1300,2,Tackle,Normal,12,Twister,Dragon,25,354,62,0.53,1.07,1300,2,Steel Wing,Steel,15,Aerial Ace,Flying,30, Pidgey37,Pidgey,42,20,2.11,0.33,400,1,Quick Attack,Normal,10,Aerial Ace,Flying,30,80,30,1.02,1.21,400,1,Wing Attack,Flying,9,Aerial Ace,Flying,30, Pidgey38,Pidgey,344,56,2.57,0.35,3000,3,Tackle,Normal,12,Aerial Ace,Flying,30,647,84,4.16,1.29,3000,3,Steel Wing,Steel,15,Twister,Dragon,25, Pidgey39,Pidgey,108,31,1.19,0.25,800,1,Tackle,Normal,12,Air Cutter,Flying,30,205,47,0.54,0.91,800,1,Wing Attack,Flying,9,Air Cutter,Flying,30, ``` ---- ### 生成散佈圖 ![image](https://hackmd.io/_uploads/Hk5vY4VxR.png =600x) ---- ### 實作(Data沿用) ```python= x = np.arange(-200,-100,1) #bias y = np.arange(-5,5,0.1) #weight Z = np.zeros((len(x), len(y))) for i in range(len(x)): for j in range(len(y)): b = x[i] w = y[j] for n in range(len(x_data)): Z[i][j] = Z[i][j] + (y_data[n] - b - w*x_data[n])**2 #Loss function Z[i][j] = Z[i][j]/len(x_data) #Average loss plt.imshow(Z, extent=[-5, 5, -200, -100], aspect='auto', origin='lower') plt.colorbar(label='Loss') plt.xlabel('Weight') plt.ylabel('Bias') plt.title('Loss Function') plt.show() ``` ---- $L(w,b)$ 越紫色代表$L(w,b)$越小 $\Rightarrow$ better parameters ![image](https://hackmd.io/_uploads/B1IWRXNlR.png =600x) --- ### Step 3 ---- __找出最好的function__ A set of Functions(Model) $\downarrow$ Goodness of $f()$ $\rightarrow$ Best Function $\uparrow$ Training Data ---- $f^*$ $=$ $arg$ $\min\limits_f$ $L(f)$ $w^*$,$b^*$ $=$ $arg$ $\min\limits_{w,b}$ $L(w,b)$ $L(w,b)$ $=$ $\sum\limits_{n = 1}^{10}$ $($$\hat{y}$$^n$$-$$($$b+w$$\cdot$$x^{n}_{cp}$$)$$)$^2^ $w^*$,$b^*$ $=$ $arg$ $\min\limits_{w,b}$ $\sum\limits_{n = 1}^{10}$ $($$\hat{y}$$^n$$-$$($$b+w$$\cdot$$x^{n}_{cp}$$)$$)$^2^ ---- $w^*$,$b^*$ $=$ $arg$ $\sum\limits_{n = 1}^{10}$ $($$\hat{y}$$^n$$-$$($$b+w$$\cdot$$x^{n}_{cp}$$)$$)$^2^ 可獲得Training Data中$L(w,b)$ output值最低的function $\Downarrow$ __Best function__ --- ### Gradient Descent 梯度下降法 ---- 運用此最佳化算法找到函式的局部最小值 $w^*$,$b^*$ $=$ $arg$ $\sum\limits_{n = 1}^{10}$ $($$\hat{y}$$^n$$-$$($$b+w$$\cdot$$x^{n}_{cp}$$)$$)$^2^ ---- ### 簡化 Create a $L(f)$ with __one__ parameter $w^*$ $=$ $arg$ $\min\limits_w$ $L(w)$ ![image](https://hackmd.io/_uploads/Sy0JR_ZlR.png =600x) ---- ### 沒有效率的做法窮舉$w$，找出$L(w)$最小的值 ---- ### 有效率的做法建立隨機點$w^0$ 計算$\frac{dL}{dw}$ $|$ $w=w^0$ ![image](https://hackmd.io/_uploads/S1WAgY-x0.png =600x) ---- 不會微分..? 計算斜率 if slope<0 $\rightarrow$ increase $w$ if slope>0 $\rightarrow$ decrease $w$ ![image](https://hackmd.io/_uploads/HkC6xF-g0.png =600x) ---- ### 移動$w$ $w^1$: $w^0 - \eta\frac{dL}{dw}$ | $w=w^0$ ---- ### 改變$w$的因素 $\eta$ $\&$ $\frac{dL}{dw}$ $\eta$: Learning rate(定值) $\frac{dL}{dw}$: 微分值 ---- ### Learning Rate 參數越大，更新幅度越大 Learning rate越大，學習速度越快 ---- ### Local Minimum 經過非常多次的迭代後，會遇到Local Minimum 在這個點上微分=0 $\Rightarrow$ 之後無法繼續更新參數 $w^t$: $w^{t-1} - \eta\cdot0$ | $w=w^{t-1}$ ![image](https://hackmd.io/_uploads/BJpPNtWx0.png =400x) 也因為這個關係，gradient descent只能求出local minimum但沒辦法求出global minimum ---- ### 那為什麼在這裡沒有問題呢 Linear model只有local minimum，沒有global minimum ---- Visualize: https://uclaacm.github.io/gradient-descent-visualiser/#playground --- ### 兩個參數的Gradient Descent ---- $w^*$ $b^*$ $=$ $arg$ $\min\limits_{w,b}$ $L(w,b)$ ---- Pick random value $w^0,b^0$ Compute: $\frac{dL}{dw}$ | $w=w^0,b=b^0$ $\frac{dL}{db}$ | $w=w^0,b=b^0$ $\Downarrow$ $w^1$: $w^0 - \eta\frac{dL}{dw}$ | $w=w^0,b=b^0$ $b^1$: $b^0 - \eta\frac{dL}{db}$ | $w=w^0,b=b^0$ ---- ![image](https://hackmd.io/_uploads/ByXe5HGlR.png) ---- ### 實作(Data沿用) ```python= b = -120 # initial b w = -4 # initial w lr = 1 # learning rate iteration = 100000 #迭代次數 b_lr = 0.0 w_lr = 0.0 b_history = [b] w_history = [w] for i in range(iteration): b_grad = 0.0 w_grad = 0.0 for n in range(len(x_data)): b_grad = b_grad - 2.0*(y_data[n] - b - w*x_data[n])*1.0 w_grad = w_grad - 2.0*(y_data[n] - b - w*x_data[n])*x_data[n] b_lr = b_lr + b_grad**2 w_lr = w_lr + w_grad**2 b = b - lr/np.sqrt(b_lr) * b_grad w = w - lr/np.sqrt(w_lr) * w_grad b_history.append(b) w_history.append(w) ``` ---- ![image](https://hackmd.io/_uploads/r12sF4VeA.png =600x) min $b$: -0.15537159439738932 min $w$: 1.8891815145998943 ---- ### Learning rate impact if learning rate越大 $\rightarrow$ 學習速度越快 __But__ 稍微不精準 How about lowering lr? $\rightarrow$ 學習速度變慢 ![image](https://hackmd.io/_uploads/BkVVI44eR.png =400x) 為甚麼會這樣? $\because$ iteration不夠 $\Rightarrow$ 迭代次數調大 $\rightarrow$ 需花費更多時間 ---- ![image](https://hackmd.io/_uploads/SJcX5VVgC.png =600x) 可以發現$b^*,w^*$的值更準確了 min $b$: -7.031236207033889 min $w$: 1.9181764491678417 --- ### 回到寶可夢CP預測 ---- 將我們用gradient descent求出來的$b^*,w^*$代入原本的model $b^*$=-7.031236207033889 $w^*$=1.9181764491678417 Model: $y=b+w$$\cdot$$x_{cp}$ ---- ```python= import matplotlib.pyplot as plt import numpy as np import csv x_data = [] y_data = [] with open(r'C:\Users\Ryan\Documents\ML\Regression\Dataset-LR\test.csv', 'r') as file: reader = csv.reader(file) next(reader) for row in reader: x_data.append(float(row[0])) y_data.append(float(row[1])) b=-7.031236207033889 w=1.9181764491678417 x=np.linspace(700,0) y=b+w*x plt.scatter(x_data, y_data) plt.plot(x,y) plt.show() ``` ---- ![image](https://hackmd.io/_uploads/r1ns94NeA.png =600x) ---- ### Average error on training data 4.503145265728941 ```python= differences = [abs(y - (b + w * x)) for x, y in zip(x_data, y_data)] average_difference = np.mean(differences) print("Average error on training data:", average_difference) ``` ---- ![image](https://hackmd.io/_uploads/H1P0q4EeR.png =600x) ---- ### Testing Data Training data用來訓練模型 Testing data用來確認模型以上學舉例，Training data是老師，Test data是考試 ---- ### Test Data範例(Row[2]&Row[14]) ```csvpreview= name,species,cp,hp,weight,height,power_up_stardust,power_up_candy,attack_weak,attack_weak_type,attack_weak_value,attack_strong,attack_strong_type,attack_strong_value,cp_new,hp_new,weight_new,height_new,power_up_stardust_new,power_up_candy_new,attack_weak_new,attack_weak_type_new,attack_weak_value_new,attack_strong_new,attack_strong_type_new,attack_strong_value_new,notes Pidgey1,Pidgey,384,56,2.31,0.34,2500,2,Tackle,Normal,12,Aerial Ace,Flying,30,694,84,2.6,1.24,2500,2,Steel Wing,Steel,15,Air Cutter,Flying,30, Pidgey2,Pidgey,366,54,1.67,0.29,2500,2,Quick Attack,Normal,10,Twister,Dragon,25,669,81,1.93,1.05,2500,2,Wing Attack,Flying,9,Air Cutter,Flying,30, Pidgey3,Pidgey,353,55,1.94,0.3,3000,3,Quick Attack,Normal,10,Aerial Ace,Flying,30,659,83,3.51,1.11,3000,3,Wing Attack,Flying,9,Air Cutter,Flying,30, Pidgey4,Pidgey,338,51,1.73,0.31,3000,3,Tackle,Normal,12,Air Cutter,Flying,30,640,79,30,1.12,3000,3,Steel Wing,Steel,15,Air Cutter,Flying,30, Pidgey5,Pidgey,242,45,1.44,0.27,1900,2,Quick Attack,Normal,10,Air Cutter,Flying,30,457,69,1.42,0.98,1900,2,Wing Attack,Flying,9,Twister,Dragon,25, Pidgey6,Pidgey,129,35,2.07,0.35,800,1,Quick Attack,Normal,10,Air Cutter,Flying,30,243,52,30,1.27,800,1,Wing Attack,Flying,9,Aerial Ace,Flying,30, Pidgey7,Pidgey,10,10,0.92,0.25,200,1,Tackle,Normal,12,Air Cutter,Flying,30,15,13,30,0.9,200,1,Wing Attack,Flying,9,Air Cutter,Flying,30, Pidgey8,Pidgey,25,14,2.72,0.37,200,1,Tackle,Normal,12,Twister,Dragon,25,47,21,2.63,1.35,200,1,Steel Wing,Steel,15,Air Cutter,Flying,30, Pidgey9,Pidgey,24,13,2.07,0.32,200,1,Quick Attack,Normal,10,Twister,Dragon,25,47,21,3.27,1.16,200,1,Wing Attack,Flying,9,Twister,Dragon,25, Pidgey10,Pidgey,161,35,1.45,0.31,1000,1,Tackle,Normal,12,Twister,Dragon,25,305,54,30,1.14,1000,1,Steel Wing,Steel,15,Aerial Ace,Flying,30, Pidgey11,Pidgey,114,31,1.58,0.26,800,1,Tackle,Normal,12,Aerial Ace,Flying,30,213,47,4.65,0.96,800,1,Wing Attack,Flying,9,Aerial Ace,Flying,30, Pidgey12,Pidgey,333,52,1.85,0.3,3000,3,Tackle,Normal,12,Air Cutter,Flying,30,633,80,2.54,1.1,3000,3,Steel Wing,Steel,15,Aerial Ace,Flying,30, Pidgey13,Pidgey,132,33,1.63,0.28,800,1,Quick Attack,Normal,10,Aerial Ace,Flying,30,247,50,2.41,1.03,800,1,Wing Attack,Flying,9,Air Cutter,Flying,30, Pidgey14,Pidgey,60,21,1.67,0.3,400,1,Quick Attack,Normal,10,Twister,Dragon,25,113,33,0.28,1.09,400,1,Steel Wing,Steel,15,Twister,Dragon,25, Pidgey15,Pidgey,42,19,2.01,0.3,400,1,Quick Attack,Normal,10,Air Cutter,Flying,30,79,29,4.87,1.11,400,1,Wing Attack,Flying,9,Aerial Ace,Flying,30, Pidgey16,Pidgey,91,29,2.68,0.35,600,1,Quick Attack,Normal,10,Twister,Dragon,25,173,44,5.42,1.3,600,1,Steel Wing,Steel,15,Air Cutter,Flying,30, Pidgey17,Pidgey,139,34,1.76,0.31,1000,1,Tackle,Normal,12,Twister,Dragon,25,265,53,30,1.13,1000,1,Steel Wing,Steel,15,Aerial Ace,Flying,30, Pidgey18,Pidgey,330,48,1.62,0.29,2500,2,Quick Attack,Normal,10,Air Cutter,Flying,30,624,75,0.02,1.08,2500,2,Wing Attack,Flying,9,Aerial Ace,Flying,30, Pidgey19,Pidgey,328,48,1.62,0.3,2500,2,Quick Attack,Normal,10,Twister,Dragon,25,619,76,30,1.08,2500,2,Steel Wing,Steel,15,Aerial Ace,Flying,30, ``` ---- ### Average error on testing data 6.668680136084539 ![image](https://hackmd.io/_uploads/HJdUiVNe0.png =600x) --- ### 更準確的Model? ---- ### 二次式 Model: $y=b+w_1$$\cdot$$x_{cp}+w_2$$\cdot$$(x_{cp})^2$ ---- ```python= for i in range(iteration): b_grad = 0.0 w1_grad = 0.0 w2_grad = 0.0 for n in range(len(x_data)): b_grad -= 2.0 * (y_data[n] - b - w1 * x_data[n] - w2 * x_data[n]**2) * 1.0 w1_grad -= 2.0 * (y_data[n] - b - w1 * x_data[n] - w2 * x_data[n]**2) * x_data[n] w2_grad -= 2.0 * (y_data[n] - b - w1 * x_data[n] - w2 * x_data[n]**2) * x_data[n]**2 b_lr += b_grad**2 w1_lr += w1_grad**2 w2_lr += w2_grad**2 b -= lr / np.sqrt(b_lr) * b_grad w1 -= lr / np.sqrt(w1_lr) * w1_grad w2 -= lr / np.sqrt(w2_lr) * w2_grad b_history.append(b) w1_history.append(w1) w2_history.append(w2) print('min b:', b_history[-1], 'min w1:', w1_history[-1], 'min w2:', w2_history[-1]) ``` min $b$: -3.44768035432058 min $w_1$: 1.9399339047755344 min $w_2$: -0.00014155455728048015 ---- ### Average error on training data 2.776670843299643 ```python= differences = [abs(y - (b + w1 * x + w2 * x ** 2)) for x, y in zip(x_data, y_data)] average_difference = np.mean(differences) print("Average error on training data:", average_difference) ``` ![image](https://hackmd.io/_uploads/Byar-uBxC.png =400x) ---- ### How about testing data? 4.411362971079184 ![image](https://hackmd.io/_uploads/SkgdWdHgA.png =600x) ---- ### 三(四、五)次式呢 ---- ### Overfitting 三、四、五次式 Testing data error值很大過度學習(overfitting) $\downarrow$ Training data上結果很好不代表Testing data也會一樣好 ![image](https://hackmd.io/_uploads/SJdQQSLeR.png =500x) ---- ### Conclusion 越複雜的model不一定越好! --- References ---- https://youtu.be/fegAeph9UaA https://youtu.be/1UqCjFQiiy0 https://pse.is/5sfwgs https://www.kaggle.com/datasets/mirajdeepbhandari/polynomial-regression https://www.openintro.org/book/statdata/index.php?data=pokemon