# 2020CCU_Machine_Learning_hw1 ###### author:`405420052 資工四 陳奕瑋` ## Use the linear model y = 2x +ε with zero-mean Gaussian noise ε∼ N(0, 1) to generate 20 data points with (equal spacing) x ∈ [−3, 3].:question:</br></br> ### ( a ) Perform linear regression. 20 data points are split into 15 training samples and 5 testing samples (75% for training and 25% for testing).</br></br> Show the **fitting plots** of the **training error**, **cross-validation errors** for both leave-one-out and five-fold, and **testing errors**. --- #### 過程 1. 依照題目產生 `x` , `y` ```python= def generate(): epsilon = np.random.normal( 0 , 1 , 20 ) x = np.linspace( -3 , 3 , 20) y = 2 * x + epsilon return x , y ``` - `epsilon` 是題目中的 `zero-mean Gaussian noise` `ε∼ N(0, 1)` - `x` 則在 `[ -3 , 3 ]` 取等距的 `20` 個資料點 2. 將測試資料( testing data )及訓練資料( training data )各別分成 `25%` 及 `75%` ```python= def splitdata( a ): percent = 0.25 sp = math.floor(percent*a.size) test = a[ : sp ] train = a[ sp : a.size ] return train , test ``` - `testing data` 取得前面 `25%` 的資料 - `training data` 取得後面剩下的資料 3. 將訓練資料代入求得 W<sub>linear</sub> ```python= def linear_compute( x , y ): x = np.column_stack((x,np.ones(x.shape[0]))) y = y[:,np.newaxis] pin = np.linalg.pinv(x) # pseudo-inverse result = pin @ y return result ``` - 將 `x` 右邊填滿 1 `column` 的 `1` - 藉由 `np.linalg.pinv` 求得 `x` 的 `pseudo-inverse` - 將 `pseudo-inverse` 與 y 相乘 4. 計算 `Training Error` 與 `Testing Error` ```python= def error_func( pred , test ): num = np.prod ( pred.shape) error = np.linalg.norm( pred - test ) **2 / num return error ``` - 將 W<sub>linear</sub>與 `x_train` 或 `x-test` 代入公式求得 `y_pred`, 並與 `y_train` 或`y_test` 一起代入此函式 - 將這 2 個值相減取`2-norm` 後平方再除以資料數量 - 所得即為`training error` 或 `testing error` 5. 求 `five-fold` 及 `leave-one-out` 的 `cross Validation Error` ```python= def kfold_split( a , begin = 15 , percent = 0.25 ): arr = np.array([]) k = math.floor(a.size * percent) test = a[ begin : begin + k ] if( begin ): arr = a[ : begin ] arr2 = a[begin + k : ] train = np.append(arr,arr2) return train , test def kfold( x , y , k ): begin = 0 terr = 0 size = x.size for _ in range( k ): x_train,x_test = kfold_split( x , begin , 1/k ) y_train,y_test = kfold_split( y , begin , 1/k ) wlin = linear_compute( x_train , y_train ) pred = wlin[0] * x_test + wlin[1] err = error_func( pred , y_test ) begin += math.floor( size * 1/k ) terr += err return terr/k ``` #### ( a ) 程式 ```python= x , y = generate() x_train,x_test = splitdata( x ) y_train,y_test = splitdata( y ) wlin = linear_compute(x_train,y_train) print("-----Part A-----") print("-----Training Error-----") pred = wlin[0] * x_train + wlin[1] print("Training Error : " + str(error_func( pred, y_train ))) print("-----Leave-one-out-----") print("Cross Validation Error : " + str(kfold(x,y,20))) print("-----Five-Fold-----") print("Cross Validation Error : " + str(kfold(x,y,5))) print("-----Testing Error-----") test_pred = wlin[0] * x_test + wlin[1] print("Testing Error : " + str(error_func( test_pred, y_test ))) plt.xlabel("X") plt.ylabel("Y") plt.scatter( x_train , y_train, label = "TrainData") plt.scatter( x_test , y_test, label= "TestData" ) plt.title("Part A : Fitting Plot") y_pred = wlin[0] * x + wlin[1] y_pred = y_pred.reshape(20,1) plt.plot( x , y_pred , label="Fitting Function") plt.legend() plt.show() ``` #### 結果 ![](https://i.imgur.com/AXSHWWD.png) | Training Error | Testing Error | Cross-Validation-Error( leave-one-out ) | Cross-Validation-Error( five-fold )| | ---- | ---- | ---- | ---- | | 1.3197551637541514 | 3.3874342169640492 | 1.9077633587725926| 1.8076079235549933 | #### 討論 Q : 測試資料與訓練資料的切割方法會不會影響error ? ### ( b ) Perform polynomial regression with degree 5, 10 and 14, respectively.</br> For each case, show the fitting plots of the training error, cross-validation errors (both leave-one-out and five-fold) and testing errors. --- #### 過程 此題與( a )不同的部分在於 `fitting function` 及 `X` ,在此列出與上一題不同的部分。 ```python= def poly_cal( x , n ): x3=np.array([]) for i in range( n ): x2=np.array([]) if ( i ): x2 = x**( i + 1 ) x3 = np.column_stack((x3,x2)) else : x3 = np.column_stack((x,np.ones(x.shape[0]))) x3 = np.fliplr(x3) x3 = np.fliplr(x3) return x3 ``` #### ( b )程式 ```python= def poly_regression(x,y,n): x_train,x_test = splitdata( x ) #preserve x_train for plot y_train,y_test = splitdata( y ) x_train_2 = poly_cal(x_train,n) wlin = poly_compute(x_train_2,y_train) print("-----Degree %d polynomial regression-----" %n) print("-----Training Error-----") p = np.poly1d(wlin.flatten()) test_pred=p(x_test) y_pred = p(x) pred=p(x_train) print("Training Error : " + str(error_func( pred, y_train ))) print("-----Leave-one-out-----") print("Cross Validation Error : " + str(kfold(x,y,20,n))) print("-----Five-Fold-----") print("Cross Validation Error : " + str(kfold(x,y,5,n))) print("-----Testing Error-----") print("Testing Error : " + str(error_func( test_pred, y_test ))) plt.xlabel("X") plt.ylabel("Y") plt.scatter( x_train , y_train, label = "TrainData") plt.scatter( x_test , y_test, label= "TestData" ) plt.title("Part B : Degree %d Fitting Plot" %n) plt.plot( x , y_pred , label="Fitting Function") plt.legend() plt.show() return print("-----Part B-----") x , y = generate() poly_regression(x,y,5) poly_regression(x,y,10) poly_regression(x,y,14) ``` - `X` 的 `shape` 為 `N*(d+1)` (`d = 5 , 10 , 14 N為資料總數` ) - 第 1 row 由左至右分別是 x<sub>1</sub><sup>d</sup> , x<sub>1</sub><sup>d-1</sup> , .... ,x<sub>1</sub><sup>0</sup> - 第 2 row 由左至右分別是 x<sub>2</sub><sup>d</sup> , x<sub>2</sub><sup>d-1</sup> , .... ,x<sub>2</sub><sup>0</sup>,以此類推... - 最後的 `x3` 即為所求的 `X` #### 結果 ![](https://i.imgur.com/RWvl8ps.png) ![](https://i.imgur.com/OG3fPrG.png) ![](https://i.imgur.com/RWHFwWb.png) | Degree | Training Error | Testing Error | Cross-Validation-Error( leave-one-out ) | Cross-Validation-Error( five-fold )| |----| ---- | ---- | ---- | ---- | | 5 | 0.6944558047821588 | 2464.8382865417984 | 1.9578348576088629| 81.18661465708635 | | 10 | 0.11479188699682578 | 264262952.736448 | 305.6685006156316| 1851569.011690172 | | 14 | 6.06127400457753e-19 | 217823218344795.53 | 8596.99446149074| 16041214609.370647 | #### 討論 此題與`A` 差在生成 `X` 的 dimension 變很大 , 由上方的表格可以發現Degree越大,training error變得越來越小,而產生的regression也越貼合 data point 。 ### ( c ) Generate data using **y = sin(2πx) +ε** with the noise ε∼ N(0, 0.04) and (equal spacing) x ∈ [0, 1].</br></br> Show the fitting plots of the training error, cross-validation errors for both leave-one-out and five-fold, and testing errors via polynomial regression with degree **5**, **10** and **14**. --- 此題與 ( b ) 差在條件變成 **y = sin(2πx) +ε** with the noise ε∼ N(0, 0.04) and (equal spacing) x ∈ [0, 1],以下為不同的部分: ```python= def generate(): epsilon = np.random.normal( 0 , 0.04 , 20 ) x = np.linspace( 0 , 1 , 20) y = np.sin( 2 * x * np.pi ) + epsilon return x , y ``` #### 結果 ![](https://i.imgur.com/NVSKZvR.png) ![](https://i.imgur.com/uwCox7n.png) ![](https://i.imgur.com/ylfKwoD.png) | Degree | Training Error | Testing Error | Cross-Validation-Error( leave-one-out ) | Cross-Validation-Error( five-fold )| |----| ---- | ---- | ---- | ---- | | 5 | 0.0002208547075204422 | 0.1727247519463676 | 0.0006647943292551545| 0.012829142575124467 | | 10 | 0.00015291210060351483 | 7508.125578429443 | 0.04077242002862919| 512.9872962501589 | | 14 | 1.587772072270109e-07 | 44518516510.71311 | 9.387818005589647| 261087699.30254954 | #### 討論 此題也是 `training error` 隨著 degree 增加而減少。 ### ( d ) Consider the model in ( b ) with degree **14** via varying the number training data points m, say, m = 60, 160, 320.</br>Show the five-fold cross-validation errors, testing error and the fitting plots with 75% for training and 25% for testing --- 此題與( b )差在 `training data` 的數量不一樣,本題採用 `m = 60 , 160 , 320`各別求出。 (d) 程式 ```python= def poly_regression(point): n = 14 epsilon = np.random.normal( 0 , 1 , point ) x = np.linspace( -3 , 3 , point ) y = 2 * x + epsilon x_train,x_test = splitdata( x ) y_train,y_test = splitdata( y ) x_train_2 = poly_cal(x_train,n) wlin = poly_compute(x_train_2,y_train) p = np.poly1d(wlin.flatten()) test_pred=p(x_test) y_pred = p(x) '''print("-----Training Error-----") pred = fit_func( wlin , x_train_2 ) print("Training Error : " + str(error_func( pred, y_train ))) print("-----Leave-one-out-----") print("Cross Validation Error : " + str(kfold(x,y,20,n)))''' print("----- %d Training Data-----" %point) print("-----Five-Fold-----") print("Cross Validation Error : " + str(kfold(x,y,5,n))) print("-----Testing Error-----") print("Testing Error : " + str(error_func( test_pred, y_test ))) plt.xlabel("X") plt.ylabel("Y") plt.scatter( x_train , y_train, label = "TrainData") plt.scatter( x_test , y_test, label= "TestData" ) plt.title("Part D : Degree 14 Fitting Plot with %d data points" %point) plt.plot( x , y_pred , label="Fitting Function") plt.legend() plt.show() return print("-----Part D-----") poly_regression(60) poly_regression(160) poly_regression(320) ``` #### 結果 ![](https://i.imgur.com/PLiCbHT.png) ![](https://i.imgur.com/jT0VXlc.png) ![](https://i.imgur.com/LhzUXMi.png) | Data points | Testing Error | Cross-Validation-Error( five-fold )| |----| ---- | ---- | | 60 | 278719956.54025924 | 197398876.5216198| | 160 | 7351466493.240173 | 11407955.479479197| | 320 | 4622907809.147924 | 235222027.51230615| #### 討論 Testing Error 隨著 data points 的上升而增加 ### ( e ) Consider again the model in (b) with degree 14 via **regularization**:<br/>![](https://i.imgur.com/yVaFhyk.png)<br/>Compare the results derived by setting λ = 0, 0.001/m , 1/m, 1000/m , where m = 20 is the number of data points (with x = 0, 1/(m−1) , 2/(m−1), . . . , 1). Show the five-fold cross-validation errors, testing errors and the fitting plots with regularization using the following equation:</br> ![](https://i.imgur.com/vLZv6dL.png) --- 此題與前面幾題最大的差異為他的 `W` 算法不同,總共比較 `4` 種不同的 `λ` ,以下為不同的部分: ```python= def poly_compute( x , y , lam ): wlin = np.matmul( np.matmul ( np.linalg.inv( np.matmul( x.T , x ) + lam * np.identity(x.shape[1]) ) , x.T) , y ) return wlin ``` - 此為計算 `W` ( e ) 程式 ```python= def poly_regression(x , y , lam): n = 14 m = 20 x_train , x_test = splitdata( x ) y_train , y_test = splitdata( y ) x_train_2 = poly_cal( x_train , n ) wlin = poly_compute( x_train_2, y_train , lam ) print("-----λ = %f Degree %d polynomial regression-----" %( lam , n )) p = np.poly1d(wlin.flatten()) test_pred=p(x_test) y_pred = p(x) '''print("-----Training Error-----") pred = fit_func( wlin , x_train_2 ) print("Training Error : " + str(error_func( pred, y_train ))) print("-----Leave-one-out-----") print("Cross Validation Error : " + str(kfold(x,y,20,n,lam)))''' print("-----Five-Fold-----") print("Cross Validation Error : " + str( kfold ( x , y , 5 , n , lam ))) print("-----Testing Error-----") print("Testing Error : " + str(error_func( test_pred , y_test ))) print("") plt.xlabel("X") plt.ylabel("Y") plt.scatter( x_train , y_train, label = "TrainData") plt.scatter( x_test , y_test, label= "TestData" ) plt.title("Part E : λ = %f Degree %d Fitting Plot" %( lam , n )) y_pred = poly_cal( x , n) y_pred = y_pred @ wlin y_pred = y_pred.reshape( m , 1) plt.plot( x , y_pred , label="Fitting Function") plt.legend() plt.show() return print("-----Part E-----") m = 20 x , y = generate( m ) poly_regression( x , y , 0 ) # lambda : 0 poly_regression( x , y , 0.001 / m ) # lambda : 0.001/m poly_regression( x , y , 1/m ) # lambda : 1/m poly_regression( x , y , 1000/m ) # lambda : 1000/m ``` #### 結果 ![](https://i.imgur.com/uxKPT3r.png) ![](https://i.imgur.com/mMI5SUA.png) ![](https://i.imgur.com/xaC06Na.png) ![](https://i.imgur.com/sIYqVMS.png) `m = 20` | λ | Testing Error | Cross-Validation-Error( five-fold )| |----| ---- | ---- | | 0 | 37523459.38431137 | 1372213865.0117557| | 0.001/m | 8.185531781011145 | 20.43477279690964| | 1/m | 0.9067399188977785 | 1.621398010236738| | 1000/m | 0.8900933833961424 | 1.4656922795408458| #### 討論 此題可以發現隨著 `λ` 的增加, Fitting Function 會越不貼合 `training data` , 而 `testing error` 及 `cross-validation-error` 也隨之減少。