# Machine Learning Homework 3 Perceptron Department : Forsetry Student ID : M10912004 Name : De-Kai Kao (高得愷) E-mail : block58697@gmail.com ## Python 實作 ### Import ``` import numpy as np import pandas as pd import matplotlib.pyplot as plt import seaborn as sns ``` ### Generate random data ``` # Set up the parameters for the normal distributions mean1 = 20 std1 = 5 mean2 = 50 std2 = 10 n = 200 # Generate n random samples of x1 and x2 x1 = np.random.normal(mean1, std1, n) y1 = np.random.normal(mean1, std1, n) x2 = np.random.normal(mean2, std2, n) y2 = np.random.normal(mean2, std2, n) ``` ### Z-score ``` mean = np.mean(np.concatenate((x1, y2))) sigma = np.std(np.concatenate((x1, y2))) zx1 = (x1-mean) / sigma zy1 = (y1-mean) / sigma zx2 = (x2-mean) / sigma zy2 = (y2-mean) / sigma xy1 = np.stack((zx1, zy1), axis=1) xy2 = np.stack((zx2, zy2), axis=1) plt.plot(xy1[:,0], xy1[:,1], 'g.', xy2[:,0], xy2[:,1], 'y.') plt.gca().axis('equal') ``` ![](https://i.imgur.com/VclsHgE.png) ### Label ``` minusone = np.insert(xy1, 2, -1, axis=1) one = np.insert(xy2, 2, 1, axis=1) df = pd.DataFrame(np.concatenate((minusone, one)),columns = ['x','y','label']) df['label'] = df['label'].astype(int) print(df) ``` | index | x | y | label | | ----- | --------- | --------- | ----- | | 0 | -1.481550 | -0.958123 | -1 | | 1 | -1.068351 | -0.479404 | -1 | | 2 | -0.967985 | -1.150020 | -1 | | 3 | -0.456959 | -1.184466 | -1 | | 4 | -1.436903 | -1.098218 | -1 | | .. | ... | ... | ... | | 395 | 0.867096 | 0.963210 | 1 | | 396 | 1.054103 | 0.382790 | 1 | | 397 | 0.662723 | 0.719258 | 1 | | 398 | 2.435677 | 1.892383 | 1 | | 399 | 0.835755 | 1.633615 | 1 | ### Split train and test ``` train = df.sample(frac=0.8) test = df.loc[df.index.difference(train.index)] print(train.shape) #(320, 3) print(test.shape) #(80, 3) ``` ### Activation function ``` def sign(z): if z > 0: return 1 else: return -1 ``` ### Training ``` def Train(df): w = np.array([0.,0.,0.]) error = 1 iterator = 0 while error != 0: error = 0 for i in range(len(df)): x,y = np.concatenate((np.array([1.]), np.array(df.iloc[i])[:2])),\ np.array(df.iloc[i])[2] if sign(np.dot(w,x)) != y: # print("iterator: "+str(iterator)) # print("x: " + str(x)) # print("w0: " + str(w)) # print("y: " + str(y)) # print(sign(np.dot(w,x)),'\n----') iterator += 1 error += 1 w += y*x return w Train(train) ``` * Result: array([1. , 3.11812938, 1.41421968]) ### Testing ``` def Test(df, w): c = 0 for i in range(len(df)): x,y = np.concatenate((np.array([1.]), np.array(df.iloc[i])[:2])), \ np.array(df.iloc[i])[2] if sign(np.dot(w,x)) == y: c += 1 return c/df.shape[0] Test(test, Train(train)) ``` * Accuary: 1.0 ### Plot ![](https://i.imgur.com/hQCYdyf.png =33%x)![](https://i.imgur.com/noqstfm.png =33%x)![](https://i.imgur.com/8mHfjJW.png =33%x)![](https://i.imgur.com/QYge6Ld.png =33%x)![](https://i.imgur.com/SARXDhA.png =33%x)![](https://i.imgur.com/kyqbHMF.png =33%x)![](https://i.imgur.com/0jafMzY.png =33%x)![](https://i.imgur.com/ym8Q1oh.png =33%x)![](https://i.imgur.com/uCsGUIQ.png =33%x)![](https://i.imgur.com/oQn791E.png =33%x)![](https://i.imgur.com/9v9I47k.png =33%x) ### Sckit-learn ``` from sklearn import metrics from sklearn.linear_model import Perceptron from sklearn.metrics import accuracy_score X_train = train.iloc[:,:2] y_train = train.iloc[:,2] X_test = test.iloc[:,:2] y_test = test.iloc[:,2] clf = Perceptron(tol=1e-3, random_state=0) clf.fit(X_train, y_train) predicted = clf.predict(X_test) accuracy_score(predicted, y_test) ``` * Accuracy: 1.0 ## 討論 隨機下降 (Stochastic Gradient Descent, SGD) 是一種常用的優化算法,用於訓練機器學習模型。儘管SGD在實踐中表現良好,但它仍然可能會面臨以下問題: 1. 非凸性問題 (Non-Convex):當目標函數存在多個局部極小值時,SGD可能會陷入局部極小值,而無法收斂到全局最小值。這可以通過使用更高級的優化算法來解決,例如牛頓法或共軛梯度法等。 2. 過擬合問題 (Overfitting):當模型過於複雜或訓練數據過少時,SGD容易過擬合,即在訓練集上表現良好但在測試集上表現差。這可以通過使用正則化(Regularization)技術來解決。 3. 超參數:SGD的性能受超參數的影響,如學習率、批次大小等。這些參數需要進行調整,以獲得最佳性能。 4. 訓練速度:當訓練數據集很大時,SGD需要花費較長時間來完成訓練。這可以通過使用更高效的算法,例如隨機平均梯度下降 (Stochastic Average Gradient Descent,SAG)、Adam優化器等,來解決。 --- https://hackmd.io/rLDSa-aRQ_qCkA9BgyxN3w?view ![](https://i.imgur.com/G0jXvbJ.png =60%x)