# Machine Learning Homework 3 Perceptron
Department : Forsetry
Student ID : M10912004
Name : De-Kai Kao (高得愷)
E-mail : block58697@gmail.com
## Python 實作
### Import
```
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
```
### Generate random data
```
# Set up the parameters for the normal distributions
mean1 = 20
std1 = 5
mean2 = 50
std2 = 10
n = 200
# Generate n random samples of x1 and x2
x1 = np.random.normal(mean1, std1, n)
y1 = np.random.normal(mean1, std1, n)
x2 = np.random.normal(mean2, std2, n)
y2 = np.random.normal(mean2, std2, n)
```
### Z-score
```
mean = np.mean(np.concatenate((x1, y2)))
sigma = np.std(np.concatenate((x1, y2)))
zx1 = (x1-mean) / sigma
zy1 = (y1-mean) / sigma
zx2 = (x2-mean) / sigma
zy2 = (y2-mean) / sigma
xy1 = np.stack((zx1, zy1), axis=1)
xy2 = np.stack((zx2, zy2), axis=1)
plt.plot(xy1[:,0], xy1[:,1], 'g.', xy2[:,0], xy2[:,1], 'y.')
plt.gca().axis('equal')
```

### Label
```
minusone = np.insert(xy1, 2, -1, axis=1)
one = np.insert(xy2, 2, 1, axis=1)
df = pd.DataFrame(np.concatenate((minusone, one)),columns = ['x','y','label'])
df['label'] = df['label'].astype(int)
print(df)
```
| index | x | y | label |
| ----- | --------- | --------- | ----- |
| 0 | -1.481550 | -0.958123 | -1 |
| 1 | -1.068351 | -0.479404 | -1 |
| 2 | -0.967985 | -1.150020 | -1 |
| 3 | -0.456959 | -1.184466 | -1 |
| 4 | -1.436903 | -1.098218 | -1 |
| .. | ... | ... | ... |
| 395 | 0.867096 | 0.963210 | 1 |
| 396 | 1.054103 | 0.382790 | 1 |
| 397 | 0.662723 | 0.719258 | 1 |
| 398 | 2.435677 | 1.892383 | 1 |
| 399 | 0.835755 | 1.633615 | 1 |
### Split train and test
```
train = df.sample(frac=0.8)
test = df.loc[df.index.difference(train.index)]
print(train.shape) #(320, 3)
print(test.shape) #(80, 3)
```
### Activation function
```
def sign(z):
if z > 0:
return 1
else:
return -1
```
### Training
```
def Train(df):
w = np.array([0.,0.,0.])
error = 1
iterator = 0
while error != 0:
error = 0
for i in range(len(df)):
x,y = np.concatenate((np.array([1.]), np.array(df.iloc[i])[:2])),\
np.array(df.iloc[i])[2]
if sign(np.dot(w,x)) != y:
# print("iterator: "+str(iterator))
# print("x: " + str(x))
# print("w0: " + str(w))
# print("y: " + str(y))
# print(sign(np.dot(w,x)),'\n----')
iterator += 1
error += 1
w += y*x
return w
Train(train)
```
* Result: array([1. , 3.11812938, 1.41421968])
### Testing
```
def Test(df, w):
c = 0
for i in range(len(df)):
x,y = np.concatenate((np.array([1.]), np.array(df.iloc[i])[:2])), \
np.array(df.iloc[i])[2]
if sign(np.dot(w,x)) == y:
c += 1
return c/df.shape[0]
Test(test, Train(train))
```
* Accuary: 1.0
### Plot

### Sckit-learn
```
from sklearn import metrics
from sklearn.linear_model import Perceptron
from sklearn.metrics import accuracy_score
X_train = train.iloc[:,:2]
y_train = train.iloc[:,2]
X_test = test.iloc[:,:2]
y_test = test.iloc[:,2]
clf = Perceptron(tol=1e-3, random_state=0)
clf.fit(X_train, y_train)
predicted = clf.predict(X_test)
accuracy_score(predicted, y_test)
```
* Accuracy: 1.0
## 討論
隨機下降 (Stochastic Gradient Descent, SGD) 是一種常用的優化算法,用於訓練機器學習模型。儘管SGD在實踐中表現良好,但它仍然可能會面臨以下問題:
1. 非凸性問題 (Non-Convex):當目標函數存在多個局部極小值時,SGD可能會陷入局部極小值,而無法收斂到全局最小值。這可以通過使用更高級的優化算法來解決,例如牛頓法或共軛梯度法等。
2. 過擬合問題 (Overfitting):當模型過於複雜或訓練數據過少時,SGD容易過擬合,即在訓練集上表現良好但在測試集上表現差。這可以通過使用正則化(Regularization)技術來解決。
3. 超參數:SGD的性能受超參數的影響,如學習率、批次大小等。這些參數需要進行調整,以獲得最佳性能。
4. 訓練速度:當訓練數據集很大時,SGD需要花費較長時間來完成訓練。這可以通過使用更高效的算法,例如隨機平均梯度下降 (Stochastic Average Gradient Descent,SAG)、Adam優化器等,來解決。
---
https://hackmd.io/rLDSa-aRQ_qCkA9BgyxN3w?view
