---
title: Learning 8-bit parity checking problem with MLP
---
# Learning 8-bit parity checking problem with MLP
###### Table of Contents
[TOC]
## 問題描述
- 以MLP學習8bit parity check問題,並使用偶核對位元。
## 1. 解決方法
- 建置MLP需要的**Activation Function**
$$o_j = \varphi(x \cdot W + b )$$
所需的Activation Function:
- ReLU
$$ f(x)=\left\{
\begin{aligned}
0 \; for \; x < 0 \\
x \; for \; x \geq 0\\
\end{aligned}
\right.
$$
- Sigmoid
$$\sigma(x) = \frac{1}{1+e^{-x}}$$
- tanh
$$ tanh(x) = \frac{\exp^{x}-\exp^{-x}}{\exp^{x}+\exp^{-x}}$$
- 實作
```python=
class ReLU:
def __init__(self):
pass
def forward(self, x):
self.mask = (x<=0)
out = x
out[out<=0] = 0
return out
def backward(self, dout):
dx = dout
dx[self.mask] = 0
return dx
class Sigmoid:
def __init__(self):
pass
def forward(self, x):
out = 1.0/(1+np.exp(-x))
self.o = out
return out
def backward(self, dout):
dx = dout*self.o*(1-self.o)
return dx
class tanh:
def __init__(self):
pass
def forward(self, x):
out = np.tanh(x)
self.o = out
return out
def backward(self, dout):
dx = dout*(1.0 - self.o**2)
return dx
```
- 構建MLP API
```python=
class MLP:
def __init__(self, inDegree:int, layerType:list, outDegree):
#略
def forward(self, x):
#略
def backward(self, y):
#略
def update(self, eta, alpha):
#略
def predict(self, x):
#略
```
- 產生測資及標籤
```python=
def Data():
data = []
label = []
for i in range(256):
binaryString = '{0:08b}'.format(i)
data.append(list(map(int, binaryString)))
label.append([ParityBit(binaryString)])
return (np.array(data), np.array(label))
```
## 2. 實現成果
- 使用四種不同Model:
- 兩層ReLU架構
- 三層ReLU架構
- 四層ReLU架構
- ReLU->Tanh三層架構
---
### 兩層ReLU架構
```flow
st=>inputoutput: ReLU
e=>inputoutput: Sigmoid
st(right)->e
```
###### 訓練結果
epoch 1000: loss = 59.824089
epoch 2000: loss = 42.496969
epoch 3000: loss = 25.405999
epoch 4000: loss = 15.206905
epoch 5000: loss = 8.968195
epoch 6000: loss = 5.575517
epoch 7000: loss = 3.731786
epoch 8000: loss = 2.669512
epoch 9000: loss = 2.007941
epoch 10000: loss = 1.561317
epoch 11000: loss = 1.256148
epoch 12000: loss = 1.037282
epoch 13000: loss = 0.877471
epoch 14000: loss = 0.753664
epoch 15000: loss = 0.657069
epoch 16000: loss = 0.578303
epoch 17000: loss = 0.513994
epoch 18000: loss = 0.461073
epoch 19000: loss = 0.417002
epoch 20000: loss = 0.379725

---
### 三層ReLU架構
```flow
st=>inputoutput: ReLU
op1=>inputoutput: ReLU
e=>inputoutput: Sigmoid
st(right)->op1(right)->e
```
###### 訓練結果
On epoch 1000: loss = 62.390846
On epoch 2000: loss = 31.842048
On epoch 3000: loss = 2.435030
On epoch 4000: loss = 0.682203
On epoch 5000: loss = 0.320841
On epoch 6000: loss = 0.193129
On epoch 7000: loss = 0.133233
On epoch 8000: loss = 0.099513
On epoch 9000: loss = 0.078341
On epoch 10000: loss = 0.064007
On epoch 11000: loss = 0.053752
On epoch 12000: loss = 0.046112
On epoch 13000: loss = 0.040244
On epoch 14000: loss = 0.035582
On epoch 15000: loss = 0.031810
On epoch 16000: loss = 0.028717
On epoch 17000: loss = 0.026119
On epoch 18000: loss = 0.023918
On epoch 19000: loss = 0.022037
On epoch 20000: loss = 0.020409

---
### 四層ReLU架構
```flow
st=>inputoutput: ReLU
op1=>inputoutput: ReLU
op2=>inputoutput: ReLU
e=>inputoutput: Sigmoid
st(right)->op1(right)->op2(right)->e
```
###### 訓練結果
On epoch 1000: loss = 59.028748
On epoch 2000: loss = 1.557150
On epoch 3000: loss = 0.136009
On epoch 4000: loss = 0.058077
On epoch 5000: loss = 0.034693
On epoch 6000: loss = 0.024140
On epoch 7000: loss = 0.018258
On epoch 8000: loss = 0.014552
On epoch 9000: loss = 0.012019
On epoch 10000: loss = 0.010190
On epoch 11000: loss = 0.008816
On epoch 12000: loss = 0.007747
On epoch 13000: loss = 0.006897
On epoch 14000: loss = 0.006202
On epoch 15000: loss = 0.005627
On epoch 16000: loss = 0.005142
On epoch 17000: loss = 0.004730
On epoch 18000: loss = 0.004375
On epoch 19000: loss = 0.004066
On epoch 20000: loss = 0.003795

---
### ReLU->Tanh三層架構
```flow
st=>inputoutput: ReLU
op1=>inputoutput: Tanh
e=>inputoutput: Sigmoid
st(right)->op1(right)->e
```
###### 訓練結果
On epoch 1000: loss = 61.276613
On epoch 2000: loss = 35.918907
On epoch 3000: loss = 4.255450
On epoch 4000: loss = 0.682826
On epoch 5000: loss = 0.298602
On epoch 6000: loss = 0.178338
On epoch 7000: loss = 0.123018
On epoch 8000: loss = 0.092134
On epoch 9000: loss = 0.072746
On epoch 10000: loss = 0.059573
On epoch 11000: loss = 0.050165
On epoch 12000: loss = 0.043141
On epoch 13000: loss = 0.037743
On epoch 14000: loss = 0.033469
On epoch 15000: loss = 0.030002
On epoch 16000: loss = 0.027142
On epoch 17000: loss = 0.024750
On epoch 18000: loss = 0.022716
On epoch 19000: loss = 0.020969
On epoch 20000: loss = 0.019454

---
## 3. 總結
- 可以發現到,使用愈多層的架構,Loss下降的速度愈快(兩層、三層、四層分別在接近7000、4000、2000 epoch收斂)。
- 從圖表可知,使用ReLU和Tanh更能提升收斂速度(接近3000 epoch收斂)。