--- title: Learning 8-bit parity checking problem with MLP --- # Learning 8-bit parity checking problem with MLP ###### Table of Contents [TOC] ## 問題描述 - 以MLP學習8bit parity check問題,並使用偶核對位元。 ## 1. 解決方法 - 建置MLP需要的**Activation Function** $$o_j = \varphi(x \cdot W + b )$$ 所需的Activation Function: - ReLU $$ f(x)=\left\{ \begin{aligned} 0 \; for \; x < 0 \\ x \; for \; x \geq 0\\ \end{aligned} \right. $$ - Sigmoid $$\sigma(x) = \frac{1}{1+e^{-x}}$$ - tanh $$ tanh(x) = \frac{\exp^{x}-\exp^{-x}}{\exp^{x}+\exp^{-x}}$$ - 實作 ```python= class ReLU: def __init__(self): pass def forward(self, x): self.mask = (x<=0) out = x out[out<=0] = 0 return out def backward(self, dout): dx = dout dx[self.mask] = 0 return dx class Sigmoid: def __init__(self): pass def forward(self, x): out = 1.0/(1+np.exp(-x)) self.o = out return out def backward(self, dout): dx = dout*self.o*(1-self.o) return dx class tanh: def __init__(self): pass def forward(self, x): out = np.tanh(x) self.o = out return out def backward(self, dout): dx = dout*(1.0 - self.o**2) return dx ``` - 構建MLP API ```python= class MLP: def __init__(self, inDegree:int, layerType:list, outDegree): #略 def forward(self, x): #略 def backward(self, y): #略 def update(self, eta, alpha): #略 def predict(self, x): #略 ``` - 產生測資及標籤 ```python= def Data(): data = [] label = [] for i in range(256): binaryString = '{0:08b}'.format(i) data.append(list(map(int, binaryString))) label.append([ParityBit(binaryString)]) return (np.array(data), np.array(label)) ``` ## 2. 實現成果 - 使用四種不同Model: - 兩層ReLU架構 - 三層ReLU架構 - 四層ReLU架構 - ReLU->Tanh三層架構 --- ### 兩層ReLU架構 ```flow st=>inputoutput: ReLU e=>inputoutput: Sigmoid st(right)->e ``` ###### 訓練結果 epoch 1000: loss = 59.824089 epoch 2000: loss = 42.496969 epoch 3000: loss = 25.405999 epoch 4000: loss = 15.206905 epoch 5000: loss = 8.968195 epoch 6000: loss = 5.575517 epoch 7000: loss = 3.731786 epoch 8000: loss = 2.669512 epoch 9000: loss = 2.007941 epoch 10000: loss = 1.561317 epoch 11000: loss = 1.256148 epoch 12000: loss = 1.037282 epoch 13000: loss = 0.877471 epoch 14000: loss = 0.753664 epoch 15000: loss = 0.657069 epoch 16000: loss = 0.578303 epoch 17000: loss = 0.513994 epoch 18000: loss = 0.461073 epoch 19000: loss = 0.417002 epoch 20000: loss = 0.379725 ![](https://imgur.com/7CeGhuy.png) --- ### 三層ReLU架構 ```flow st=>inputoutput: ReLU op1=>inputoutput: ReLU e=>inputoutput: Sigmoid st(right)->op1(right)->e ``` ###### 訓練結果 On epoch 1000: loss = 62.390846 On epoch 2000: loss = 31.842048 On epoch 3000: loss = 2.435030 On epoch 4000: loss = 0.682203 On epoch 5000: loss = 0.320841 On epoch 6000: loss = 0.193129 On epoch 7000: loss = 0.133233 On epoch 8000: loss = 0.099513 On epoch 9000: loss = 0.078341 On epoch 10000: loss = 0.064007 On epoch 11000: loss = 0.053752 On epoch 12000: loss = 0.046112 On epoch 13000: loss = 0.040244 On epoch 14000: loss = 0.035582 On epoch 15000: loss = 0.031810 On epoch 16000: loss = 0.028717 On epoch 17000: loss = 0.026119 On epoch 18000: loss = 0.023918 On epoch 19000: loss = 0.022037 On epoch 20000: loss = 0.020409 ![](https://i.imgur.com/gqbf8U4.png) --- ### 四層ReLU架構 ```flow st=>inputoutput: ReLU op1=>inputoutput: ReLU op2=>inputoutput: ReLU e=>inputoutput: Sigmoid st(right)->op1(right)->op2(right)->e ``` ###### 訓練結果 On epoch 1000: loss = 59.028748 On epoch 2000: loss = 1.557150 On epoch 3000: loss = 0.136009 On epoch 4000: loss = 0.058077 On epoch 5000: loss = 0.034693 On epoch 6000: loss = 0.024140 On epoch 7000: loss = 0.018258 On epoch 8000: loss = 0.014552 On epoch 9000: loss = 0.012019 On epoch 10000: loss = 0.010190 On epoch 11000: loss = 0.008816 On epoch 12000: loss = 0.007747 On epoch 13000: loss = 0.006897 On epoch 14000: loss = 0.006202 On epoch 15000: loss = 0.005627 On epoch 16000: loss = 0.005142 On epoch 17000: loss = 0.004730 On epoch 18000: loss = 0.004375 On epoch 19000: loss = 0.004066 On epoch 20000: loss = 0.003795 ![](https://i.imgur.com/j4e87BJ.png) --- ### ReLU->Tanh三層架構 ```flow st=>inputoutput: ReLU op1=>inputoutput: Tanh e=>inputoutput: Sigmoid st(right)->op1(right)->e ``` ###### 訓練結果 On epoch 1000: loss = 61.276613 On epoch 2000: loss = 35.918907 On epoch 3000: loss = 4.255450 On epoch 4000: loss = 0.682826 On epoch 5000: loss = 0.298602 On epoch 6000: loss = 0.178338 On epoch 7000: loss = 0.123018 On epoch 8000: loss = 0.092134 On epoch 9000: loss = 0.072746 On epoch 10000: loss = 0.059573 On epoch 11000: loss = 0.050165 On epoch 12000: loss = 0.043141 On epoch 13000: loss = 0.037743 On epoch 14000: loss = 0.033469 On epoch 15000: loss = 0.030002 On epoch 16000: loss = 0.027142 On epoch 17000: loss = 0.024750 On epoch 18000: loss = 0.022716 On epoch 19000: loss = 0.020969 On epoch 20000: loss = 0.019454 ![](https://i.imgur.com/wPfWBjP.jpg) --- ## 3. 總結 - 可以發現到,使用愈多層的架構,Loss下降的速度愈快(兩層、三層、四層分別在接近7000、4000、2000 epoch收斂)。 - 從圖表可知,使用ReLU和Tanh更能提升收斂速度(接近3000 epoch收斂)。