Learning 8-bit parity checking problem with MLP

# Learning 8-bit parity checking problem with MLP Writer : 資工三王信惟 410621241 --- ## Catalog [TOC] --- ## 1. Problem descriptions 在本次作業中，我們要設計出一個 multilayer perceptron (MLP) 去學習 **8-同位位元 (8BPC)**。8BPC 的問題是要從 8 位元的二進位輸入中產出 1 位元的奇偶校位元，當輸入的8位元中 1 的數量為奇數時，生產出來的奇偶校位元即為 1，反之，當 1 的數量為偶數時，奇偶校位元即為 0。 --- ## 2. Steps to do the assignment ### 1. Generate Data - 定義此函式 data_gen()，命其生產訓練使用之資料。 ```python= def data_gen(n): arrayX = [] arrayY = [] for i in range(n): num = list(str("{0:08b}".format(i))) num = list(map(int, num)) arrayX.append(num) arrayY.append([1 if num.count(1) % 2 == 1 else 0]) return np.array(arrayX), np.array(arrayY) ``` ### 2. The layers of MLP - Linear Layer ``` python= class Linear: def __init__(self,m,n): self.W, self.b = np.random.randn(m, n) / 8, np.random.rand(1, n) / 8 self.dW, self.db = None, None def forward(self,x): self.x=x; return((np.dot(x,self.W)+ self.b)) def linear_backward_propagation(self,dA): dx=np.dot(dA , self.W.T) self.dW = np.dot(self.x.T,dA) self.db = np.sum(dA,axis = 0) return dx ``` - Sigmoid Layer ![](https://i.imgur.com/eWUrDn5.png) ``` python= class Sigmoid: def __init__(self): pass def forward(self,x): out = 1 / (1 + np.exp(-x)) self.o=out return out def sigmoid_backward_propagation(self, dA): dz = dA * self.o * (1 - self.o) return dz ``` - ReLU Layer ![](https://i.imgur.com/3t66n0i.png) ``` python= class ReLU: def __init__(self): pass def forward(self,x): self.mask = (x <= 0) out = x out[out <= 0] = 0 return out def relu_backward_propagation(self,dA): dz = dA dz[self.mask] = 0 return dz ``` - Tanh Layer ![](https://i.imgur.com/297Ji0g.png) ``` python= class Tanh: def __init__(self): pass def forward(self,x): out = (np.exp(x) - np.exp(-x)) / (np.exp(x) + np.exp(-x)) self.o = out return out def tanh_backward_propagation(self,dA): dz = dA * (1 - self.o ** 2) return dz ``` - Loss Function ``` python= class Loss_Function: def __init__(self): pass def loss(self,y,yb): self.yb = yb return np.sum((y - yb) ** 2) def loss_backward_propagation(self, dA): return(-(2 * (y - self.yb))) ``` ### 3. Achieve MLP ``` python= class MLP: def __init__(self, m, layer): self.layer = [m] self.layer += layer print(self.layer[0],self.layer[1],self.layer[2],self.layer[3]) self.linear = Linear(self.layer[0],self.layer[1]) self.act = R() self.linear1 = Linear(self.layer[1],self.layer[2]) self.act1 = T() self.linear2 = Linear(self.layer[2],self.layer[3]) self.act2 = S() self.loss = Loss() self.last_dW,self.last_db = 0,0 self.last_dW1,self.last_db1 = 0,0 self.last_dW2,self.last_db2 = 0,0 def forward(self,x): x = self.linear.forward(x) x = self.act.forward(x) x = self.linear1.forward(x) x = self.act1.forward(x) x = self.linear2.forward(x) self.yb = self.act2.forward(x) def backward(self,y): self.L = self.loss.loss(y,self.yb) g = self.loss.loss_backward_propagation(1) g = self.act2.sigmoid_backward_propagation(g) g = self.linear2.linear_backward_propagation(g) g = self.act1.tanh_backward_propagation(g) g = self.linear1.linear_backward_propagation(g) g = self.act.relu_backward_propagation(g) g = self.linear.linear_backward_propagation(g) def update(self, eta , alpha): self.linear.W = self.linear.W-eta * self.linear.dW + alpha * self.last_dW self.linear1.W = self.linear1.W-eta * self.linear1.dW + alpha * self.last_dW1 self.linear2.W = self.linear2.W-eta * self.linear2.dW + alpha * self.last_dW2 self.linear.b = self.linear.b-eta * self.linear.db + alpha * self.last_db self.linear1.b = self.linear1.b-eta * self.linear1.db + alpha * self.last_db1 self.linear2.b = self.linear2.b-eta * self.linear2.db + alpha * self.last_db2 self.last_dW = eta * self.linear.dW self.last_dW1 = eta * self.linear1.dW self.last_dW2 = eta * self.linear2.dW self.last_db = eta * self.linear.db self.last_db1 = eta * self.linear1.db self.last_db2 = eta * self.linear2.db ``` --- ## 3. Start training - 使用 25000 筆 Epoch ，每 1000 Epoch 一次， eta 為 0.001，alpha 為 0.0001 。 ``` python= model = MLP(8,[30,4,1]) max_epochs,check_epochs = 25000,1000 last_dW,last_db = 0,0 eta = 0.001 alpha = 0.0001 X,y = data_gen(256) epochscount = [] losscount = [] for e in range (max_epochs): model.forward(X) model.backward(y) model.update(eta,alpha) if (e + 1) % check_epochs == 0: print('Epoch %3d: loss %.6f' %(e + 1,model.L)) losscount.append(model.L) epochscount.append(e + 1) print(model.yb.T) ``` --- ## 4. Result of training - 隨著 Epoch 之增加，loss 和 Epoch 成反比的下降，前面學習速度較慢，差不多到了 Epoch 3000 時，loss 便開始迅速下降，直到 Epoch 7000 時，loss 便慢慢逼近 0。 ![](https://i.imgur.com/hvRK6SK.png) - Training Error 如下圖所示： ![](https://i.imgur.com/9Wvb4M5.png) ---