--- tags: machine-learning --- # LeNet-5: Summary and Implementation <div style="text-align: center"> <img src="https://raw.githubusercontent.com/valoxe/image-storage-1/master/research-paper-summary/alexnet/0.png?token=AMAXSKLOJ332DVKUCQFEDVS6WMFAW"> </div> >This post is divided into 2 sections: Summary and Implementation. > >We are going to have an in-depth review of the [Gradient-Based Learning Applied to Document Recognition](http://vision.stanford.edu/cs598_spring07/papers/Lecun98.pdf) paper which introduces the LeNet-5 architecture. > > The implementation uses Keras as framework. For other frameworks implementation, please refer to this [repository](https://github.com/3outeille/Research-Paper-Summary). > > Also, if you want to read other "Summary and Implementation", feel free to > check them at my [blog](https://ferdinandmom.engineer/deep-learning/). ![Figure 1: Building blocks](https://raw.githubusercontent.com/valoxe/image-storage-1/master/research-paper-summary/lenet-5/1.png?token=AMAXSKIUCCTXYPTLSFUVM726WMEOY) --- # I/ Summary - ==LeNet is one of the very first convolutional neural networks (CNNs).== - In the paper, there is several versions of LeNet (LeNet-1, LeNet-4, LeNet-5, Boosted LeNet-4) but here, we are going to focus only on LeNet-5. Here is its architecture: ![Figure 2: LeNet-5 Architecture](https://raw.githubusercontent.com/valoxe/image-storage-1/master/research-paper-summary/lenet-5/2.png?token=AMAXSKIFVOFJARUUJULNGWS6WMEQG) - LeNet-5 has: - 2 **Convolutional** layers. - 3 **Fully connected** layers. - 2 **Average pooling** layers. - **Tanh** as activation function for hidden layer. - **Softmax** as activation function for output layer. - 60000 trainable parameters. - **Cross-entropy** as cost function - **Gradient descent** as optimizer. - LeNet-5 is: - trained on MNIST dataset (60000 training examples). - trained over 20 epochs. - LeNet-5 is expected to: - converge after 10–12 epochs. - have an error rate of 0.95% on test set. (Using accuracy as metric) # II/ Implementation - We will use a simpler version of the LeNet-5 than the one described in the paper. For example, computation on average pooling layers described in the paper are slightly more complex than usual. - The implementation is divided as follow: 1. Import libraries 2. Loading dataset 3. Data preprocessing 4. Data visualization 5. Architecture build 6. Training 7. Evaluating ## 1. Import libraries ```python from tensorflow.keras import Sequential from tensorflow.keras.layers import Conv2D, AveragePooling2D, Flatten, Dense from tensorflow.keras.losses import CategoricalCrossentropy import numpy as np import matplotlib.pyplot as plt import tensorflow as tf %matplotlib inline ``` ## 2. Loading dataset ```python (X_train, y_train), (X_test, y_test) = tf.keras.datasets.mnist.load_data() X_val, y_val = X_train[55000:, ..., np.newaxis], y_train[55000:] X_train, y_train = X_train[:55000, ..., np.newaxis], y_train[:55000] X_test = X_test[..., np.newaxis] print("Image Shape: {}".format(X_train[0].shape), end = '\n\n') print("Training Set: {} samples".format(len(X_train))) print("Validation Set: {} samples".format(len(X_val))) print("Test Set: {} samples".format(len(X_test))) ``` If everything went well, you should get the following output: ```python Image Shape: (28, 28, 1) Training Set: 55000 samples Validation Set: 5000 samples Test Set: 10000 samples ``` ## 3. Data preprocessing Firstly, check if your ~/.keras/keras.json file has the following line: ``` "image_data_format": "channels_first" ``` If not, replace it to: ``` "image_data_format": "channels_last ``` Now, we need to: - reshape the image into a 32x32x1 shape. - normalize our dataset. ### Reshape the image into a 32x32x1 shape ```python # Pad images with 0s X_train = np.pad(X_train, ((0,0),(2,2),(2,2),(0,0)), 'constant') X_val = np.pad(X_val, ((0,0),(2,2),(2,2),(0,0)), 'constant') X_test = np.pad(X_test, ((0,0),(2,2),(2,2),(0,0)), 'constant') print("Updated Image Shape for: ", end='\n\n') print("-Training set: {}".format(X_train.shape)) print("-Validation set: {}".format(X_val.shape)) print("-Test set: {}".format(X_test.shape)) ``` ### Normalize our dataset ```python # Normalization. X_train, X_val, X_test = X_train/float(255), X_val/float(255), X_test/float(255) X_train -= np.mean(X_train) X_val -= np.mean(X_val) X_test -= np.mean(X_test) ``` ## 4. Data visualization Let's visualize some pictures. Here is how it looks: ![](https://raw.githubusercontent.com/valoxe/image-storage-1/master/research-paper-summary/lenet-5/3.png) ## 5. Architecture build Following the Figure 2 above, here is LeNet-5 architecture in Keras. ```python def LeNet_5(): model = Sequential() # C1: (None,32,32,1) -> (None,28,28,6). model.add(Conv2D(6, kernel_size=(5, 5), strides=(1, 1), activation='tanh', input_shape=(32,32,1), padding='valid')) # P1: (None,28,28,6) -> (None,14,14,6). model.add(AveragePooling2D(pool_size=(2, 2), strides=(2, 2), padding='valid')) # C2: (None,14,14,6) -> (None,10,10,16). model.add(Conv2D(16, kernel_size=(5, 5), strides=(1, 1), activation='tanh', padding='valid')) # P2: (None,10,10,16) -> (None,5,5,16). model.add(AveragePooling2D(pool_size=(2, 2), strides=(2, 2), padding='valid')) # Flatten: (None,5,5,16) -> (None, 400). model.add(Flatten()) # FC1: (None, 400) -> (None,120). model.add(Dense(120, activation='tanh')) # FC2: (None,120) -> (None,84). model.add(Dense(84, activation='tanh')) # FC3: (None,84) -> (None,10). model.add(Dense(10, activation='softmax')) # Compile the model model.compile(loss='sparse_categorical_crossentropy', optimizer='sgd', metrics=['accuracy']) return model ``` ## 6. Training Let's train and save our model. ```python history = model.fit(X_train, y_train, validation_data=(X_val, y_val), epochs=20) # Save the model. model.save("lenet5_model.h5") ``` ## 7. Evaluating Now, let's test our model. ```python # Restore the model. model = tf.keras.models.load_model('lenet5_model.h5') # Make prediction. predictions = model.predict(X_test) # Retrieve predictions indexes. y_pred = np.argmax(predictions, axis=1) # Print test set accuracy. print('Test set error rate: {}'.format(np.mean(y_pred == y_test))) # Plot some examples with model predictions. print('\nSome correct classification:') plot_example(X_test, y_test, y_pred) print('\nSome incorrect classification:') plot_example_errors(X_test, y_test, y_pred) # Plot training error. print('\nPlot of training error over 20 epochs:') plt.title('Training error') plt.ylabel('Cost') plt.xlabel('epoch') plt.plot(history.history['loss']) plt.plot(history.history['val_loss']) plt.legend(['train loss', 'val loss'], loc='upper right') plt.show() ``` We should get the following output: ```python Test set error rate: 0.9837 ``` Some correct classification: ![](https://raw.githubusercontent.com/valoxe/image-storage-1/master/research-paper-summary/lenet-5/4.png) Some incorrect classification: ![](https://raw.githubusercontent.com/valoxe/image-storage-1/master/research-paper-summary/lenet-5/5.png) Plot of training error over 20 epochs: ![](https://raw.githubusercontent.com/valoxe/image-storage-1/master/research-paper-summary/lenet-5/6.png)