###### tags: `Collegue Life` `Coding-X` `AI & Machine Learning`
# Practice - MNIST with MLP
## Introduction
This practice is a HW for MLP practicing,
**learning using Keras to build a MLP model for MNIST**,
with similarly source code provided in PPT for class.
We need to modify the source code, adjust the parameters and the model.
Here's the model's imformation:

Codes below are rewrited by me.
And I alse added more detailed comments and some of my thoughts, wish could help you if you're interested in it. Enjoy!
## Code
[Github link](https://github.com/chwchao/2019-Coding-X/tree/master/MachineLearning/MNIST)
1. **Open training file**
```python=1
with open('train.csv', 'r')as file:
csv_lines = file.readlines()
```
<br>
2. **Access and store the data**
According to the format of MNIST data,
1st column is label, the answer of the picture in the other hand.
Columns left are the pixel imformation of the picture (Gray scale - 0~255)
So we store pictures and labels into different lists
```python=+
pic = []
label = []
for i in range(1, len(csv_lines)):
# delete '\n', and split by ',' into lists
row = csv_lines[i].replace('\n', '').split(',')
# except label
pic.append(list(map(int, row[1:])))
# only label
label.append(list(map(int, row[0])))
```
<br>
3. **One-hot encoding**
Because our labels are only map from 10 numbers (0~9), we can use "One-hot Encoding" to simplify the data.
```python=+
from keras.utils import to_categorical
label = to_categorical(y, num_classes = 10)
```
<br>
4. **Transform into numpy.array, and also normalize**
Because we know that the data of the pictures are mapped from 0~255 (gray scale),
we divide them by 255 to normalize, which are re-mapped to 0~1.
```python=+
import numpy as np
pic = np.array(pic)/255.0
label = np.array(label)
print(pic.shape)
print(label.shape)
```

<br>
5. **Build the model**
We can see it as an empty model at first, and we need to add layers one by one.
>First layer:
>Excepts number of neurals and activation, function, because of here's the interface we input data, we need to tell the model how we'll do the input.
>Hidden layers:
>The layers between input and output ones. Only number of neurals and activation, function needed to set.
>Output layer:
>The last one layer, its number of neurals must be same as the options we want the model come out. Take this situation for example, we will only get 0~9 total ten numbers, so the output layer must have 10 neurals.
>Also, we use "Softmax" activation function to make outputs converge.
```python=+
from keras.models import Sequential
from keras.layers import Dense, Dropout
# Declaration a sequential deep learning model
model = Sequential()
# Add the first layer(# of neurals, activation function, input shape)
model.add(Dense(512, activation='relu', input_shape=(784,)))
# Add second layer(# of neurals, activation function)
model.add(Dense(512, activation='relu'))
# Dropout
model.add(Dropout(0.2))
# Add third layer(same as second one)
model.add(Dense(256, activation='relu'))
# Add the output layer (use softmax to normalize the sum of results)
model.add(Dense(10, activation='softmax')
print(model.summary)
```
*relu -- Rectified Linear Unit*
*softmax -- Softmax*

<br>
6. **Compile the model**
Set imformations of the model.
```python=+
from keras.optimizers import RMSprop
# settings of the model
model.compile(
loss='categorical_crossentropy',
optimizer=RMSprop(),
metrics=['acc']
)
```
<br>
7. **Start training**
>batch_size -- how much data used in a gradient
>epochs -- how many times all data is used
>validation -- how much data is splited out used to validate the model's loss. accuracy...
>verbose -- 0 : No log , 1 : Show progress bar , 2 : Log of each epochs
```python=+
batch_size = 64
epochs = 4
history = model.fit(
pic,
label,
batch_size=batch_size,
epochs=epochs,
validation_split=0.2,
verbose=1
)
```

<br>
8. **Evaluation**
Show the accuracy when training and validating on a plot.
>acc -- accuracy when training
>val_acc -- accuracy when validating
```python=+
import matplotlib.pyplot as plt
def show_train_history(train_history):
plt.plot(train_history.history['acc'])
plt.plot(train_history.history['val_acc'])
plt.xticks([i for i in range(0, len(train_history.history['acc']))])
plt.title('Train History')
plt.ylabel('acc')
plt.xlabel('epoch')
plt.legend(['train', 'validation'], loc='upper left')
plt.show()
show_train_history(history)
```

<br>
9. **Confusion Matrix**
```python+=
import itertools
from sklearn.metrics import confusion_matrix, classification_report
def plot_confusion_matrix(
cm,
classes,
title='Confusion matrix',
cmap=plt.cm.Blues
):
fig = plt.figure()
plt.imshow(cm, interpolation='nearest', cmap=cmap)
plt.title(title)
plt.colorbar()
tick_marks = np.arange(len(classes))
plt.xticks(tick_marks, classes, rotation=45)
plt.yticks(tick_marks, classes)
thresh = cm.max() / 2.
for i, j in itertools.product(range(cm.shape[0]), range(cm.shape[1])):
plt.text(j, i, cm[i, j],horizontalalignment="center",
color="white" if cm[i, j] > thresh else "black")
plt.tight_layout()
plt.ylabel('True label')
plt.xlabel('Predicted label')
# fig.savefig(title + '.eps', format='eps', dpi=600, quality=95)
# fig.savefig(title + '.png', dpi=600, quality=95)
plt.show()
plt.close()
# Mnist label
classes = ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9']
# Predict the values from the training
Y_pred = model.predict([pic])
# Convert predictions classes to one hot vectors
y_pred = np.argmax(Y_pred, axis = 1)
# Convert training observations to one hot vectors
Y_true = np.argmax(label, axis = 1)
# compute the confusion matrix
confusion_mtx = confusion_matrix(Y_true, y_pred)
# plot the confusion matrix
plot_confusion_matrix(
confusion_mtx,
classes=classes,
title='Confusion_matrix_train'
)
```

<br>
10. **Save the model**
```python=+
try:
model.save_weights('mnist.h5')
print('Saving sucess!')
except:
print('Saving failed!')
```

<br>
11. **Predict with the test file**
Notice that at line 92, theres a '[ ]' between *np.array()* and *list()*, it's because when we called *fit()* above, the data we gave is a 2D array, and so should we do now. So we make a nested-list to change the format of the testing data.
```python=+
# Open the test file
with open('test.csv', 'r')as file:
csv_lines = file.readlines()
# Count for showing pictures
show_image = 0
for i in range(1, len(csv_lines)):
# delete '\n', and split by ',' into lists
row = csv_lines[i].replace('\n', '').split(',')
# do predict ( Argument format should as same as fit() )
result = model.predict_classes(np.array([list(map(int, row))])/255.0)[0]
# setup the plot
ax = plt.subplot(2, 5, (i % 10) if i % 10 != 0 else 10)
ax.imshow(
255 - np.array(list(map(int, row))).reshape(28, 28).astype(np.uint8),
cmap='gray'
)
plt.title('result: ' + str(result))
plt.axis('off')
show_image += 1
if show_image == 10: # every ten pictures a page
plt.show()
plt.draw()
show_image = 0
plt.show()
plt.draw()
```

<br>
12. **Submit the result to submit.csv**
```python=+
# The format we'll submit, and these are titles of the two columns
submit = 'ImageId,Label\n'
# Open the testing file
with open('test.csv', 'r')as file:
csv_lines = file.readlines()
# Count for row
image_id = 1
for i in range(1, len(csv_lines)):
# delete '\n', and split by ',' into lists
row = csv_lines[i].replace('\n', '').split(',')
# do predict (same as above)
result = model.predict_classes(np.array([list(map(int, row))])/255.0)[0]
submit += str(image_id) + ',' + str(result) + '\n'
image_id += 1
# Write file
open('submit.csv', 'w').write(submit)
```


## Reference
Function Arguments
https://keras-cn.readthedocs.io/en/latest/models/model/
What's MNIST?
https://en.wikipedia.org/wiki/MNIST_database