# **Lab 3: Quantize MobileNet**
<span style="color:Red;">**Due Date: 5/9 23:55**</span>
## Introduction
This lab aims to quantize the MobileNetV2 model, resulting in reduced model size and potentially improved inference speed without significant loss in accuracy.
* Please download the provided Jupyter Notebook file using the link below. Follow the prompts and hints provided within the notebook to fill in the empty blocks and answer the questions.
> [lab3.ipynb](https://colab.research.google.com/drive/1uO8nSfVe5SbY-DR0M4o3wBuhTwJZr4gG?usp=drive_link)
## Part 1: Linear Quatization Implementation (70%)
In this part, you will learn how linear quantization works and implement the quantized version of **Fully Connected Layer** (Linear) and **Convolution Layer** (Conv2d).
Refer to **"Part1"** in the provided notebook (ipynb).
## Part 2: Quantize MobileNetV2 and Export (30%)
You will quantize MobileNetV2 using XNNPack library, and you may try to deploy the quantized model on to Raspberry Pi.
Refer to **"Part2"** in the provided notebook (ipynb).
* Below is a MobileNetV2 with 96.3% accuarcy on CIFAR10, finetuned by TAs. You will be using the model's weights a starting point for quantization:
> [mobilenetv2_0.963.pth](https://drive.google.com/file/d/1k89xAqC1FETperw11xvpxSPGcEwMfZJh/view)
You can load the above model with the following snippet:
```python
import torch
from torchvision.models.quantization import mobilenet_v2
model = torch.load('./mobilenetv2_for_quant.pth', map_location="cpu")
```
:::success
We will not calculate your score by executorch pte, but you can deploy it onto Raspberry Pi to compare the execution time with the model before being quantized.
:::
## Hand-In Policy
You will need to hand-in:
* Your quantized model ***mobilenet_quantized.pth***
* Fill out ***lab3.ipynb***, and rename it to ***```<YourID>```.ipynb***
Please organize your submission files into a zip archive structured as follows:
```scss
YourID.zip
├── model/
│ └── mobilenet_quantized.pth
└── YourID.ipynb
```
## Evaluation Criteria
Upon receiving your zip file:
1. We will use following code to load your quantized model
```scss
// Load the saved ExportedProgram
loaded_quantized_ep = torch.export.load(pt2e_quantized_model_file_path)
loaded_quantized_model = loaded_quantized_ep.module()
```
2. Evaluate the accuracy of your quantized model using the following code:
```scss
acc = evaluate_model(loaded_quantized_model, test_loader, device)
```
$$
Score = (10 \times Step function(Accuracy-0.88) + 20 \times \dfrac{Accuracy - 0.88}{0.96 - 0.88})
$$
The reported accuracy must be higher than *88%* to obtain full score.