AAML 2023 Fall Final Project

Problem

Design a CFU for MLPerf™ Tiny image classification benchmark model and targeting on decreasing latency.
Your design will be benchmarked by the MLPerf™ Tiny Benchmark Framework. Here is its Github page for detailed information aboud MLPerf™ Tiny.

MLPerf™ Tiny Image Classification Benchmark Model is a tiny version of ResNet.
- It consists of Conv2D, Add, AvgPool2D, FC, and Softmax.
You don't need to itegrate the model your self. The model is already included in CFU.
- See ${CFU_ROOT}/common/src/models/mlcommons_tiny_v01/imgc/
You can inspect the architecture of the selected model with Netron.
- Upload the model and you will see a vivid computation graph containing infomation of operators, tensors, and dependency between each objects.
- It might provide you some inspiration for your design.

Clone this fork of CFU to get the final project template
- Final project template path: ${CFU_ROOT}/proj/AAML_final_proj
Accuracy and Latency are evaluated by the provided evaluation script
- Script path: ${CFU_ROOT}/proj/AAML_final_proj/eval_script.py
- Dependency:
```
pip install pyserial tdqm
```

Image Not Showing Possible Reasons

No other src code in ${CFU_ROOT}/common/** and ${CFU_ROOT}/third_party/** should be overriden unless asking for permission.

Your design should pass the golden test
- After make prog && make load, input 11g to run golden test of MLPerf Tiny imgc model
  - Make sure you are running imgc's golden test if multiple models are included
  - Gold test passed if you see this:
    - Image Not Showing Possible Reasons
      The image was uploaded to a note which you don't have access to
      The note which the image was originally uploaded to has been deleted
      Learn More →
You can modify the architecture or the parameters of the selected model
- The classification accuracy of your design should be evaluated
- run python eval_script.py in ${CFU_ROOT}/proj/AAML_final_proj
  - --port {tty_path:-/dev/ttyUSB1}: Add this argument to select correct serial port
Improve the performance of your design to decrease the latency as low as it could be
Accuracy and Latency are evaluated by the provided evaluation script
- Usage:
  - make prog && make load > reboot litex > turn off litex-term > run eval script
- Image Not Showing Possible Reasons
  - The image was uploaded to a note which you don't have access to
  - The note which the image was originally uploaded to has been deleted
  Learn More →

Image Not Showing Possible Reasons

If you just want to know the latency of your design, it would be easier to run a test input instead of whole process of evaluation.

Image Not Showing Possible Reasons

You will receive 0 point if you don't present your work

We will compare the performance of your design with our reference design which is a implementation of HW2 and will not be released.
ACC won't be test if you don't modify the model
$L A T_{T A} \approx 154 M c y c l e s \approx 2036000 μ s$
All
$A C C_{X X}$ and
$L A T_{X X}$ are measured by the provided evaluation script
Ranking will be released with everyone's evaluation result after the deadline.

Accuracy:

$G O L D = {\begin{cases} 1 i f g o l d e n t e s t p a s s e d, \\ 0 i f g o l d e n t e s t f a i l e d \end{cases}$

$A C C = M i n (\frac{A C C_{s t u d e n t}}{A C C_{o r i}}, 100 %)$

Image Not Showing Possible Reasons

Note that better ACC won't give you better score!!

Latency:

$\begin{array}{l} L A T_{b a s e} = M i n (80 \times \frac{L A T_{T A}}{L A T_{s t u d e n t}}, 80) \\ L A T_{r a n k} = M i n (20 \times \frac{# s t u d e n t s - R a n k_{s t u d e n t}}{# s t u d e n t s}, 20) \\ w h e r e R a n k_{s t u d e n t} \in [0, # s t u d e n t s - 1] \end{array}$
Presentation

$P r e s e n t = {\begin{cases} - 30 i f y o u s u b m i t a p l a i n i m p l o f l a b 2 w i t h t h e s a m e p e r f o r m a n c e a s T A^{'} s, \\ - 0 o t h e r w i s e \end{cases}$
Final score:

$\begin{matrix} S c o r e = G O L D \times A C C \times (L A T_{b a s e} + L A T_{r a n k}) + P r e s e n t \\ (H i g h e s t s c o r e = 1 \times 100 % \times (80 + 20) - 0 = 100) \end{matrix}$

Please fork my repo and push your work to it
- If you use your own model
  - Put pretrained model under ${CFU_ROOT}/proj/AAML_final_proj or somewhere else we can easily find it
  - Send us the link to your training/optimization script (Github repo or GoogleDrive …) via email (yyliu.cs11@nycu.edu.tw)
    - Or you can put them in your final project repo and leave a message about where to find them in the README.md under your CFU project direcrtory (this file)
Put the link of your fork and your presentation slides to this spreadsheet
- https://docs.google.com/spreadsheets/d/15Got2YzOi-4sHKineF5v3dHaOsCUczfY464RRLa8qCU/edit#gid=1193696083
Grading workflow will be:
1. Clone your fork
2. Apply your custom model if needed
3. make prog && make load
4. Run golden test
5. Run evaluation script
6. Record measurements