Transfer Learning with PyTorch

# Transfer Learning with PyTorch ###### tags: `黃仲璿` `2021/07/16` > Transfer learning is a technique for re-training a DNN model on a new dataset, which takes less time than training a network from scratch. With transfer learning, the weights of a pre-trained model are fine-tuned to classify a customized dataset. ---Dusty from Jetson Inference > PyTorch is the machine learning framework that we'll be using, and example datasets along with training scripts are provided to use below, in addition to a camera-based tool for collecting and labeling your own training datasets. ---Dusty from Jetson Inference Introduction to PyTorch https://medium.com/pyladies-taiwan/%E6%B7%B1%E5%BA%A6%E5%AD%B8%E7%BF%92%E6%96%B0%E6%89%8B%E6%9D%91-pytorch%E5%85%A5%E9%96%80-511df3c1c025 ## Verifying PyTorch Installation We installed PyTorch when setting up Jetson Nano. The following commands can be used in shell to check on PyTorch version. ``` $ python3 >>> import torch >>> print(torch.__version__) >>> import torchvision >>> print(torchvision.__version__) ``` Torch version should return 1.6.0 Torchvision version should return 0.7.0 ## Disable Desktop GUI Temporarily If the memory is running low while training, disabling the Ubuntu desktop GUI temporarily can free up extra memory. Simply restart Jetson Nano after the training finishes to enable the desktop GUI again. ``` $ sudo init 3 # stop the desktop # log your user back into the console # run the PyTorch training scripts $ sudo init 5 # restart the desktop ``` To **perminantly** disable Ubuntu desktop GUI: ``` $ sudo systemctl set-default multi-user.target # disable desktop on boot $ sudo systemctl set-default graphical.target # enable desktop on boot ``` ## Re-Training Image Classification Model ### Re-Training on the Cat/DogDataset Download the example dataset (5000 training images, 1000 validation images, and 200 test images) ``` $ cd jetson-inference/python/training/classification/data $ wget https://nvidia.box.com/shared/static/o577zd8yp3lmxf5zhm38svrbrv45am3y.gz -O cat_dog.tar.gz $ tar xvzf cat_dog.tar.gz ``` ### Re-training ResNet-18 Model Launch the training under `jetson-inference/python/training/classification/` ``` $ cd jetson-inference/python/training/classification $ python3 train.py --model-dir=models/cat_dog data/cat_dog ``` #### Optional Arguments `--epochs` Change the number of epochs to run (default is 35) `--arch` Change which network model to train (default is ResNet-18) `--batch` Change size of each batch (default is 8) #### Things You'll See in Shell When Training Text similar to the following lines should appear in shell while training: ``` Use GPU: 0 for training => dataset classes: 2 ['cat', 'dog'] => using pre-trained model 'resnet18' => reshaped ResNet fully-connected layer with: Linear(in_features=512, out_features=2, bias=True) Epoch: [0][ 0/625] Time 0.932 ( 0.932) Data 0.148 ( 0.148) Loss 6.8126e-01 (6.8126e-01) Acc@1 50.00 ( 50.00) Acc@5 100.00 (100.00) Epoch: [0][ 10/625] Time 0.085 ( 0.163) Data 0.000 ( 0.019) Loss 2.3263e+01 (2.1190e+01) Acc@1 25.00 ( 55.68) Acc@5 100.00 (100.00) Epoch: [0][ 20/625] Time 0.079 ( 0.126) Data 0.000 ( 0.013) Loss 1.5674e+00 (1.8448e+01) Acc@1 62.50 ( 52.38) Acc@5 100.00 (100.00) Epoch: [0][ 30/625] Time 0.127 ( 0.114) Data 0.000 ( 0.011) Loss 1.7583e+00 (1.5975e+01) Acc@1 25.00 ( 52.02) Acc@5 100.00 (100.00) Epoch: [0][ 40/625] Time 0.118 ( 0.116) Data 0.000 ( 0.010) Loss 5.4494e+00 (1.2934e+01) Acc@1 50.00 ( 50.30) Acc@5 100.00 (100.00) Epoch: [0][ 50/625] Time 0.080 ( 0.111) Data 0.000 ( 0.010) Loss 1.8903e+01 (1.1359e+01) Acc@1 50.00 ( 48.77) Acc@5 100.00 (100.00) Epoch: [0][ 60/625] Time 0.082 ( 0.106) Data 0.000 ( 0.009) Loss 1.0540e+01 (1.0473e+01) Acc@1 25.00 ( 49.39) Acc@5 100.00 (100.00) Epoch: [0][ 70/625] Time 0.080 ( 0.102) Data 0.000 ( 0.009) Loss 5.1142e-01 (1.0354e+01) Acc@1 75.00 ( 49.65) Acc@5 100.00 (100.00) Epoch: [0][ 80/625] Time 0.076 ( 0.100) Data 0.000 ( 0.009) Loss 6.7064e-01 (9.2385e+00) Acc@1 50.00 ( 49.38) Acc@5 100.00 (100.00) Epoch: [0][ 90/625] Time 0.083 ( 0.098) Data 0.000 ( 0.008) Loss 7.3421e+00 (8.4755e+00) Acc@1 37.50 ( 50.00) Acc@5 100.00 (100.00) Epoch: [0][100/625] Time 0.093 ( 0.097) Data 0.000 ( 0.008) Loss 7.4379e-01 (7.8715e+00) Acc@1 50.00 ( 50.12) Acc@5 100.00 (100.00) ``` `Epoch: [N]` * An epoch is one complete training pass over the whole dataset. * Indicates which epoch you are currently on (starting from 0). `[N/625]` * Current image batch from the epoch that you are on. `Loss` * Accumulated errors that the model made (expected vs. predicted). `Acc@1` * the Top-1 classification accuracy over the batch, meaning that the model predicted exactly the correct class. `Acc@5` * The Top-5 classification accuracy over the batch, meaning that the correct class was one of the Top 5 outputs the model predicted * *Note: Since this Cat/Dog example only has **2 classes** (Cat and Dog), Top-5 is always 100%. Other datasets from the tutorial have more than 5 classes, where Top-5 is valid* ## Collecting Your Own Classification Datasets ###### tags: `2021/07/19` `2021/07/21` `camera-capture` is a tool for capturing and labeling images on Jetson Nano from *live videos*. It creates datasets with the following directory structure on disk: ``` ‣ train/ • class-A/ • class-B/ • ... ‣ val/ • class-A/ • class-B/ • ... ‣ test/ • class-A/ • class-B/ • ... ``` `class-A`, `class-B`, etc are subdirectories containing the data for each object class defined in `labels.txt` ### Creating the Label File Create an empty directory under `jetson-inference/python/training/classification/data` to store dataset and a text file `labels.txt` that defines the class labels. `labels.txt` should contain *one class label per line*. My `labels.txt`: ``` blue red black pencil high_lighter ``` The corresponding directory structure that `camera-capture` will create: ``` ‣ train/ • blue/ • red/ • black/ • pencil/ • high_lighter/ ‣ val/ • blue/ • red/ • black/ • pencil/ • high_lighter/ ‣ test/ • blue/ • red/ • black/ • pencil/ • high_lighter/ ``` ### Launching the Tool > The source for the `camera-capture `tool can be found under `jetson-inference/tools/camera-capture/`, and like the other programs from the repo it gets built to the `aarch64/bin` directory and installed under `/usr/local/bin/` ---Dusty from Jetson Inference To launch `camera-capture`, open shell and type the following command: `$ camera-capture /dev/video0` ![](https://i.imgur.com/0QkWb6D.png) ![](https://i.imgur.com/Xzwl8vw.png) ### Collect Training Images Select the correct `Dataset Path` and `Class Labels`. Leave `Dataset Type` as `Classification`, and `Current Set` as `train`. Position the camera towards the object and select the correct `Currect Class` (for instance, `blue`) to whatever the object should be catagorized in. Click `Capture (space)` or press `spacebar` on your keyboard to capture image. ![](https://i.imgur.com/2iBANVv.png) The status bar displays how many images have been saved under that catagory. *Note: The path for `Class Label` is incorrect in this photo. It should be `.../classification/data/pen_dataset/labels.txt`* ![](https://i.imgur.com/QR0pZI1.png) The images will be stored under `/home/huang/jetson-inference/python/training/classification/data/pen_dataset/train/<Class Label>` (for instance, `/home/huang/jetson-inference/python/training/classification/data/pen_dataset/train/blue`) ![](https://i.imgur.com/CaYCI61.png) *To keep things simple, I only captured 12 images per label.* > It's recommended to collect at least 100 training images per class before attempting training. A rule of thumb for the validation set is that it should be roughly 10-20% the size of the training set, and the size of the test set is simply dictated by how many static images you want to test on. You can also just run the camera to test your model if you'd like. > > It's important that your data is collected from varying object orientations, camera viewpoints, lighting conditions, and ideally with different backgrounds to create a model that is robust to noise and changes in environment. If you find that you're model isn't performing as well as you'd like, try adding more training data and playing around with the conditions. > ---Dusty from Jetson Inference ### Add Validation Images After captureing all your training images, randomly pick **20%** of the images of each label and copy them to `/home/huang/jetson-inference/python/training/classification/data/pen_dataset/val/<Class Label>` ex. I captured 12 images per label for training, so I randomly pick 2 images of each label from `train/<Class Label>` and copy them to `val/<Class Label>` ### Training Your Model ``` $ cd jetson-inference/python/training/classification $ python3 train.py --model-dir=models/pen data/pen_dataset ``` This will start the training process. The training uses ResNet-18 model and trains by PyTorch. The network is stored at`/home/huang/.cache/torch/hub/checkpoints/resnet18-5c106cde.pth` After finishing training, convert PyTorch to ONNX: `$ python3 onnx_export.py --model-dir=models/<YOUR-MODEL>` For instance, `$ python3 onnx_export.py --model-dir=models/pen` The converted model will be saved under `models/<YOUR-MODEL>/resnet18.onnx` (for instance, `models/pen/resnet18.onnx`), which you can then load with the imagenet programs like we did in the previous examples: ``` $ imagenet.py --model=<YOUR-MODEL>/resnet18.onnx --input_blob=input_0 --output_blob=output_0 --labels=<YOUR-DATASET>/labels.txt ``` For instance, ``` $ imagenet.py --model=models/pens/resnet18.onnx --input_blob=input_0 --output_blob=output_0 --labels=data/pens_dataset/labels.txt /dev/video0 ``` The image classification will start running on live video. ![](https://i.imgur.com/faOD0rp.png) I only trained the model with 12 images each, so the classification accuracy is very low. ![](https://i.imgur.com/cpwa06E.png) ### Difficulties Encountered (and Solved) * `labels.txt` should be located under `dataset/pen_dataset`, not under `dataset/` * The training didn't work at first since I originally had `labels.txt` under `dataset/`. ## Re-Training Object Detection Model ###### tags: `2021/07/22` ### Re-training SSD-Mobilenet >Next, we'll train our own SSD-Mobilenet object detection model using PyTorch and the Open Images dataset. SSD-Mobilenet is a popular network architecture for realtime object detection on mobile and embedded devices that combines the SSD-300 Single-Shot MultiBox Detector with a Mobilenet backbone. ![](https://i.imgur.com/5BOKjxz.png) *Note: first make sure that you have JetPack 4.4 or newer on your Jetson and PyTorch installed for Python 3.6* ---Dusty from Jetson Inference #### Setup If `jetson-inference/python/training/detection/ssd` is empty, go to https://github.com/dusty-nv/pytorch-ssd/tree/8ed842a408f8c4a8812f430cf8063e0b93a56803 and download everything to `jetson-inference/python/training/detection/ssd` ``` $ cd jetson-inference/python/training/detection/ssd $ mkdir models $ wget https://nvidia.box.com/shared/static/djf5w54rjvpqocsiztzaandq1m3avr7c.pth -O models/mobilenet-v1-ssd-mp-0_675.pth $ pip3 install -v -r requirements.txt ``` This will download the base model to `ssd/models` and install some required Python packages. The base model was already pre-trained we'll use transfer learning to fine-tune it to detect new object classes of our choosing. We'll use transfer learning to fine-tune it to detect new object classes of our choosing. #### Download the Data >The Open Images dataset contains over 600 object classes that you can pick and choose from. There is a script provided called open_images_downloader.py which will automatically download the desired object classes for you. >The classes that we'll be using are "Apple,Orange,Banana,Strawberry,Grape,Pear,Pineapple,Watermelon", for example for a fruit-picking robot - although you are welcome to substitute your own choices from the class list. The fruit classes have ~6500 images, which is a happy medium. >---Dusty from Jetson Inference ``` $ python3 open_images_downloader.py --class-names "Apple,Orange,Banana,Strawberry,Grape,Pear,Pineapple,Watermelon" --data=data/fruit ``` By default, the dataset will be downloaded to the `data/` directory under `jetson-inference/python/training/detection/ssd`, but you can change that by specifying the `--data=<PATH>` option. #### Limiting the Amount of Data Sometimes Openimage contains too much data to be used for training under a reasonable amount of time. It's recommended to first run the downloader script with the `--stats-`only option. This will show how many images there are for your classes, **without actually downloading any images**. ex. ``` $ python3 open_images_downloader.py --stats-only --class-names "Apple,Orange,Banana,Strawberry,Grape,Pear,Pineapple,Watermelon" --data=data/fruit ``` You can limit the amount of data downloaded with the `--max-images` option or the `--max-annotations-per-class` options: * `--max-images` limits the total dataset to the specified number of images, while keeping the distribution of images per class roughly the same as the original dataset. If one class has more images than another, the ratio will remain roughly the same. * `--max-annotations-per-class` limits each class to the specified number of bounding boxes, and if a class has less than that number available, all of it's data will be used - this is useful if the distribution of data is unbalanced across classes. ex. If you only want to use 100 images for the training ``` $ python3 open_images_downloader.py --max-images=100 --class-names "Apple,Orange,Banana,Strawberry,Grape,Pear,Pineapple,Watermelon" --data=data/fruit ``` #### Approximate SSD-Mobilenet Training Performance | | Images/sec | Time per epoch* | | -------- | -------- | -------- | | Jetson Nano | 4.77 | 17 min 55 sec | *Note: Time per epoch is measured on the fruits dataset (5145 training images, batch size 4)* #### Training the SSD-Mobilenet Model ``` python3 train_ssd.py --data=data/fruit --model-dir=models/fruit --batch-size=4 --epochs=30 ``` ##### Common Arguments | Arugment | Default | Description | | -------- | -------- | -------- | | `--data` | `data/` | the location of the dataset | | `--model-dir` | `models/` | directory to output the trained model checkpoints | | `--resume` | None | path to an existing checkpoint to resume training from | | `--batch-size` | 4 | try increasing depending on available memory | | `--epochs` | 30 | up to 100 is desirable, but will increase training time | | `--workers` | 2 | number of data loader threads (0 = disable multithreading) | After the training finishes, convert the model from PyTorch to ONNX. ``` $ python3 onnx_export.py --model-dir=models/fruit ``` This will save a model called `ssd-mobilenet.onnx` under `jetson-inference/python/training/detection/ssd/models/fruit/` #### Object Detection with Images ``` $ detectnet --model=models/fruit/ssd-mobilenet.onnx --labels=models/fruit/labels.txt \ --input-blob=input_0 --output-cvg=scores --output-bbox=boxes \ "/home/huang/jetson-inference/data/images/fruit_*.jpg" /home/huang/jetson-inference/data/images/test/fruit_%i.jpg ``` Processed images are saved under `/home/huang/jetson-inference/data/images/test/` ![](https://i.imgur.com/yxTDIos.jpg) #### Object Detection with Live Camera ``` $ detectnet --model=models/fruit/ssd-mobilenet.onnx --labels=models/fruit/labels.txt \ --input-blob=input_0 --output-cvg=scores --output-bbox=boxes /dev/video0 ``` ![](https://i.imgur.com/XVNQK7w.png)