# **2021/07/16** [[7 Running the Live Camera Detection Demo]](https://github.com/dusty-nv/jetson-inference/blob/master/docs/detectnet-camera-2.md) [[8 Coding Your Own Object Detection Program]](https://github.com/dusty-nv/jetson-inference/blob/master/docs/detectnet-example-2.md) [[11 Transfer Learning with PyTorch]](https://github.com/dusty-nv/jetson-inference/blob/master/docs/pytorch-transfer-learning.md) [[12 Re-training on the Cat/Dog Dataset]](https://github.com/dusty-nv/jetson-inference/blob/master/docs/pytorch-cat-dog.md) [[13 Re-training on the PlantCLEF Dataset]](https://github.com/dusty-nv/jetson-inference/blob/master/docs/pytorch-plants.md) ###### tags: `藍柏婷` ###### tags: `2021/07/16` ### **==== 運行實時攝像機檢測演示 Running the Live Camera Detection Demo ====** $ ./detectnet.py /dev/video0 # V4L2 攝像頭 (這是公司寫的,以後要自己寫) ```python= #!/usr/bin/python3 import jetson.inference import jetson.utils import argparse import sys # parse the command line parser = argparse.ArgumentParser(description="Locate objects in a live camera stream using an object detection DNN.", formatter_class=argparse.RawTextHelpFormatter, epilog=jetson.inference.detectNet.Usage() + jetson.utils.videoSource.Usage() + jetson.utils.videoOutput.Usage() + jetson.utils.logUsage()) parser.add_argument("input_URI", type=str, default="", nargs='?', help="URI of the input stream") parser.add_argument("output_URI", type=str, default="", nargs='?', help="URI of the output stream") parser.add_argument("--network", type=str, default="ssd-mobilenet-v2", help="pre-trained model to load (see below for options)") parser.add_argument("--overlay", type=str, default="box,labels,conf", help="detection overlay flags (e.g. --overlay=box,labels,conf)\nvalid combinations are: 'box', 'labels', 'conf', 'none'") parser.add_argument("--threshold", type=float, default=0.5, help="minimum detection threshold to use") is_headless = ["--headless"] if sys.argv[0].find('console.py') != -1 else [""] try: opt = parser.parse_known_args()[0] except: print("") parser.print_help() sys.exit(0) # load the object detection network net = jetson.inference.detectNet(opt.network, sys.argv, opt.threshold) # create video sources & outputs input = jetson.utils.videoSource(opt.input_URI, argv=sys.argv) output = jetson.utils.videoOutput(opt.output_URI, argv=sys.argv+is_headless) # process frames until the user exits while True: # capture the next image img = input.Capture() # detect objects in the image (with overlay) detections = net.Detect(img, overlay=opt.overlay) # print the detections print("detected {:d} objects in image".format(len(detections))) for detection in detections: print(detection) # render the image output.Render(img) # update the title bar output.SetStatus("{:s} | Network {:.0f} FPS".format(opt.network, net.GetNetworkFPS())) # print out performance info net.PrintProfilerTimes() # exit on input/output EOS if not input.IsStreaming() or not output.IsStreaming(): break ``` --- ### **==== 編寫您自己的對象檢測程序 Coding Your Own Object Detection Program 2 ====** (看懂) 先打開`gedit Text Editor`,創一個`Editor`,"my-detection_1.py" **因為原檔有錯誤,所以要改一下** 如下: ```python= import jetson.inference import jetson.utils net = jetson.inference.detectNet("ssd-mobilenet-v2", threshold=0.5) #camera = jetson.utils.videoSource("csi://0") # csi camera camera = jetson.utils.videoSource("/dev/video0") # usb camera display = jetson.utils.videoOutput("display://0") # 'my_video.mp4' for file while True: img = camera.Capture() detections = net.Detect(img) display.Render(img) display.SetStatus("Object Detection | Network {:.0f} FPS".format(net.GetNetworkFPS())) if not camera.IsStreaming() or not display.IsStreaming(): break ``` >```python=4 >net = jetson.inference.detectNet("ssd-mobilenet-v2", threshold=0.5) >``` >-> 0.5是指50%以上才顯示 > >```python=5 >#camera = jetson.utils.videoSource("csi://0") # csi camera >camera = jetson.utils.videoSource("/dev/video0") # usb camera >``` >->這才是我們的鏡頭 (V4L2 camera) > >```python=10 > img = camera.Capture() >``` >-> 從鏡頭抓一幀照片 >```python=11 > detections = net.Detect(img) >``` >-> 偵測照片中的一些物件 >```python=12 > display.Render(img) >``` >-> 將偵測到的資訊寫上照片 >```python=13 > display.SetStatus("Object Detection | Network {:.0f} FPS".format(net.GetNetworkFPS())) >``` >-> `SetStatus`是輸出的函式 >```python=14 > if not camera.IsStreaming() or not display.IsStreaming(): > break >``` >-> 如果鏡頭停止就break 就這樣!!! 執行時,先從`Desktop/scifair/image_classifiction_demo/jetson-inference/build/aarch64/bin`,進入`bin`後,直接輸入 $ python3 my-detection_1.py 若無誤,將會顯示鏡頭,finish! (要寫自己的code啦!!!) --- ### **==== 使用 PyTorch 進行遷移學習 Transfer Learning with PyTorch ====** #### **== 安裝 PyTorch (Installing PyTorch) ==** $ cd jetson-inference/build $ ./install-pytorch.sh 接下來會出現一個綠色的框框 **注意:一定要打勾(把星號"*"按出來),再按OK** (然後我就發現之前已經安裝過了,呵呵!) #### **== 驗證 PyTorch (Verifying PyTorch) ==** (就是看一下PyTorch有沒有確實裝好) $ python3 -> (1.6.0) >>> import torch >>> print(torch.__version__) >>> print('CUDA available: ' + str(torch.cuda.is_available())) >>> a = torch.cuda.FloatTensor(2).zero_() >>> print('Tensor a = ' + str(a)) >>> b = torch.randn(2).cuda() >>> print('Tensor b = ' + str(b)) >>> c = a + b >>> print('Tensor c = ' + str(c)) #### **== 安裝交換 Mounting Swap ==** (增加訓練時的記憶體) sudo systemctl disable nvzramconfig sudo fallocate -l 4G /mnt/4GB.swap sudo mkswap /mnt/4GB.swap sudo swapon /mnt/4GB.swap #### **== 禁用桌面 GUI (Disabling the Desktop GUI) ==** (暫時不用) :::danger **千萬不要用這個!!!** ::: #### **= 暫時禁用 =** $ sudo init 3 # stop the desktop # log your user back into the console # run the PyTorch training scripts $ sudo init 5 # restart the desktop #### **= 永久禁用 =** $ sudo systemctl set-default multi-user.target # disable desktop on boot $ sudo systemctl set-default graphical.target # enable desktop on boot --- ### **==== 在貓/狗數據集上重新訓練 Re-training on the Cat/Dog Dataset ====** ### **==== 在 PlantCLEF 數據集上重新訓練 Re-training on the PlantCLEF Dataset ====** (看懂) #### **== 下載數據 Downloading the Data ==** $ cd jetson-inference/python/training/classification/data $ wget https://nvidia.box.com/shared/static/o577zd8yp3lmxf5zhm38svrbrv45am3y.gz -O cat_dog.tar.gz $ tar xvzf cat_dog.tar.gz >`$ tar xvzf cat_dog.tar.gz` -> 解壓縮(tar xvzf)`cat_dog.tar.gz`到目前目錄(file) #### **== 重新訓練 ResNet-18 模型 Re-training ResNet-18 Model ==** $ cd jetson-inference/python/training/classification $ python3 train.py --model-dir=models/cat_dog data/cat_dog >`train.py` -> 檔名 `--model-dir=models/cat_dog` -> 類別模板 `data/cat_dog` -> 要判斷的圖片 將會在控制台看到以下文本 ```javascript Use GPU: 0 for training => dataset classes: 2 ['cat', 'dog'] => using pre-trained model 'resnet18' => reshaped ResNet fully-connected layer with: Linear(in_features=512, out_features=2, bias=True) Epoch: [0][ 0/625] Time 0.932 ( 0.932) Data 0.148 ( 0.148) Loss 6.8126e-01 (6.8126e-01) Acc@1 50.00 ( 50.00) Acc@5 100.00 (100.00) Epoch: [0][ 10/625] Time 0.085 ( 0.163) Data 0.000 ( 0.019) Loss 2.3263e+01 (2.1190e+01) Acc@1 25.00 ( 55.68) Acc@5 100.00 (100.00) Epoch: [0][ 20/625] Time 0.079 ( 0.126) Data 0.000 ( 0.013) Loss 1.5674e+00 (1.8448e+01) Acc@1 62.50 ( 52.38) Acc@5 100.00 (100.00) Epoch: [0][ 30/625] Time 0.127 ( 0.114) Data 0.000 ( 0.011) Loss 1.7583e+00 (1.5975e+01) Acc@1 25.00 ( 52.02) Acc@5 100.00 (100.00) Epoch: [0][ 40/625] Time 0.118 ( 0.116) Data 0.000 ( 0.010) Loss 5.4494e+00 (1.2934e+01) Acc@1 50.00 ( 50.30) Acc@5 100.00 (100.00) Epoch: [0][ 50/625] Time 0.080 ( 0.111) Data 0.000 ( 0.010) Loss 1.8903e+01 (1.1359e+01) Acc@1 50.00 ( 48.77) Acc@5 100.00 (100.00) Epoch: [0][ 60/625] Time 0.082 ( 0.106) Data 0.000 ( 0.009) Loss 1.0540e+01 (1.0473e+01) Acc@1 25.00 ( 49.39) Acc@5 100.00 (100.00) Epoch: [0][ 70/625] Time 0.080 ( 0.102) Data 0.000 ( 0.009) Loss 5.1142e-01 (1.0354e+01) Acc@1 75.00 ( 49.65) Acc@5 100.00 (100.00) Epoch: [0][ 80/625] Time 0.076 ( 0.100) Data 0.000 ( 0.009) Loss 6.7064e-01 (9.2385e+00) Acc@1 50.00 ( 49.38) Acc@5 100.00 (100.00) Epoch: [0][ 90/625] Time 0.083 ( 0.098) Data 0.000 ( 0.008) Loss 7.3421e+00 (8.4755e+00) Acc@1 37.50 ( 50.00) Acc@5 100.00 (100.00) Epoch: [0][100/625] Time 0.093 ( 0.097) Data 0.000 ( 0.008) Loss 7.4379e-01 (7.8715e+00) Acc@1 50.00 ( 50.12) Acc@5 100.00 (100.00) ``` #### **== 訓練指標 Training Metrics ==** ex. ```javascript Use GPU: 0 for training => dataset classes: 2 ['cat', 'dog'] => using pre-trained model 'resnet18' => reshaped ResNet fully-connected layer with: Linear(in_features=512, out_features=2, bias=True) Epoch: [0][ 0/625] Time 0.932 ( 0.932) Data 0.148 ( 0.148) Loss 6.8126e-01 (6.8126e-01) Acc@1 50.00 ( 50.00) Acc@5 100.00 (100.00) Epoch: [0][ 10/625] Time 0.085 ( 0.163) Data 0.000 ( 0.019) Loss 2.3263e+01 (2.1190e+01) Acc@1 25.00 ( 55.68) Acc@5 100.00 (100.00) Epoch: [0][ 20/625] Time 0.079 ( 0.126) Data 0.000 ( 0.013) Loss 1.5674e+00 (1.8448e+01) Acc@1 62.50 ( 52.38) Acc@5 100.00 (100.00) Epoch: [0][ 30/625] Time 0.127 ( 0.114) Data 0.000 ( 0.011) Loss 1.7583e+00 (1.5975e+01) Acc@1 25.00 ( 52.02) Acc@5 100.00 (100.00) Epoch: [0][ 40/625] Time 0.118 ( 0.116) Data 0.000 ( 0.010) Loss 5.4494e+00 (1.2934e+01) Acc@1 50.00 ( 50.30) Acc@5 100.00 (100.00) Epoch: [0][ 50/625] Time 0.080 ( 0.111) Data 0.000 ( 0.010) Loss 1.8903e+01 (1.1359e+01) Acc@1 50.00 ( 48.77) Acc@5 100.00 (100.00) Epoch: [0][ 60/625] Time 0.082 ( 0.106) Data 0.000 ( 0.009) Loss 1.0540e+01 (1.0473e+01) Acc@1 25.00 ( 49.39) Acc@5 100.00 (100.00) Epoch: [0][ 70/625] Time 0.080 ( 0.102) Data 0.000 ( 0.009) Loss 5.1142e-01 (1.0354e+01) Acc@1 75.00 ( 49.65) Acc@5 100.00 (100.00) Epoch: [0][ 80/625] Time 0.076 ( 0.100) Data 0.000 ( 0.009) Loss 6.7064e-01 (9.2385e+00) Acc@1 50.00 ( 49.38) Acc@5 100.00 (100.00) Epoch: [0][ 90/625] Time 0.083 ( 0.098) Data 0.000 ( 0.008) Loss 7.3421e+00 (8.4755e+00) Acc@1 37.50 ( 50.00) Acc@5 100.00 (100.00) Epoch: [0][100/625] Time 0.093 ( 0.097) Data 0.000 ( 0.008) Loss 7.4379e-01 (7.8715e+00) Acc@1 50.00 ( 50.12) Acc@5 100.00 (100.00) ``` (ctrl+c:停止訓練 --resume --epoch-start:重新開始訓練 python3 train.py --help:獲取更多信息) * Epoch:一個 epoch 是對數據集的一次完整訓練(訓練次數,訓練越多次,準確率越高) * Epoch: [N] 表示您當前處於 epoch 0、1、2 等。 * 默認是運行 35 個 epochs(您可以使用--epochs=N標誌更改此設置) * [N/625] 是您所在時代的當前圖像批次 * 訓練圖像以小批量處理以提高性能(此分為625一批) * 默認batch size為8張圖片,可以用--batch=Nflag設置 * 將括號中的數字乘以批量大小(例如 batch [100/625]-> image [800/5000]) * 時間 Time:當前圖像批次的處理時間(以秒為單位) * 數據 Data :當前圖像批次的磁盤加載時間(以秒為單位) * 損失 Loss :模型產生的累積誤差(預期與預測) -> 如果判斷錯,誤差越大 * Acc@1:該批次的 Top-1 分類準確率 -> 判斷正確占比數 * Top-1,意味著模型準確預測了正確的類別 * Acc@5:該批次的 Top-5 分類準確率 * Top-5,意味著正確的類是模型預測的前 5 個輸出之一 * 由於此貓/狗示例只有 2 個類(貓和狗),因此前 5 名始終為 100% * 教程中的其他數據集有 5 個以上的類,其中 Top-5 有效 ex. 正確類別:a d c b e h f g a a e e c d a 判斷類別:a c d b e h f g a a e d c d a -> 正確類別數量順序:a(4), e(3), d(2), c(2), b(1), f(1), g(1), h(1) -> 判斷類別數量順序:a(4), d(3), e(2), c(2), b(1), f(1), g(1), h(1) 若 Top-2 來算 Acc@2 = 50% ---> 正確類別 Top-2:a, e ---> 判斷類別 Top-2:a, d #### **== 將模型轉換為 ONNX (Converting the Model to ONNX) ==** (將 PyTorch 模型轉換為ONNX格式,以便 TensorRT 可以加載它。ONNX 是一種開放模型格式,支持許多流行的 ML 框架,包括 PyTorch、TensorFlow、TensorRT 等,因此它簡化了工具之間的模型傳輸。) $ python3 onnx_export.py --model-dir=models/cat_dog #### **== 使用 TensorRT 處理圖像 Processing Images with TensorRT ==** #### **= 處理靜態圖像分類 =** $ cd jetson-inference/python/training/classification/ $ NET=models/cat_dog $ DATASET=data/cat_dog $ imagenet.py --model=$NET/cat_dog/resnet18.onnx --input_blob=input_0 --output_blob=output_0 --labels=$DATASET/labels.txt $DATASET/test/cat/01.jpg cat.jpg >`imagenet.py` -> 檔名 `--model=$NET/cat_dog/resnet18.onnx` -> 類別模板 `--input_blob=input_0` `--output_blob=output_0` `--labels=$DATASET/labels.txt` -> 圖片上類別名稱從 labels.txt 找 `$DATASET/test/cat/01.jpg` -> 輸入 `cat.jpg` -> 輸出結果 #### **= 處理所有測試圖像 Processing all the Test Images =** $ mkdir $DATASET/test_output_cat $DATASET/test_output_dog $ imagenet --model=$NET/resnet18.onnx --input_blob=input_0 --output_blob=output_0 --labels=$DATASET/../labels.txt \ $DATASET/test/cat $DATASET/test_output_cat $ imagenet --model=$NET/resnet18.onnx --input_blob=input_0 --output_blob=output_0 --labels=$DATASET/../labels.txt \ $DATASET/test/dog $DATASET/test_output_dog #### **== 運行實時攝像頭程序 Running the Live Camera Program ==** $ imagenet.py --model= $NET /resnet18.onnx --input_blob=input_0 --output_blob=output_0 --labels= $DATASET /labels.txt csi://0 >`imagenet.py` -> 檔名 `--model= $NET /resnet18.onnx` -> 類別模板 `--input_blob=input_0` `--output_blob=output_0` `--labels= $DATASET /labels.txt` -> 圖片上類別名稱從 labels.txt 找 `csi://0` -> 輸入 #### **== 生成更多數據(可選)Generating More Data (Optional) ==** (未來可能會用到,先存著) 來自 Cat/Dog 數據集的圖像是使用cat-dog-dataset.sh腳本從 ILSCRV12 的更大 22.5GB子集中隨機提取的。第一個 Cat/Dog 數據集有意保持較小以減少訓練時間,但通過使用此腳本,您可以使用其他圖像重新生成它以創建更強大的模型。 更大的數據集需要更多的時間來訓練,所以你可以繼續下一個例子,但如果你想擴展 Cat/Dog 數據集,首先從這裡下載源數據: * https://drive.google.com/open?id=1LsxHT9HX5gM2wMVqPUfILgrqVlGtqX1o 解壓此存檔後,tools/cat-dog-dataset.sh進行以下修改進行編輯: * 代ilsvrc12_subset入IMAGENET_DIR變量中提取的位置 * 然後在某處為 cat_dog 創建一個空文件夾,並將該位置替換為 OUTPUT_DIR * 通過修改變更數據集的大小NUM_TRAIN,NUM_VAL以及NUM_TEST變量 該腳本在 下為 train、val 和 test 創建子目錄,OUTPUT_DIR然後將為每個目錄填充指定數量的圖像。然後,您可以按照與上述相同的方式訓練模型,可選擇使用--resume和--epoch-start標誌從上次中斷的地方繼續訓練(如果您不想從頭開始訓練)。記得在重新訓練後將模型重新導出到 ONNX。 The images from the Cat/Dog dataset were randomly pulled from a larger 22.5GB subset of ILSCRV12 by using the cat-dog-dataset.sh script. This first Cat/Dog dataset is intentionally kept smaller to keep the training time down, but by using this script you can re-generate it with additional images to create a more robust model. Larger datasets take more time to train, so you can proceed to the next example awhile, but if you were to want to expand the Cat/Dog dataset, first download the source data from here: * https://drive.google.com/open?id=1LsxHT9HX5gM2wMVqPUfILgrqVlGtqX1o After extracting this archive, edit tools/cat-dog-dataset.sh with the following modifications: * Substitue the location of the extracted ilsvrc12_subset in the IMAGENET_DIR variable * Then create an empty folder somewhere for cat_dog, and substitue that location in OUTPUT_DIR * Change the size of the dataset by modifying NUM_TRAIN, NUM_VAL, and NUM_TEST variables The script creates subdirectories for train, val, and test underneath the OUTPUT_DIR, and will then fill those directories with the specified number of images for each. Then you can train the model the same way as above, optionally using the --resume and --epoch-start flags to pick up training where you left off (if you don't want to restart training from the beginning). Remember to re-export the model to ONNX after re-training.