# **2021/07/16**
[[7 Running the Live Camera Detection Demo]](https://github.com/dusty-nv/jetson-inference/blob/master/docs/detectnet-camera-2.md)
[[8 Coding Your Own Object Detection Program]](https://github.com/dusty-nv/jetson-inference/blob/master/docs/detectnet-example-2.md)
[[11 Transfer Learning with PyTorch]](https://github.com/dusty-nv/jetson-inference/blob/master/docs/pytorch-transfer-learning.md)
[[12 Re-training on the Cat/Dog Dataset]](https://github.com/dusty-nv/jetson-inference/blob/master/docs/pytorch-cat-dog.md)
[[13 Re-training on the PlantCLEF Dataset]](https://github.com/dusty-nv/jetson-inference/blob/master/docs/pytorch-plants.md)
###### tags: `藍柏婷`
###### tags: `2021/07/16`
### **==== 運行實時攝像機檢測演示 Running the Live Camera Detection Demo ====**
$ ./detectnet.py /dev/video0 # V4L2 攝像頭
(這是公司寫的,以後要自己寫)
```python=
#!/usr/bin/python3
import jetson.inference
import jetson.utils
import argparse
import sys
# parse the command line
parser = argparse.ArgumentParser(description="Locate objects in a live camera stream using an object detection DNN.",
formatter_class=argparse.RawTextHelpFormatter, epilog=jetson.inference.detectNet.Usage() +
jetson.utils.videoSource.Usage() + jetson.utils.videoOutput.Usage() + jetson.utils.logUsage())
parser.add_argument("input_URI", type=str, default="", nargs='?', help="URI of the input stream")
parser.add_argument("output_URI", type=str, default="", nargs='?', help="URI of the output stream")
parser.add_argument("--network", type=str, default="ssd-mobilenet-v2", help="pre-trained model to load (see below for options)")
parser.add_argument("--overlay", type=str, default="box,labels,conf", help="detection overlay flags (e.g. --overlay=box,labels,conf)\nvalid combinations are: 'box', 'labels', 'conf', 'none'")
parser.add_argument("--threshold", type=float, default=0.5, help="minimum detection threshold to use")
is_headless = ["--headless"] if sys.argv[0].find('console.py') != -1 else [""]
try:
opt = parser.parse_known_args()[0]
except:
print("")
parser.print_help()
sys.exit(0)
# load the object detection network
net = jetson.inference.detectNet(opt.network, sys.argv, opt.threshold)
# create video sources & outputs
input = jetson.utils.videoSource(opt.input_URI, argv=sys.argv)
output = jetson.utils.videoOutput(opt.output_URI, argv=sys.argv+is_headless)
# process frames until the user exits
while True:
# capture the next image
img = input.Capture()
# detect objects in the image (with overlay)
detections = net.Detect(img, overlay=opt.overlay)
# print the detections
print("detected {:d} objects in image".format(len(detections)))
for detection in detections:
print(detection)
# render the image
output.Render(img)
# update the title bar
output.SetStatus("{:s} | Network {:.0f} FPS".format(opt.network, net.GetNetworkFPS()))
# print out performance info
net.PrintProfilerTimes()
# exit on input/output EOS
if not input.IsStreaming() or not output.IsStreaming():
break
```
---
### **==== 編寫您自己的對象檢測程序 Coding Your Own Object Detection Program 2 ====**
(看懂)
先打開`gedit Text Editor`,創一個`Editor`,"my-detection_1.py"
**因為原檔有錯誤,所以要改一下**
如下:
```python=
import jetson.inference
import jetson.utils
net = jetson.inference.detectNet("ssd-mobilenet-v2", threshold=0.5)
#camera = jetson.utils.videoSource("csi://0") # csi camera
camera = jetson.utils.videoSource("/dev/video0") # usb camera
display = jetson.utils.videoOutput("display://0") # 'my_video.mp4' for file
while True:
img = camera.Capture()
detections = net.Detect(img)
display.Render(img)
display.SetStatus("Object Detection | Network {:.0f} FPS".format(net.GetNetworkFPS()))
if not camera.IsStreaming() or not display.IsStreaming():
break
```
>```python=4
>net = jetson.inference.detectNet("ssd-mobilenet-v2", threshold=0.5)
>```
>-> 0.5是指50%以上才顯示
>
>```python=5
>#camera = jetson.utils.videoSource("csi://0") # csi camera
>camera = jetson.utils.videoSource("/dev/video0") # usb camera
>```
>->這才是我們的鏡頭 (V4L2 camera)
>
>```python=10
> img = camera.Capture()
>```
>-> 從鏡頭抓一幀照片
>```python=11
> detections = net.Detect(img)
>```
>-> 偵測照片中的一些物件
>```python=12
> display.Render(img)
>```
>-> 將偵測到的資訊寫上照片
>```python=13
> display.SetStatus("Object Detection | Network {:.0f} FPS".format(net.GetNetworkFPS()))
>```
>-> `SetStatus`是輸出的函式
>```python=14
> if not camera.IsStreaming() or not display.IsStreaming():
> break
>```
>-> 如果鏡頭停止就break
就這樣!!!
執行時,先從`Desktop/scifair/image_classifiction_demo/jetson-inference/build/aarch64/bin`,進入`bin`後,直接輸入
$ python3 my-detection_1.py
若無誤,將會顯示鏡頭,finish!
(要寫自己的code啦!!!)
---
### **==== 使用 PyTorch 進行遷移學習 Transfer Learning with PyTorch ====**
#### **== 安裝 PyTorch (Installing PyTorch) ==**
$ cd jetson-inference/build
$ ./install-pytorch.sh
接下來會出現一個綠色的框框
**注意:一定要打勾(把星號"*"按出來),再按OK**
(然後我就發現之前已經安裝過了,呵呵!)
#### **== 驗證 PyTorch (Verifying PyTorch) ==**
(就是看一下PyTorch有沒有確實裝好)
$ python3 -> (1.6.0)
>>> import torch
>>> print(torch.__version__)
>>> print('CUDA available: ' + str(torch.cuda.is_available()))
>>> a = torch.cuda.FloatTensor(2).zero_()
>>> print('Tensor a = ' + str(a))
>>> b = torch.randn(2).cuda()
>>> print('Tensor b = ' + str(b))
>>> c = a + b
>>> print('Tensor c = ' + str(c))
#### **== 安裝交換 Mounting Swap ==**
(增加訓練時的記憶體)
sudo systemctl disable nvzramconfig
sudo fallocate -l 4G /mnt/4GB.swap
sudo mkswap /mnt/4GB.swap
sudo swapon /mnt/4GB.swap
#### **== 禁用桌面 GUI (Disabling the Desktop GUI) ==**
(暫時不用)
:::danger
**千萬不要用這個!!!**
:::
#### **= 暫時禁用 =**
$ sudo init 3 # stop the desktop
# log your user back into the console
# run the PyTorch training scripts
$ sudo init 5 # restart the desktop
#### **= 永久禁用 =**
$ sudo systemctl set-default multi-user.target # disable desktop on boot
$ sudo systemctl set-default graphical.target # enable desktop on boot
---
### **==== 在貓/狗數據集上重新訓練 Re-training on the Cat/Dog Dataset ====**
### **==== 在 PlantCLEF 數據集上重新訓練 Re-training on the PlantCLEF Dataset ====**
(看懂)
#### **== 下載數據 Downloading the Data ==**
$ cd jetson-inference/python/training/classification/data
$ wget https://nvidia.box.com/shared/static/o577zd8yp3lmxf5zhm38svrbrv45am3y.gz -O cat_dog.tar.gz
$ tar xvzf cat_dog.tar.gz
>`$ tar xvzf cat_dog.tar.gz` -> 解壓縮(tar xvzf)`cat_dog.tar.gz`到目前目錄(file)
#### **== 重新訓練 ResNet-18 模型 Re-training ResNet-18 Model ==**
$ cd jetson-inference/python/training/classification
$ python3 train.py --model-dir=models/cat_dog data/cat_dog
>`train.py` -> 檔名
`--model-dir=models/cat_dog` -> 類別模板
`data/cat_dog` -> 要判斷的圖片
將會在控制台看到以下文本
```javascript
Use GPU: 0 for training
=> dataset classes: 2 ['cat', 'dog']
=> using pre-trained model 'resnet18'
=> reshaped ResNet fully-connected layer with: Linear(in_features=512, out_features=2, bias=True)
Epoch: [0][ 0/625] Time 0.932 ( 0.932) Data 0.148 ( 0.148) Loss 6.8126e-01 (6.8126e-01) Acc@1 50.00 ( 50.00) Acc@5 100.00 (100.00)
Epoch: [0][ 10/625] Time 0.085 ( 0.163) Data 0.000 ( 0.019) Loss 2.3263e+01 (2.1190e+01) Acc@1 25.00 ( 55.68) Acc@5 100.00 (100.00)
Epoch: [0][ 20/625] Time 0.079 ( 0.126) Data 0.000 ( 0.013) Loss 1.5674e+00 (1.8448e+01) Acc@1 62.50 ( 52.38) Acc@5 100.00 (100.00)
Epoch: [0][ 30/625] Time 0.127 ( 0.114) Data 0.000 ( 0.011) Loss 1.7583e+00 (1.5975e+01) Acc@1 25.00 ( 52.02) Acc@5 100.00 (100.00)
Epoch: [0][ 40/625] Time 0.118 ( 0.116) Data 0.000 ( 0.010) Loss 5.4494e+00 (1.2934e+01) Acc@1 50.00 ( 50.30) Acc@5 100.00 (100.00)
Epoch: [0][ 50/625] Time 0.080 ( 0.111) Data 0.000 ( 0.010) Loss 1.8903e+01 (1.1359e+01) Acc@1 50.00 ( 48.77) Acc@5 100.00 (100.00)
Epoch: [0][ 60/625] Time 0.082 ( 0.106) Data 0.000 ( 0.009) Loss 1.0540e+01 (1.0473e+01) Acc@1 25.00 ( 49.39) Acc@5 100.00 (100.00)
Epoch: [0][ 70/625] Time 0.080 ( 0.102) Data 0.000 ( 0.009) Loss 5.1142e-01 (1.0354e+01) Acc@1 75.00 ( 49.65) Acc@5 100.00 (100.00)
Epoch: [0][ 80/625] Time 0.076 ( 0.100) Data 0.000 ( 0.009) Loss 6.7064e-01 (9.2385e+00) Acc@1 50.00 ( 49.38) Acc@5 100.00 (100.00)
Epoch: [0][ 90/625] Time 0.083 ( 0.098) Data 0.000 ( 0.008) Loss 7.3421e+00 (8.4755e+00) Acc@1 37.50 ( 50.00) Acc@5 100.00 (100.00)
Epoch: [0][100/625] Time 0.093 ( 0.097) Data 0.000 ( 0.008) Loss 7.4379e-01 (7.8715e+00) Acc@1 50.00 ( 50.12) Acc@5 100.00 (100.00)
```
#### **== 訓練指標 Training Metrics ==**
ex.
```javascript
Use GPU: 0 for training
=> dataset classes: 2 ['cat', 'dog']
=> using pre-trained model 'resnet18'
=> reshaped ResNet fully-connected layer with: Linear(in_features=512, out_features=2, bias=True)
Epoch: [0][ 0/625] Time 0.932 ( 0.932) Data 0.148 ( 0.148) Loss 6.8126e-01 (6.8126e-01) Acc@1 50.00 ( 50.00) Acc@5 100.00 (100.00)
Epoch: [0][ 10/625] Time 0.085 ( 0.163) Data 0.000 ( 0.019) Loss 2.3263e+01 (2.1190e+01) Acc@1 25.00 ( 55.68) Acc@5 100.00 (100.00)
Epoch: [0][ 20/625] Time 0.079 ( 0.126) Data 0.000 ( 0.013) Loss 1.5674e+00 (1.8448e+01) Acc@1 62.50 ( 52.38) Acc@5 100.00 (100.00)
Epoch: [0][ 30/625] Time 0.127 ( 0.114) Data 0.000 ( 0.011) Loss 1.7583e+00 (1.5975e+01) Acc@1 25.00 ( 52.02) Acc@5 100.00 (100.00)
Epoch: [0][ 40/625] Time 0.118 ( 0.116) Data 0.000 ( 0.010) Loss 5.4494e+00 (1.2934e+01) Acc@1 50.00 ( 50.30) Acc@5 100.00 (100.00)
Epoch: [0][ 50/625] Time 0.080 ( 0.111) Data 0.000 ( 0.010) Loss 1.8903e+01 (1.1359e+01) Acc@1 50.00 ( 48.77) Acc@5 100.00 (100.00)
Epoch: [0][ 60/625] Time 0.082 ( 0.106) Data 0.000 ( 0.009) Loss 1.0540e+01 (1.0473e+01) Acc@1 25.00 ( 49.39) Acc@5 100.00 (100.00)
Epoch: [0][ 70/625] Time 0.080 ( 0.102) Data 0.000 ( 0.009) Loss 5.1142e-01 (1.0354e+01) Acc@1 75.00 ( 49.65) Acc@5 100.00 (100.00)
Epoch: [0][ 80/625] Time 0.076 ( 0.100) Data 0.000 ( 0.009) Loss 6.7064e-01 (9.2385e+00) Acc@1 50.00 ( 49.38) Acc@5 100.00 (100.00)
Epoch: [0][ 90/625] Time 0.083 ( 0.098) Data 0.000 ( 0.008) Loss 7.3421e+00 (8.4755e+00) Acc@1 37.50 ( 50.00) Acc@5 100.00 (100.00)
Epoch: [0][100/625] Time 0.093 ( 0.097) Data 0.000 ( 0.008) Loss 7.4379e-01 (7.8715e+00) Acc@1 50.00 ( 50.12) Acc@5 100.00 (100.00)
```
(ctrl+c:停止訓練 --resume --epoch-start:重新開始訓練 python3 train.py --help:獲取更多信息)
* Epoch:一個 epoch 是對數據集的一次完整訓練(訓練次數,訓練越多次,準確率越高)
* Epoch: [N] 表示您當前處於 epoch 0、1、2 等。
* 默認是運行 35 個 epochs(您可以使用--epochs=N標誌更改此設置)
* [N/625] 是您所在時代的當前圖像批次
* 訓練圖像以小批量處理以提高性能(此分為625一批)
* 默認batch size為8張圖片,可以用--batch=Nflag設置
* 將括號中的數字乘以批量大小(例如 batch [100/625]-> image [800/5000])
* 時間 Time:當前圖像批次的處理時間(以秒為單位)
* 數據 Data :當前圖像批次的磁盤加載時間(以秒為單位)
* 損失 Loss :模型產生的累積誤差(預期與預測) -> 如果判斷錯,誤差越大
* Acc@1:該批次的 Top-1 分類準確率 -> 判斷正確占比數
* Top-1,意味著模型準確預測了正確的類別
* Acc@5:該批次的 Top-5 分類準確率
* Top-5,意味著正確的類是模型預測的前 5 個輸出之一
* 由於此貓/狗示例只有 2 個類(貓和狗),因此前 5 名始終為 100%
* 教程中的其他數據集有 5 個以上的類,其中 Top-5 有效
ex.
正確類別:a d c b e h f g a a e e c d a
判斷類別:a c d b e h f g a a e d c d a
-> 正確類別數量順序:a(4), e(3), d(2), c(2), b(1), f(1), g(1), h(1)
-> 判斷類別數量順序:a(4), d(3), e(2), c(2), b(1), f(1), g(1), h(1)
若 Top-2 來算 Acc@2 = 50%
---> 正確類別 Top-2:a, e
---> 判斷類別 Top-2:a, d
#### **== 將模型轉換為 ONNX (Converting the Model to ONNX) ==**
(將 PyTorch 模型轉換為ONNX格式,以便 TensorRT 可以加載它。ONNX 是一種開放模型格式,支持許多流行的 ML 框架,包括 PyTorch、TensorFlow、TensorRT 等,因此它簡化了工具之間的模型傳輸。)
$ python3 onnx_export.py --model-dir=models/cat_dog
#### **== 使用 TensorRT 處理圖像 Processing Images with TensorRT ==**
#### **= 處理靜態圖像分類 =**
$ cd jetson-inference/python/training/classification/
$ NET=models/cat_dog
$ DATASET=data/cat_dog
$ imagenet.py --model=$NET/cat_dog/resnet18.onnx --input_blob=input_0 --output_blob=output_0 --labels=$DATASET/labels.txt $DATASET/test/cat/01.jpg cat.jpg
>`imagenet.py` -> 檔名
`--model=$NET/cat_dog/resnet18.onnx` -> 類別模板
`--input_blob=input_0`
`--output_blob=output_0`
`--labels=$DATASET/labels.txt` -> 圖片上類別名稱從 labels.txt 找
`$DATASET/test/cat/01.jpg` -> 輸入
`cat.jpg` -> 輸出結果
#### **= 處理所有測試圖像 Processing all the Test Images =**
$ mkdir $DATASET/test_output_cat $DATASET/test_output_dog
$ imagenet --model=$NET/resnet18.onnx --input_blob=input_0 --output_blob=output_0 --labels=$DATASET/../labels.txt \
$DATASET/test/cat $DATASET/test_output_cat
$ imagenet --model=$NET/resnet18.onnx --input_blob=input_0 --output_blob=output_0 --labels=$DATASET/../labels.txt \
$DATASET/test/dog $DATASET/test_output_dog
#### **== 運行實時攝像頭程序 Running the Live Camera Program ==**
$ imagenet.py --model= $NET /resnet18.onnx --input_blob=input_0 --output_blob=output_0 --labels= $DATASET /labels.txt csi://0
>`imagenet.py` -> 檔名
`--model= $NET /resnet18.onnx` -> 類別模板
`--input_blob=input_0`
`--output_blob=output_0`
`--labels= $DATASET /labels.txt` -> 圖片上類別名稱從 labels.txt 找
`csi://0` -> 輸入
#### **== 生成更多數據(可選)Generating More Data (Optional) ==**
(未來可能會用到,先存著)
來自 Cat/Dog 數據集的圖像是使用cat-dog-dataset.sh腳本從 ILSCRV12 的更大 22.5GB子集中隨機提取的。第一個 Cat/Dog 數據集有意保持較小以減少訓練時間,但通過使用此腳本,您可以使用其他圖像重新生成它以創建更強大的模型。
更大的數據集需要更多的時間來訓練,所以你可以繼續下一個例子,但如果你想擴展 Cat/Dog 數據集,首先從這裡下載源數據:
* https://drive.google.com/open?id=1LsxHT9HX5gM2wMVqPUfILgrqVlGtqX1o
解壓此存檔後,tools/cat-dog-dataset.sh進行以下修改進行編輯:
* 代ilsvrc12_subset入IMAGENET_DIR變量中提取的位置
* 然後在某處為 cat_dog 創建一個空文件夾,並將該位置替換為 OUTPUT_DIR
* 通過修改變更數據集的大小NUM_TRAIN,NUM_VAL以及NUM_TEST變量
該腳本在 下為 train、val 和 test 創建子目錄,OUTPUT_DIR然後將為每個目錄填充指定數量的圖像。然後,您可以按照與上述相同的方式訓練模型,可選擇使用--resume和--epoch-start標誌從上次中斷的地方繼續訓練(如果您不想從頭開始訓練)。記得在重新訓練後將模型重新導出到 ONNX。
The images from the Cat/Dog dataset were randomly pulled from a larger 22.5GB subset of ILSCRV12 by using the cat-dog-dataset.sh script. This first Cat/Dog dataset is intentionally kept smaller to keep the training time down, but by using this script you can re-generate it with additional images to create a more robust model.
Larger datasets take more time to train, so you can proceed to the next example awhile, but if you were to want to expand the Cat/Dog dataset, first download the source data from here:
* https://drive.google.com/open?id=1LsxHT9HX5gM2wMVqPUfILgrqVlGtqX1o
After extracting this archive, edit tools/cat-dog-dataset.sh with the following modifications:
* Substitue the location of the extracted ilsvrc12_subset in the IMAGENET_DIR variable
* Then create an empty folder somewhere for cat_dog, and substitue that location in OUTPUT_DIR
* Change the size of the dataset by modifying NUM_TRAIN, NUM_VAL, and NUM_TEST variables
The script creates subdirectories for train, val, and test underneath the OUTPUT_DIR, and will then fill those directories with the specified number of images for each. Then you can train the model the same way as above, optionally using the --resume and --epoch-start flags to pick up training where you left off (if you don't want to restart training from the beginning). Remember to re-export the model to ONNX after re-training.