YOLOv9 環境安裝及實際操作

# YOLOv9 環境安裝及實際操作 ###### tags: `yolo` `九月份` [遇到的問題 #1](https://chatgpt.com/share/68c27847-184c-8006-914b-8a6913babd04) [遇到的問題 #2](https://chatgpt.com/share/68c27847-184c-8006-914b-8a6913babd04) >過程中遇到的問題，請居批踢協助。 ![image](https://hackmd.io/_uploads/By1Sn-Woxl.png) > 一個資料集的實作過程。 ## 摘要在 Windows 環境下，使用 Anaconda 建立 YOLOv9 深度學習環境，並以台灣車牌公開資料集進行訓練的完整流程。內容涵蓋環境安裝、資料準備、模型訓練、偵測與裁切輸出，以及訓練過程中遇到的常見問題與修正方法（如標註格式轉換、CUDA 記憶體不足、PyTorch 版本差異造成的錯誤）。透過實作，最終成功建立一個能夠準確偵測車牌的 YOLOv9 模型，並可搭配 OCR 系統完成後續辨識工作。 ## Anaconda 新建環境 ```cmd= conda create -n yolov9 python=3.9 -y conda activate yolov9 ```` ## 安裝[Pytorch](https://pytorch.org/get-started/locally/) (含 CUDA 版本) ![image](https://hackmd.io/_uploads/r1SF2cRqxx.png) ```powershell= PS C:\Users\user\Documents\Carplatev9> python ``` 如果這步失敗再改成 `pytorch-cuda=11.8` ```cmd= pip3 install torch torchvision --index-url https://download.pytorch.org/whl/cu126 ```` ### 驗證 CUDA ```cmd= python >>> import torch >>> print("cuda?", torch.cuda.is_available(), "cuda_ver:", torch.version.cuda, "torch:", torch.__version__) cuda? True cuda_ver: 12.6 torch: 2.8.0+cu126 >>> ``` ## 取得 YOLOv9 專案並裝需求若沒安裝 Git： ```cmd= conda install -c anaconda git -y git --version ``` ### Github 檔案下載 - [WongKinYiu yolov9](https://github.com/WongKinYiu/yolov9) ``` cmd= git clone https://github.com/WongKinYiu/yolov9.git cd Documents cd yolov9 pip install -r requirements.txt ``` ### 下滑至 Performance下載任一模型的預訓練檔（原本是用 YOLOv9-M，但改成 YOLOv9-S。） ![image](https://hackmd.io/_uploads/BJVisLpcgg.png) ## 準備資料集 YAML 這邊使用線上開源的台灣汽機車車牌資料集。資料來源：[Taiwan License Plate Recognition Research Tlprr Dataset](https://universe.roboflow.com/jackresearch0/taiwan-license-plate-recognition-research-tlprr/dataset/7) ```yaml= # C:\Users\user\Documents\yolov9\Carplatev9\data.yaml train: train/images val: valid/images test: test/images nc: 1 names: [plate] roboflow: workspace: jackresearch0 project: taiwan-license-plate-recognition-research-tlprr version: 7 license: CC BY 4.0 url: https://universe.roboflow.com/jackresearch0/taiwan-license-plate-recognition-research-tlprr/dataset/7 ``` ## 訓練 ```cmd= python train.py --device 0 --batch-size 16 --data "C:\Users\user\Documents\yolov9\Carplatev9\data.yaml" --imgsz 640 --weights yolov9m.pt --cfg models/detect/yolov9-m.yaml --hyp data/hyps/hyp.scratch-high.yaml --name yolov9-plates ``` ### 遇到的一些問題： * 檔案位置的放置，yolov9m.pt 需與 train.py 同目錄。 * 降級 PyTorch（最簡單、最穩定）或是直接修改 train.py 中 108 行程式碼： ```python= 108 # ckpt = torch.load(weights, map_location='cpu') # load checkpoint to CPU to avoid CUDA memory leak 109 ckpt = torch.load(weights, map_location='cpu', weights_only=False) ``` * TXT 檔案格式錯誤問題。 * 原資料存成多邊形座標（Polygon segmentation），而 YOLOv9 預設需要的是矩形框 (Bounding Box) 格式。 * [fixlable.py](https://github.com/yungpei/Car_identification_YOLOv9/blob/main/fixlable.py) * 記憶體爆掉。 * 改 `batch-size` 下調到 4。 * 使用較小 Model（yolov9-s）。 * 開啟 `--cache ram`，讓 dataset 緩存在 CPU RAM，減少顯存壓力。 ```cmd= python train.py --device 0 --batch-size 4 --data "C:\Users\user\Documents\yolov9\Carplatev9\data.yaml" --imgsz 640 --weights yolov9s.pt --cfg models/detect/yolov9-s.yaml --hyp data/hyps/hyp.scratch-high.yaml --name yolov9-plates --cache ram ``` ![image](https://hackmd.io/_uploads/B1XNf1gogl.png) * 模型結構和 config/weights 不匹配。 * 從頭訓練，可以不用 --weights yolov9s.pt，直接用： `--weights '' --cfg models/detect/yolov9-s.yaml` ```cmd= python train.py --device 0 --batch-size 4 --data "C:\Users\user\Documents\yolov9\Carplatev9\data.yaml" --imgsz 640 --weights '' --cfg models/detect/yolov9-s.yaml --hyp data/hyps/hyp.scratch-high.yaml --name yolov9-plates --cache ram ``` * YOLOv9 的模型輸出和 loss 計算。 * 錯誤訊息： ```pgsql= AttributeError: 'list' object has no attribute 'view' ``` * 在 loss_tal.py 第 168 行： ```python= pred_distri, pred_scores = torch.cat([xi.view(feats[0].shape[0], self.no, -1) for xi in feats], 2).split((self.reg_max * 4, self.nc), 1) ``` xi 本來應該是 Tensor，但現在變成 list，所以 .view() 爆掉。→ 代表 pred 的格式錯誤。 * 檢查 pred shape 的小腳本，在進 loss 前輸出結構：[check_pred.py](https://github.com/yungpei/Car_identification_YOLOv9/blob/main/check_pred.py) ```bash= python check_pred.py ``` ![image](https://hackmd.io/_uploads/BJN1Heeill.png) 模型 forward() 回傳的是 tuple，會錯，因為 feats 其實是一個 list 裡面還有 list。在 loss_tal.py 的 __call__ 內先解包 pred，只取 [0]（真正的 prediction tensor list），忽略 [1]（中間 features）。 * 找到 **utils/loss_tal.py** 第 168 行附近（或 __call__ 函式內）改成： ```python= # 如果 pred 是 tuple，先取 pred[0] if isinstance(feats, (tuple, list)) and isinstance(feats[0], list): feats = feats[0] # 只取第一個 list (真正的輸出) # 確保 feats 是 list of Tensor feats = [xi if isinstance(xi, torch.Tensor) else xi[0] for xi in feats] pred_distri, pred_scores = torch.cat([xi.view(feats[0].shape[0], self.no, -1) for xi in feats], 2).split((self.reg_max * 4, self.nc), 1) ``` * 修改完之後再跑 `train.py`： ```cmd= python train.py --device 0 --batch-size 6 --data "C:\Users\user\Documents\yolov9\Carplatev9\data.yaml" --imgsz 416 --weights '' --cfg models/detect/yolov9-s.yaml --hyp data/hyps/hyp.scratch-high.yaml --name yolov9-plates --cache ram ``` 表示有成功訓練！ ![image](https://hackmd.io/_uploads/r1wIpJ-sxg.png) ### 參數分析 | 指令 | 解釋 | | ------------------------------------------------------------ |:------------------------------------------------------------------------------------------------------------------------ | | `python train.py` | 執行 YOLOv9 的訓練腳本 | | `--device 0` | 指定使用第 0 張 GPU。 --device 0,1 代表用多張 GPU --device cpu 代表只用 CPU | | `--batch-size 6` | 每個 batch 的影像數量是 6。 越大訓練越快，但顯存會爆掉；越小訓練越慢，但能保證顯存不超限。 | | `--data "...\Carplatev9\data.yaml"` | 指定數據集設定檔（data.yaml） | | `--imgsz 416` | 訓練影像的尺寸（輸入會自動 resize 成 416×416）。 預設 640，設小一點會加快訓練、降低 GPU 需求，但精度可能略降。 | | `--weights ''` | 不載入預訓練權重，從零開始訓練。 通常會設成 yolov9-s.pt 來做 fine-tune，效果會更好。 設成 '' 表示完全 scratch（比較難收斂）。 | | `--cfg models/detect/yolov9-s.yaml` | 指定模型架構（這裡是 YOLOv9-s = small 版本）。 s 比較快，適合車牌這種小資料集。 | | `--hyp data/hyps/hyp.scratch-high.yaml` | 訓練超參數（hyperparameters），例如： * 學習率 (lr) * 損失函數權重 * 增強方法 (data augmentation) | | `--name yolov9-plates` | 訓練結果會存到 runs/train/yolov9-plates/ 資料夾，方便區分不同實驗。 | | `--cache ram` | 把資料集 cache 到記憶體（RAM）裡，讀取影像速度更快，加快訓練。 需要有足夠的 RAM（建議至少 16GB 以上）。 | ## 偵測＋裁切（給 OCR 用） ```cmd= python detect.py --source C:\path\to\test.jpg --weights runs\train\yolov9-plates\weights\best.pt --img 416 --save-crop --classes 0 --exist-ok ``` 成功會在（目前工作目錄）產生 runs\detect\...\crops\plate\*.jpg。 ------------------------------- ## 訓練完成後的工作 ### 檢查訓練結果 * 訓練曲線在 runs/train/yolov9-plates/ 目錄中，可能會有 results.png 或 metrics 資料夾，沒有可以安裝 tensorboard： ``` cmd= conda install tensorboard (yolov9) C:\...\yolov9>tensorboard --logdir=./runs TensorFlow installation not found - running with reduced feature set. Serving TensorBoard on localhost; to expose to the network, use a proxy or pass --bind_all TensorBoard 2.19.0 at http://localhost:..../ (Press CTRL+C to quit) ``` 觀察 box_loss、cls_loss、dfl_loss 是否下降穩定，確認沒有過度震盪或 NaN ![image](https://hackmd.io/_uploads/rJGF4krjel.png) ![image](https://hackmd.io/_uploads/r1tkSJHjgx.png) * 最佳模型 runs/train/yolov9-plates/weights/ 裡會有： best.pt → 根據驗證集的最優結果存下的模型 last.pt → 訓練最後的模型 ### 驗證模型效果 * 在驗證集上測試： ```cmd= python detect.py --weights runs/train/yolov9-plates/weights/best.pt --img 416 --conf 0.25 --source Carplatev9/valid/images ``` 會看到車牌預測框與信心分數，確認模型辨識正確。也可以用 metrics.py 計算 mAP、precision、recall 等指標。 #### 遇到的一些問題： * PyTorch 在 2.6 以後，`torch.load()` 預設用 `weights_only=True`，結果就把你訓練好的 best.pt 當成「不安全的檔案」，直接報 `_pickle.UnpicklingError`。找到 `models/experimental.py` 裡這段（大概 243 行左右）： ```python= ckpt = torch.load(attempt_download(w), map_location='cpu') # load ``` 改成 ```python= ckpt = torch.load(attempt_download(w), map_location='cpu', weights_only=False) ``` * 模型 forward 出來的 `pred` 是 list，但是 YOLOv9 的 `detect.py` 期待的是 tensor，所以在跑 `non_max_suppression(pred, …)` 的時候掛掉。打開 detect.py 找到這段（大約在 100 行左右），改成： ```python= with dt[1]: visualize = increment_path(save_dir / Path(path).stem, mkdir=True) if visualize else False pred = model(im, augment=augment, visualize=visualize) # 如果 pred 是 tuple/list，取第一個輸出 if isinstance(pred, (list, tuple)): pred = pred[0] # NMS with dt[2]: pred = non_max_suppression(pred, conf_thres, iou_thres, classes, agnostic_nms, max_det=max_det) ``` 驗證完成的結果會存在：==runs\detect\exp== ### 推論 / 部署 * 推論單張圖片 ```cmd= python detect.py --weights runs/train/yolov9-plates/weights/best.pt --img 416 --conf 0.25 --source 要測試的照片路徑 ``` * 批次推論 ```cmd= python detect.py --weights runs/train/yolov9-plates/weights/best.pt --img 416 --conf 0.25 --source Carplatev9/test/images/ ``` 導出成 ONNX / TensorRT（方便部署到其他環境） ```cmd= python export.py --weights runs/train/yolov9-plates/weights/best.pt --img 416 --batch 1 --device 0 ``` ![image](https://hackmd.io/_uploads/r1-rXgHsex.png) > 轉換成 TorchScript 格式，代表它可以更穩定地做推理，不會再遇到剛才 list 沒有 device 那種錯誤。後續可改用 `.torchscript` 來跑 `detect`： ```cmd= python detect.py --weights runs/train/yolov9-plates/weights/best.torchscript --img 416 --conf 0.25 --source Carplatev9/valid/images ``` ### 驗證模型準確度直接測試 mAP： ```cmd= python val.py --weights runs/train/yolov9-plates/weights/best.torchscript --data Carplatev9/data.yaml --img 416 ``` result: ```ini= P = 0.989 R = 0.968 mAP50 = 0.993 mAP50-95 = 0.848 # 模型在驗證集上表現很好，車牌偵測幾乎完美。 ``` ![image](https://hackmd.io/_uploads/HysEPxBoxl.png) ### 視覺化結果可以把訓練或驗證圖片用 plot_images.py 或 detect.py 查看預測效果。確認邊框位置、信心分數是否合理。 ### 保存 & 備份訓練好的權重 (best.pt、last.pt) 建議備份到雲端或其他磁，也可以導出 YAML config、classes 資料，方便之後重現或在不同專案使用。 ## 結語透過這次 YOLOv9 的環境安裝與實作，可以完整體驗從資料集準備、模型訓練到推論部署的流程。過程中雖然遇到不少挑戰，例如標註格式不相容、CUDA 記憶體不足、程式碼版本差異導致的錯誤，但逐一解決後，也對 YOLO 系列框架的運作方式有了更深入的理解。最終能夠成功訓練出專屬的車牌辨識模型，並完成偵測與裁切輸出，為後續串接 OCR 進行文字辨識奠定基礎。接下來的方向，可以進一步嘗試模型壓縮、ONNX 部署，或結合 OpenCV、PaddleOCR 打造一個完整的自動化車牌辨識系統。 ## 參考資料 [YOLOv9 模型剖析與使用 — 新的 Object detection 模型](https://blog.infuseai.io/yolov9-object-detection-model-introduction-61dc029d8167) [Taiwan License Plate Recognition Research Tlprr Dataset](https://universe.roboflow.com/jackresearch0/taiwan-license-plate-recognition-research-tlprr/dataset/7) [【Day 13】前置作業 - 初始化環境](https://ithelp.ithome.com.tw/m/articles/10355341) [Pytorch – YOLOv9自定義資料訓練](https://www.wpgdadatong.com/blog/detail/74311) [一些 v7 的範例程式碼](https://www.larrysprognotes.com/YOLOv7_2/)