Mask R-CNN: 使用Object Detection API訓練自己的資料集

# Mask R-CNN: 使用Object Detection API訓練自己的資料集 - [name=作者：Jeff, Jacky5112] - [time=Thu, May 21, 2020 03:01 PM] - OS: Windows 10 ## 目錄 [TOC] ## 整體流程圖 ![](https://i.imgur.com/S2X00Ll.png) ## 環境建置 ### [1] 虛擬環境搭建 - **[Anaconda](https://www.anaconda.com/products/individual#download-section)** 安裝時，請設定 Anaconda 為環境變數，並在命令提示字元中進行配置。 - 建立虛擬環境 `conda create --name myenv python=3.7` >name : 虛擬環境名稱 >python : 選擇 python 版本 - 進入虛擬環境 `conda activate myenv` >虛擬環境名稱 ### [2] 安裝 [Git](https://git-scm.com/) - 在Windows環境中能夠使用Linux指令。 ### [3] TensorFlow Object Detection API 環境搭建 - 下載 [TensorFlow Model Garden](https://github.com/tensorflow/models)，選擇在搭建虛擬環境的資料夾中下載。 ``` git clone --recursive https://github.com/tensorflow/models.git ``` ### [4] 安裝 Python 套件 ``` conda install protobuf (版本 0.29.17) conda install pillow (版本 7.1.2) conda install lxml (版本 4.5.1) conda install matplotlib (版本 3.1.3) conda install Cython (版本 0.29.17) conda install contextlib2 (版本 0.6.0.post1) conda install -c anaconda pyqt (版本 5.12.3) conda install -c conda-forge opencv (版本 4.2.0) conda install numpy==1.17 (版本 1.17.0) conda install scipy==1.5.0 (版本 1.5.0) pip install tf-slim (版本 1.1.0) pip install labelme (版本 4.5.6) pip install git+https://github.com/philferriere/cocoapi.git#subdirectory=PythonAPI ``` ### [5] GPU 相關套件配置 - TensorFlow 安裝 - 此 API 版本適用 TensorFlow v1.15.0。 - CPU 版本 `pip install tensorflow==1.15.0` - GPU 版本 `pip install tensorflow-gpu==1.15.0` - Cuda 安裝 - 此 API 版本適用 [CUDA Toolkit 10.0 Archive](https://developer.nvidia.com/cuda-10.0-download-archive?target_os=Linux&target_arch=x86_64&target_distro=Ubuntu&target_version=1604&target_type=runfilelocal)。 - Cudnn 安裝 - 此 Cuda 版本適用 Cuda10 之 [cuDNN v7.6.5](https://developer.nvidia.com/rdp/cudnn-download)。 ### [6] Protobuf 編譯 ``` protoc object_detection/protos/*.proto --python_out=. ``` 若有錯誤訊息, 嘗試加入 protoc 路徑，以下為指令範例: ``` C:\tensorflow\bin\protoc object_detection/protos/*.proto --python_out=. ``` ### [7] Slim 目錄加至 Python 路徑在 ./models/research/ 目錄下執行，每次開啟 Terminal 都要執行設定。 ``` set PYTHONPATH=<path to models>;<path to models>\research;<path to models>\research\slim ``` 以下為執行範例 : ``` set PYTHONPATH=C:\Users\hanst\Desktop\Lance\tensorflow_api_vino\models;C:\Users\hanst\Desktop\Lance\tensorflow_api_vino\models\research;C:\Users\hanst\Desktop\Lance\tensorflow_api_vino\models\research\slim ``` ### [8] 測試安裝是否成功在 ./models/research/ 目錄下執行測試安裝的程式。若無任何錯誤顯示，即為已安裝所需套件及設置完成。 ``` python object_detection/builders/model_builder_test.py ``` ## 準備資料集與標註資料 ### 建立物件類別檔案檔名命名為 ***.pbtxt** ，檔案格式如下： ![類別檔案](https://i.imgur.com/YUesqz0.jpg) ### 標註資料集建立**訓練集資料夾**與**驗證集資料夾**，並把圖片放置其中，將產生的 json 檔也會出現在相對應的資料夾裡，如下圖所示： ![Create files](https://i.imgur.com/8odvs72.jpg) 建議標註集樹狀圖如下: ``` |__research/datasets | |__Wrench | |--train | |--img | |__image1.jpg, ... | |__json | |__image1.json, ... | |__val | |--img | |__image2.jpg, ... | |__json | |__image2.json, ... ``` - **Labelme** - Mask R-CNN常見的標註工具，為 Python 套件，Labelme 產生檔案的形式為一張圖片一個 json 檔。 ``` labelme ``` 輸入 Labelme 後，即可選擇標註資料夾，使用**Create Polygons**開始多邊形標註，以下為標註畫面： ![labelme](https://i.imgur.com/WBFHbje.png) ## [VGG Image Annotator](http://www.robots.ox.ac.uk/~vgg/software/via/via_demo.html) 標註檔案格式轉換由 VIA 標註格式轉換為 Labelme 之格式（若使用 Labelme 標註則跳過此步驟）。 - 使用`VGG2labelme.py`將 VIA 所產生的標註檔轉換成 Labelme 之格式，指令如下： ``` python VGG2labelme.py \ --json_file=path to VIA format json file (VIA json檔案路徑) --images_dir=path to image directory (照片資料夾路徑) --output_folder_path=name new ouput folder name (檔案輸出資料夾) ``` 訓練集格式轉換： ![](https://i.imgur.com/S0rAfKO.png) 驗證集格式轉換： ![](https://i.imgur.com/wGWs6Ps.png) ## 產生[TFRecord](https://www.tensorflow.org/tutorials/load_data/tfrecord)檔案 TFRecord 是一個 byte-string 組成的檔案，主要為使用 TensorFlow 模型訓練的前處理。從 Labelme 格式之標註檔案，轉換為 Tensorflow 可使用之標註資訊，即為 TFRecord。 - 使用`create_labelme_tf_record.py`產生 TFRecord 檔，指令如下： ``` python create_labelme_tf_record.py \ --images_dir=path to images path (圖片資料夾位置) --annotations_json_dir=path to annotation path (放置Json檔資料夾位置) --label_map_path=path to *.pbtxt (物件類別檔) --output_path=path to output path/*.record (完整檔案名稱) ``` 訓練集轉換為 TFRecord : ![](https://i.imgur.com/692d4pJ.png) 驗證集轉換為 TFRecord : ![](https://i.imgur.com/LxlA7G1.png) - 附註： Labelme json 檔之格式說明 - version : Labelme 標註工具版本 - flags : 以逗號分隔的標誌列表或包含標誌的文件 - shapes - label : 類別名稱 - points : 標註框之座標 - gourp_id : 分群的 id - shape_type : 標註模式 - flags : 以逗號分隔的標誌列表或包含標誌的文件 - imagePath : 圖片路徑 - imageData : 加密後之圖片路徑 - imageHeight : 圖片高度 - imageWidth : 圖片框度 ## 選擇欲訓練 Mask R-CNN 模型及組態設定至 [**model zoo**](https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/tf1_detection_zoo.md)下載所需模型，並從 ./models/research/object_detection/samples/configs/ 資料夾中挑選相對應模型組態，替代剛下載的模型組態(建議優先使用 API 內之模型)。以下為幾項組態中必須調整的參數： - **必要調整參數** ![num_class](https://i.imgur.com/HYAtWFl.png) - num_class : 物件類別總數。 - min_dimension & max_dimension : 建議將輸入調整為影像大小。若短邊超出min_dimension值，則會自動調整為設定值；若長邊超過max_dimension值，則會自動調整為設定值。 ![fine_tune](https://i.imgur.com/TiF7SDO.png) - fine_tune_checkpoint : 為放置組態資料夾之路徑(Ex : path to configuration direction/model.ckpt)。 ![train_input_reader](https://i.imgur.com/WohfJyf.png) - input_path : **訓練集**之TFRecord檔案路徑。 - label_map_path : 物件類別之檔案路徑。 ![eval_map_path](https://i.imgur.com/xhQCcLO.png) - input_path : **驗證集**之TFRecord檔案路徑。 - label_map_path : 物件類別之檔案路徑。 - **依訓練狀況調整之參數** ![train_config](https://i.imgur.com/TaK47S2.png) - batch_size : 每次迭代所訓練的資料數目。 - learning_rate : 根據所設定之迭代數，進行學習率變換。 - momentum_optimizer_value : 調整動量設定。 - num_steps : 總迭代數。 ## 開始訓練在 */models/research/object_detection/legacy 資料夾中，使用 train.py 開始訓練資料。以下為兩種型式訓練資料： - 只有**單一**個組態檔案 : ``` python train.py \ --train_dir=path to output model directory --pipeline_config_path=path to config file (檔案名稱 *.config) ``` > train_dir : 訓練產生檔案之路徑。 > pipeline_config_path : 組態檔路徑。 - 擁有**三種**不同組態檔案 : 將組態檔拆分成三個檔案，模型檔、訓練參數檔和訓練資料路徑檔。需自行從單一組態檔拆分成三個檔案，起訖標題 model、train_config 和 train_input_reader。 ``` python train.py \ --train_dir=path to train directory --model_config_path=path to model direction (檔案名稱 *.pbtxt) --train_config_path=path to train direction (檔案名稱 *.pbtxt) --input_config_path=path to train_input_config (檔案名稱 *.pbtxt) ``` > train_dir : 訓練產生檔案之路徑。 > model_config_path : 模型組態檔。 > train_config_path : 訓練參數之組態檔。 > input_config_path : 訓練資料路徑之組態檔。以下為實際運行指令： ![](https://i.imgur.com/iuyBbD6.png) ## 產生推論檔案使用已訓練模型，產生 TensorFlow 推論檔案(*.pb)，進行 Freeze 程序： ``` python export_inference_graph.py \ --pipeline_config_path=path to configuration file (檔案名稱 *config) --trained_checkpoint_prefix=path to checkpoint file (檔案名稱 *ckpt-{step}) --output_directory=path to output directory ``` > pipeline_config_path : 組態檔路徑。 > train_checkpoint_preflix : checkpoint 檔。 > output_directory : 選擇產生推論檔案之路徑。以下為實際運行指令： ![](https://i.imgur.com/7Mdz4Bn.png) # Frozen 模型轉換為推論模型 ## 轉換為 OpenCV 使用之模型根據 OpenCV 官方文件，提及如何在 OpenCV 使用 Object detection API 所產生的 Frozen 檔，以下為轉檔要點： 1) 從[官方文件](https://github.com/opencv/opencv/wiki/TensorFlow-Object-Detection-API)中，至 **Generate a config file** 之標題，選取各模型之 python 檔案，如下圖： ![](https://i.imgur.com/kHGrY72.jpg) 2) 複製所需之 python 程式，並自行建立相關檔案，以下以 Mask RCNN 為例： - tf_text_graph_common.py - tf_text_graph_mask_rcnn.py 3) 執行 tf_text_graph_mask_rcnn.py ``` python tf_text_graph_mask_rcnn.py \ --input=/path/to/model.pb (Frozen模型檔 .pb) --config=/path/to/example.config (組態檔 .config) --output=/path/to/graph.pbtxt (輸出OpenCV適用檔案 .pbtxt) ``` > input : Frozen 模型之路徑。 > config : 組態檔之路徑。 > output : 輸出 OpenCV 架構檔案之路徑實際運行如下： ![](https://i.imgur.com/U0Fbx4C.png) ## 轉換為OpenVINO使用之 IR 模型需要使用**系統管理員權限**、進入**虛擬環境**與啟用**OpenVINO模式**，方能進行轉換： ``` cd <file path>\IntelSWTools\openvino_2020.1.033\deployment_tools\model_optimizer python mo_tf.py\ --output_dir=<path to output> --input_model=<file path>/frozen_inference_graph.pb --model_name=model_name --tensorflow_object_detection_api_pipeline_config=<file path>/pipeline.config --tensorflow_use_custom_operations_config=<file path>/IntelSWTools/openvino_2020.1.033/deployment_tools/model_optimizer/extensions/front/tf/mask_rcnn_support_api_v1.15.json --log_level=ERROR ``` > input_model : 訓練完之模型 > model_name : 輸出模型之名稱 > tensorflow_object_detection_api_pipeline_config : Object Detection API 之組態檔案 > tensorflow_use_custom_operations_config : TensorFlow指定使用檔案(此次為 Mask R-CNN) > log_level : 設定資訊提供層級(CRITICAL、ERROR、WRAN、WARNING、INFO、DEBUG、NOTSET) 實際運行指令如下： ![](https://i.imgur.com/pTjWHFw.png) # 成功使用 Object Detection API 訓練資料集使用 OpenVINO 官方提供的 Mask R-CNN Demo (適用 Object Detection API)，檔案位於 `C:\Users\<username>\Documents\Intel\OpenVINO\omz_demos_build\intel64\Release` 以下為實際測試指令： ``` mask_rcnn_demo.exe -i checkerboard_0.png -m maskrcnn_v3.xml -d CPU ``` ![](https://i.imgur.com/GGJKLr9.png) 測試結果如下： ![](https://i.imgur.com/7asVG93.png) # Object Detection ## CMake + Micorosft Visual Studio 重新編譯 ### [0] 關於OpenCV Backend 與 Target from opencv2\dnn\dnn.hpp ![](https://i.imgur.com/unoSNIR.png) ### [1] 安裝軟體與函示庫 1. 下載OpenCV原始碼(https://github.com/opencv/opencv) 2. 在Cmake中，選取編譯器(請注意選擇x64) ![](https://i.imgur.com/eZCkuBz.png) 3. 執行CMake-GUI，開啟「WITH_INF_ENGINE」後點選「Configure」 ![](https://i.imgur.com/EZ3eoQr.png) 4. 開啟「BUILD_opencv_world」 ![](https://i.imgur.com/ZE6xtEv.png) 5. 設定IE_.....相關之實體路徑(目前使用OpenCV 4.4.0 搭配 OpenVino 2020.3版本) ![](https://i.imgur.com/ETB1Zte.png) 6. 設定CPU支援(請特別注意CPU是否支援此指令集) ![](https://i.imgur.com/GxqgfBH.png) 7. 點選「Generate」，將會產生Visual Studio專案檔 ### [2] 重新編譯OpenCV 1. 開啟OpenCV專案檔(OpenCV.sln) 2. 對整個方案重新編譯(建議編譯區分為「Debug」與「Release」，並且設定為「x64」) ![](https://i.imgur.com/yWhUIGp.png) 3. 編譯完成後，於「bin」資料夾將會產生「Debug」與「Release」，並將該編譯後動態函示庫放置自行開發專案之下(也可設定系統PATH參數，預設OpenCV.dll載入) ### [3] 開發環境設定 1. 於專案>屬性>VC++目錄，針對「Include 目錄」與「程式庫目錄」設定 Include目錄請設定您OpenCV中「Include」標頭檔目錄以及OpenVINO中「inference_engine 」的「Include」標頭檔目錄。程式庫目錄請設定您編譯好OpenCV的「lib」靜態函示庫目錄以及OpenVINO中「inference_engine 」的「lib」靜態函示庫目錄(請注意Debug以及Release模式)。 ![](https://i.imgur.com/8UHeRAT.png) 2. 於專案>屬性>連結器>輸入，針對「其他相依性」設定，其中使用Debug模式之為 Debug模式設定： opencv_world440d.lib inference_engined.lib inference_engine_legacyd.lib inference_engine_c_apid.lib inference_engine_nn_builderd.lib Release模式設定： opencv_world440.lib inference_engine.lib inference_engine_legacy.lib inference_engine_c_api.lib inference_engine_nn_builder.lib ![](https://i.imgur.com/isbsQFU.png) 3. 於專案>屬性>連結器>輸入，針對「忽略特定的預設程式庫」中輸入「%(IgnoreSpecificDefaultLibraries)」 4. 若要將自行開發專案部屬其他電腦，建議使用Release模式，另外部屬之電腦若未安裝Microsoft Visual Studio發布套件，可於專案>屬性>C/C++>程式碼產生，針對「執行階段程式庫」設定為「多執行續(/MT)」 ![](https://i.imgur.com/AsDkRm6.png) ### [4] 安裝 Movidius MyraidX 驅動程式 1. 安裝 Movidius VSC Driver ![](https://i.imgur.com/CAdJdBC.png) 2. 安裝 myriad 驅動程式 ![](https://i.imgur.com/lXlZzvg.png) ### [5] 修改 MaskRCNN Tensorflow Model 之 pbtxt 檔以支援OpenVINO雲算 (如使用OpenCV 4.2.0以下版本)修改方法如圖： ![](https://i.imgur.com/r3bARo0.png) 如使用OpenCV 4.4.0以後版本，將不需要修改(*.pbtxt)以支援OpenVINO 執行結果(以官方MaskRCNN為測試)： ![](https://i.imgur.com/kFTDqEB.jpg) #### 使用方法本程式執行命令方式： ``` "{help | | Print help message. }" "{platform | OpenCV | Using OpenCV or OpenVino platform. }" "{weight | | (OpenCV and OpenVino Platform) Path to a binary file of model contains trained weights. " "It could be a file with extensions .caffemodel (Caffe), .pb (TensorFlow), .t7 or .net (Torch), .weights (Darknet), .bin (OpenVINO). }" "{graph | | (OpenCV and OpenVino Platform) Path to a text file of model contains network configuration, " "It could be a file with extensions .prototxt (Caffe), .pbtxt (TensorFlow), .cfg (Darknet), .xml (OpenVINO). }" "{classes | | (OpenCV and OpenVino Platform) Path to a text file with names of classes to label detected objects. }" "{colors | | (OpenCV and OpenVino Platform) Path to a text file of indication that model works with RGB input images instead BGR ones (this is for segmenttation). }" "{scale | 1.0 | (OpenCV and OpenVino Platform) Preprocess input image by multiplying on a scale factor. }" "{image |<none>| (OpenCV and OpenVino Platform) Path to a image file for input data. }" "{video |<none>| (OpenCV and OpenVino Platform) Path to a video file for input data. }" "{camera |<none>| (OpenCV and OpenVino Platform) Use camera's frame for input data. }" "{conf | 0.5 | (OpenCV and OpenVino Platform) Confidence threshold. }" "{mask | 0.4 | (OpenCV and OpenVino Platform) Non-maximum suppression threshold. }" "{backend | OpenCV | (OpenCV Platform) (Optional) Choose one of computation backends: " "OpenVINO: Intel Inference Engine, " "OpenCV: OpenCV implementation }" "{target | CPU | (OpenCV and OpenVino Platform) (Optional) Choose one of target computation devices: " "CPU: CPU target (by default), " "MyRaid: MyRaid, " "GPU: GPU, " "GPU16: GPU using half-float precision. }" "{outlayer_names | | (OpenCV Platform) (Optional) Force to set output layer's name (use ',' to seperate). }" "{detection | | (OpenVino Platform) (Optional) Force to set detection layer's name. }" "{masks_name | | (OpenVino Platform) (Optional) Force to set the mask name. }" "{time | 1 | (OpenCV and OpenVino Platform) (Optional) Show inference time. }" ``` # 階段執行成果 ## OpenCV + Inference Engine ### Wrench ![](https://i.imgur.com/kEfveau.jpg) ![](https://i.imgur.com/bHgXXci.jpg) ![](https://i.imgur.com/jvxM2Td.jpg) ### Noodles ![](https://i.imgur.com/99aV0wf.jpg) ![](https://i.imgur.com/WgmA7Nf.jpg) ![](https://i.imgur.com/3BdTSyr.jpg) ![](https://i.imgur.com/U1RqJfz.jpg) ### 關於 GPU 使用方式確認可利用「工作管理員」針對執行之程式(Program)，檢查是否有執行使用GPU。 ![](https://i.imgur.com/iMia4Vv.png) 未啟用時： ![](https://i.imgur.com/CQFT2tE.png) 啟用GPU運算時： ![](https://i.imgur.com/h8XRR1f.png) ## Notice ### Inference Engine Plugins.xml (supported Myriad) ![](https://i.imgur.com/F2TdITa.png) ### CPU vs VPU vs GPU (OpenCL) #### 說明實驗組 A-> Tensorflow-> OpenCV + CPU B-> Tensorflow-> OpenCV + GPU C-> Tensorflow-> OpenCV(backend OpenVino) + CPU D-> Model Optimizer-> OpenVino + CPU E-> Model Optimizer-> OpenVino + 1 NSE2 F-> Model Optimizer-> OpenVino + 2 NSE2 G-> Model Optimizer-> OpenVino + 3 NSE2 H-> Model Optimizer-> OpenVino + 4 NSE2 #### 第一組 (note. Jacky桌機) CPU -> AMD Ryzen Threadipper 2950X 16-Core Processor 3.42GHz GPU(獨顯) -> GTX GeForce 1080Ti Note: ! -> 代表不能跑，官方Demo程式也無法跑 ##### Wrench (1280 * 720) | |A|B|C|D|E|F|G|H |---|-|-|-|-|-|-|-|-| |Min|757|2430|463|588|!|3230|3229|!| |Max|968|7032|580|633|!|3633|3381|!| |Avg|799|2697|501|593|!|3248|3262|!| Time: ms ##### Noodles (1280 * 720) | |A|B|C|D|E|F|G|H |---|-|-|-|-|-|-|-|-| |Min|796|2435|484|649|!|!|3443|3433| |Max|1250|6804|752|729|!|!|3896|3588| |Avg|847|2605|536|683|!|!|3481|3456| Time: ms #### NCHC AI Hackthon Car Damage (224 * 224) | |A|B|C|D|E|F|G|H| |---|-|-|-|-|-|-|-|-| |Min|618|2079|348|476|2797|!|!|2794| |Max|1462|11316|2972|515|7931|!|!|7656| |Avg|707|2928|484|480|2826|!|!|2890| Time: ms #### 第二組 (note. Jacky筆電) CPU -> Intel(R) Core(TM) i7-8565U CPU @1.80GHz 1.99GHz GPU(內顯)-> Intel(R) UHD Graphic 620 ##### Wrench (1280 * 720) | |A|B|C|D|E|F|G|H| |---|-|-|-|-|-|-|-|-| |Min|1463|1712|796|916|!|||| |Max|1967|13440|1117|1499|!|||| |Avg|1642|4405|941|1159|!|||| Time: ms ##### Noodles (1280 * 720) | |A|B|C|D|E|F|G|H| |---|-|-|-|-|-|-|-|-| |Min|2249|1394|940|1150|!|!||| |Max|2759|97100|2017|1538|!|!||| |Avg|2452|4068|1465|1333|!|!||| Time: ms #### NCHC AI Hackthon Car Damage (224 * 224) | |A|B|C|D|E|F|G|H| |---|-|-|-|-|-|-|-|-| |Min|1232|906|582|692|2800|!|!|| |Max|1476|8923|2895|856|7938|!|!|7656| |Avg|1302|1048|802|780|2817|!|!|2890| Time: ms # 使用OpecnVINO與OpenCV運行數據比較 ![](https://i.imgur.com/iSyqb6J.jpg) # 參考資料 - https://blog.csdn.net/length85/article/details/87917438 - https://blog.csdn.net/length85/article/details/87917361#%E5%B0%86%E6%A0%87%E5%AE%9A%E6%A0%B7%E6%9C%AC%E7%94%9F%E6%88%90%E4%B8%BA.record%E6%A0%BC%E5%BC%8F%E6%96%87%E4%BB%B6