Mask RCNN - HackMD

--- tags: thesis --- # Mask RCNN :notes: Record by `zoanana990` ## Document :information_source: Open Source：[Thesis](https://drive.google.com/file/d/1XD7hNjN1a05RI56IFL19uMaZWiIFrXOQ/view?usp=sharing), [Journal](https://docs.google.com/document/d/1WqR7ABOj4qvtXUB4IdbSNpb29-N-ziD8/edit?usp=sharing&ouid=102331250265453411054&rtpof=true&sd=true) ## Result :warning: Access Restrictions：[Result](https://drive.google.com/drive/folders/1LzmR1hiLWmQ_kcyZot-XEqLONq0ZXzuZ?usp=sharing) ## Environment Installation Windows: [Detectron2 Installation](https://hackmd.io/@zoanana990/SkTWjb19K) Linux: [Official Website]() ## Usage - Step 1, Please prepare COCO Dataset - Step 2, Clone the following code and modify the position of dataset - Step 3, Modify the configuration file ## Convert to COCO DATASET ### Dataset Using Labelme to annotate images and convert to COCO Format ``` git clone https://github.com/wkentaro/labelme.git ``` #### Convert Labelme to COCO Dataset Format: ``` python3 ./labelme/examples/instance_segmentation/labelme2coco.py <input/data/folder> <output/data/folder> --labels ./labels.txt ``` Example: :information_source: if your system is Unix-like, please use python3. ``` python3 ./labelme/examples/instance_segmentation/labelme2coco.py ./Data/Example ./Data/COCO --labels ./labels.txt python ./labelme/examples/instance_segmentation/labelme2coco.py ./Data/4_folds_fold0 ./Data/F0 --labels ./labels.txt python ./labelme/examples/instance_segmentation/labelme2coco.py ./Data/4_folds_fold1 ./Data/F1 --labels ./labels.txt python ./labelme/examples/instance_segmentation/labelme2coco.py ./Data/4_folds_fold2 ./Data/F2 --labels ./labels.txt python ./labelme/examples/instance_segmentation/labelme2coco.py ./Data/4_folds_fold3 ./Data/F3 --labels ./labels.txt ``` labels.txt Format: ``` __ignore__ powder_uneven powder_uncover scratch ``` ## Source Code for detectron 2 Mask RCNN `main.py` :::spoiler :information_source: Open Source Sample Code ```python= import detectron2 from detectron2.utils.logger import setup_logger setup_logger() import torch device=torch.cuda.device("cuda") # import some common libraries import matplotlib.pyplot as plt import numpy as np import cv2 # import some common detectron2 utilities from detectron2.engine import DefaultPredictor from detectron2.config import get_cfg from detectron2.utils.visualizer import Visualizer from detectron2.data import MetadataCatalog, DatasetCatalog from detectron2.data.datasets import register_coco_instances register_coco_instances(name="F1", metadata={}, json_file="Data/Fold/F1/annotations.json", image_root="Data/Fold/F1") register_coco_instances(name="F2", metadata={}, json_file="Data/Fold/F2/annotations.json", image_root="Data/Fold/F2") register_coco_instances(name="F3", metadata={}, json_file="Data/Fold/F3/annotations.json", image_root="Data/Fold/F3") register_coco_instances(name="F4", metadata={}, json_file="Data/Fold/F4/annotations.json", image_root="Data/Fold/F4") # coco_val_metadata = MetadataCatalog.get("self_coco_val") # dataset_dicts = DatasetCatalog.get("self_coco_val") from detectron2.engine import DefaultTrainer from detectron2.config import get_cfg from detectron2 import model_zoo cfg = get_cfg() ## Detection: COCO-Detection/faster_rcnn_R_101_C4_3x.yaml ## Detection: COCO-Detection/faster_rcnn_R_101_FPN_3x.yaml ## Segmentation: COCO-InstanceSegmentation/mask_rcnn_R_101_C4_3x.yaml ## Segmentation: COCO-InstanceSegmentation/mask_rcnn_R_101_FPN_3x.yaml self.cfg = get_cfg() cfg.merge_from_file(model_zoo.get_config_file("COCO-InstanceSegmentation/mask_rcnn_X_101_32x8d_FPN_3x.yaml")) cfg.MODEL.WEIGHTS = model_zoo.get_checkpoint_url("COCO-InstanceSegmentation/mask_rcnn_X_101_32x8d_FPN_3x.yaml") cfg.DATASETS.TRAIN = ("F2", "F3", "F4", ) test_dataset = "F1" cfg.DATASETS.TEST = (test_dataset,) cfg.DATALOADER.NUM_WORKERS = 2 # Let training initialize from model zoo cfg.SOLVER.IMS_PER_BATCH = 2 num_gpu = 1 bs = (num_gpu * 2) cfg.SOLVER.BASE_LR = 0.0002 * bs / 16 # pick a good LR cfg.SOLVER.MAX_ITER = 80000 # 300 iterations seems good enough for this toy dataset; you may need to train longer for a practical dataset cfg.MODEL.ANCHOR_GENERATOR.SIZES = [[16], [48], [96], [216], [480]] # One size for each in feature map cfg.MODEL.ANCHOR_GENERATOR.ASPECT_RATIOS = [[0.1, 0.2, 0.5, 1, 2, 5, 10, 25, 50, 60, 70]] # Three aspect ratios (same for all in feature maps) # cfg.MODEL.ROI_HEADS.BATCH_SIZE_PER_IMAGE = 512 # faster, and good enough for this toy dataset (default: 512) cfg.MODEL.ROI_HEADS.NUM_CLASSES = 3 # coco datasets cfg.OUTPUT_DIR = "./output/" import os os.makedirs(cfg.OUTPUT_DIR, exist_ok=True) trainer = DefaultTrainer(cfg) trainer.resume_or_load(resume=False) trainer.train() from detectron2.evaluation import COCOEvaluator, inference_on_dataset from detectron2.data import build_detection_test_loader evaluator = COCOEvaluator(test_dataset, cfg, False, output_dir = cfg.OUTPUT_DIR) val_loader = build_detection_test_loader(cfg, test_dataset) inference_on_dataset(trainer.model, val_loader, evaluator) cfg.MODEL.WEIGHTS = os.path.join(cfg.OUTPUT_DIR, "model_final.pth") cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.5 # set the testing threshold for this model cfg.DATASETS.TEST = (test_dataset, ) predictor = DefaultPredictor(cfg) from detectron2.utils.visualizer import ColorMode import glob import time t1 = time.time() count=0 saveFolder = "./output/" + test_dataset + "/Result" if not os.path.exists(saveFolder): os.makedirs(saveFolder) for d in glob.glob("./Data/Fold/" + test_dataset + "/JPEGImages/*.jpg"): count+=1 im = cv2.imread(d) predictions = predictor(im) viz = Visualizer(im[:, :, ::-1], metadata=self.coco_test_metadata, scale=1, instance_mode=ColorMode.IMAGE_BW) output = viz.draw_instance_predictions(predictions["instances"].to("cpu")) result = output.get_image()[:, :, ::-1] cv2.imwrite(saveFolder + '/' + imagePath.split('/')[-1], result) t2=time.time() print("----------------------------------------------------------------------------------------------") print("Time Usage per image: " + str((t2-t1)/count) + "Seconds") ``` ::: :::warning This is a sample code, I'm sure that can be work. However, it is ugly. If you want to use that, please modify that be more clear ::: :warning: Access Restrictions: [github](https://github.com/zoanana990/Mask_RCNN_with_Detectron2) ## Result ### 4 Fold Training with Navie Model :information_source: Backbone: Resnext-101-FPN | Fold Number | mAP | Dice | FPS | AP.50 | AP.75 | AP small | AP medium | AP large | |-------------|--------|-------|------|--------|--------|----------|-----------|----------| | Fold 1 | 77.500 | 92.78 | 8.16 | 91.200 | 81.700 | 32.700 | 27.300 | 88.200 | | Fold 2 | 75.904 | 92.30 | 8.16 | 90.753 | 80.723 | 30.599 | 27.510 | 81.673 | | Fold 3 | 75.370 | 91.54 | 8.16 | 91.072 | 80.563 | 27.825 | 53.887 | 83.224 | | Fold 4 | 75.000 | 92.79 | 8.16 | 89.130 | 78.176 | 12.302 | 51.819 | 85.118 | ##### For Different Class | Fold Number | AP Uneven | AP Uncover | AP scratch | |-------------|-----------|------------|-------------| | Fold 1 | 83.972 | 46.531 | 97.869 | | Fold 2 | 80.358 | 47.353 | 100.00 | | Fold 3 | 79.623 | 46.719 | 99.767 | | Fold 4 | 85.340 | 40.703 | 98.819 | ### Cross Validation | Method | Backbone | mAP | Dice | FPS | AP.50 | AP.75 | |------------|----------|-------|-------|------|--------|-------| | Mask RCNN | R-101 | 87.95 | 92.82 | 9.27 | 98.980 | 92.65 | | Mask RCNN | X-101 | 91.28 | 94.34 | 8.16 | 98.980 | 94.36 | | Mask RCNN | Swin_T | 93.49 | 97.41 | 9.54 | 99.980 | 95.27 | #### For Different Class | Backbone | AP Uneven | AP Uncover | AP scratch | |----------|---------------|----------------|----------------| | R-101 | 85.03 | 80.41 | 97.78 | | X-101 | 93.94 | 82.38 | 97.54 | | Swin_T | 95.11 | 86.38 | 98.61 | * R denotes for Resnet, , which is pretrained backbone by detectron2 * X denotes for Resnext, which is pretrained backbone by detectron2 * Swin_T denotes for Swin transformer, which is custom backbone ### Reference [Swin Transformer](https://arxiv.org/pdf/2111.09883.pdf) ## Classifier Model：Efficient Net B6 Image Size：(512, 512) Dataset： | Categroy | Training Set | Validation Set | Test Set | Total | | -------- | -------- | -------- |-|-| | Normal | 60% | 20% | 180 (20%)|900| | Defect | 60%|20%|178 (20%)|894| Pretrained： None Data Augmentation: Noise, Transpose ### Confusion Matrix | | Intact | Defect | | -- | --- | -- | | Intact | 178 | 1 | | Defect | 2 | 177 | ### Metrics | Metrics | Value | | ------- | ----- | | Accuracy | 99.16% | | Precision | 99.45% | | Recall | 98.91% | | fps | 71.94 | ### PR-Curve ![](https://i.imgur.com/60cpRFl.png)