---
tags: thesis
---
# Mask RCNN
:notes: Record by `zoanana990`
## Document
:information_source: Open Source:[Thesis](https://drive.google.com/file/d/1XD7hNjN1a05RI56IFL19uMaZWiIFrXOQ/view?usp=sharing), [Journal](https://docs.google.com/document/d/1WqR7ABOj4qvtXUB4IdbSNpb29-N-ziD8/edit?usp=sharing&ouid=102331250265453411054&rtpof=true&sd=true)
## Result
:warning: Access Restrictions:[Result](https://drive.google.com/drive/folders/1LzmR1hiLWmQ_kcyZot-XEqLONq0ZXzuZ?usp=sharing)
## Environment Installation
Windows: [Detectron2 Installation](https://hackmd.io/@zoanana990/SkTWjb19K)
Linux: [Official Website]()
## Usage
- Step 1, Please prepare COCO Dataset
- Step 2, Clone the following code and modify the position of dataset
- Step 3, Modify the configuration file
## Convert to COCO DATASET
### Dataset
Using Labelme to annotate images and convert to COCO Format
```
git clone https://github.com/wkentaro/labelme.git
```
#### Convert Labelme to COCO Dataset
Format:
```
python3 ./labelme/examples/instance_segmentation/labelme2coco.py <input/data/folder> <output/data/folder> --labels ./labels.txt
```
Example:
:information_source: if your system is Unix-like, please use python3.
```
python3 ./labelme/examples/instance_segmentation/labelme2coco.py ./Data/Example ./Data/COCO --labels ./labels.txt
python ./labelme/examples/instance_segmentation/labelme2coco.py ./Data/4_folds_fold0 ./Data/F0 --labels ./labels.txt
python ./labelme/examples/instance_segmentation/labelme2coco.py ./Data/4_folds_fold1 ./Data/F1 --labels ./labels.txt
python ./labelme/examples/instance_segmentation/labelme2coco.py ./Data/4_folds_fold2 ./Data/F2 --labels ./labels.txt
python ./labelme/examples/instance_segmentation/labelme2coco.py ./Data/4_folds_fold3 ./Data/F3 --labels ./labels.txt
```
labels.txt Format:
```
__ignore__
powder_uneven
powder_uncover
scratch
```
## Source Code for detectron 2 Mask RCNN
`main.py`
:::spoiler :information_source: Open Source Sample Code
```python=
import detectron2
from detectron2.utils.logger import setup_logger
setup_logger()
import torch
device=torch.cuda.device("cuda")
# import some common libraries
import matplotlib.pyplot as plt
import numpy as np
import cv2
# import some common detectron2 utilities
from detectron2.engine import DefaultPredictor
from detectron2.config import get_cfg
from detectron2.utils.visualizer import Visualizer
from detectron2.data import MetadataCatalog, DatasetCatalog
from detectron2.data.datasets import register_coco_instances
register_coco_instances(name="F1", metadata={},
json_file="Data/Fold/F1/annotations.json", image_root="Data/Fold/F1")
register_coco_instances(name="F2", metadata={},
json_file="Data/Fold/F2/annotations.json", image_root="Data/Fold/F2")
register_coco_instances(name="F3", metadata={},
json_file="Data/Fold/F3/annotations.json", image_root="Data/Fold/F3")
register_coco_instances(name="F4", metadata={},
json_file="Data/Fold/F4/annotations.json", image_root="Data/Fold/F4")
# coco_val_metadata = MetadataCatalog.get("self_coco_val")
# dataset_dicts = DatasetCatalog.get("self_coco_val")
from detectron2.engine import DefaultTrainer
from detectron2.config import get_cfg
from detectron2 import model_zoo
cfg = get_cfg()
## Detection: COCO-Detection/faster_rcnn_R_101_C4_3x.yaml
## Detection: COCO-Detection/faster_rcnn_R_101_FPN_3x.yaml
## Segmentation: COCO-InstanceSegmentation/mask_rcnn_R_101_C4_3x.yaml
## Segmentation: COCO-InstanceSegmentation/mask_rcnn_R_101_FPN_3x.yaml
self.cfg = get_cfg()
cfg.merge_from_file(model_zoo.get_config_file("COCO-InstanceSegmentation/mask_rcnn_X_101_32x8d_FPN_3x.yaml"))
cfg.MODEL.WEIGHTS = model_zoo.get_checkpoint_url("COCO-InstanceSegmentation/mask_rcnn_X_101_32x8d_FPN_3x.yaml")
cfg.DATASETS.TRAIN = ("F2", "F3", "F4", )
test_dataset = "F1"
cfg.DATASETS.TEST = (test_dataset,)
cfg.DATALOADER.NUM_WORKERS = 2
# Let training initialize from model zoo
cfg.SOLVER.IMS_PER_BATCH = 2
num_gpu = 1
bs = (num_gpu * 2)
cfg.SOLVER.BASE_LR = 0.0002 * bs / 16 # pick a good LR
cfg.SOLVER.MAX_ITER = 80000 # 300 iterations seems good enough for this toy dataset; you may need to train longer for a practical dataset
cfg.MODEL.ANCHOR_GENERATOR.SIZES = [[16], [48], [96], [216], [480]] # One size for each in feature map
cfg.MODEL.ANCHOR_GENERATOR.ASPECT_RATIOS = [[0.1, 0.2, 0.5, 1, 2, 5, 10, 25, 50, 60, 70]] # Three aspect ratios (same for all in feature maps)
# cfg.MODEL.ROI_HEADS.BATCH_SIZE_PER_IMAGE = 512 # faster, and good enough for this toy dataset (default: 512)
cfg.MODEL.ROI_HEADS.NUM_CLASSES = 3 # coco datasets
cfg.OUTPUT_DIR = "./output/"
import os
os.makedirs(cfg.OUTPUT_DIR, exist_ok=True)
trainer = DefaultTrainer(cfg)
trainer.resume_or_load(resume=False)
trainer.train()
from detectron2.evaluation import COCOEvaluator, inference_on_dataset
from detectron2.data import build_detection_test_loader
evaluator = COCOEvaluator(test_dataset, cfg, False, output_dir = cfg.OUTPUT_DIR)
val_loader = build_detection_test_loader(cfg, test_dataset)
inference_on_dataset(trainer.model, val_loader, evaluator)
cfg.MODEL.WEIGHTS = os.path.join(cfg.OUTPUT_DIR, "model_final.pth")
cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.5 # set the testing threshold for this model
cfg.DATASETS.TEST = (test_dataset, )
predictor = DefaultPredictor(cfg)
from detectron2.utils.visualizer import ColorMode
import glob
import time
t1 = time.time()
count=0
saveFolder = "./output/" + test_dataset + "/Result"
if not os.path.exists(saveFolder):
os.makedirs(saveFolder)
for d in glob.glob("./Data/Fold/" + test_dataset + "/JPEGImages/*.jpg"):
count+=1
im = cv2.imread(d)
predictions = predictor(im)
viz = Visualizer(im[:, :, ::-1],
metadata=self.coco_test_metadata,
scale=1,
instance_mode=ColorMode.IMAGE_BW)
output = viz.draw_instance_predictions(predictions["instances"].to("cpu"))
result = output.get_image()[:, :, ::-1]
cv2.imwrite(saveFolder + '/' + imagePath.split('/')[-1], result)
t2=time.time()
print("----------------------------------------------------------------------------------------------")
print("Time Usage per image: " + str((t2-t1)/count) + "Seconds")
```
:::
:::warning
This is a sample code, I'm sure that can be work. However, it is ugly. If you want to use that, please modify that be more clear
:::
:warning: Access Restrictions: [github](https://github.com/zoanana990/Mask_RCNN_with_Detectron2)
## Result
### 4 Fold Training with Navie Model
:information_source: Backbone: Resnext-101-FPN
| Fold Number | mAP | Dice | FPS | AP.50 | AP.75 | AP small | AP medium | AP large |
|-------------|--------|-------|------|--------|--------|----------|-----------|----------|
| Fold 1 | 77.500 | 92.78 | 8.16 | 91.200 | 81.700 | 32.700 | 27.300 | 88.200 |
| Fold 2 | 75.904 | 92.30 | 8.16 | 90.753 | 80.723 | 30.599 | 27.510 | 81.673 |
| Fold 3 | 75.370 | 91.54 | 8.16 | 91.072 | 80.563 | 27.825 | 53.887 | 83.224 |
| Fold 4 | 75.000 | 92.79 | 8.16 | 89.130 | 78.176 | 12.302 | 51.819 | 85.118 |
##### For Different Class
| Fold Number | AP Uneven | AP Uncover | AP scratch |
|-------------|-----------|------------|-------------|
| Fold 1 | 83.972 | 46.531 | 97.869 |
| Fold 2 | 80.358 | 47.353 | 100.00 |
| Fold 3 | 79.623 | 46.719 | 99.767 |
| Fold 4 | 85.340 | 40.703 | 98.819 |
### Cross Validation
| Method | Backbone | mAP | Dice | FPS | AP.50 | AP.75 |
|------------|----------|-------|-------|------|--------|-------|
| Mask RCNN | R-101 | 87.95 | 92.82 | 9.27 | 98.980 | 92.65 |
| Mask RCNN | X-101 | 91.28 | 94.34 | 8.16 | 98.980 | 94.36 |
| Mask RCNN | Swin_T | 93.49 | 97.41 | 9.54 | 99.980 | 95.27 |
#### For Different Class
| Backbone | AP Uneven | AP Uncover | AP scratch |
|----------|---------------|----------------|----------------|
| R-101 | 85.03 | 80.41 | 97.78 |
| X-101 | 93.94 | 82.38 | 97.54 |
| Swin_T | 95.11 | 86.38 | 98.61 |
* R denotes for Resnet, , which is pretrained backbone by detectron2
* X denotes for Resnext, which is pretrained backbone by detectron2
* Swin_T denotes for Swin transformer, which is custom backbone
### Reference
[Swin Transformer](https://arxiv.org/pdf/2111.09883.pdf)
## Classifier
Model:Efficient Net B6
Image Size:(512, 512)
Dataset:
| Categroy | Training Set | Validation Set | Test Set | Total |
| -------- | -------- | -------- |-|-|
| Normal | 60% | 20% | 180 (20%)|900|
| Defect | 60%|20%|178 (20%)|894|
Pretrained: None
Data Augmentation: Noise, Transpose
### Confusion Matrix
| | Intact | Defect |
| -- | --- | -- |
| Intact | 178 | 1 |
| Defect | 2 | 177 |
### Metrics
| Metrics | Value |
| ------- | ----- |
| Accuracy | 99.16% |
| Precision | 99.45% |
| Recall | 98.91% |
| fps | 71.94 |
### PR-Curve
