# Mask R-CNN Intro ###### tags: `Mask R-CNN` `DL` `Image Segmentation` ## Structure |Layers| Detail| |--------|--------| | Input layer|image | | Hiden layer 1|Take an image and extract features(提取特徵) using the ResNet 101 architecture(A kind of Neural Network) | | Hiden layer 2| ***Region Proposal Network (RPN)*** 取得特徵向量突出的長方形區域 **>>** takes an image (of any size) as input and outputs a set of rectangular object proposals, each with an objectness score **>>** Get the regions or feature maps(特徵矩陣or feature tensor) which the model predicts(from layer 1) | |Hidden layer 3 | ***Pooling layer*** 將各區域池化,接著算IOU,得唯一長方形 **>>** compute the region of interest (感興趣區) so that the computation time can be reduced. For all the predicted regions, we compute the Intersection over Union (IoU) with the ground truth boxes. **>>** IoU = Area of the intersection / Area of the union **>>** if (IoU >= 0.5) => region of interest |Hidden layer 4 | ***Segmentation Mask*** 明確輸出邊,得到物體形狀 **>>** 各物件與將物件凸顯的矩陣做捲積運算 |Output layer|Coordinate, bounding box(邊框), object mask| ## Computer Vision Comparison ![](https://i.imgur.com/zWwnfr0.jpg) ### Semantic segmentation: >目的是在一張圖裡分割聚類出不同物體的pixel 目前的主流框架皆基於Fully Convolutional Neural Networks (FCN) ### instance segmentation: >在dense feature map上面整合個instance region proposal/score map/RoI => 分割 >跟object detection是緊密相關的