owned this note
owned this note
Published
Linked with GitHub
# Implementation of semantic-segmentation-pytorch
> colab: [Semantic Segmentation Demo](https://colab.research.google.com/drive/1kUZrjMFuDjlrPsNF2l7FioQKFqTxO9oS?authuser=1#scrollTo=Ic3Belq4zmet)
> repo: [CSAILVision/semantic-segmentation-pytorch](https://github.com/CSAILVision/semantic-segmentation-pytorch) <- credit to those folks
> my note: [Segmentation Overview](https://hackmd.io/Bo8ujt3LSOOK_1pfFaHKMg)
[Toc]
## Disclaimer (?)
Hi~ If you are reading this, this is a "helpme" md try to document the note while I adopting the repo for my own project. **I am not one of authors of the repo**, and **this page is under developed**. Any comment and suggestion is welcome~
## Bug
If you are like me, using single GPU to train, you might encouter [this](https://github.com/CSAILVision/semantic-segmentation-pytorch/issues/203). The solution is already in [#203](https://github.com/CSAILVision/semantic-segmentation-pytorch/issues/203#issuecomment-562524601):
Add the following in `model.py` around line 30,
```python=
class SegmentationModule(SegmentationModuleBase):
def forward(self, feed_dict, *, segSize=None):
# training
if type(feed_dict) is list:
feed_dict = feed_dict[0]
if torch.cuda.is_available():
feed_dict['img_data'] = feed_dict['img_data'].cuda()
feed_dict['seg_label'] = feed_dict['seg_label'].cuda()
else:
raise RunTimeError('Cannot convert torch.Floattensor into torch.cuda.FloatTensor')
```
## Configuration
The default configurations of datasets and models, train/val/test's hyparameters are located in `mit_semseg/config/default.py`. The explanation of each option is well documented in there.
The official custom configuration can be found in `config` directory.
## Loss
Their loss is [NLLLoss](https://pytorch.org/docs/stable/generated/torch.nn.NLLLoss.html). Might change to the cross-entropy loss. Or dice coefficient. The efficiency or improvement is unknown to me.
## Model
See [here](https://github.com/CSAILVision/semantic-segmentation-pytorch/blob/9aff40de31ee4b21f18514d31e5d6e4ba056924d/mit_semseg/models/models.py#L50), it is in `mit_semseg/models/models.py`
## Evaluation
Use `eval_multipro.py` to display the IoU of each class in terminal, and calculate the mean IoU of all the classes as well as accuracy. If the visulazation is set `True` (one can set it in the `.yaml` config file) The results of prediction (`.png`) will be saved under `./ckpt/the_model_u_trained/result`. You will get something like the following:
<center>
<img
src=https://i.imgur.com/xu8m6p8.png
width=600>
</center>
## Train on you own dataset
### Test run
:::success
Here assume that you alreay follow the steps in Train section of the readme in the repo, e.g. download the dataset etc.
:::
To make sure everything is in the right place and already to go, test run with
```python=
python train.py --gpus 1 --cfg config/ade20k-resnet50dilated-ppm_deepsup.yaml
```
if there is a error message of GPU memory is not enough, _usually_ change the `imgMaxSize` in `config/ade20k-resnet50dilated-ppm_deepsup.yaml` in `config` can solve the problem.
### Dataset Preparation
#### `.json` to Dataset
Here, I use labelme for annotation ([tutorial here](https://github.com/wkentaro/labelme/tree/master/examples/tutorial#tutorial-single-image-example)), in order to let the model eat the label data, we need to transform `.json` files to `.png` label images. The file `json_to_dataset.py` is to convert `.json` file into single image dataset.
:::info
It can be found in `~/anaconda3/envs/labelme/lib/pythonX.X/site-packages/labelme/cli/json_to_dataset.py` if one creates with conda env.
Otherwise, one can find in here: `~/anaconda3/lib/pythonX.X/site-packages/labelme/cli/json_to_dataset.py`.
:::
To perform repeatly for all the `.json` files, I modified it as below:
```python=
import argparse
import base64
import json
import os
import os.path as osp
import imgviz
import PIL.Image
import matplotlib.pyplot as plt
from labelme.logger import logger
from labelme import utils
def main():
# logger.warning(
# "This script is aimed to demonstrate how to convert the "
# "JSON file to a single image dataset."
# )
# logger.warning(
# "It won't handle multiple JSON files to generate a "
# "real-use dataset."
# )
parser = argparse.ArgumentParser()
parser.add_argument("json_file")
# parser.add_argument("-o", "--out", default=None)
parser.add_argument('--save',type=str)
args = parser.parse_args()
json_file = args.json_file
# if args.out is None:
# out_dir = osp.basename(json_file).replace(".", "_")
# out_dir = osp.join(osp.dirname(json_file), out_dir)
# else:
# out_dir = args.out
# if not osp.exists(out_dir):
# os.mkdir(out_dir)
data = json.load(open(json_file))
imageData = data.get("imageData")
if not imageData:
imagePath = os.path.join(os.path.dirname(json_file), data["imagePath"])
with open(imagePath, "rb") as f:
imageData = f.read()
imageData = base64.b64encode(imageData).decode("utf-8")
img = utils.img_b64_to_arr(imageData)
label_name_to_value = {"_background_": 0}
for shape in sorted(data["shapes"], key=lambda x: x["label"]):
label_name = shape["label"]
if label_name in label_name_to_value:
label_value = label_name_to_value[label_name]
else:
label_value = len(label_name_to_value)
label_name_to_value[label_name] = label_value
lbl, _ = utils.shapes_to_label(
img.shape, data["shapes"], label_name_to_value
)
label_names = [None] * (max(label_name_to_value.values()) + 1)
for name, value in label_name_to_value.items():
label_names[value] = name
lbl_viz = imgviz.label2rgb(
label=lbl, img=imgviz.asgray(img), label_names=label_names, loc="rb"
)
# Modified here
# PIL.Image.fromarray(img).save(osp.join(out_dir, "img.png"))
utils.lblsave(osp.join(args.save, f"{os.path.basename(args.json_file)[:-5]}.png"), lbl)
# PIL.Image.fromarray(lbl_viz).save(osp.join(out_dir, "label_viz.png"))
# with open(osp.join(out_dir, "label_names.txt"), "w") as f:
# for lbl_name in label_names:
# f.write(lbl_name + "\n")
logger.info("Saved to: {}".format(args.save))
if __name__ == "__main__":
main()
```
Note that you can modify it for the label of background (Here it assigns label 0 for the background). In the following, I repeatly call the above function, remember that one has to prepare both training and val datasets at first:
```python
import labelme
import os, sys
if __name__ == "__main__":
mode = 'val' # or 'train'
path = f"path/to/your_json"
dest = f"path/to/save/"
dirs = os.listdir(path)
dirs = [dir for dir in dirs if dir.endswith('.json')]
dirs = [os.path.join(path,dir) for dir in dirs]
for i, item in enumerate(dirs):
json_name = item.split('/')[-1].split('.')[0]
os.system("labelme_json_to_dataset "+item+" --save "+dest)
```
We still need to create a several files to train, but first, let's check the data structure of the repo's file.
#### Directory Structure
```
# under `data` directory
yourdataset ___
|
___ annotations _______ training
| |___ validation
|
|
___ images ______ training
|___ validation
# help files
training.odgt
validation.odgt
```
If you happen to have your own dataset, split the data and create a directory structure under the repo's `data`. The `.odgt` files are for the image and annotation paths also their height and width. The format of content in `training/validation.odgt` is like the following:
```
{"fpath_img": "path/to/.jpg", "fpath_segm": "path/to/.png", "width": 683, "height": 512}
```
To create these files, I wrote the following (note that I presume you have already put the data under the images/annotations directory)
```python
import os
import cv2
import json
def odgt(img_path):
seg_path = img_path.replace('images','annotations')
seg_path = seg_path.replace('.jpg','.png')
if os.path.exists(seg_path):
img = cv2.imread(img_path)
h, w, _ = img.shape
odgt_dic = {}
odgt_dic["fpath_img"] = img_path
odgt_dic["fpath_segm"] = seg_path
odgt_dic["width"] = h
odgt_dic["height"] = w
return odgt_dic
else:
# print('the corresponded annotation does not exist')
# print(img_path)
return None
if __name__ == "__main__":
modes = ['train','val']
saves = ['metal_training.odgt', 'metal_validation.odgt'] # customized
for i, mode in enumerate(modes):
save = saves[i]
dir_path = f"your/data/{mode}"
img_list = os.listdir(dir_path)
img_list.sort()
img_list = [os.path.join(dir_path, img) for img in img_list]
with open(f'~/semantic-segmentation-pytorch/data/{save}', mode='wt', encoding='utf-8') as myodgt:
for i, img in enumerate(img_list):
a_odgt = odgt(img)
if a_odgt is not None:
myodgt.write(f'{json.dumps(a_odgt)}\n')
```
#### Modify `dataset.py`
Note that the annotation file should be converted to mode `L`, I modified it in `__getitem__` in `TrainDataset` (val and test also)
```
# Original
# segm = Image.open(segm_path)
segm = Image.open(segm_path).convert('L')
```
In addition, classes annotations from `labelme` will not start from 0. I have to change the following line in `BaseDataset` manually (there are only two classes for my case):
```python=
def segm_transform(self, segm):
# to tensor, -1 to 149 for the default dataset
# for ours, -1 background, 0 and 1 for classes
# the segm input format is (0, 38, 75)
# we need to change it to (-1, 0, 1)
segm = np.array(segm)
segm = np.where(segm==0, -1, segm)
segm = np.where(segm==38, 0, segm)
segm = np.where(segm==75, 1, segm)
# print(np.unique(segm))
# Original
# segm = torch.from_numpy(segm).long() - 1
segm = torch.from_numpy(segm).long()
return segm
```
#### Modify Config
Customizing the config file for ones case
```yaml=
DATASET:
root_dataset: "./data/"
list_train: "./data/metal_training.odgt"
list_val: "./data/metal_validation.odgt"
num_class: 2
...
```
### Training
```shell=
python3 train.py --gpus 1 --cfg config/custom.yaml
```
### Loading in pre-train model
TODO