In this document, I want to describe the whole research done by me trying to integrate MMSegmentation in our framework. I have decided to do it in a structured way so any person reviewing this task is aware about limitations and difficulties and is not obliged to walk through my path.
The MMSegmentation library uses python dictionaries as configs and provides you a way to train any of their pre-equipped models with custom datasets. Usually, you do it in the following way:
All steps are easily visible through code here. But, in this tutorial they setup a new dataset and mention that they need to override 'load_annotation' method. However, they do not do this in the code, so I am confused about what they meant by this and how it is actually working. Overall, this notebook gives a good overview of the structure of the framework
Your framework also uses configuration as the core concept to control machine learning tasks. But, our configuration files are much closer to the actual code and written with YAML. Also, we have a way to point to the particular class we want to use, while MMSegmentation rely on Registry.
As I provided overview for both ideas, I proceed to limitation of integration of them.
Both frameworks are standalone solution for solving various machine learning tasks and their size and totally different approaches makes it difficult to make them work together
It is not an easy task to install and use this library. I have set up an environment several times and a few of them was succesful. Overall, to install the library and use it, the following steps should be taken:
There are two ways to install mmcv-full
torch
and CUDA
versions are to be determined first. Using this information, you retrieve the proper link for pip
from here. So, if you change the version of either torch
or CUDA
, the link also should be changed.mim
and then install mmcv-full
using it. The mim package is used anyway, but the major drawback is that I see no way to automatize installation from mim
and it is done in poetry
or pip
.Then, we need to install mmsegmentation
. The preferrable way to do it is using pip
:
Another way it do it from source (refer to this)
In conclusion, installation is not pretty and requires manual work (either to retrieve proper link for pip
, or use mim
).
Another limitation is the format of datasets used in mmsegmentation. The whole data flow is orchestrated with configuration files including the dataset class and image/annotations format. After long research, I found a function build_dataloader
that accepts "PyTorch dataset" as it is said in the documentation. However, a problem still arises during training when our Dataset
object is supplied. I have no idea why the mention PyTorch dataset, if actually the training loop await CustomDataset
(source). The problem is traced in the following way
__getitem__
method returns a result of prepare_train_img
function.My proposition is to have create a wrapper or a totally different class for working with segmentation data specially for MMSegmentation. I think it will tend to stable codebase and easiness in OpenMMLab integrations.
Both frameworks use their own idea of managing configurations and building datasets. The only way I see them integrated is that we adapt an interface devoted special of mmsegmentation class. But this task requires clear understanding of segmentation as a process and broad knowledge of the whole codebase of the project. I alone will spend a significant amount of time incorporating this two solutions.