ApeerML (PFEE)

# ApeerML (PFEE) ## First meeting 1. Understand the current current approach in the code watershedding 2. Read state of the art and review 3. Prototyping and testing 4. Write the software to integrate it Goal for the next meeting: Summary how the problem is handled and some litterature review. Use google scholar parallel markerbased image segmentation with watershed transformation ([link](https://d1wqtxts1xzle7.cloudfront.net/30803553/parallel-marker-based.pdf?1362353716=&response-content-disposition=inline%3B+filename%3DParallel_marker_based_image_segmentation.pdf&Expires=1621428910&Signature=MnD6qhqDV~-hJv0tXzGbsO4F6M6IJSDhsz5CYbtMKohuTzTLmgGAwHOM9vLPDIEWrPR0VicTpCL-PO8YQO1YMVuF-s6JSZHD4Iv7Cchok8hDKAwW4Mi6R5qpCGFB3ZEv1KjVy8rYXAanzgJ-K6M3WLNJ0MRrBpjZCmYlIOMxyp8bND~0jnhWezBFEcpKEDHT~NA8L7mUAVI5Drlx1Ad~KEePC3M7hrlnWPqbdjPbjE9NQ7dMFKX2c9AYUkmtY04wKxRLet6lH28tfnmWPFyDoDz4OBTE3IvzUH2z-K0pZn1NbbfSUDDgIXhPGB5g~t0zumiCEhoKlctBu1Yaa9DJng__&Key-Pair-Id=APKAJLOHF5GGSLRBV4ZA)) ## Until second meeting ### Questions * Est-ce que on fait un watershed sur des predictions ? Il semblerait que ouoi * Pourquoi ils appliquent un watershed sur les prediction ? ### Understanding of the current postprocessing pipeline 1. **Make predictions**. 1 channel per class -> get masks 2. **Remove small blobs** -> get clean masks 3. **Apply watershed transformation** (using the clean masks) -> get output 1. compute markers with connected component ```python markers = sitk.ConnectedComponent(objects_mask_image) ``` 2. compute map of distances from each pixel to all marked pixels ```python distances = sitk.SignedMaurerDistanceMap( objects_mask_image, insideIsPositive=False, squaredDistance=False, useImageSpacing=False) ``` 3. free as much memory as possible ```python # watershed is a memory-heavy operation, hence the cleanup del mask del objects_mask_image ``` 4. Compute **watershed separation** ```python objects_labeled = sitk.GetArrayFromImage( sitk.MorphologicalWatershedFromMarkers( distances, markers, markWatershedLine=False, fullyConnected=False) ).astype('uint16') ``` Watershed computation is performed with markers based. Doing so, we prevent any over-segmentation. Other documentations: * [Notebook Watershed](http://insightsoftwareconsortium.github.io/SimpleITK-Notebooks/Python_html/32_Watersheds_Segmentation.html) * [SimpleITK documentation](https://simpleitk.org/doxygen/latest/html/classitk_1_1simple_1_1MorphologicalWatershedFromMarkersImageFilter.html#details) * [OpenCV](https://docs.opencv.org/master/d3/db4/tutorial_py_watershed.html) * [Image Segmentation Using Gray-Scale Morphology and Marker-Controlled Watershed Transformation](https://downloads.hindawi.com/journals/ddns/2008/384346.pdf) ### Questions * Why using SimpleITK but not OpenCV nor Scikit * We searched for memory bottleneck in the watershed algorithm but didn't found one (only O(n) memory complexity). Is the memory optimisation only for when images become very big ? Like 20M pixels ? * Can we have a very big image ? * Can we have a 3D/volumetric image ? * We don't have access to the paper cited in the README ! * Why competing for pixels (pred_multichannel)? Why computing one watershed and not one watershed for each class? * Error with the separation of classes during the post watershed ? ### Tasks * Benchmark: Eliaz * Implementation paper: Johan * Find another paper: Valentin * Notebook (show step by step): Ilan # Towards A Parallel Topological Watershed Github: https://github.com/d-montenegro/Topological-Watershed ## Principe Un watershed topologique est un watershed qui renvoit non pas une image noir et blanc comme pour un watershed classique mais une image en nuance de gris. Cette image conserve une propriété en particulier: **Tout pixel de l'image d'origine *$k$-connecté* à un autre pixel reste *$k$-connecté* dans l'image résultante** Un pixel $p$ est *$k$-connecté* à un pixel $q$ si: - il existe un chemin qui relie $p$ à $q$ contenant un pixel de valeur max $k-1$ - il n'existe pas de chemin qui relie $p$ et $q$ dont le pixel de valeur max est en-dessous de $k-1$ - $p < k, q < k$ - Autrement dit, $p$ et $q$ sont deux pixels de bassins différents, séparés par au moins un pixel de valeur $k-1$ Une $W$-destruction est une operation qui s'applique sur un pixel $x$, qui réduit la valeur du pixel $x$ de $1$ tout en conservant les propriétés de connexion ($k$-connexion) Un watershed topologique est *simplement* l'application de $W$-destructions sur l'ensemble de l'image jusqu'à que plus aucun pixel ne soit $W$-destructible. Pour que cette implémentation soit rapide, il faut que l'on puisse descendre la valeur des pixels non pas de $1$ à chaque étape mais de la valeur maximale en une fois. On utilise pour cela le MinTree de l'image, dont le LCA en un point donne cette valeur. Le calcul du MinTree peut être parallélisé, et la calcul du LCA optimisé (CF le papier qui donne des ref vers d'autres papiers). Le watershed peut être parallélisé car l'algo de $W$-destruction est local (il a juste besoin du LCA, qui peut être précalculé). On obtient des problèmes sur les bords, car un pixel sur une tile peut faire partie d'un bassin dont le minimum local est sur une autre tile. Pour gérer ce problème, on utilise une file de priorité par tuile pour stocker les pixels à traiter. Quand un pixel est modifié, on ajoute ses voisons des tuiles adjacentes dans la file de priorité. **Le niveau de priorité d'un pixel dans la file de priorité est la valeur du niveau auquel il peut être descendu**