# Paper Reproduction - NODEO: A Neural Ordinary Differential Equation Based Optimization Framework for Deformable Image Registration
This reproducibility project was conducted by students from TU Delft as part of the course CS4240 Deep Learning (Q3 2023).
Victoria Leskoschek | 5701627 | v.leskoschek@student.tudelft.nl
- 3D OASIS Reproduction
- 2D OASIS algorithm variant
Ariel Ebersberger | 4874951 | a.ebersberger@student.tudelft.nl
- 3D OASIS Reproduction
- 2D Toy Dataset algorithm variant
The aim of the project was to reproduce the results of the paper *NODEO: A Neural Ordinary Differential Equation Based Optimization Framework for Deformable Image Registration* by Wu et al., based on the source code provided by the authors.
Links to the original NODEO paper:
- [Paper](https://arxiv.org/pdf/2108.03443.pdf)
- [Source Code](https://github.com/yifannnwu/NODEO-DIR)
- [Project Website](https://yifannnwu.com/NODEO-DIR/)
## Useful Information
You can download the OASIS datasets from [medical-datasets](https://github.com/adalca/medical-datasets/blob/master/neurite-oasis.md).
The following requirements need to be installed to run the code:
```
!pip install nibabel --upgrade
!pip install torchio
```
GitHub Folder Structure:
- 3D OASIS: Reproduction of existing code for 3D OASIS dataset
- 2D OASIS: Algorithm variant for 2D OASIS dataset
- 2D Toy Dataset: Algorithm variant for 2D PNG Toy Dataset
You can also play around with our code in Google Colab:
- [3D OASIS](https://colab.research.google.com/drive/1_u26GSM9LS53HlLPgL9dJplOA69hr2mV?usp=sharing)
- [2D OASIS](https://colab.research.google.com/drive/1ZWVnnluZ_chz3IfkF0epPkUtEj5T-rYl?usp=sharing)
- [2D Toy Dataset](https://colab.research.google.com/drive/184T0Vi5YFIXsimwyVSPxceozImJyVVKa?usp=sharing)
Nifti images can be visualised using the software [Freesurfer](https://surfer.nmr.mgh.harvard.edu/).
## Introduction
The paper NODEO (Wu et al.) introduces a novel framework for Deformable Image Registration (DIR) that uses neural ordinary differential equations (NODEs) to optimize for a dynamical system between the image pairs and corresponding transformation. Deformable Image Registration is a technique used to align two or more images by deforming one of them to match the other. A warped image is created by applying a deformation field to the moving image, such that it aligns with the fixed image. Deformable Image Registration is a crucial task in the medical sector, where it is used to align different medical images, such as CT scans, MRI scans, and PET scans for diagnosis and monitoring of diseases.
The following figure shows the framework used in the NODEO paper, using convolutional layers to down-sample the data, then going through dense layers where time information is incorporated. The output is then upsampled and smoothend using a Gaussian kernel.

## Reproducibility Project
For our project, we focused on the following categories:
**1. Reproduction**
We reproduced the code provided on GitHub for 3D deformable image registration on the OASIS dataset.
**2. New Algorithm Variant**
Since the existing code only functions for 3D images, we modified the code for compatibility with 2D brain images. Additionally, we implemented a variant that supports PNG images, enabling us to assess the model with a self-created toy dataset based on the one provided in the paper.
**3. New Code Variant**
We refactored certain hard-coded snippets to make the code more generalisable and implemented a loop structure allowing to iterate over multiple image pairs and compute the average dice scores. We further tried to add more comments to the code, explaining its functionality better.
### Reproduction of existing code (3D - OASIS)
First of all focused on reproducing the given source code from [NODEO GitHub repo](https://github.com/yifannnwu/NODEO-DIR). We set up a Google Colab environment to integrate the codebase there. The given code implements the NODEO framework for 3D Nifti images from the OASIS dataset. A set of two images is provided in the repository (OASIS001 and OASIS002). We managed to get the code running without any major difficulties. The following figure illustrates the warped image that we got as output when conducting the registration on the provided image pair (ID0001 = fixed, ID0002 = moving), compared to the snippet of the results of the same registration presented in the NODEO paper (p.8, Figure 5).

Since there were only two images provided by the NODEO repo, our second task was to inspect whether the given code also works for the OASIS 3D dataset from [medical-datasets](https://github.com/adalca/medical-datasets/blob/master/neurite-oasis.md). We used ```aligned_norm.nii.gz``` equivalent to the ```brain.nii.gz``` file, and ```aligned_seg35.nii.gz``` equivalent to the ```brain_aseg.nii.gz``` file. While the images have gone through mainly the same preprocessing stages as the images provided by the NODEO authors, their dimensions slighty differ. The size of the medical-datasets OASIS images is 160 x 192 x 224, while the NODEO images are 160 x 192 x 144. If we execute the code on the new dataset without any changes, we get an error at the first linear layer, as the value for the number of in-features is hardcoded and dependent on the initial image dimensions. While we could just crop the images to the desired size at the beginning, we decided for a more flexible approach and adjusted the code so that it dynamically calculates the number of in-features for the linear layer based on the input size:
```python
[...]
# Calculate number of in-features
self.in_features = self.calculate_in_features()
# Linear layers
self.lin1 = nn.Linear(self.in_features, self.bs, bias=bias)
self.lin2 = nn.Linear(self.bs, self.bottleneck_sz * 3, bias=bias)
[...]
def calculate_in_features(self):
"""
Calculates the number of in-features for the linear layer dynamically based on input size.
"""
x = torch.randn(1, 3, self.img_sz[0], self.img_sz[1], self.img_sz[2]) # Create a dummy input tensor
x = F.interpolate(x, scale_factor=0.5, mode='trilinear') # Optional to downsample the image
x = self.relu(self.enc_conv1(x))
x = self.relu(self.enc_conv2(x))
x = self.relu(self.enc_conv3(x))
x = self.relu(self.enc_conv4(x))
x = self.enc_conv5(x)
return x.view(x.size(0), -1).size(1) # Flatten the tensor and return the number of features
```
While the authors mention in the paper that they carried out the registration for 200 image pairs (using IDs 1, 10, 20, 30, 40 as fixed images and the remaining ones with IDs < 50 as moving images), the code only supports the registration of one image pair per run. To facilitate the registration of multiple image pairs and the calculation of the mean dice score, we restructured the code a bit and implemented a loop structure that runs through a given set of fixed/moving image pairs:
```python
def createConfig(id_fixed, id_moving):
"""
Configuration
"""
# Create Config
config = types.SimpleNamespace()
config.id_fixed = id_fixed # ID of fixed image
config.id_moving = id_moving # ID of moving image
config.affine = None # affine transformation matrix (assigned later after image has been loaded)
# File Paths
config.fixed = 'data/3D/OASIS_OAS1_0' + id_fixed + '_MR1/aligned_norm.nii.gz'
config.moving = 'data/3D/OASIS_OAS1_0' + id_moving + '_MR1/aligned_norm.nii.gz'
config.fixed_seg = 'data/3D/OASIS_OAS1_0' + id_fixed + '_MR1/aligned_seg35.nii.gz'
config.moving_seg = 'data/3D/OASIS_OAS1_0' + id_moving + '_MR1/aligned_seg35.nii.gz'
config.savepath = 'result3d/' + config.id_fixed + config.id_moving
[...]
return config
if __name__ == '__main__':
# Assign Image IDs
fixed_ids = [1, 10, 20, 30, 40]
moving_ids = [2, 3, 4, 5, 6, 7, 9, 11, 12, 13, 14, 15, 16, 17, 18, 19, 21, 22, 23, 25, 26, 27, 28, 29, 31, 32, 33, 34, 35, 37, 38, 39, 41, 42, 43, 44, 45, 46, 47, 49]
# For running a single image pair, adjust code like in the following two lines
# fixed_ids = [1]
# moving_ids = [2]
dice_list = [] # initialise list to store dice scores
for id_fixed in fixed_ids:
for id_moving in moving_ids:
print("-----------------------")
print("NEW SET")
print("Fixed Image ID: " + str(id_fixed).zfill(3))
print("Moving Image ID: " + str(id_moving).zfill(3))
print("-----------------------")
print()
config = createConfig(str(id_fixed).zfill(3), str(id_moving).zfill(3))
dice_list.append(main(config, dice_list))
# Calculate mean dice score and standard deviation
mean_dice_score = np.mean(dice_list)
squared_diff = (dice_list - mean_dice_score) ** 2
variance = np.mean(squared_diff)
std_deviation = np.sqrt(variance)
print("--------------")
print("Dice Mean Scores Array: " + str(dice_list))
print("Overall Mean Dice Score: " + str(mean_dice_score))
print("Standard Deviation of Mean Dice Scores: " + str(std_deviation))
print("--------------")
```
Another issue we encountered with this adaptation was that the labels from the provided images differed from the medical-dataset image labels. We inspected the labels from the segmentation images ```aligned_seg35.nii.gz``` (medical-datasets) and ```brain_aseg.nii.gz```(NODEO) with the following function:
```python
def getLabels(filepath):
img = nib.load(filepath)
data = img.get_fdata()
unique_labels = np.unique(data)
print(unique_labels)
```
These are the labels attached to the input files:
- Labels for images provided by the NODEO paper:
[ 0. 2. 3. 4. 5. 7. 8. 10. 11. 12. 13. 14. 15. 16. 17. 18. 24. 26. 28. 30. 41. 42. 43. 44. 46. 47. 49. 50. 51. 52. 53. 54. 58. 60. 62. 72. 78. 79. 81. 82. 85.]
- Labels for images from OASIS dataset provided by medical-datasets:
[ 0. 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33. 34. 35.]
While a total of 41 labels was found on the NODEO paper images, the authors seem to take only a selection of 28 labels when assigning them in the code,:
```python
label = [2, 3, 4, 7, 8, 10, 11, 12, 13, 14, 15, 16, 17, 18, 24, 28, 41, 42, 43, 46, 47, 49, 50, 51, 52, 53, 54, 60]
```
However, it is unclear what anatomical structure these labels refer to, since a label.txt file is missing. Therefore, it is also not comprehensible how they came to the subselection of labels. This however poses a problem to an accurate reproduction, since this means that if we take the original label list from the NODEO code and apply it to the medical-datasets OASIS segmentation images, their inherent labels will not fully match the NODEO label list and therefore result in a dice score that only takes into account a fraction of the actually inherent labels. On its own, the wisest thing would be to take the 35 labels from 1-35 from the ```aligned_seg35.nii.gz``` image (and therefore include a bit more structures than in the original NODEO paper). However, if we want to compare our dice scores to the ones from the NODEO paper, the anatomical structures represented by the labels should be as similar as possible. Therefore, we tried to cross-reference the label names provided in the ```seg35_labels.txt``` file for the ```aligned_seg35.nii.gz``` image, with the names of the anatomical structures mentioned in Figure 7, p.12 of the NODEO paper, and came to a selection of 27 labels whose name corresponded to the structures in the figure. However, one structure 'CFR' could not be found in the 35-label set of the dataset.
Running the code over 200 image pairs exceeded the computational ressources of Google Colab. Therefore, we only calculated the mean dice scores for 10 image structures as an example (ID 1 as fixed image, ID 2, 3, 4, 5, 6, 7, 9, 11, 12, 13 as moving image).
Depending on which labels we chose, we received different dice scores. Using the same labels as in the NODEO paper (least overlaps with actual labels), we received a mean dice score in a similar range like the paper:
*Mean Dice Score*: 0.7859361845273766
*Standard Deviation*: 0.0019418613462957035
With the seleciton of 27 labels we reached an even higher dice score:
*Mean Dice Score*: 0.8805092460945844
*Standard Deviation*: 0.003485712761554620
Also with the full 35 labels we reached a higher dice score:
*Mean Dice Score*: 0.8920384782747582
*Standard Deviation*: 0.0023458283282384848
We can't really reason the higher dice scores, but it is most likely to do different sections being taken into account by the labels.
Another aspect we had to adjust was taking into account the affine transformation matrix of the images. In the save_nii function, the original code applied an affine transformation matrix:
```python
def save_nii(img, savename):
affine = np.diag([1, 1, 1, 1])
new_img = nib.nifti1.Nifti1Image(img, affine, header=None)
nib.save(new_img, savename)
```
However, this affine transformation matrix was hardcoded specifically for the image set provided by the paper, but did not match the one of the medical-datasets NODEO images. We therefore implemented a slightly different load_nii and save_nii function to assign the affine matrix dynamically:
```python
def load_nii(path):
"""
Load a NIfTI file from disk and extract the data array and affine transformation matrix
"""
X = nib.load(path)
affine = X.affine
X = X.get_fdata()
return X, affine
def save_nii(img, savename, affine):
"""
Save a NIfTI file to disk
"""
new_img = nib.nifti1.Nifti1Image(img, affine, header=None)
nib.save(new_img, savename)
```
### Algorithm variant for 2D - OASIS
In the paper, they also present results of the NODEO framework applied to the 2D version of the OASIS dataset. However, the provided code only supports 3D images. Therefore, we made a few adjustments to make it applicable to 2D Nifti images. The detailed changes can be seen in the corresponding 2D-OASIS folder. For the 2D code version, we also added a code file that plots the results. We reproduced Figure 4 from the NODEO paper:
**Reproduction**

**NODEO Paper**

### Algorithm variant for 2D - PNG Toy dataset
To verify our code on a toy dataset, we recreated the images used for the illustrative examples shown in the paper and adjusted the code so that it works for 2D black & white PNG images. The PNGs were created using Adobe Illustrator and are based on the toy data from the NODEO paper. With our code adjustments, we managed to obtain similar results as shown in Figure 1, p.1 of the paper.
**Reproduction**

**NODEO Paper**

The visualisation of the deformation field seems to be slightly distorted, but we could not identify the reason behind it.
## Conclusion
In general, the reproduction of the paper was unexpectedly difficult. Even though the source code was available, it was only sparingly documented and proper coding best-practices weren't followed. We have been able to reconstruct much of the missing information we felt was necessary for a complete reproduction of the paper, though some details, such as the choice of labels or their origin is still a mystery. The source code also only allowed the reproduction of the 3D variant, though the paper discusses a 2D variant as well, which we were able to reconstruct in the end. The 3D variant was seemingly only developed for a very specific modified OASIS dataset, which can't be found online and the modificatinos haven't been outlined in the paper with enough detail. Thus, we had to generalize their code before we could reproduce the results with the OASIS dataset available to us.
However, in the end we managed to obtain quite similar results to the paper, albeit with some deviations for dice scores and visualisations of the deformation field. The project showed us how important it is to document ones code clearly, as well as to provide detailed insights into how the results of a paper were obtained, ensuring that they are reproducable by other researchers.
## References
NODEO: A Neural Ordinary Differential Equation Based Optimization Framework for Deformable Image Registration
```
@inproceedings{wu2022nodeo, title={Nodeo: A neural ordinary differential equation based optimization framework for deformable image registration}, author={Wu, Yifan and Jiahao, Tom Z and Wang, Jiancong and Yushkevich, Paul A and Hsieh, M Ani and Gee, James C}, booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition}, pages={20804--20813}, year={2022} }
```