# Neural ordinary differential equations for image registration
Contributers:
Gianna Eeken 4729234
Hanna Post 5607345
Nigel Jansen 4501470
## **1. Introduction**
This blog describes the process of reproducing the paper
[“NODEO: A Neural Ordinary Differential Equation Based Optimization Framework for Deformable Image Registration” (Wu & Jiahao et al., 2023)](https://arxiv.org/abs/2108.03443).
This was done for the course Deep Learning at Delft University of Technology.
The blog is complementary to our github repository, where our code for reproduction and a ReadMe can be found. Our repository can be found [here](https://https://github.com/DonPi-Boxer/NODEO-Batch-Registrate).
Next to the reproduction process itself (methods, results, discussion), we provide some background intuition to understand the paper. We find it important that explanation is given accompanying the code and in the code itself; rather too much than too little. Therefore this post explains things that some might find superfluous.
### **1.1 Image registration**
Image registration is the process of aligning two images in the same coordinate system. It is very important in the medical field, where the two images can be from different modalities (e.g. CT and PET), different times (like in the follow–up of cancer treatment) or from two different patients altogether. One image (the moving image) is mapped onto the other (the fixed image). This transformed image is called the warped image. See figure 1 for a visual explanation.

Fig. 1. Source: adapted from NODEO paper and Cao, Yiqin & Zhu, Zhenyu & Rao, Yi & Chenchen, Qin & Lin, Di & Dou, Qi & Ni, Dong & Wang, Yi. (2021). Edge-Aware Pyramidal Deformable Network for Unsupervised Registration of Brain MR Images. Frontiers in Neuroscience. 14. 10.3389/fnins.2020.620235.
Deformable image registration (DIR) includes all those transformations that are not simply linear transformations, such as translation or rotation. Figure 2 gives a clear overview of possible transformation techniques. This paper performs DIR with the use of Neural ordinary differential equations.

Fig. 2. Source: Uchida, Seiichi. (2013). Image processing and recognition for biological images. Development, growth & differentiation. 55. 10.1111/dgd.12054.
### **1.2 Neural ODE’s**
Neural ordinary differential equations (neural ODE’s) were first introduced by [Chen et al. in 2018](https://proceedings.neurips.cc/paper_files/paper/2018/file/69386f6bb1dfed68692a24c8686939b9-Paper.pdf). It is a family of deep neural networks that is based on ODE solvers. Recall that an ordinary differential equation is of the form:
$$ {dy \over dx} = p(x).$$
It contains one independent variable and its derivative. Solving the ODE means finding the unknown function $y(x)$ using the relation given in the differential equation and the initial condition $y(0) = c$, with c some value.
In a neural ODE, a neural network is used to approximate the solution to an ODE. Instead of using discrete time steps to calculate the output of the neural network, the neural network takes as input the initial conditions of the system and a time interval, and then it calculates the output of the system at the end of that time interval.
The advantage of using a neural ODE is that it allows us to model complex systems that change over time, while still being able to use the power of neural networks to approximate the solution to those systems. Of course, it is usually not interesting to just find some function, the function would describe a certain situation of interest. For example, to model the evolution of the features of an image over time, and then use the final output of the ODE as a representation of the image that can be used for classification. The neural ODE takes the image as an input, and then uses a set of differential equations to model the evolution of the image features over time. The ODE solver then computes the final state of the system, which can be thought of as a representation of the image that captures its important features. Something similar happens with the application of neural ODE’s to DIR. We input a moving and fixed image and find as our solution to the differential equation the deformation field (see figure 3).

Fig. 3
### **1.3 Neural ODE’s for image registration**
To explain what happens we first provide a general explanation of supervised training a neural network using figure 4. One first defines the architecture of the network and the loss function to be minimized (entire research fields in themselves). Then training commences with the initialisation of the parameters, which allow the network to generate its first prediction. This output is compared to the ground truth labels in the loss function. How wrong it is, is used to set the parameters again (by for example backpropagation). After a certain amount of iterations, the loss is sufficiently small and the state of the parameters is accepted as the final network.

Fig. 4
A neural network used for DIR aims to find a deformation field that adequately maps the moving image to the fixed image. So the output of the network is the deformation field, and by extension the warped image. This process is one-shot learning. The deformation field is unique for this set of images, for another set the network needs to be trained again.

Fig. 5
## **2. Project goals**
In terms of reproducibility, our main goals were to reproduce figure 4 and table 1 of the original paper. In summary, we have done the following:
* Reproduce figure 4 of the paper
* Reproduce table 1 of the paper
* Add the possibility for 2D registration
* Allow for batch registration tasks to be performed in 3D and 2D
* Write a more detailed ReadMe file about how to use the repository
* Make the registration task possible for input images of various sizes
### **2.1 Figure 4**
The authors of the NODEO paper performed an ablation study to analyze the effects of regularization terms. In the paper, figure 4 displays the effect of Gaussian smoothing and $L_{Jdet}$ regularizer on 2D images. This original result form the paper is shown below in figure 6. Here, $J$ and $I$ are the moving and fixed images respectively. The rows show
* $(J_ψ)$ warped moving images,
* $(ψ)$ grid visualization of the deformation field,
* $(|D_ψ|)$ Jacobian determinants of the transformation $ψ$, and
* $(Neg)$ the regions with negative Jacobian determinants.
The columns shows the registration of the images with
* $($a$)$ both $L_{Jdet}$ regularization and Gaussian smoothing,
* $($b$)$ without regularization,
* $($c$)$ without Gaussian smoothing, and
* $($d$)$ with neither.

Fig. 6
Since figure 4 shows registration results of 2D images and the code of the original repository only supports registration in 3D, we had to rewrite the original code into 2D to perform the reproducibility check for figure 4. Since we thought 2D registration would be a valuable addition to the original repository, we have decided to keep this as a feature. Thus, with our forked repository, the user is able to perform 3D registration as well as 2D registration.
### **2.2 Table 1**
Additionally, the authors performed a quantitative study of their network in order to compare its performance to several state-of-the-art learning-based methods. They compared their method to SYMnet and pairwise optimization-based methods including SyN, NiftyReg, and Log-Demons.

Table 1. Table 1 from the original paper
Table 1 above shows the quantitative results of different image registration methods. The table shows three different, quantitative values. ‘Avg. Dice (28)’ stands for the average dice over 28 structures. Average dice is a measurement for the succes of an image registration task. It compares the positioning of the deformed segmentations of the moving imaging to the positioning of the segmentations of the fixed image. The value of the dice score for one segmentation region is equal to:
$$ Dicescore = {TP + TN \over TP + TN + FP + FN}.$$
Where
$TP$ = True Positive, the pixels where both the deformed segmentation and the fixed segmentation are present
$TN$ = True Negative, the pixels where both the deformed segmentation and the fixed segmentation are not present
$FP$ = False Positive, the pixels the deformed segmentation is present, but the fixed segmentation is not present
$FN$ = False Negative, the pixels where the deformed segmentation is not present, but the fixed segmentation is present
By averaging the dice score taken over 28 segmentations, the average dice score of one performed registration is obtained. By averaging the average dice over 200 different registrations, the mean of the average dice and the standard deviation can be obtained.
The second column in table 1 shows the average ratio of negative jacobian values over 200 performed registration. This ratio gives the percentage of pixels in the jacobian, which is used during registration, that have a negative value. The third column displays the average of the total value of the negative jacobian over 200 performed registrations. The total value of the negative jacobian is simply the sum of the number of pixels in the jacobian with a negative value.
### **2.3 Extra goals**
Table 1 of the paper was created by performing 200 registrations using NODEO, so we wrote a code to perform the registration in 3D in batch. As we thought it would be useful for a user to be able to run registrations in batch, we decided to keep the batch registration in our repository. The batch registration was added to both the 2D and 3D registration tasks.
Furthermore, we noticed that the ReadMe of the original repository is extremely concise. Especially given the many different arguments the original repository offers, it is a shame that these are not well documented in the ReadMe. Therefore, another goal for us was to extend the ReadMe in order to give more explanation about how to run the (batch) registration. Fitting our goal to visualize 2D deformation visuals, we have added the output of these visuals when the batch registration is run in 2D by default. If the argument “--no-visuals” is given, these visuals are not saved when performing the batch registration.
Lastly, the original repository has a hardcoded value for the linear layer in the network. Therefore, the original repository is only able to perform registration over images of size 160 x 192 x 144. We have changed the declaration of the llinear layer in the network, such that its size is dependant on the size of the input images, meaning that the batch registration can now be performed over images of varying sizes (however, the moving and fixed image should still be of identical size.
## **3. Methods**
### **3.1 Google Virtual Machine**
Especially for the 3D registration task, the availability of NVIDIA drivers that can be called on using CUDA, greatly decreases the run time for registration tasks. Therefore, we have used a Google Virtual Machine to run the registration tasks. The configuration of the Google VIrtual Machine is as allows:

### **3.2 Used data**
For both our reproducibility goals, we have used data originating from the OASIS dataset. This dataset is also used in the original paper. A notable difference between the original paper and our reproducibility check is, however, that the authors of the original papers performed auto-segmentation over the OASIS dataset themself by using FreeSurfer. Subsequently, they aligned all images to MNI 152 space using affine transformation. Lastly, they have used center cropping to obtain final images with a size of 160 x 192 x 144. For our reproducibility, we have used the OASIS dataset from Neurite OASIS Sample Data provided by Dalca et. al. This dataset was already segmented, also using FreeSurfer, and aligned, using SAMSEG, such that it could be used for the registration task without the need to perform segmentation and alignment ourselves. Furthermore, this dataset consist of both 2D and 3D data. The 2D images are of size 160x192 and the 3D image are of size 160x192x224. It is thus notable that, although the 2D slices are of the same size, our reproducibility check in 3D used images with more 2D slices than the authors of the original paper.
### **3.3 Reproducing figure 4**
In order to reproduce figure 4 it was necessary to:
* Change the code from 3D to 2D
* Write code to output the specific images (different rows of figure 4)
* Change code to perform ablation studies (different columns of figure 4)
The code provided by the authors of the NODEO paper only supports registration of 3D images. Therefore we had to change the code from 3D into 2D. The change to 2D involved a lot of small changes throughout all of the files, except neuralODE.py. We have annotated most of these changes in the code. A few examples of the changes made can be observed in the code of the BrainNet(ODEF) class, shown below. Here, small changes are made such as changing `Conv3d` into `Conv2d` and modifing the corresponding parameters (often change 3 into 2 or deleting the third dimension).
```
class BrainNet(ODEF):
def __init__(self, img_sz,smoothing, smoothing_kernel, smoothing_win, smoothing_pass, ds, bs):
super(BrainNet, self).__init__()
padding_mode = 'replicate'
bias = True
self.smoothing = smoothing
self.ds = ds
self.bs = bs
self.img_sz = img_sz
self.smoothing_kernel = smoothing_kernel
self.smoothing_pass = smoothing_pass
# self.enc_conv1 = nn.Conv3d(3, 32, kernel_size=3, stride=2, padding=1, padding_mode=padding_mode, bias=bias)
# Change Conv3d into Conv2d and 3 --> 2
self.enc_conv2 = nn.Conv2d(2, 32, kernel_size=3, stride=2, padding=1, padding_mode=padding_mode, bias=bias)
self.enc_conv3 = nn.Conv2d(32, 32, kernel_size=3, stride=2, padding=1, padding_mode=padding_mode, bias=bias)
self.enc_conv4 = nn.Conv2d(32, 32, kernel_size=3, stride=2, padding=1, padding_mode=padding_mode, bias=bias)
self.enc_conv5 = nn.Conv2d(32, 32, kernel_size=3, stride=2, padding=1, padding_mode=padding_mode, bias=bias)
self.enc_conv6 = nn.Conv2d(32, 32, kernel_size=3, stride=2, padding=1, padding_mode=padding_mode, bias=bias)
self.bottleneck_sz = int(
math.ceil(img_sz[0] / pow(2, self.ds)) * math.ceil(img_sz[1] / pow(2, self.ds))) # 'was: img_sz[2] and extra element'
#Decleration of self.lin is changed w.r.t. the original repository. In the original repository,
#the size of the linear layer was hardcode to be 864, meaning the code could only be executed for
#Image with one size. By changing this, we have enabled the code to be executed over images of other sizes as well.
self.lin1 = nn.Linear(int(img_sz[0]* img_sz[1] / 32), self.bs, bias=bias)
self.lin2 = nn.Linear(self.bs, self.bottleneck_sz * 2, bias=bias) # 2 for 2d
self.relu = nn.ReLU()
# Create smoothing kernels
if self.smoothing_kernel == 'AK':
self.sk = AveragingKernel(win=smoothing_win)
else:
self.sk = GaussianKernel(win=smoothing_win, nsig=0.1)
```
To output the images and deformation fields of figure 4, we have written two functions, save_image and save_grid, using the matplotlib library. The functions are defined in the save.py file. The images of the figure were created by plotting the (outputted) variables fixed_mri, moving_mri, df_with_grid, warped_moving, jdet and neg_Jdet using these functions. For this, the variables were first transformed to a numpy array. The code to visualise the images was inserted in Registration.py. See this file in our repository for the full code.
The authors of the original paper did not mention which subject of the OASIS dataset they used for the creation of figure 4. We used the first (OASIS_OAS1_0001_MR1) and second (OASIS_OAS1_0002_MR1) one as moving and fixed images respectively.
To reproduce figure 4 of the paper, registration had to be run with $($a$)$ both $L_{Jdet}$ regularization and Gaussian smoothing, $($b$)$ without $L_{Jdet}$ regularization, $($c$)$ without Gaussian smoothing, and $($d$)$ with neither. It is important to note an important difference between our reproduction and the original paper, which is that we used the Averaging kernel instead of the Gaussian kernel, as this was recommended by our external supervisor.
The possibility to perform registration without $L_{Jdet}$ regulizer was already present in the original repository. To achieve this, the code had to be run with the argument `--lambda_J =0`
To run registration without any form of smoothing, we added an if statement above the code section that performs the smoothing. The if-statement defaults to TRUE, meaning smoothing will be applied, but using the argument `--no-smoothing`, no smoothing wil be applied.
### **3.4 Reproducing table 1**
In our given assignment, the goal was reproduce the results of table 1 of the original paper with respect to the registrations performed over the OASIS datasets. In the original paper, the five images with IDs 1,10,20,30,40 were used as fixed images, and the remaining images with IDs<50 as moving images. As this means there are 5 fixed images and 40 moving images, a total of 200 pairs are registered. In order to come as close as possible to the work of the authors of the original author, we have decided to perform these same 200 registrations.
The original repository was already fairly complete for 3D registration tasks. The only thing that we had to change in order to allow perform a single registration task over our data, was the size of the linear layer present in the network. In the original repository, the size of this layer had a predifined, hardcoded size (of 864). As our 3D dataset contains more Z-slices than the dataset used by the authors, we had to change the size of the linear layer accordingly. We have achieved this by making the linear layer size dependant on the size of the input images. After this was done, we were able to use the code to perform a single registration task.
In the original repository, the values of the average dice, ratio of the negative jacobian and the value of the total jacobian were already calculated and also printed in the terminal after the registration was performed.
The original repository however, only allowed to perform one registration task at a time. As we had to run 200 registrations in order to reproduce table 1, we have added a python file that allows for registration over multiple images in batch.We have created a for-loop that registers moving and fixed images stored in separate directories for their respective datasets, in order to accomplish this task. Following each registration, the average dice, negative jacobian ratio, and total negative jacobian values are computed and added to their respective arrays. Once all 200 registration tasks have been performed, the average value of these arrays is calculated.
## **4. Results & discussion**
### **4.1 Figure 4**

Looking at our reproduction of figure 4, it looks fairly similar, but not a 100%. Part of these differences can be ascribed to the difference in visualization. The deformation fields and the black background of the moving, fixed and warped moving images are clear examples of this. It was not mentioned what visualization tool the authors used, so we visualized the output using python. Another possible explanation for the differences between the two figures is the use of different smoothing kernels. For our reproduction, we used the averaging kernel instead of the Gaussian kernel, as this was recommended by our external supervisor.
In the original figure, the images without regularization have noticeably higher yellow spots, when looking at the negative Jacobian. A possible explanation for this is that we used a different set of images than in the paper, as it was not mentioned which ones they used. Furthermore, the deformation fields of our reproduced figure look a bit different than the ones of the original paper. However, we still can observe that by applying the smoothing kernel the deformation field appears more smooth and regular. Also, by applying a soft regulizer LJdet, we can observe that the deformation field changes smooth and continuously throughout the image, while preserving the topology and structure of the image.
### **4.2 Table 1**
After performing the registration for λ=2.5, the terminal showed the following result:

After performing the registration for λ=2.0, the terminal showed the following result:

Putting this in table form yields:
Avg. Dice (28)
| λ<sub>1 | Avg. Dice (28) | Ratio negjet | Total negjet|
| -------- | -------- | -------- | -------- |
| 2.5 | 0.773 土 0.020 | 0.013% | 44.620
2.0 | 0.773 土 0.020 | 0.024% | 89.244
If we compare these results to those of the original article, we see that the values do not deviate that much. For the average dice, we see that our result for λ<sub>1</sub>=2.5, is 0.64 % lower, while for λ<sub>1</sub>=2.0 our result is 0.77 % lower. For the values of the average dice, we thus come very close to the values obtained by the original paper. Due to it’s stochastic nature, it’s not surprising that these values are not exactly the same. It’s a good sign that the difference between the values is only very small, and adds a lot to the reproducibility value of the original paper. Furthermore, it is nice to see that the value of the average dice is quite consistent over different registration tasks, as is also displayed by its low standard deviation.
Our obtained value for the ratio of negative jacobian values is 0.017 %-point lower for λ<sub>1</sub>=2.5, and 0.006 %-point lower for λ<sub>1</sub>=2.0. Our values for the total negative jacobian are 30.5 % higher for λ<sub>1</sub>=2.5 and 46.05 % higher for λ<sub>1</sub>=2.0.
We see that the difference between our values for the ratio of negative jacobian do not differ a lot compared to those in the original paper. However, the values of the total negative jacobian do differ a lot compared to those in the original paper. Furthermore, the ratio of the negative jacobian is lower than the one in the paper, while the sum of the negative jacobian is bigger than the one given in the paper. We think this is due to the fact that the images we used to perform this repdocucibility check were bigger then the images the authors of the paper used, meaning that the ratio of the negative jacobian can be lower while the total sum of the negative jacobian is bigger.
Taking all this in consideration, we think that our reproducibility check for table 1 shows a positive result with respect to the original paper, meaning that the algorithm proposed in the original paper and forked of the GitHub page is well reproducible.
## **5 Conclusions**
In conclusion, we think the results of the original paper are well reproducible, for both figure 4 and table 1. The results we obtained are quite similar to those of the paper, and the deviations that we obtained can be well explained due to differences in the used data and difference in network (average smoothing for our reproducibility check versus gaussian smoothing used in the paper).
While the results of the paper can be well reproduced, the original paper and its complementary Github repository could be more elaborate in order to enhance the replicability of the paper. In case of the reproducibility of figure 4, nowhere is it mentioned which exact dataset of OASIS and which slice is used for the creation of the figure. Furthermore, there is no mention about how the images of the figure where obtained. That is, it is not clear whether the author performed a 3D registration tasks and displayed a slice of this result, or whether the authors have created a 2D registration algorithm themselves. For the reproducibility of table 1, if one would want to use the original repository without any chances, it is necessary to crop the used input data into the appropriate size (160 x 192 x 144). While the authors mention in their paper that they do this, it is not mentioned explicitly that this is necessary in order for the registration to run successful. Instead of cropping our input images, we have decided to make the network generally applicable, independent of the input image size.
All in all, our forked repository allows one to run the code once, in either 2D and 3D, and obtain a reproducibility check for figure 4 and table 1 respectively. We think this can be valuable for other, who want to do a reproducibility check themselves.
## **6. Task division**
Nigel worked on the reproducibility of table 1. In order to do so, he wrote the option to run the registration task in batch. Furthermore, he took of the Google Cloud virtual machine computer environment.
Gianna and Hanna changed the code to 2D and added code to visualize the images and deformation fields to reproduce figure 4.
All of us worked on this blogpost.
We think our cooperation as a group was quite successful.
## **7. Sources**
NODEO: A Neural Ordinary Differential Equation Based Optimization Framework for Deformable Image Registration (Wu & Jiahao et al., 2023)
Open Access Series of Imaging Studies (OASIS): Cross-Sectional MRI Data in Young, Middle Aged, Nondemented, and Demented Older Adults.
Marcus DS, Wang TH, Parker J, Csernansky JG, Morris JC, Buckner RL.Journal of Cognitive Neuroscience, 19, 1498-1507.
Learning the Effect of Registration Hyperparameters with HyperMorph. Hoopes A, Hoffmann M, D. N. Greve, Fischl B, Guttag J, Dalca AV. MELBA 2022.
Chen, R. T., Rubanova, Y., Bettencourt, J., & Duvenaud, D. K. (2018). Neural ordinary differential equations. Advances in neural information processing systems, 31.
Uchida, Seiichi. (2013). Image processing and recognition for biological images. Development, growth & differentiation