Blind Hyperspectral-Multispectral Image Fusion via Graph Laplacian Regularization

# Blind Hyperspectral-Multispectral Image Fusion via Graph Laplacian Regularization Authors: Chandrajit Bajaj, Tianming Wang Link: https://arxiv.org/abs/1902.08224 Written by: Pronoma Banerjee --- ### Abstract and Introduction The paper presents a method for fusing a hyperspectral image (HSI) of low resolution with a multispectral image (MSI) of high resolution to produce a super-resolution image (SRI). The process is similar to the process of hyperspectral pan sharpening, but unlike previous work, it assumes no prior knowledge of the spatial or spectral degradation from the SRI to the HSI. It also doesn't assume a perfectly aligned HSI and MSI pair. The SRI is assumed to be aligned with the MSI. The image fusion is performed iteratively by fusing the HSI with a graph Laplacian of the MSI. The proposed approach searches for the blur kernel and uses the graph Laplacian defined on the MSI to guide the super-resolution of all the bands of the HSI simultaneously. This approach is able to achieve super-resolution without prior knowledge of the spatial nor spectral degradation. --- ### Graph Laplacian Regularization In remote sensing the graph Laplacian has been used to convert a hyperspecrtral image to RGB for better visualization. Assuming the SRI to be spacially aligned with the MSI, the authors exploit the correlation between them using the graph Laplacian. Let $Y \in R^{(n_1n_2)\times N_3}$, $Z \in R^{(N_1N_2)\times n_3}$ and $X \in R^{(N_1N_2)\times N_3}$ be the matrix form of HSI, MSI and SRI of the same scene, respectively. Let $L(Z)$ be the Laplacian on $Z$. Let $W$ be the adjacency matrix of $L(Z)$ such that the entry at the $i$-th row and $j$-th column denotes the similarity of the pixel vectors at those locations. Then the graph Laplacian of the SRI is as follows: Tr$(X^TL(Z)X)= <X, L(Z)X> = \sum^{N_3}_{l=1} (\frac{1}{2} \sum_{i,j} w_{ij}(X^{(i,l)})- X^{(j,l)}))$ $=\frac{1}{2} \sum_{i,j}w_{ij}||X^{(i,:)})- X^{(j,:)}||^2_2$ where $X^{(i,:)}$ and $X^{(j,:)}$ denote the pixel vectors of $X$ at the $i$-th and $j$-th locations, respectively. By minimizing with respect to this regularization, the pixel vectors that are close to each other in the MSI are forced to be close in the SRI. --- ### Blind graph Laplacian Regularized Fusion (BGLRF) When the spatial degradation from SRI to HSI is known, $X$ can be reconstructed by directly solving: min$_X ||(P C)X − Y ||^2_F + α$Tr$(X^T L(Z)X)$. When the spatial degradation is not known, the BGLRF algorithm alternates between updating the blur kernel $K$ and finding the super-resolved image $X$. This update takes place for each band of the HSI. An isotropic TV regularization is applied to the kernel $K$, which helps in the piecewise smoothness of $K$. $K$ is further restricted to be in the simplex $S \subset R^{p \times p}$, where p is an estimate for the maximum blur kernel size and $S = \{K∈R^{p \times p}|K^{(a,b)}≥0,\sum_a \sum_b K^{(a,b)}=1\}$. (1) We take $I$ to be the indicator function, with values as 0 if $K$ lies within the simplex, and infinity otherwise. Hence when the spatial blur is unknown, $X$ is reconstructed by solving: min$_{X, K} ||(PC(K))X − Y ||^2_F + α$Tr$(X^T L(Z)X) + β$TV$(K) + I_S(K)$. (2) where $α, β >0$ are proper constants. The above equation is solved using proximal alternating minimization. Each band of $X$ is initiated as the bucubic interpolation of the corresponding band of $Y$. $K$ is updated by first adding an inertia term at the end, bringing the above equation to a non-smooth convex optimization problem which can be solved using ADMM, by introduicing an operator that zero pads $K$ and shifts it circularly according to the high-resolution image, resulting in equation (4). Equation (1) with the added inertia term can be solved to get an unique $X$ by using conjugate gradient (CG) method. Hence the algorithm for the entire process is as follows: ___ **Algorithm 1: Blind graph Laplacian Regularized Fusion (BGLRF)** Initialize X using bicubic interpolation for iteration = 0,1,··· do 1. Update $K$ by solving (4) using ADMM 2. Update $X$ by solving (7) using CG end for --- Lemma: If the generated sequence $(K,X)$ is bounded, Algorithm 1 converges to a stationary point of the objective function (2). --- ### Numerical Experiments The experiments are executed from MATLAB R2018a on a 64-bit Linux machine with 8 Intel i7-7700 CPUs at 3.60 GHz and 32 GB of RAM. Experiments were conducted from simulated HSI and MSI pairs generated from the Indian Pines, Salinas, Pavia University and the Western Sichuan dataset. The synthesis of HSI and MSI was done using the standard procedure. At first the input image was denoised, to produce the SRI. The HSI was a result of spatial degradation of the SRI. This spatial degradation was modeled by a Gaussian blur, followed by downsampling and then adding noise. The Gaussian blur has a standard deviation such that its full width at half maximum is equal to the downsampling ratio $d$. The size of the blur is assumed to be $(2d+1)$. The downsampling ratio is fixed to 4 for all the experiments. The HSI was then generated by adding Gaussian noise of SNR=30. MSI is the synthesized by performing band selection on the SRI (based on the MNBS algorithm), using spectral responses of a multispectral sensor. For Indian Pines and Salinas, the Landsat_TM5 sensor was used to obtain the spectral responses. Gaussian noise is added such that the final MSI has SNR=40. The graph Laplacian chosen was one that defines the affinity of pixel vectors using correlations between the overlapping windows, using both spectral and spatial information. Conjugate gradient was used to update $K$ and $X$, which was implemented by the authors' own solvers. Further experiments were done to prove the usefulness of the blind kernel estimation, and graph laplacian regularization, by comparing BGLRF with bicubic interpolation, no graph Laplacian regularization and non-blind image fusions, using the HSI and MSI generated from Indian Pines, with a blur kernel shifted by 4 pixels horizontally and vertically to the botton right. When there is no regularization, the TV regularization about the blur kernel isn't as useful. One cannot expect to get good estimation of the blur kernel due to the lack of spatial information, which is provided via the graph Laplacian. When there is no blur kernel estimation, the super-resolution is not as good. Hence, BGLRF outperforms the other methods. All the results were evaluated using the following metrics: - Metrics based on spatial measures:: 1. **ERGAS** (relative dimensionless global error in synthesis): It is defined as $100d \sqrt{ \frac{1}{N_1N_2N_3} \sum^{N_3}_{l=1}\frac{||X^{(:,l)}-\underline{X}^{(:,l)}||^2_F}{\mu^2_l}}$ where $d = N1/n1 = N2/n2$ is the downsampling ratio, and $\mu_l$ is the mean in the $l$-th band. 2. **UIQI** (universal image quality index): Mean UIQI of all the bands, calculated between the estimated SRI and the ground truth. It takes values in [−1, 1], and measures spatial distortion between the two based on correlation, luminance and contrast. - Metrics based on spectral measures: 1. **SAM** (spectral angle mapper): It measures the closeness (in degrees) of the estimated pixel vectors with the ground truth ones. It is defined as: $\frac{1}{N_1N_2}\sum^{N_1N_2}_{n=1}$arccos$(\frac{<X^{(n,:)},\underline{X}^{(n,:)}>}{||X^{(n,:)}||_2||\underline{X}^{(n,:)}||_2})$ 2. **OA**: The overall accuracy of classification. While ground truth labels are available, the accuracy of fusion can be validated by the classification accuracy of the estimated SRI, using binary or multi-class SVMs. BGLRF was compared to other related algorithms (dTV, HySure, STEREO) for aligned and misaligned blur kernels over several datasets in 3 different settings: - (mis-registration: 0) When the spatial degradation of HSI can be accurately estimated. Blur kernel of size 9x9, centered at the origin (ground truth blur kernel) is taken. - (mis-registration: 2) The center of the blur kernel is shifted horizontal and vertically to the top left by 2 pixels. The authors let HySure and our method search among all the blur kernels of size no more than 13. - (mis-registration: 4) The center of the blur kernel is shifted horizontal and vertically to the bottom right by 4 pixels. The authors let HySure and our method search among all the blur kernels of size no more than 17. In all the settings, STEREO and HySure are fed with the ground truth blur kernel and spectral responses. The CP rank of STEREO is set to be 150. For BGLRF, $\alpha$ = 10 in all settings and $\beta$ = 10 for the misregistration cases. BGLRF performs as well as the others in terms of the spatial metrics when blur kernel is aligned and consistently better than the others for mis-registered blur kernel. In terms of the spectral metrics, it outperforms others in all situations. The proposed algorithm proves to be capable of preserving classification accuracy, with the additional advantage of being able to better distinguish different materials in the scene. --- ### Conclusion The authors presented a hyperspectral-multispectral fusion algorithm using graph Laplacian regularization, without assuming prior knowledge about the blur kernel. This algorithm alternates between finding the blur kernel and fusing HSI with MSI. As a byproduct, the algorithm is able to deal with translation mis-alignment between the two input images. Various numerical experiments validate the usefulness of our algorithm, and its ability to preserve spectral information. Such property is desirable for further classification and detection tasks.