Sample Efficient Hyperspectral Unmixing Design Document

# Sample Efficient Hyperspectral Unmixing Design Document [GitHub Repository](https://github.com/CVC-Lab/SampleEfficientHyperspectralUnmixing) All floats are `float32` unless otherwise specified. ## To-Do - Something that will generate a synthetic dataset - DONE (Could be improved) - A PyTorch Dataset that takes as input coordinates for a square and returns the average of pixels in that square - Still needs doing - dataset.initialize() - This dataset will be initialized with a hyperspectral image, such as an image from the synthetic dataset - The image will be of size m x n x p where p >> 1 - The dataset will take as input an index (i,j) where $i \in [0,m)$ and $j \in [0,n)$ are integers and an integer input $w \ge 1$ - The dataset will then output the average of all pixels in the set $(i:i+w-1,j:j+w-1)$ - EXAMPLE: - dataset.get_HSblock([i,j],width): - block = HSImage[i:i+width-1,j:j+width-1] - sum = 0 - for pixel in block: - sum += pixel - return sum / (number of pixels in block) - dataset.get_MATblock - block = MATImage[i:i+width-1,j:j+width-1] - sum = 0 - for pixel in block: - sum += pixel - return sum / (number of pixels in block) - HSImage and MATImage are the outputs of generateDataCube() from the synthetic dataset - For now we can assume that these widths are squares, but we'll need varying side lenths later - Endmember extractor - We can probably use vertex component analysis - For now, we can probably use the ground truth spectral signatures - We will also need some sort of compression because VCA will not work with big data - PCA will probably work for this - It would be great if we could implement DAEN for this, but lets leave that for later - The actual machine learner - Chase will work on this - For now we can use linear unmixing - $Y = AW + E$ - $W = A^\dagger Y$ - This will work for now - Autoencoder and decoder following [DAEN](https://ieeexplore-ieee-org.ezproxy.lib.utexas.edu/document/8628241) - Sample Efficient Reinforcement Learning - ![](https://i.imgur.com/baLHmDO.png) - Initialize Dataset - Load the image and put it into the Dataset object - Initialize pixel set $\mathcal{P}$ - Determine how the image will be blocked off in the beginning and create a list of these blocks - Instead of using a done set, we can simply remove pixels from the list $\mathcal{P}$ - Loop the following until $\mathcal{P}$ is empty - Randomly select a block from $\mathcal{P}$ - This block will be called $\mathbf{x}_t$ and it will be a vector representing the average spectral signature in the block - Generate some random noise $\epsilon_t$ and concatenate it with $\mathbf{x}_t$ to create $[\mathbf{x}_t,\epsilon_t]$ - To start we will have the length of $\epsilon_t$ be equal to the length of $\mathbf{x}_t$ - Neural Network $\mathcal{T}_{\phi_1}$ - Input <- $2p$ - Hidden Layer <- $2p$ - Output $\psi_t$ <- $h$ - Where $h$ is the length of the latent dimension - Neural Network $\mathcal{T}_{\phi_2}$ - Input <- $p$ - Hidden Layer <- $p$ - Output $\Sigma_{\mathbf{z}_t}$ <- $h$ - $\mathbf{z}t$ is now sampled from a Gaussian distribution with mean $\psi_t$ and variance $\Sigma_{\mathbf{z}_t}$ - Neural Network $\mathcal{T}_\theta$ - Input <- $p + h$ - Hidden Layer <- $p$ - Output $\mathbf{\mu}_{\mathbf{r}_t}$ <- 1 - $a_t = \arg \max \mathbf{\mu}_{\mathbf{r}_t}$ - IF $a_t$ is 1 - Remove $\mathbf{x}_t$ from $\mathcal{P}$ - What reward should be given?? - Store $(\mathbf{x}_t,a_t,r_t)$ - IF $a_t$ is 0 - $\mathbf{x}_t$ is split into a number of blocks equal to the largest prime factor of w squared - This ensures an even splitting no matter what w is - These blocks are then added to $\mathcal{P} and $\mathbf{x}_t$ is removed - The reward $r_t = -\sum_{j=1}^4 [||\mathbf{x}_t^j - \mathcal{T}_\text{unmix}(\mathbf{x}_t^j)||_2^2] + 4||\mathbf{x}_t - \mathcal{T}_\text{unmix}(\mathbf{x}_t)||_2^2 - \eta$ - Here $\mathcal{T}_\text{unmix}$ is the pretrained neural network that unmixes pixels - Store $(\mathbf{x}_t,a_t,r_t)$ - When we've gone through $\mathbf{t}_f$ iterations, we loop through the stored $(\mathbf{x}_i,a_i,r_i)$ to update the parameters of the model - Take a minibatch $\{(\mathbf{x}_i,a_i,r_i)\}_{i=1}^N$ - This will allow some speedup by doing several at once - expand $r_i$ into a vector $\mathbf{r}_i$ that is a one hot encoding based on the action taken - So if $a_t = 0$, then $\mathbf{r}_i = [r_i, 0]$ - Create a mask for each datapoint in the batch $\{\mathbf{m}_i\}_{i=1}^N$ - Each $\mathbf{m}_i$ is a vector with a one hot encoding corresponding to the selected action - Run $[\mathbf{x}_i,\epsilon_i]$ through $\mathcal{T}_{\phi_1}$ K times using different epsilons to get $\{\psi_i^{(k)}\}_{k=0}^K$ - Compute $\Sigma_{\mathbf{z}_i} = \mathcal{T}_{\phi_2}$ - Compute $\mathbf{z}_i = \psi_i^{(0)} + \Sigma_{\mathbf{z}_i} \odot \epsilon_i$ - Compute $\mu_{\mathbf{r}_i} = \mathcal{T}_\theta([\mathbf{x}_i,\mathbf{z}_i])$ - Use autograd to compute the derivatives of that massive equation