# Sample Efficient Hyperspectral Unmixing Design Document
[GitHub Repository](https://github.com/CVC-Lab/SampleEfficientHyperspectralUnmixing)
All floats are `float32` unless otherwise specified.
## To-Do
- Something that will generate a synthetic dataset
- DONE (Could be improved)
- A PyTorch Dataset that takes as input coordinates for a square and returns the average of pixels in that square
- Still needs doing
- dataset.initialize()
- This dataset will be initialized with a hyperspectral image, such as an image from the synthetic dataset
- The image will be of size m x n x p where p >> 1
- The dataset will take as input an index (i,j) where $i \in [0,m)$ and $j \in [0,n)$ are integers and an integer input $w \ge 1$
- The dataset will then output the average of all pixels in the set $(i:i+w-1,j:j+w-1)$
- EXAMPLE:
- dataset.get_HSblock([i,j],width):
- block = HSImage[i:i+width-1,j:j+width-1]
- sum = 0
- for pixel in block:
- sum += pixel
- return sum / (number of pixels in block)
- dataset.get_MATblock
- block = MATImage[i:i+width-1,j:j+width-1]
- sum = 0
- for pixel in block:
- sum += pixel
- return sum / (number of pixels in block)
- HSImage and MATImage are the outputs of generateDataCube() from the synthetic dataset
- For now we can assume that these widths are squares, but we'll need varying side lenths later
- Endmember extractor
- We can probably use vertex component analysis
- For now, we can probably use the ground truth spectral signatures
- We will also need some sort of compression because VCA will not work with big data
- PCA will probably work for this
- It would be great if we could implement DAEN for this, but lets leave that for later
- The actual machine learner
- Chase will work on this
- For now we can use linear unmixing
- $Y = AW + E$
- $W = A^\dagger Y$
- This will work for now
- Autoencoder and decoder following [DAEN](https://ieeexplore-ieee-org.ezproxy.lib.utexas.edu/document/8628241)
- Sample Efficient Reinforcement Learning
- 
- Initialize Dataset
- Load the image and put it into the Dataset object
- Initialize pixel set $\mathcal{P}$
- Determine how the image will be blocked off in the beginning and create a list of these blocks
- Instead of using a done set, we can simply remove pixels from the list $\mathcal{P}$
- Loop the following until $\mathcal{P}$ is empty
- Randomly select a block from $\mathcal{P}$
- This block will be called $\mathbf{x}_t$ and it will be a vector representing the average spectral signature in the block
- Generate some random noise $\epsilon_t$ and concatenate it with $\mathbf{x}_t$ to create $[\mathbf{x}_t,\epsilon_t]$
- To start we will have the length of $\epsilon_t$ be equal to the length of $\mathbf{x}_t$
- Neural Network $\mathcal{T}_{\phi_1}$
- Input <- $2p$
- Hidden Layer <- $2p$
- Output $\psi_t$ <- $h$
- Where $h$ is the length of the latent dimension
- Neural Network $\mathcal{T}_{\phi_2}$
- Input <- $p$
- Hidden Layer <- $p$
- Output $\Sigma_{\mathbf{z}_t}$ <- $h$
- $\mathbf{z}t$ is now sampled from a Gaussian distribution with mean $\psi_t$ and variance $\Sigma_{\mathbf{z}_t}$
- Neural Network $\mathcal{T}_\theta$
- Input <- $p + h$
- Hidden Layer <- $p$
- Output $\mathbf{\mu}_{\mathbf{r}_t}$ <- 1
- $a_t = \arg \max \mathbf{\mu}_{\mathbf{r}_t}$
- IF $a_t$ is 1
- Remove $\mathbf{x}_t$ from $\mathcal{P}$
- What reward should be given??
- Store $(\mathbf{x}_t,a_t,r_t)$
- IF $a_t$ is 0
- $\mathbf{x}_t$ is split into a number of blocks equal to the largest prime factor of w squared
- This ensures an even splitting no matter what w is
- These blocks are then added to $\mathcal{P} and $\mathbf{x}_t$ is removed
- The reward $r_t = -\sum_{j=1}^4 [||\mathbf{x}_t^j - \mathcal{T}_\text{unmix}(\mathbf{x}_t^j)||_2^2] + 4||\mathbf{x}_t - \mathcal{T}_\text{unmix}(\mathbf{x}_t)||_2^2 - \eta$
- Here $\mathcal{T}_\text{unmix}$ is the pretrained neural network that unmixes pixels
- Store $(\mathbf{x}_t,a_t,r_t)$
- When we've gone through $\mathbf{t}_f$ iterations, we loop through the stored $(\mathbf{x}_i,a_i,r_i)$ to update the parameters of the model
- Take a minibatch $\{(\mathbf{x}_i,a_i,r_i)\}_{i=1}^N$
- This will allow some speedup by doing several at once
- expand $r_i$ into a vector $\mathbf{r}_i$ that is a one hot encoding based on the action taken
- So if $a_t = 0$, then $\mathbf{r}_i = [r_i, 0]$
- Create a mask for each datapoint in the batch $\{\mathbf{m}_i\}_{i=1}^N$
- Each $\mathbf{m}_i$ is a vector with a one hot encoding corresponding to the selected action
- Run $[\mathbf{x}_i,\epsilon_i]$ through $\mathcal{T}_{\phi_1}$ K times using different epsilons to get $\{\psi_i^{(k)}\}_{k=0}^K$
- Compute $\Sigma_{\mathbf{z}_i} = \mathcal{T}_{\phi_2}$
- Compute $\mathbf{z}_i = \psi_i^{(0)} + \Sigma_{\mathbf{z}_i} \odot \epsilon_i$
- Compute $\mu_{\mathbf{r}_i} = \mathcal{T}_\theta([\mathbf{x}_i,\mathbf{z}_i])$
- Use autograd to compute the derivatives of that massive equation