# DNBP Simplified
# Index
[TOC]
- If you are unable to understand few terms in this hack or want to know more, refer this hack https://hackmd.io/@thrib513/rJ7HEsxhJg
# Markov Random Field (MRF) Representation
## Graph
- Undirected Graph $G=\{V,E\}$
- V - Vertices
- E - Edges
```json=
"graph": [[0, 1, 1, 2],
[1, 0, 2, 1]],
```
- The first row represents source nodes.
- The second row represents destination nodes.
- Each column represents an edge between two nodes.
```json=
"edge_set": [0, 2],
```
- This specifies the indices of edges that are considered for belief propagation.
- In this case, Edge 0 (0 → 1) and Edge 2 (1 → 2) are included
```json=
"inc_nghbrs": [[1], [0,2], [1]],
```
- This lists the neighbors for each node:
- Node 0 → Connected to Node 1 ([1]).
- Node 1 → Connected to Nodes 0 and 2 ([0, 2]).
- Node 2 → Connected to Node 1 ([1])
- (b) is MRF

- Circle represents Node (Random Variable)
- $Y$ grey represents observed variable
- $X$ white represents unobserved variable
- Each edge represents pairwise relationship between 2 random variables in V
## Joint probability distribution for Graph G
$$
P(X,Y)=\frac 1 Z \space \prod_{(s,d) \in E} \space \psi_{sd}(X_s,X_d) \space \prod_{(d) \in V} \space \phi_{d}(X_d,Y_d)
$$
- $X=\{X_d|d \in V\}$ - set of unobserved variables
- $Y=\{Y_d|d \in V\}$ - set of observed variables
- Z - normalizing constant
- $\phi_d$ - **Unary potential**, describes the compatibility(likelihood) of $X_d$ with $Y_d$
- $\psi_{sd}$ - **Pairwise potential**, describes the compatibility(likelihood) of neighboring varibles $X_s$ and $X_d$
## Belief Propagation (BP)
- BP provides an algorithm for inference of marginal posterior distribtion $\left(P(X|Y)\right)$ known as beliefs $bel_d(X_d)$.
- BP designs message passing algorithm for calculation of beliefs
$$
bel_d(X_d) \propto \phi_d\left(X_d,Y_d\right) \prod_{s\in p(d)} m_{s\to d}\left(X_d\right) \tag{eq.2}
$$
- p(d) denotes set of neighboring nodes of d
- message from node s(source) to d(destination) is defined as:
$$
m_{s\to d} \left(X_d\right) = \int_{X_s} \phi_s(X_s,Y_s) \psi_{sd}(X_s,X_d) \times \prod_{u\in p(s)} m_{u\to s}(X_s) dX_s \tag{eq.3}
$$
- Performing this on continuous space is `intractable (need efficient algorithm for message passing)`
# Differentiable Non Parametric Belief Propagation (DNBP)
- DNBP is new method for BP that aims to improve how to handle uncertainity in estimates by efficiently approximating marginal posterior distributions in MRF.
- For potentials, instead of designing functions for problem specific,`Neural Networks` are used for making it more flexible and adaptable
- DNBP uses an iterative, differentiable message passing schemem to infer beliefs over hidden variables in an MRF
- DNBP approximates belief & message in Eq 2&3 at iteration t by sets of `N & M weighted particles`.
$$
\begin{align}
\text{bel}_{d}^{t}(X_d) &= \left\{ \left( \mu_d^{(i)}, w_d^{(i)} \right) \right\}_{i=1}^{N} \\
m_{s \to d}^{t} &= \left\{ \left( \mu_{sd}^{(i)}, w_{sd}^{(i)} \right) \right\}_{i=1}^{M}
\end{align}
$$
where:
- $\mu$ represents particle
- particle information is basically `position coordinate` (x,y)
- $w$ represents weight for various modes described below
- **PULL Strategy** is defined for message passing.:
- A message, $m^t_{s\to d}$ is generated by first sampling M independent samples from $bel^{t-1}_d(X_d)$ then resampling and reweighting from this set.
# Loss Function
- Generate density estimate with weighted gaussian kernels
- Function in `diffBP/networks/dnbp_synthetic/dnbp.py/density_estimation(line 645)`
1. **Density Estimation Equation**
- The density of the belief can be expressed as a mixture of Gaussians, with a component centered at each particle.
$$
l(x_i|\theta)=\sum^N_{j=1}w_{\theta,i,j}\times\frac 1 {\sigma\sqrt{2\pi}}\times \exp\left(\frac{-1} 2 \left(\frac {x_i-p_{i,j}}{\sigma}\right)^2 \right)
$$
- Where:
- $x$ is **ground truth** (position of node)
- $i$ is **node id**
- $j$ is **particle id**
- $p_{i,j}$ is **belief particles** of a node i
- $\theta$ represents mode:
- **belief weights likelihood** (belief_weights_lik (or) w_lik)
- **message weights unary**
- **message_weights_neigh** (message weights neighborhood)
- **belief weights** - This is not used (in code although it exists) as
- belief weights = normalized (belief_weights_lik * message_weights_unary * message_weights_neigh))
- $w$ represents weight(value) of a particular mode $\theta$ for $j^{th}$ particle id of $i^{th}$ node.
2. **Final Loss**
- The final loss is for a particular node of a sequence. Loss Resets to 0 for calculating for other node
- Basically **calculate mean over all modes losses**.
$$
L_{i}=\frac 1 N \sum^N_{k=1} -log(l(x_i|\theta_k)+10^{-50})
$$
- Where
- N=4
- k represents mode($\theta$) described above in density estimation
- i represents node id
## Loss function formulation in paper
- Supervised training is used
- Objective function:
$$
\begin{align}
\overline{bel}_{d}^{t}(x_d) &= \sum_{i=1}^{N} w_{d}^{(i)} \cdot \mathcal{N}(x_d; \mu_{d}^{(i)}, \Sigma) \\
L_{d}^{t} &= -\log\left(\overline{bel}_{d}^{t}(x_{d}^{t,*})\right)
\end{align}
$$
- where,
- t is time/frame number
- d represents node id
- i represents mode
- w represents weight for that particular mode of that node id i
- $x_d$ is ground truth
- $\mathcal N$ represents Gaussian eqn

# Particle Information
# Modes $\left (\theta\right)$ / Weight Types
## 1. belief_weights_likelihood
- belief_weights_lik - Shape(batch_size, num_neighbors,particle count)
- This stores likelihood computation for beliefs

<details>
<summary>Initialization Code</summary>
```python=
# diffBP/networks/dnbp_synthetic/dnbp.py
self.belief_weights_lik = [torch.ones(self.batch_size, len(self.inc_nghbrs[i]),
self.particle_count).to(device=self.device).type(self.type)
/ (len(self.inc_nghbrs[i]) * self.particle_count)
for i in range(self.num_nodes)]
```
</details>
- Used in **belief_update** function
1. Pass global featues ($Y_d$) & message particles ($X_d=\mu_{sd}^{(i)}$) into node likelihood (**Unary Potential Function** $\phi_d$) to get message likelihood (message)
2. Normalize message likelihood to get belief_weights_lik

(from 2. **Belief Update**)
## 2. message weights unary
- message_weights_unary ($w_{unary}$) - Shape (batch_size, num_neighbors,particle count)
- This stores unary density
- 
<details>
<summary>Initialization Code</summary>
```python=
# diffBP/networks/dnbp_synthetic/dnbp.py
# diffBP/networks/dnbp_synthetic/dnbp.py
self.message_weights = [torch.ones(self.batch_size, len(self.inc_nghbrs[_]),
self.particle_count).to(device=self.device).type(self.type)
/ (len(self.inc_nghbrs[_]) * self.particle_count)
for _ in range(self.num_nodes)]
```
</details>
- Updated in **message_update**
- 
(from **1. Message Update**)
1. Resampling Particles
- A fraction of the belief particles (frac_resamp) are resampled to improve performance.
- Uses either low-variance resampling **(IMPORTANCE SAMPLING)** or random resampling.
- Resampled particles are perturbed by a small time delta to add diversity.
- The remaining particles are replaced with new random samples
<details>
<summary>Code</summary>
```python=
if self.mode=='low_var':
rand_chance = ((torch.arange(resamp_particles) / float(resamp_particles)).repeat(self.batch_size,len(self.inc_nghbrs[dst_]), 1)
+ (torch.rand(self.batch_size, len(self.inc_nghbrs[dst_]), 1) / float(resamp_particles)))
else:
rand_chance = torch.rand(self.batch_size, len(self.inc_nghbrs[dst_]), resamp_particles)
# batch x incoming x 1 x resamp_particles
rand_chance = rand_chance.unsqueeze(2).type(self.type).to(device=self.device)
```
</details>
2. Computing Message Unary Weights ($w_{unary}$)
- Likelihood scores for particles are calculated using the likelihood network.
- Scores are normalized to get probability values.
<details>
<summary>Update Code</summary>
```python=
for src_i, src_ in enumerate(self.inc_nghbrs[dst_]):
# Determine edge index of message pass from src_->dst_
edge_i = ((self.graph == torch.tensor([[min(src_,dst_)],
[max(src_,dst_)]])).all(dim=0).nonzero().squeeze(0)
== self.edge_set).nonzero().squeeze(0)
# Isolate outgoing particles from src_->dst_ for w_unary calculation
# These outgoing particles are in the frame of dst_
msgs = self.message_particles[dst_][:,src_i].contiguous().view(-1,1,self.particle_size)
# Generate delta samples to translate particles from dst_ frame to src_ frame
if self.multi_edge_samplers:
if src_ < dst_:
s = self.edge_samplers[edge_i][0](msgs.shape[0]*num_int)
else:
s = self.edge_samplers[edge_i][1](msgs.shape[0]*num_int)
else:
if src_ < dst_:
s = self.edge_samplers[edge_i](msgs.shape[0]*num_int)
else:
s = -self.edge_samplers[edge_i](msgs.shape[0]*num_int)
# Translate particles into src_ frame using sampled deltas
samples = msgs + s.view(msgs.shape[0],num_int,-1)
# Change view for feeding into network
samples = samples.view(msgs.shape[0]*num_int,-1)
# Concatenate features with particles and feed into likelihood network of src_
# Remember, samples are now in src_'s frame
# This is the key step that causes us to turn off likelihood gradients; we don't want the src_ likelihood
# factor learning to weight particles based on feedback from dst_ ground truth
if self.shared_feats:
wgts = self.node_likelihoods[src_](torch.cat((samples,
glbl_feats.unsqueeze(1).repeat((1, self.particle_count * num_int,
1)).view(self.batch_size
* self.particle_count
* num_int, -1)),
dim=1)).view(self.batch_size, self.particle_count, num_int)
else:
wgts = self.node_likelihoods[src_](torch.cat((samples,
glbl_feats[src_].unsqueeze(1).repeat((1, self.particle_count * num_int,
1)).view(self.batch_size
* self.particle_count
* num_int, -1)),
dim=1)).view(self.batch_size, self.particle_count, num_int)
# Average over the num_int samples
wgts = wgts.mean(dim=-1)
# Normalize scores of outgoing particles, these are the w_unary scores
wgts = wgts / wgts.sum(dim=1,keepdim=True)
# Store unary scores for use later
self.message_weights_unary[dst_][:,src_i] = wgts
```
</details>
## 3. message weights neighborhood
- message_weights_neigh - Shape (batch_size, num_neighbors,particle count)
- This stores neighbor messages

<details>
<summary>Initialization Code</summary>
```python=
# diffBP/networks/dnbp_synthetic/dnbp.py
self.message_weights_neigh = [torch.ones(self.batch_size, len(self.inc_nghbrs[i]),
self.particle_count).to(device=self.device).type(self.type)
/ (len(self.inc_nghbrs[i]) * self.particle_count)
for i in range(self.num_nodes)]
```
</details>
- Updated in message_update
- 
(from 1. **Message Update**)
- Computing Neighbor Weights ($w_{neigh}$)
- The effect of neighboring nodes on message particles is computed.
- For each neighboring node:
- The function finds the correct edge index in the graph.
- Translates particles into the neighbor’s frame using **Edge Densities** (**pairwise density function** $\psi_{sd}$) .
- Computes probability scores based on the likelihood network.
- Normalizes and stores the final neighbor influence weights.
<details>
<summary>Update Code</summary>
```python=
# diffBP/networks/dnbp_synthetic/dnbp.py
self.message_weights_neigh = [torch.ones(self.batch_size, len(self.inc_nghbrs[i]),
self.particle_count).to(device=self.device).type(self.type)
/ (len(self.inc_nghbrs[i]) * self.particle_count)
for i in range(self.num_nodes)]
```
</details>
# Message Update

# Belief Update
