# TEASER: Fast and Certifiable Point Cloud Registration
H. Yang, J. Shi, and L. Carlone, *TEASER: Fast and Certifiable Point Cloud Registration*, IEEE Transactions on Robotics, vol. 37, no. 2, pp. 314-333, 2020.
[TOC]
## Abstract
> heuristic and certifiable(可驗證)
>
> 拆解 point cloud registration
> - scale: adaptive voting (convex domain)
> - rotation: SDP/GNC (non-convex domain)
> - translation: adaptive voting (convex domian)
>
> GNC for heuristic, SDP for certify
We propose a fast and **certifiable** algorithm for P-REG named TEASER. We first reformulate the problem using TLS, then decouple scale, rotation, and translation estimation. Despite that each subproblem is still non-convex and combinatorial in nature, we show that 1) TLS scale and translation can be solved in polynomial time via an **adaptive voting** scheme, 2) TLS rotation can be relaxed to a SDP and the relaxation is tight, 3) the graph-theoretic framework allows drastic pruning of outliers by finding the maximum clique. While solving large SDP relaxations is typically slow, we develop TEASER++ that uses GNC to solve the rotation subproblem and leverages Douglas-Rachford Splitting to efficiently certify global optimality. For both algorithms, we provide theoretical bounds on the estimation errors.
> 
## Robust registration with TLS
registration problem: scale, rotation, translation
$$
\textbf{b}_i=s\textbf{R}\textbf{a}_i+\mathbf{t}+\textbf{o}_i+\epsilon_i
$$
registration without outliers ($\mathbf{o}_i=0$)
$$
\min_{s>0,\mathbf{R}\in\text{SO}(3),\mathbf{t}\in\mathbb{R}^3}\sum_{i=1}^N\frac{1}{\sigma_i^2}\|\mathbf{b}_i-s\textbf{R}\textbf{a}_i-\mathbf{t}\|^2
$$
registration with TLS
$$
\min_{s>0,\mathbf{R}\in\text{SO}(3),\mathbf{t}\in\mathbb{R}^3}\sum_{i=1}^N\min\Big(\frac{1}{\beta_i^2}\|\mathbf{b}_i-s\textbf{R}\textbf{a}_i-\mathbf{t}\|^2,\bar{c}^2\Big)\tag{1}
$$
$\|\epsilon_i\|\leq\beta_i$: noise bound
## Scale, Rotation, Translation
- relative distance is invariant to rotation and translation
- relative position is invariant to translation
### Translation Invariant Measurements (TIM)
> 兩點之間的向量不受 $\mathbf{t}$ 影響
relative position between two points: $\mathbf{t}$ is cancelled out
$$
\mathbf{b}_j-\mathbf{b}_i=s\mathbf{R}(\mathbf{a}_j-\mathbf{a}_i)+(\mathbf{t}-\mathbf{t})+(\mathbf{o}_j-\mathbf{o}_i)+(\epsilon_j-\epsilon_i)=s\mathbf{R}(\mathbf{a}_j-\mathbf{a}_i)+(\mathbf{o}_j-\mathbf{o}_i)+(\epsilon_j-\epsilon_i)
$$
define $\bar{\mathbf{b}}_{ij}=\mathbf{b}_j-\mathbf{b}_i$ and $\bar{\mathbf{a}}_{ij}=\mathbf{a}_j-\mathbf{a}_i$
$$
\bar{\mathbf{b}}_{ij}=s\mathbf{R}\bar{\mathbf{a}}_{ij}+\mathbf{o}_{ij}+\epsilon_{ij}\tag{TIM}
$$
$\|\epsilon_{ij}\|\leq\beta_i+\beta_j=\delta_{ij}$
- depends only on $s,\mathbf{R}$
- number of TIMs is upper-bdd by $N(N-1)/2$

**graph interpretation** $\mathcal{G(V,E)}$
$\mathbf{a,b}\in\mathbb{R}^{3N}$ by concatenating all $\mathbf{a}_i,\mathbf{b}_i$ respectively
$\bar{\mathbf{a}},\bar{\mathbf{b}}\in\mathbb{R}^{3N}$ by concatenating all $\bar{\mathbf{a}}_{ij},\bar{\mathbf{b}}_{ij}$ respectively
then $\bar{\mathbf{a}}=(C\otimes I_3)\mathbf{a}$ and $\bar{\mathbf{b}}=(C\otimes I_3)\mathbf{b}$ where $C$ is the incidence matrix of $\mathcal{G}$
### Translation and Rotation Invariant Measurements (TRIM)
> 兩點之間的距離不受 $\mathbf{R,t}$ 影響
compute the norm of TIM
$$
\|\bar{\mathbf{b}}_{ij}\|=\|s\mathbf{R}\bar{\mathbf{a}}_{ij}+\mathbf{o}_{ij}+\epsilon_{ij}\|\tag{2}
$$
for inliers, consider the triangle inequality $\|\epsilon_{ij}\|\leq\delta_{ij}$
$$
\|s\mathbf{R}\bar{\mathbf{a}}_{ij}\|-\delta_{ij}\leq\|s\mathbf{R}\bar{\mathbf{a}}_{ij}+\epsilon_{ij}\|\leq\|s\mathbf{R}\bar{\mathbf{a}}_{ij}\|+\delta_{ij}
$$
rewrite $(2)$ as (error term is now some value instead of vector)
$$
\|\bar{\mathbf{b}}_{ij}\|=\|s\mathbf{R}\bar{\mathbf{a}}_{ij}\|+\tilde{o}_{ij}+\tilde{\epsilon}_{ij}
$$
$|\tilde{\epsilon}_{ij}|\leq\delta_{ij}$
since $\mathbf{R}$ is norm invariant and set $s_{ij}=\|\bar{\mathbf{b}}_{ij}\|/\|\bar{\mathbf{a}}_{ij}\|,\ o_{ij}^s=\tilde{o}_{ij}/\|\bar{\mathbf{a}}_{ij}\|,\ \epsilon_{ij}^s=\tilde{\epsilon}_{ij}/\|\bar{\mathbf{a}}_{ij}\|$
$$
s_{ij}=s+o_{ij}^s+\epsilon_{ij}^s\tag{TRIM}
$$
$|\epsilon_{ij}^s|\leq\delta_{ij}/\|\bar{\mathbf{a}}_{ij}\|=\alpha_{ij}$
> summary of invariant measurements

## <font color=red>T</font>runcated least squares <font color=red>E</font>stimation <font color=red>A</font>nd <font color=red>SE</font>midefinite <font color=red>R</font>elaxation
1. use TRIM to estimate $\hat{s}$
2. use $\hat{s}$ and TIM to estimate $\hat{\mathbf{R}}$
3. use $\hat{s},\hat{\mathbf{R}}$ and TLS to estimate $\hat{\mathbf{t}}$
### Robust scale estimation
$$
\hat{s}=\min_{s>0}\sum_{k=1}^K\min\Big(\frac{1}{\alpha_k^2}(s-s_k)^2,\bar{c}^2\Big)\tag{3}
$$
$k$ is the number of edges, can be solved by adaptive voting
### Robust rotation estimation
$$
\hat{\mathbf{R}}=\min_{\mathbf{R}\in\text{SO}(3)}\sum_{k=1}^K\min\Big(\frac{1}{\delta_k^2}\|\bar{\mathbf{b}}_k-\hat{s}\textbf{R}\bar{\textbf{a}}_k\|^2,\bar{c}^2\Big)\tag{4}
$$
$(4)$ is also known as the **Robust Wahba** or **Robust Rotation Search** problem
### Robust component-wise translation estimation
substitute $\hat{s},\hat{\mathbf{R}}$ back into $(1)$ to estimate $\hat{\mathbf{t}}$
component-wise: $t_1,t_2,t_3$ independently
$$
\hat{t}_j=\min_{t_j}\sum_{i=1}^N\min\Big(\frac{1}{\beta_i^2}([\mathbf{b}_i-\hat{s}\hat{\textbf{R}}\textbf{a}_i]_j-t_j)^2,\bar{c}^2\Big),\ j=1,2,3
$$
can be solved by adaptive voting
### Max Clique Inlier Selection (MCIS)
> 利用圖論性質在scale estimation這步更進一步過濾outliers
after $\hat{s}$ is estimated, we pruned outliers $s_{ij}$ where $|s_{ij}-\hat{s}|>\bar{c}\alpha_{ij}$
that is, we obtained a subgraph $\mathcal{G(V,E^\prime)}$ where $\mathcal{E}^\prime\subseteq\mathcal{E}$
Edges corresponding to inliers form a clique in $\mathcal{E}^\prime$, and there is at least one maximal clique in $\mathcal{E}^\prime$ that contains all inliers.
**solve maximum clique**
> Listing All Maximal Cliques in Sparse Graphs in Near-optimal Time (ISAAC, 2010)
drastically reduces the number of outliers
### Algorithm

## Adaptive voting
> solve scalar TLS estimation in polynomial time
(take scale estimation for example)
given $s\in\mathbb{R}$, define the consensus set of $s$:
$$
\mathcal{T}(s)=\{k\mid\frac{(s-s_k)^2}{\alpha^2_k}\leq\bar{c}^2\}
$$
> Ex. $\{s_1,s_2,s_3\}=\{6,11,10\},\ \bar{c}^2=2$
> $\mathcal{T}(8)=\{1,3\},\ \mathcal{T}(10)=\{2,3\}$(不管$\alpha$)
> 
> 概念是每個區間內$s$算出來的consensus set (inliers) 會一樣,每個區間只要找到一個$\hat{s}_i$代回$(3)$即可
for any $s\in\mathbb{R}$, there are at most $2K-1$ different nonempty consensus sets $\mathcal{T}_1,\dots,\mathcal{T}_{2K-1}$. $(3)$ can be computed by enumeration as:
$$
\hat{s}=\min\Big\{f_s(\hat{s}_i)\Bigm|\hat{s}_i=\frac{\sum_{k\in\mathcal{T}_i}\frac{s_k}{\alpha_k^2}}{\sum_{k\in\mathcal{T}_i}\frac{1}{\alpha^2_k}},\ \forall i\Big\}\tag{5}
$$
>$f_s(\cdot)$ is the objective function of $(3)$(用幾何平均的方式找$\hat{s}_i$)
>
>consider the second case: $\bar{c}=\sqrt{2},\ \alpha_1=\frac{4}{\sqrt{2}},\ \alpha_2=\alpha_3=\frac{2}{\sqrt{2}}$
>
>\begin{align}&\hat{s}_2=\frac{\frac{2}{16}\cdot 6+\frac{2}{4}\cdot 10}{\frac{2}{16}+\frac{2}{4}}=9.2\in I_1\cap I_2\\
&\hat{s}_3=\frac{\frac{2}{16}\cdot 6+\frac{2}{4}\cdot 10+\frac{2}{4}\cdot 11}{\frac{2}{16}+\frac{2}{4}+\frac{2}{4}}=10\in I_1\cap I_2\cap I_3\\
&f(\hat{s}_2)=\frac{2}{16}(9.2-6)^2+\frac{2}{4}(9.2-10)^2+\frac{2}{4}(9.2-11)^2=1.28+0.32+1.62=3.22\\[2pt]
&f(\hat{s}_3)=\underbrace{\frac{2}{16}(10-6)^2}_{\geq\bar{c}^2}+\frac{2}{4}(10-10)^2+\frac{2}{4}(10-11)^2=2+0+0.5=2.5
\end{align}
### Algorithm

> Eq. $(14)$ is $(5)$ in this note
## Semidefinite relaxation & Certification
> solve SE(3) TLS estimation in polynomial time in practical problems
>
> 1. formulate non-convex QCQP by quaternion and binary cloning
> 2. relax into convex SDP with redundant constraints
> 3. relaxation enables a fast optimality certifier
>
> 實際應用TEASER++還是用GNC處理,但利用QCQP、SDP理論做certification
### Quadratically Constrained Quadratic Program
> $\begin{bmatrix}\mathbf{R}x\\0\end{bmatrix}=\mathbf{q}\circ\hat{x}\circ\mathbf{q}^{-1}$, where $\hat{x}$ is the homogeneous coordinate of $x$
rewrite $(4)$ with quaternion formulation:
$$
\hat{\mathbf{q}}=\min_{\mathbf{q}\in\mathbb{S}^3}\sum_{k=1}^K\min\Big(\frac{1}{\delta_k^2}\|\hat{\mathbf{b}}_k-\mathbf{q}\circ\hat{\textbf{a}}_k\circ\mathbf{q}^{-1}\|^2,\bar{c}^2\Big)\tag{6}
$$
where $\hat{\mathbf{a}}=\begin{bmatrix}\hat{s}\bar{\mathbf{a}}_k\\0\end{bmatrix},\ \hat{\mathbf{b}}=\begin{bmatrix}\hat{s}\bar{\mathbf{b}}_k\\0\end{bmatrix}$ and $\mathbb{S}^3\subseteq\mathbb{R}^4$ is the unit sphere
rewrite $\min$ in $(6)$ by:
$$
\min(x,y)=\min_{\theta\in\{+1,-1\}}\frac{1+\theta}{2}x+\frac{1-\theta}{2}y
$$
rewrite $(6)$ as a mixed-integer problem:
$$
\min_{\mathbf{q}\in\mathbb{S}^3,\theta_k=\{\pm 1\}}\sum_{k=1}^K\frac{1+\theta_k}{2}\cdot\frac{1}{\delta_k^2}\|\hat{\mathbf{b}}_k-\mathbf{q}\circ\hat{\textbf{a}}_k\circ\mathbf{q}^{-1}\|^2+\frac{1-\theta_k}{2}\cdot\bar{c}^2\tag{7}
$$
> $\theta_k$ determines inliers ($\theta_k=+1$) or outliers ($\theta_k=-1$)
let $\mathbf{q}_k=\theta_k\mathbf{q}$, then $\mathbf{q}_k^T\mathbf{q}=\theta_k$ and $\mathbf{q}\circ\hat{\mathbf{a}}_k\circ\mathbf{q}_k^{-1}=\theta_k(\mathbf{q}\circ\hat{\mathbf{a}}_k\circ\mathbf{q}^{-1})$:
\begin{align}\implies&\min_{\mathbf{q}\in\mathbb{S}^3,\theta_k=\{\pm 1\}}\sum_{k=1}^K\frac{1}{\delta_k^2}\|\Big(\frac{1+\theta_k}{2}\Big)(\hat{\mathbf{b}}_k-\mathbf{q}\circ\hat{\textbf{a}}_k\circ\mathbf{q}^{-1})\|^2+\frac{1-\mathbf{q}_k^T\mathbf{q}}{2}\bar{c}^2\\\implies&\min_{\mathbf{q}\in\mathbb{S}^3,\theta_k=\{\pm 1\}}\sum_{k=1}^K\frac{1}{4\delta_k^2}\|\hat{\mathbf{b}}_k-\mathbf{q}\circ\hat{\textbf{a}}_k\circ\mathbf{q}^{-1}+\mathbf{q}^T\mathbf{q}_k\hat{\mathbf{b}}_k-\mathbf{q}\circ\hat{\mathbf{a}_k}\circ\mathbf{q}_k^{-1}\|^2+\frac{1-\mathbf{q}_k^T\mathbf{q}}{2}\bar{c}^2\tag{8}\end{align}
> create a quaternion $\mathbf{q}_k$ that determines inliers ($\mathbf{q}_k=\mathbf{q}$) or outliers ($\mathbf{q}_k=-\mathbf{q}$)
---
reformulate $(8)$ into QCQP
> - $\|\mathbf{q}\circ\hat{\textbf{a}}_k\circ\mathbf{q}^{-1}\|^2=\|\mathbf{R}\hat{\mathbf{a}}\|^2=\|\hat{\mathbf{a}}\|^2$
> - $\|\mathbf{q}^T\mathbf{q}_k\hat{\mathbf{b}}_k\|^2=\|\hat{\mathbf{b}}_k\|^2$
> - $\|\mathbf{q}\circ\hat{\mathbf{a}}\circ\mathbf{q}_k^{-1}\|^2=\|\theta_k\cdot(\mathbf{q}\circ\hat{\textbf{a}}_k\circ\mathbf{q}^{-1})\|^2=\|\hat{\mathbf{a}}\|^2$
> - $(\mathbf{q}\circ\hat{\textbf{a}}_k\circ\mathbf{q}^{-1})^T(\mathbf{q}\circ\hat{\textbf{a}}_k\circ\mathbf{q}^{-1}_k)=\theta_k\|\mathbf{q}\circ\hat{\textbf{a}}_k\circ\mathbf{q}^{-1}\|^2=\theta_k\|\hat{\mathbf{a}}\|^2=\mathbf{q}^T\mathbf{q}_k\|\hat{\mathbf{a}}\|$
\begin{aligned}
&\|\hat{\mathbf{b}}_k-\mathbf{q}\circ\hat{\textbf{a}}_k\circ\mathbf{q}^{-1}+\mathbf{q}^T\mathbf{q}_k\hat{\mathbf{b}}_k-\mathbf{q}\circ\hat{\mathbf{a}}\circ\mathbf{q}_k^{-1}\|^2\\[6pt]=&2\|\hat{\mathbf{b}}_k\|^2+2\|\hat{\mathbf{a}}_k\|^2-2\hat{\mathbf{b}}_k^T(\mathbf{q}\circ\mathbf{a}_k\circ\mathbf{q}^{-1})+2\mathbf{q}^T\mathbf{q}_k\|\hat{\mathbf{b}}_k\|^2-2\hat{\mathbf{b}}_k^T(\mathbf{q}\circ\hat{\textbf{a}}_k\circ\mathbf{q}^{-1}_k)\\[4pt]&-2\mathbf{q}^T\mathbf{q}_k\hat{\mathbf{b}}_k^T(\mathbf{q}\circ\mathbf{a}_k\circ\mathbf{q}^{-1})+\mathbf{q}^T\mathbf{q}_k\|\hat{\mathbf{a}}\|-2\mathbf{q}^T\mathbf{q}_k\hat{\mathbf{b}}_k^T(\mathbf{q}\circ\hat{\textbf{a}}_k\circ\mathbf{q}^{-1}_k)
\end{aligned}
> - $\mathbf{q}^T\mathbf{q}_k\hat{\mathbf{b}}_k^T(\mathbf{q}\circ\hat{\textbf{a}}_k\circ\mathbf{q}^{-1}_k)=(\theta_k)^2\hat{\mathbf{b}}_k^T(\mathbf{q}\circ\hat{\textbf{a}}_k\circ\mathbf{q}^{-1})=\hat{\mathbf{b}}_k^T(\mathbf{q}\circ\hat{\textbf{a}}_k\circ\mathbf{q}^{-1})$
> - $\hat{\mathbf{b}}_k^T(\mathbf{q}\circ\hat{\textbf{a}}_k\circ\mathbf{q}^{-1}_k)=\theta_k\hat{\mathbf{b}}_k^T(\mathbf{q}\circ\hat{\textbf{a}}_k\circ\mathbf{q}^{-1})=\mathbf{q}^T\mathbf{q}_k\hat{\mathbf{b}}_k^T(\mathbf{q}\circ\hat{\textbf{a}}_k\circ\mathbf{q}^{-1})$
$$
=2\|\hat{\mathbf{b}}_k\|^2+2\|\hat{\mathbf{a}}_k\|^2+2\mathbf{q}^T\mathbf{q}_k\|\hat{\mathbf{b}}_k\|^2+2\mathbf{q}^T\mathbf{q}_k\|\hat{\mathbf{a}}\|-4\hat{\mathbf{b}}_k^T(\mathbf{q}\circ\mathbf{a}_k\circ\mathbf{q}^{-1})-4\mathbf{q}^T\mathbf{q}_k\hat{\mathbf{b}}_k^T(\mathbf{q}\circ\hat{\textbf{a}}_k\circ\mathbf{q}^{-1}_k)
$$
>let $\Omega_1(\mathbf{q})=\begin{bmatrix}w&-z&y&x\\z&w&-x&y\\-y&x&w&z\\-x&-y&-z&w\end{bmatrix}$ and $\Omega_2(\mathbf{q})=\begin{bmatrix}w&z&-y&x\\-z&w&x&y\\y&-x&w&z\\-x&-y&-z&w\end{bmatrix}$
>
> - $\mathbf{q}_a\circ\mathbf{q}_b=\Omega_1(\mathbf{q}_a)\mathbf{q}_b=\Omega_2(\mathbf{q}_b)\mathbf{q}_a$
> - $\ \Omega_2(\mathbf{q}^{-1})=\Omega_2^{-T}(\mathbf{q})$
> - $-\Omega_1^T(\mathbf{q})=\Omega_1(\mathbf{q})$
> - $\hat{\mathbf{b}}_k^T(\mathbf{q}\circ\hat{\textbf{a}}_k\circ\mathbf{q}^{-1})=\hat{\mathbf{b}}_k^T(\Omega_2^T(\mathbf{q})\Omega_1(\mathbf{q})\hat{\mathbf{a}}_k))=\mathbf{q}^T\Omega_1^T(\hat{\mathbf{b}}_k)\Omega_2(\hat{\mathbf{a}}_k)\mathbf{q}$
\begin{aligned}
&=2\|\hat{\mathbf{b}}_k\|^2+2\|\hat{\mathbf{a}}_k\|^2+2\mathbf{q}^T\mathbf{q}_k\|\hat{\mathbf{b}}_k\|^2+2\mathbf{q}^T\mathbf{q}_k\|\hat{\mathbf{a}}\|-4\mathbf{q}^T\Omega_1^T(\hat{\mathbf{b}}_k)\Omega_2(\hat{\mathbf{a}}_k)\mathbf{q}-4\mathbf{q}^T\mathbf{q}_k\mathbf{q}^T\Omega_1^T(\hat{\mathbf{b}}_k)\Omega_2(\hat{\mathbf{a}}_k)\mathbf{q})\\[6pt]&=2\|\hat{\mathbf{b}}_k\|^2+2\|\hat{\mathbf{a}}_k\|^2+2\mathbf{q}^T\mathbf{q}_k\|\hat{\mathbf{b}}_k\|^2+2\mathbf{q}^T\mathbf{q}_k\|\hat{\mathbf{a}}\|-4\mathbf{q}^T\Omega_1^T(\hat{\mathbf{b}}_k)\Omega_2(\hat{\mathbf{a}}_k)\mathbf{q}-4\mathbf{q}^T\Omega_1^T(\hat{\mathbf{b}}_k)\Omega_2(\hat{\mathbf{a}}_k)\mathbf{q}_k)\\[6pt]&=2\|\hat{\mathbf{b}}_k\|^2+2\|\hat{\mathbf{a}}_k\|^2+2\mathbf{q}^T\mathbf{q}_k\|\hat{\mathbf{b}}_k\|^2+2\mathbf{q}^T\mathbf{q}_k\|\hat{\mathbf{a}}\|+4\mathbf{q}^T\Omega_1(\hat{\mathbf{b}}_k)\Omega_2(\hat{\mathbf{a}}_k)\mathbf{q}+4\mathbf{q}^T\Omega_1(\hat{\mathbf{b}}_k)\Omega_2(\hat{\mathbf{a}}_k)\mathbf{q}_k)
\end{aligned}
is quadratic in terms of $\mathbf{q}$ and $\mathbf{q}_k$, we can now rewrite the cost function in $(8)$:
\begin{aligned}&\frac{1}{4\delta_k^2}\|\hat{\mathbf{b}}_k-\mathbf{q}\circ\hat{\textbf{a}}_k\circ\mathbf{q}^{-1}+\mathbf{q}^T\mathbf{q}_k\hat{\mathbf{b}}_k-\mathbf{q}\circ\hat{\mathbf{a}}\circ\mathbf{q}_k^{-1}\|^2+\frac{1-\mathbf{q}_k^T\mathbf{q}}{2}\bar{c}^2\\[12pt]&=\frac{\|\hat{\mathbf{b}}_k\|^2+\|\hat{\mathbf{a}}_k\|^2+\mathbf{q}^T\mathbf{q}_k\|\hat{\mathbf{b}}_k\|^2+\mathbf{q}^T\mathbf{q}_k\|\hat{\mathbf{a}}\|+2\mathbf{q}^T\Omega_1(\hat{\mathbf{b}}_k)\Omega_2(\hat{\mathbf{a}}_k)\mathbf{q}+2\mathbf{q}^T\Omega_1(\hat{\mathbf{b}}_k)\Omega_2(\hat{\mathbf{a}}_k)\mathbf{q}_k)}{2\delta_k^2}+\frac{1-\mathbf{q}_k^T\mathbf{q}}{2}\bar{c}^2\\[6pt]&=\frac{\mathbf{q}_k^T\mathbf{q}_k\|\hat{\mathbf{b}}_k\|^2+\mathbf{q}_k^T\mathbf{q}_k\|\hat{\mathbf{a}}_k\|^2+\mathbf{q}^T\mathbf{q}_k\|\hat{\mathbf{b}}_k\|^2+\mathbf{q}^T\mathbf{q}_k\|\hat{\mathbf{a}}\|+2\mathbf{q}_k^T\Omega_1(\hat{\mathbf{b}}_k)\Omega_2(\hat{\mathbf{a}}_k)\mathbf{q}+2\mathbf{q}_k^T\Omega_1(\hat{\mathbf{b}}_k)\Omega_2(\hat{\mathbf{a}}_k)\mathbf{q}_k)}{2\delta_k^2}+\frac{\mathbf{q}_k^T\mathbf{q}_k-\mathbf{q}_k^T\mathbf{q}}{2}\bar{c}^2\\[6pt]&=\mathbf{q}_k^T\Big(\underbrace{\frac{(\|\hat{\mathbf{b}}_k\|^2+\|\hat{\mathbf{a}}_k\|^2)\mathbf{I}_4+2\Omega_1(\hat{\mathbf{b}}_k)\Omega_2(\hat{\mathbf{a}}_k)}{2\delta_k^2}+\frac{\bar{c}^2}{2}\mathbf{I}_4}_{\mathbf{Q}_{kk}}\Big)\mathbf{q}_k+2\mathbf{q}^T\Big(\underbrace{\frac{(\|\hat{\mathbf{b}}_k\|^2+\|\hat{\mathbf{a}}_k\|^2)\mathbf{I}_4+2\Omega_1(\hat{\mathbf{b}}_k)\Omega_2(\hat{\mathbf{a}}_k)}{4\delta_k^2}-\frac{\bar{c}^2}{4}\mathbf{I}_4}_{\mathbf{Q}_{0k}}\Big)\mathbf{q}_k\end{aligned}
therefore, $(8)$ is equivalent to:
$$
\min_{\mathbf{q}\in\mathbb{S}^3,\mathbf{q}_k\in\{\pm\mathbf{q}\}}\sum_{k=1}^K\mathbf{q}_k^T\mathbf{Q}_{kk}\mathbf{q}_k+2\mathbf{q}^T\mathbf{Q}_{0k}\mathbf{q}_k\tag{9}
$$
define a $4\times 4$ block symmetric matrix :
$$
[\mathbf{Q}_k]_\mathbf{uv}=\begin{cases}\mathbf{Q}_{kk}&\text{if }\mathbf{u=q}_k\text{ and }\mathbf{v=q}_k\\\mathbf{Q}_{0k}&\text{if }(\mathbf{u=q}\text{ and }\mathbf{v=q}_k)\text{ or }(\mathbf{u=q}_k\text{ and }\mathbf{v=q})\\\mathbf{0}_{4\times4}&\text{otherwise}\end{cases}
$$
let $x=\begin{bmatrix}\mathbf{q}\\\mathbf{q}_1\\\vdots\\\mathbf{q}_k\end{bmatrix}$ and set $\mathbf{Q}=\sum_{k=1}^K\mathbf{Q}_k$, $(9)$ is equivalent to:
$$
\min_{\mathbf{x}\in\mathbb{R}^{4(K+1)}}\sum_{k=1}^K\mathbf{x}^T\mathbf{Q}_k\mathbf{x}\implies\min_{\mathbf{x}\in\mathbb{R}^{4(K+1)}}\mathbf{x}^T\mathbf{Q}\mathbf{x}
$$
where $\mathbf{q}\in\mathbb{S}^3,\mathbf{q}_k\in\{\pm\mathbf{q}\}$ are the constraints left to reformulate:
$$
\begin{cases}\mathbf{q}\in\mathbb{S}^3\\\mathbf{q}_k\in\{\pm\mathbf{q}\}\\k=1,\dots,K\end{cases}\iff\begin{cases}\mathbf{x}_q^T\mathbf{x}_q=1\\\mathbf{x}_{q_k}\mathbf{x}^T_{q_k}=\mathbf{x}_q\mathbf{x}_q^T\\k=1,\dots,K\end{cases}
$$
---
> 總結 包括將 $\mathbf{q}$ 向量擴展與建構半正定矩陣,利用quaternion與 $\min$ 轉換的binary clone性質做reformulation
let $x=\begin{bmatrix}\mathbf{q}\\\mathbf{q}_1\\\vdots\\\mathbf{q}_k\end{bmatrix}$, $(8)$ can be reformulated to a non-convex QCQP as follows:
$$
\begin{aligned}\min_{\mathbf{x}\in\mathbb{R}^{4(K+1)}}&\mathbf{x}^T\mathbf{Qx}\\\text{s.t. }\ \ \ &\mathbf{x}_q^T\mathbf{x}_q=1\\&\mathbf{x}_{q_k}\mathbf{x}^T_{q_k}=\mathbf{x}_q\mathbf{x}_q^T,\ \forall k=1,\dots,K\end{aligned}
$$
where $\mathbf{Q}\in\mathcal{S}^{4(K+1)}\subseteq M_{4(K+1)}(\mathbb{R})$ is a real symmetric matrix
> QCQP is still nonconvex since:
> 1. Q is not psd
> 2. $\mathbf{x}_q\mathbf{x}_q^T$ is not quadratic constraint
### ****Semidefinite Relaxation****
- $\mathbf{x}_{q_i}\mathbf{x}_{q_j}^T=\mathbf{q}_i\mathbf{q}_j^T=(\theta_i\mathbf{q})(\theta_j\mathbf{q}^T)=(\theta_j\mathbf{q})(\theta_i\mathbf{q})=\mathbf{q}_j\mathbf{q}_i^T=(\mathbf{x}_{q_i}\mathbf{x}_{q_j}^T)^T$
let $\mathbf{q}_0=\mathbf{q}$ for simplicity (primal problem):
$$
\begin{aligned}\min_{\mathbf{x}\in\mathbb{R}^{4(K+1)}}&\mathbf{x}^T\mathbf{Qx}\\\text{s.t. }\ \ \ &\mathbf{x}_{q_0}^T\mathbf{x}_{q_0}=1\\&\mathbf{x}_{q_k}\mathbf{x}^T_{q_k}=\mathbf{x}_{q_0}\mathbf{x}_{q_0}^T,\ \forall k=1,\dots,K\end{aligned}\tag{P1-QCQP}
$$
add redundant constraint to QCQP:
$$
\begin{aligned}\min_{\mathbf{x}\in\mathbb{R}^{4(K+1)}}&\mathbf{x}^T\mathbf{Qx}\\\text{s.t. }\ \ \ &\mathbf{x}_{q_0}^T\mathbf{x}_{q_0}=1\\&\mathbf{x}_{q_k}\mathbf{x}^T_{q_k}=\mathbf{x}_{q_0}\mathbf{x}_{q_0}^T,\ \forall k=1,\dots,K\\&\mathbf{x}_{q_i}\mathbf{x}_{q_j}^T=(\mathbf{x}_{q_i}\mathbf{x}_{q_j}^T)^T,\ \forall 0\leq i<j\leq K\end{aligned}\tag{10}
$$
define symmetric positive semidefinite matrix $\mathbf{Z}$:
$$
\mathbf{Z=xx}^T=\begin{bmatrix}\mathbf{q}_0\mathbf{q}_0^T&\mathbf{q}_0\mathbf{q}_1^T&\cdots&\mathbf{q}_0\mathbf{q}_K^T\\\mathbf{q}_1\mathbf{q}_0^T&\mathbf{q}_1\mathbf{q}_1^T&\cdots&\mathbf{q}_1\mathbf{q}_K^T\\\vdots&\vdots&\ddots&\vdots\\\mathbf{q}_K\mathbf{q}_0^T&\mathbf{q}_K\mathbf{q}_1^T&\cdots&\mathbf{q}_K\mathbf{q}_K^T\end{bmatrix}=\begin{bmatrix}\mathbf{q}_0\mathbf{q}_0^T&\mathbf{q}_0\mathbf{q}_1^T&\cdots&\mathbf{q}_0\mathbf{q}_K^T\\\mathbf{q}_0\mathbf{q}_1^T&\mathbf{q}_1\mathbf{q}_1^T&\cdots&\mathbf{q}_1\mathbf{q}_K^T\\\vdots&\vdots&\ddots&\vdots\\\mathbf{q}_0\mathbf{q}_K^T&\mathbf{q}_1\mathbf{q}_K^T&\cdots&\mathbf{q}_K\mathbf{q}_K^T\end{bmatrix}\in\mathcal{S}_+^{4(K+1)}
$$
- $Z\succeq0$
- $\text{rank}(Z)=1$
- $\mathbf{x}^T\mathbf{Qx}=\text{tr}(\mathbf{Qxx}^T)=\text{tr}(\mathbf{QZ})$ (trace of quadratic form)
- $\mathbf{x}_{q_k}^T\mathbf{x}_{q_k}=[\mathbf{Z}]_{kk}$ ($k^{th}$ $4\times4$ diagonal block of $\mathbf{Z}$)
primal problem can be reformed as:
$$
\begin{aligned}\min_{\mathbf{Z}\succeq0}\ \ &\text{tr}(\mathbf{QZ})\\\text{s.t. }\ &\text{tr}([\mathbf{Z}]_{00})=1\\&[\mathbf{Z}]_{kk}=[\mathbf{Z}]_{00},\ \forall k=1,\dots,K\\&\text{rank}(Z)=1\end{aligned}
$$
relax rank-constraint, forms a convex SDP:
$$
\begin{aligned}\min_{\mathbf{Z}\succeq0}\ \ &\text{tr}(\mathbf{QZ})\\\text{s.t. }\ &\text{tr}([\mathbf{Z}]_{00})=1\\&[\mathbf{Z}]_{kk}=[\mathbf{Z}]_{00},\ \forall k=1,\dots,K\end{aligned}\tag{P2-SDP}
$$
add redundant constraint to SDP:
$$
\begin{aligned}\min_{\mathbf{Z}\succeq0}\ \ &\text{tr}(\mathbf{QZ})\\\text{s.t. }\ &\text{tr}([\mathbf{Z}]_{00})=1\\&[\mathbf{Z}]_{kk}=[\mathbf{Z}]_{00},\ \forall k=1,\dots,K\\&[\mathbf{Z}]_{ij}=[\mathbf{Z}]_{ij}^T,\ \forall0\leq i<j\leq K\end{aligned}\tag{11}
$$
## Tight relaxation
> 已經是最終要解的SDP了,尚未討論sufficient and necessary condition,但先說結論
>
- relaxation is tight, i.e., optimal values $f_{P1}^*=f_{P2}^*$
- $Z^*=x^*(x^*)^T$ and $\pm x^*$ are two unique
### Dual problem
> define matrices:
>
>
> $\text{tr}(\mathbf{x_{q_0}}\mathbf{x_{q_0}}^T)=\text{tr}(\mathbf{Jx}\mathbf{x}^T),\ \mathbf{J}=\begin{bmatrix}\mathbf{I}_4&&&\\&0&&\\&&\ddots&\\&&&0\end{bmatrix}$
>
> $\mathbf{x_{q_i}}\mathbf{x_{q_i}}^T-\mathbf{x_{q_0}}\mathbf{x_{q_0}}^T=\text{tr}(\mathbf{\Lambda}_i\mathbf{xx}^T),\ \mathbf{\Lambda}_i=\begin{bmatrix}\mathbf{\Lambda}_{ii}&&&\\&\ddots&&\\&&-\mathbf{\Lambda}_{ii}&\\&&&\ddots\end{bmatrix}$
>
> $\mathbf{\Lambda}=\sum_{i=1}^K\mathbf{\Lambda}_i=\begin{bmatrix}\sum_{i=1}^K\mathbf{\Lambda}_{ii}&&&\\&-\mathbf{\Lambda}_{11}&&\\&&\ddots&\\&&&-\mathbf{\Lambda}_{KK}\end{bmatrix}$
>
rewrite (P1):
$$
\begin{aligned}\min_{\mathbf{x}\in\mathbb{R}^{4(K+1)}}&\text{tr}(\mathbf{Qxx}^T)\\\text{s.t. }\ \ &\text{tr}(\mathbf{Jxx}^T)-1=0\\&\mathbf{x}_{q_k}\mathbf{x}^T_{q_k}-\mathbf{x}_{q_0}\mathbf{x}_{q_0}^T=0,\ \forall i=k,\dots,K\end{aligned}\tag{P}
$$
Lagrangian of (P1):
$$
\begin{aligned}\mathcal{L}(x,\mu,\mathbf{\Lambda})&=\text{tr}(\mathbf{Qxx}^T)-\mu(\text{tr}(\mathbf{Jxx}^T)-1)-\text{tr}(\mathbf{\Lambda}\mathbf{xx}^T)\\&=\text{tr}((\mathbf{Q}-\mu\mathbf{J-\Lambda)xx}^T)+\mu\\&=\mathbf{x}^T\underbrace{(\mathbf{Q}-\mu\mathbf{J-\Lambda)}}_\text{consider p.s.d}\mathbf{x}+\mu\end{aligned}
$$
dual function:
$$
g(\mu,\mathbf{\Lambda})=\inf_{\mathbf{x}}\mathcal{L}(x,\mu,\mathbf{\Lambda})=\begin{cases}\mu&\mathbf{Q}-\mu\mathbf{J-\Lambda}\succeq0
\\-\infty&\text{otherwise}\end{cases}
$$
dual problem (convex SDP):
$$
\begin{aligned}\max_{\mu,\mathbf{\Lambda}}\ \ &\mu\\\text{s.t. }\ &\mathbf{Q}-\mu\mathbf{J-\Lambda}\succeq0\end{aligned}\tag{D}
$$
now reconsider (P2):
$$
\begin{aligned}\min_{\mathbf{Z}}\ \ &\text{tr}(\mathbf{QZ}) \\\text{s.t. }\ &\text{tr}(\mathbf{JZ})-1=0\\&[\mathbf{Z}]_{kk}-[\mathbf{Z}]_{00}=0,\ \forall k=1,\dots,K\\&\mathbf{Z}\succeq0\end{aligned}\tag{P2}
$$
Lagrangian of (P2):
$$
\begin{aligned}\mathcal{L}(\mathbf{Z},\mu,\mathbf{\Lambda,W})&=\text{tr}(\mathbf{QZ})-\mu(\text{tr}(\mathbf{JZ})-1)-\text{tr}(\mathbf{\Lambda Z})-\text{tr}(\mathbf{W Z})\\&=\text{tr}((\mathbf{Q}-\mu\mathbf{J-\Lambda-W)Z})-\mu\end{aligned}
$$
dual function:
$$
g(\mu,\mathbf{\Lambda,W})=\inf_{\mathbf{Z}}\mathcal{L}(x,\mu,\mathbf{\Lambda,W})=\begin{cases}\mu&\mathbf{Q}-\mu\mathbf{J-\Lambda-W}\succeq0
\\-\infty&\text{otherwise}\end{cases}
$$
dual problem (convex SDP):
$$
\begin{aligned}\max_{\mu,\mathbf{\Lambda}}\ \ &\mu\\\text{s.t. }\ &\mathbf{Q}-\mu\mathbf{J-\Lambda-W}\succeq0\end{aligned}
$$
setting $\mathbf{W}=0$ maximizes $\mu$, dual problem (somehow) becomes (D)
$$
\begin{aligned}\max_{\mu,\mathbf{\Lambda}}\ \ &\mu\\\text{s.t. }\ &\mathbf{Q}-\mu\mathbf{J-\Lambda}\succeq0\end{aligned}\tag{D}
$$
that is, dual of (D) is (P2) (i.e., (SDP)) when $\mathbf{W}=0$
### Weak duality
consider the objective function:
$$
\begin{aligned}f_\text{SDP}-f_\text{D}&=\text{tr}(\mathbf{QZ})-\mu\\&=\text{tr}(\mathbf{QZ})-\mu\underbrace{\text{tr}(\mathbf{JZ})}_{=1}-\underbrace{\text{tr}(\mathbf{\Lambda Z})}_{=0}\\&=\text{tr}((\mathbf{Q}-\mu\mathbf{J-\Lambda)Z})\geq0\end{aligned}\tag{12}
$$
by weak duality:
- (D) is convex dual problem of (P): $f_\text{D}^*\leq f_\text{P}^*$
- (SDP) is convex relaxation of (P): $f_\text{D}^*\leq f_\text{SDP}^*\leq f_\text{P}^*$
> 推得(SDP)性質會比(D)好,也是為什麼要解SDP relaxation而不是dual problem
>
### Necessary condition for optimal (KKT)
KKT condition for (P):
- (primal feasibility) $\text{tr}(\mathbf{Jx^*x^*}^T)-1=0$
- (primal feasibility) $\mathbf{x^*}_{q_k}\mathbf{x}^{*T}{q_k}-\mathbf{x^*}_{q_0}\mathbf{x}_{q_0}^{*T}=0$
- (stationary) $(\mathbf{Q}-\mu^*\mathbf{J-\Lambda^*)x^*}=0$ (Lagragian微分=0)
> no complementary slackness, dual feasibility since there is no inequality
>
### Sufficient condition for Strong duality
If $(\mathbf{x}^*,\mu^*,\mathbf{\Lambda}^*)$ satisfy KKT conditions and dual constraint $\mathbf{Q}-\mu^*\mathbf{J-\Lambda^*}\succeq0$ , then:
1. $f_\text{D}^*= f_\text{SDP}^*=f_\text{P}^*$ (strong duality holds)
2. $\mathbf{x}^*$ is a global minimizer for (P)
Moreover, if $\text{rank}(\mathbf{Q}-\mu^*\mathbf{J-\Lambda^*)}=4(K+1)-1$, then:
3. $\pm\mathbf{x}^*$ are the two unique global minimizers for (P)
4. the optimal solution of (SDP), denoted as $\mathbf{Z}^*$, has rank 1 and can be written as $\mathbf{Z^*=x^*x^*}^T$
> pf)
>
> 1. consider the stationary condition:
>
> $$
> \begin{aligned}&(\mathbf{Q}-\mu^*\mathbf{J-\Lambda^*)x^*}=0\\\implies&\mathbf{x^*}^T(\mathbf{Q}-\mu^*\mathbf{J-\Lambda^*)x^*}=\mathbf{x^*}^T0=0\\
> \implies&\mathbf{x^*}^T\mathbf{Qx^*}=\mathbf{x^*}^T(\mu^*\mathbf{J)x^*}+\mathbf{x^*}^T\mathbf{\Lambda x^*}\\\implies&
> \text{tr}(\mathbf{Qx^*x^*}^T)=\mu^*\underbrace{\text{tr}(\mathbf{Jx^*x^*}^T)}_{=1}+\underbrace{\text{tr}(\mathbf{\Lambda x^*x^*}^T)}_{=0}\\\implies&
> \text{tr}(\mathbf{Qx^*x^*}^T)=\mu^*\\\implies&
> {f_\text{P}^*}={f_\text{SDP}^*}
> \end{aligned}
> $$
>
> by weak duality, $f_\text{D}^*= f_\text{D}^*=f_\text{P}^*$
>
> 2. let $\mathbf{x}\in\mathbb{R}^{4(K+1)}$ satisfy primal feasibility, by dual constraint $\mathbf{Q}-\mu^*\mathbf{J-\Lambda^*}\succeq0$ :
>
> $$
> \begin{aligned}&\mathbf{Q}-\mu^*\mathbf{J-\Lambda^*}\succeq0\\\implies&
> \mathbf{x}^T(\mathbf{Q}-\mu^*\mathbf{J-\Lambda^*)x}\geq0\\
> \implies&\mathbf{x}^T\mathbf{Qx}\geq\mathbf{x}^T(\mu^*\mathbf{J)x}+\mathbf{x}^T\mathbf{\Lambda x}\\\implies&
> \text{tr}(\mathbf{Qxx}^T)\geq\mu^*=\text{tr}(\mathbf{Qx^*x^*}^T)
> \end{aligned}
> $$
>
> showing that $\mathbf{x}^*$ is the global minimizer
>
> 3. let $\mathbf{M}^*=\mathbf{Q}-\mu^*\mathbf{J-\Lambda^*}$
> - $\text{rank}(\mathbf{M}^*)=4(K+1)-1\implies\text{nullity}(\mathbf{M}^*)=1$
> - (stationary) $\mathbf{M^*x^*}=0\implies\mathbf{x^*}$ is the eigenvector corresponding to the 0 eigenvalue, i.e., $\mathbf{x^*}$ is in the null space of $\mathbf{M}^*$
>
> let $\mathbf{x}\in\mathbb{R}^{4(K+1)}$ satisfy primal feasibility, $\|\mathbf{x}\|_2=\sqrt{K+1}$
>
> let $\mathbf{x}\in\mathbb{R}^{4(K+1)}$ be a global minimizer, $\mathbf{x}=a\mathbf{x}^*\in N(\mathbf{M}^*)=\text{span}(\mathbf{x}^*)$
>
> together, $a=\pm1$, i.e., $\pm\mathbf{x}^*$ are the only two global minimizer
>
> 4. by $(12)$ and strong duality, $\text{tr}(\mathbf{M^*Z^*})=0$
>
> since $\text{rank}(\mathbf{M^*})=4(K+1)-1$, let $\mathbf{M^*}=\bar{\mathbf{M}}^{T}\bar{\mathbf{M}},\ \bar{\mathbf{M}}\in\mathbb{R}^{(4(K+1)-1)\times(4(K+1))},\ \text{rank}(\bar{\mathbf{M}})=4(K+1)-1$
>
> 假設 $\text{rank}(\mathbf{Z}^{*})=r$, let $\mathbf{Z}^*=\bar{\mathbf{Z}}\bar{\mathbf{Z}}^{T}, \bar{\mathbf{Z}}\in\mathbb{R}^{(4(K+1))\times r},\ \text{rank}(\bar{\mathbf{Z}})=r$
>
> $$
> \begin{aligned}\text{tr}(\mathbf{M^*Z^*})&=\text{tr}(\bar{\mathbf{M}}^T\bar{\mathbf{M}}\bar{\mathbf{Z}}\bar{\mathbf{Z}}^T)\\&=\text{tr}(\bar{\mathbf{Z}}^T\bar{\mathbf{M}}^T\bar{\mathbf{M}}\bar{\mathbf{Z}})\\&=\text{tr}((\bar{\mathbf{M}}\bar{\mathbf{Z}})^T(\bar{\mathbf{M}}\bar{\mathbf{Z}}))=\|\bar{\mathbf{M}}\bar{\mathbf{Z}}\|_F=0\implies\bar{\mathbf{M}}\bar{\mathbf{Z}}=0\end{aligned}
> $$
>
> by rank inequality ($\text{rank}(A)+\text{rank}(B)\leq\text{rank}(AB)+n$):
>
> $$
> \begin{aligned}0=\text{rank}(\bar{\mathbf{M}}\bar{\mathbf{Z}})&\geq\text{rank}(\bar{\mathbf{M}})+\text{rank}(\bar{\mathbf{Z}})-n\\&=4(K+1)-1+r-4(K+1)=r+1\\\implies &r\leq1,\ \bar{\mathbf{Z}}\neq0\implies r=1\end{aligned}
> $$
>
> therefore, $\text{rank}(\mathbf{Z}^{*})=1$
>
### Strong duality in noiseless and outlier-free case
> 截至目前我們有以下性質:
>
> - If $(\mathbf{x}^*,\mu^*,\mathbf{\Lambda}^*)$ satisfy KKT conditions and dual constraint $\mathbf{M^*}\succeq0$ , then strong duality holds
> - If $\text{rank}(\mathbf{M^*})=4(K+1)-1$, then $\mathbf{Z^*=x^*x^*}^T$ is the optimal of (SDP)
>
> 但我們沒有 $(\mathbf{x}^*,\mu^*,\mathbf{\Lambda}^*)$,要用其他方法判斷 KKT conditions and $\mathbf{M^*}\succeq0$
>
since we have no outliers and noise:
- $\mathbf{b}_i-\mathbf{Ra}_i\implies\|\mathbf{b}_i\|^2=\|\mathbf{a}_i\|^2$
- $\sigma_i^2=1$
- $\mathbf{q}_i=\mathbf{q}_0=\mathbf{q},\ i=1,\dots,K$, i.e., $\mathbf{x}=\begin{bmatrix}\mathbf{q}\\\mathbf{q}\\\vdots\\\mathbf{q}\end{bmatrix}$
write the stationary condition:
$$
\mathbf{M^*x^*}=\begin{bmatrix}
-\mu^*\mathbf{I}_4-\sum_{i=1}^K\mathbf{\Lambda}_{ii}^*&\mathbf{Q}_{01}&\cdots&\mathbf{Q}_{0K}\\
\mathbf{Q}_{01}&\mathbf{Q}_{11}+\mathbf{\Lambda}_{11}^*&\cdots&0\\
\vdots&\vdots&\ddots&\vdots\\
\mathbf{Q}_{0K}&0&\cdots&\mathbf{Q}_{KK}+\mathbf{\Lambda}_{KK}^*
\end{bmatrix}\begin{bmatrix}\mathbf{q}^*\\\mathbf{q}^*\\\vdots\\\mathbf{q}^*\end{bmatrix}=0
$$
let $\mathbf{D}=\begin{bmatrix}\Omega_1(\mathbf{q}^*)&0&\cdots&0\\0&\Omega_1(\mathbf{q}^*)&\cdots&0\\\vdots&\vdots&\ddots&\vdots\\0&0&\cdots&\Omega_1(\mathbf{q}^*)\end{bmatrix}$ with $\mathbf{D}^T\mathbf{D}=\mathbf{DD}^T=\mathbf{I}_{4(K+1)},\ \mathbf{e}=\begin{bmatrix}0\\0\\0\\1\end{bmatrix}\in\mathbb{R}^4,\ \mathbf{r}=\begin{bmatrix}\mathbf{e}\\\mathbf{e}\\\vdots\\\mathbf{e}\end{bmatrix}\in\mathbb{R}^{4(K+1)}$
(Similarity Transformation) Define $\mathbf{N^*=D}^T\mathbf{M^*D}$, then:
1. $\mathbf{N^*}$ and $\mathbf{M^*}$ have the same eigenvalues, $\mathbf{N^*}\succeq0\iff\mathbf{M^*}\succeq0$, $\text{rank}(\mathbf{N^*})=\text{rank}(\mathbf{M^*})$
2. $\mathbf{M^*x^*}=0\iff\mathbf{N^*r}=0$
> pf)
>
> 1. $\mathbf{N^*=D}^T\mathbf{M^*D}=\mathbf{D}^{-1}\mathbf{M^*D}$, proved by matrix similarity
> 2. by stationary condition:
>
> $$
> \begin{aligned}\mathbf{M^*x^*}=0&\iff\mathbf{D}^T\mathbf{M^*DD}^T\mathbf{x}^*=\mathbf{D}^T0\\&\iff\underbrace{\mathbf{D}^T\mathbf{M^*D}}_{\mathbf{N^*}}\underbrace{\mathbf{D}^T\mathbf{x^*}}_\mathbf{r}=0,& \mathbf{D}^T\mathbf{x}^*=\begin{bmatrix}\Omega_1(\mathbf{q}^*)\mathbf{q}^*\\\vdots\\\Omega_1(\mathbf{q}^*)\mathbf{q}^*\end{bmatrix}=\begin{bmatrix}\mathbf{e}\\\vdots\\\mathbf{e}\end{bmatrix}=\mathbf{r}\\&\iff \mathbf{N^*r}=0\end{aligned}
> $$
>
> 截至目前我們有以下性質:
>
> - If $(\mathbf{x}^*,\mu^*,\mathbf{\Lambda}^*)$ satisfy KKT conditions and dual constraint, then strong duality holds
> - (primal feasibility) $\text{tr}(\mathbf{Jx^*x^*}^T)-1=0\iff\text{tr}(\mathbf{JZ^*})-1=0$
> - (primal feasibility) $\mathbf{x^*}_{q_k}\mathbf{x}^{*T}_{q_k}-\mathbf{x^*}_{q_0}\mathbf{x}_{q_0}^{*T}=0\iff[\mathbf{Z^*}]_{kk}-[\mathbf{Z^*}]_{00}=0$
> - (stationary) $\mathbf{M^*x^*}=0\iff\mathbf{N^*r}=0$
> - (dual constraint) $\mathbf{M^*}\succeq0\iff \mathbf{N^*}\succeq0$
> - If $\text{rank}(\mathbf{M^*})=\text{rank}(\mathbf{N^*})=4(K+1)-1$, then $\mathbf{Z^*=x^*x^*}^T$ is the optimal of (SDP)
>
> 目前有 $\mathbf{N^*}$與 KKT conditions and dual constraint 的關係,接下來要去找什麼情況會有這個 $\mathbf{N}^*$
>
write down important matrices:
$$
\mathbf{N^*}=\begin{bmatrix}
-\mu^*\mathbf{I}_4-\sum_{i=1}^K\bar{\mathbf{\Lambda}}_{ii}^*&\bar{\mathbf{Q}}_{01}&\cdots&\bar{\mathbf{Q}}_{0K}\\
\bar{\mathbf{Q}}_{01}&\bar{\mathbf{Q}}_{11}+\bar{\mathbf{\Lambda}}_{11}^*&\cdots&0\\
\vdots&\vdots&\ddots&\vdots\\
\bar{\mathbf{Q}}_{0K}&0&\cdots&\bar{\mathbf{Q}}_{KK}+\bar{\mathbf{\Lambda}}_{KK}^*
\end{bmatrix}=\begin{bmatrix}
-\sum_{i=1}^K\bar{\mathbf{\Lambda}}_{ii}^*&\bar{\mathbf{Q}}_{01}&\cdots&\bar{\mathbf{Q}}_{0K}\\
\bar{\mathbf{Q}}_{01}&\bar{\mathbf{Q}}_{11}+\bar{\mathbf{\Lambda}}_{11}^*&\cdots&0\\
\vdots&\vdots&\ddots&\vdots\\
\bar{\mathbf{Q}}_{0K}&0&\cdots&\bar{\mathbf{Q}}_{KK}+\bar{\mathbf{\Lambda}}_{KK}^*
\end{bmatrix}
$$
where $\mu^*=0$
> pf)
>
>
> $$
> \begin{aligned}\mu^*&=\text{tr}(\mathbf{Qx^*x^*}^T)=\mathbf{x^*}^T\mathbf{Qx^*}\\&=\mathbf{q^*}^T\Big(\sum_{i=1}^K\mathbf{Q}_{ii}+2\mathbf{Q}_{0i}\Big)\mathbf{q}^*\\&=\mathbf{q^*}^T\Big(\sum_{i=1}^K(\|\mathbf{a}_i\|^2+\|\mathbf{b}_i\|^2)\mathbf{I}_4+2\Omega_1(\mathbf{b}_i)\Omega_2(\mathbf{a}_i)\Big)\mathbf{q}^*\\&=\mathbf{q^*}^T\Big(2\sum_{i=1}^K\|\mathbf{a}_i\|^2\mathbf{I}_4+\Omega_1(\mathbf{b}_i)\Omega_2(\mathbf{a}_i)\Big)\mathbf{q}^*\\&=2\sum_{i=1}^K\|\mathbf{a}_i\|^2-\mathbf{b}_i^T(\mathbf{q}^*\circ\mathbf{a}_i\circ\mathbf{q}^{*-1})\\&=2\sum_{i=1}^K\|\mathbf{a}_i\|^2-\|\mathbf{b}_i\|^2=0\end{aligned}
> $$
>
$$
\bar{\mathbf{Q}}_{ii}=\begin{bmatrix}(\|\mathbf{a}_i\|^2+\frac{\bar{c}^2}{2})\mathbf{I}_3-[\mathbf{a}_i]^2_\times-\mathbf{a}_i\mathbf{a}_i^T&0\\0&\frac{\bar{c}^2}{2}\end{bmatrix},\ \bar{\mathbf{Q}}_{0i}=\begin{bmatrix}(\frac{\|\mathbf{a}_i\|^2}{2}+\frac{\bar{c}^2}{4})\mathbf{I}_3-\frac{[\mathbf{a}_i]^2_\times}{2}-\frac{\mathbf{a}_i\mathbf{a}_i^T}{2}&0\\0&-\frac{\bar{c}^2}{4}\end{bmatrix},\ \bar{\Lambda}_{ii}^*=\begin{bmatrix}\mathbf{E}_{ii}&\alpha_i\\\alpha_i^T&\lambda_i\end{bmatrix}\in\mathbb{S}^4
$$
(Sparsity Pattern of Dual Variables) $\mathbf{N^*r}=0$ holds$\iff\bar{\Lambda}_{ii}^*=\begin{bmatrix}\mathbf{E}_{ii}&0\\0&-\frac{\bar{c}^2}{4}\end{bmatrix}\in\mathbb{S}^4$ sparsity pattern
> pf) consider each row of $\mathbf{N^*r}=0$
>
>
> $(\implies)$
>
> $$
> \begin{aligned}\mathbf{N^*r}=0&\implies\mathbf{N}_i^*\mathbf{r}=0,\ i=1,\dots,K\\&\implies\bar{\mathbf{Q}}_{0i}\mathbf{e}+(\bar{\mathbf{Q}}_{ii}+\bar{\mathbf{\Lambda}}_{ii}^*)\mathbf{e}=0\\&\implies\begin{bmatrix}\Delta&\alpha\\\alpha^T&\frac{\bar{c}^2}{4}+\lambda\end{bmatrix}\begin{bmatrix}0\\0\\0\\1\end{bmatrix}=0\implies\alpha=0,\ \lambda=-\frac{\bar{c}^2}{4}\end{aligned}
> $$
>
> $(\impliedby)$
>
> $$
> \begin{aligned}\mathbf{N}_i^*\mathbf{r}&=\bar{\mathbf{Q}}_{0i}\mathbf{e}+(\bar{\mathbf{Q}}_{ii}+\bar{\mathbf{\Lambda}}_{ii}^*)\mathbf{e}\\&=\begin{bmatrix}\Delta&0\\0&-\frac{\bar{c}^2}{4}\end{bmatrix}\begin{bmatrix}0\\0\\0\\1\end{bmatrix}=\begin{bmatrix}0\\0\\0\\0\end{bmatrix}=0,\ i=1,\dots,K\implies\mathbf{N}^*\mathbf{r}=0\end{aligned}
> $$
>
(Construction of Dual Variable) choose $\mathbf{E}_{ii}=[\mathbf{a}_i]^2_\times-\frac{\bar{c}^2}{4}\mathbf{I}_3,\ i=1,\dots,K$ and $\bar{\Lambda}_{ii}^*=\begin{bmatrix}\mathbf{E}_{ii}&0\\0&-\frac{\bar{c}^2}{4}\end{bmatrix}$, then:
1. $\mathbf{N^*r}=0$
2. $\mathbf{N^*}\succeq0$
3. $\text{rank}(\mathbf{N^*})=4(K+1)-1$
let $\mathbf{u}=\begin{bmatrix}\mathbf{u}_0\\\mathbf{u}_1\\\vdots\\\mathbf{u}_k\end{bmatrix},\ \mathbf{u}_i\in\mathbb{R}^4$,
$$
\begin{aligned}\mathbf{u}^T\mathbf{N^*u}&=-\sum_{i=1}^K\mathbf{u}_0^T\bar{\mathbf{\Lambda}}_{ii}^*\mathbf{u}_0+2\sum_{i=1}^K\mathbf{u}_0^T\bar{\mathbf{Q}}_{0i}\mathbf{u}_i+\sum_{i=1}^K\mathbf{u}^T_i(\bar{\mathbf{Q}}_{ii}+\bar{\mathbf{\Lambda}}_{ii}^*)\mathbf{u}_i\\&=
\sum_{i=1}^K\underbrace{\Bigg(
\mathbf{u}_0^T(-\bar{\mathbf{\Lambda}}_{ii}^*)\mathbf{u}_0+\mathbf{u}_0^T\bar{\mathbf{Q}}_{0i}\mathbf{u}_i+\mathbf{u}^T_i(\bar{\mathbf{Q}}_{ii}+\bar{\mathbf{\Lambda}}_{ii}^*)\mathbf{u}_i
\Bigg)}_{m_i}
\end{aligned}
$$
let $\mathbf{u}_i=\begin{bmatrix}\bar{\mathbf{u}}_i\\\dot{u}_i\end{bmatrix},\ \bar{\mathbf{u}}_i\in\mathbb{R}^3$,
$$
\begin{aligned}
m_i&=\begin{bmatrix}\bar{\mathbf{u}}_i\\\dot{u}_i\end{bmatrix}^T\begin{bmatrix}\mathbf{-E}_{ii}&0\\0&\frac{\bar{c}^2}{4}\end{bmatrix}\begin{bmatrix}\bar{\mathbf{u}}_i\\\dot{u}_i\end{bmatrix}
+
\begin{bmatrix}\bar{\mathbf{u}}_i\\\dot{u}_i\end{bmatrix}^T
\begin{bmatrix}(\|\mathbf{a}_i\|^2+\frac{\bar{c}^2}{2})\mathbf{I}_3-[\mathbf{a}_i]^2_\times-\mathbf{a}_i\mathbf{a}_i^T&0\\0&-\frac{\bar{c}^2}{4}\end{bmatrix}
\begin{bmatrix}\bar{\mathbf{u}}_i\\\dot{u}_i\end{bmatrix}
+
\begin{bmatrix}\bar{\mathbf{u}}_i\\\dot{u}_i\end{bmatrix}^T
\begin{bmatrix}(\|\mathbf{a}_i\|^2+\frac{\bar{c}^2}{2})\mathbf{I}_3-[\mathbf{a}_i]^2_\times-\mathbf{a}_i\mathbf{a}_i^T+\mathbf{E}_{ii}&0\\0&\frac{\bar{c}^2}{2}\end{bmatrix}
\begin{bmatrix}\bar{\mathbf{u}}_i\\\dot{u}_i\end{bmatrix}\\
&=\frac{\bar{c}^2}{4}\underbrace{(\dot{u}_0^2-2\dot{u}_0\dot{u}_i+\dot{u}_i^2)}_{(\dot{u}_0-\dot{u}_i)^2\geq0}+
\bar{\mathbf{u}}_0^T(-\mathbf{E}_{ii})\bar{\mathbf{u}}_0-\bar{\mathbf{u}}^T_0(
(\|\mathbf{a}_i\|^2+\frac{\bar{c}^2}{2})\mathbf{I}_3-[\mathbf{a}_i]^2_\times-\mathbf{a}_i\mathbf{a}_i^T
)
\bar{\mathbf{u}}_i+\bar{\mathbf{u}}_i^T(
(\|\mathbf{a}_i\|^2+\frac{\bar{c}^2}{2})\mathbf{I}_3-[\mathbf{a}_i]^2_\times-\mathbf{a}_i\mathbf{a}_i^T+\mathbf{E}_{ii}
)\bar{\mathbf{u}}_i\\&=
\cdots\ (\text{substitute }\mathbf{E}_{ii})
\\&=\frac{\bar{c}^2}{4}\underbrace{(\dot{u}_0^2-2\dot{u}_0\dot{u}_i+\dot{u}_i^2)}_{(\dot{u}_0-\dot{u}_i)^2\geq0}+
\|\mathbf{a}_i\times(\bar{\mathbf{u}}_0+\bar{\mathbf{u}}_i)\|^2+\frac{\bar{c}^2}{4}\|\bar{\mathbf{u}}_0-\bar{\mathbf{u}}_i\|^2\geq0
\end{aligned}
$$
since $m_i\geq0\implies\mathbf{u}^T\mathbf{N^*u}=\sum_{i=1}^Km_i\geq0\implies\mathbf{N}^*,\ \forall\ \mathbf{u}\implies\mathbf{N}^*\succeq0$
note that $\mathbf{u}^T\mathbf{N^*u}=0$ holds only when:
1. $\dot{u}_i=\dot{u}_0,\ i=1,\dots,K$
2. $\bar{\mathbf{u}}_i=\bar{\mathbf{u}}_0,\ i=1,\dots,K$
3. $\mathbf{a}_i\times(\bar{\mathbf{u}}_0+\bar{\mathbf{u}}_i)=0,\ i=1,\dots,K$
$$
\begin{cases}\dot{u}_i=\dot{u}_0\\\bar{\mathbf{u}}_i=\bar{\mathbf{u}}_0& i=1,\dots,K\\\mathbf{a}_i\times(\bar{\mathbf{u}}_0+\bar{\mathbf{u}}_i)=0\end{cases}\implies\bar{\mathbf{u}}_i=\bar{\mathbf{u}}_0=0
$$
the set of non-zero vectors that satisfy above conditions:
$$
\{\mathbf{u}\in\mathbb{R}^{4(K+1)}\mid\mathbf{u}=s\mathbf{r},\ s\in\mathbb{R},s\neq0,\mathbf{r}=\begin{bmatrix}\mathbf{e}\\\vdots\\\mathbf{e}\end{bmatrix}\}=\text{span}(\mathbf{r})
$$
that is, $\text{nullity}(\mathbf{N}^*)=1\implies\text{rank}(\mathbf{N}^*)=4(K+1)-1$
> 整個邏輯變從原本看 $(\mathbf{x}^*,\mu^*,\mathbf{\Lambda}^*)$ 變成 set $\mu^*=0$ and construct $\mathbf{\Lambda}^*$ by $\bar{\mathbf{\Lambda}}^*$ (by $\mathbf{E}$) 可推得:
>
> - (stationary) $\mathbf{N^*r}=0\iff\mathbf{M^*x^*}=0$
> - (dual condition) $\mathbf{N^*}\succeq0\iff \mathbf{M^*}\succeq0$
> - $\text{rank}(\mathbf{N}^*)=4(K+1)-1\iff \text{rank}(\mathbf{M}^*)=4(K+1)-1$
>
> 再驗證 (primal feasibility 本來就等價於 SDP constraint)
>
> - (primal feasibility) $\text{tr}(\mathbf{JZ^*})-1=0\iff\text{tr}(\mathbf{Jx^*x^*}^T)-1=0$
> - (primal feasibility) $[\mathbf{Z^*}]_{kk}-[\mathbf{Z^*}]_{00}=0\iff\mathbf{x^*}_{q_k}\mathbf{x}^{*T}_{q_k}-\mathbf{x^*}_{q_0}\mathbf{x}_{q_0}^{*T}=0$
>
> 因此 KKT condition and dual constraint holds
by Sufficient condition for Strong duality, relaxation is tight proved
### Global optimal guarantee
the Lagrangian dual problem of QCQP with redundant constraints $(10)$ is convex semidefinite program:
$$
\begin{aligned}\max_{\mu,\Lambda,\mathbf{W}}\ &\mu\\\text{s.t.}\ \ &(\mathbf{Q-\mu J+\Lambda+W})\succeq0\end{aligned}\tag{13}
$$
where
1. $\mathbf{J}=\begin{bmatrix}\mathbf{I}_4&\mathbf{0}_{4,4K}\\\mathbf{0}_{4K,4}&\mathbf{0}_{4K,4K}\end{bmatrix}\in\mathcal{S}^{4(K+1)}$
2. $\Lambda=\text{blockdiag}(\Lambda_{00},\dots,\Lambda_{KK})\in\mathcal{S}^{4(K+1)},\ \Lambda_{kk}\in\mathcal{S}^4,\ \sum_{k=0}^K\Lambda_{kk}=0$
3. $\mathbf{W}\in\mathcal{S}^{4(K+1)},[\mathbf{W}]_{ii}=0,[\mathbf{W}]_{ij}\in\frak{S}^4$ (skew symmetric matrix)
> - verify $\mathbf{x}^T\mathbf{Jx}=1,\ \mathbf{x}^T\mathbf{\Lambda x}=0,\ \mathbf{x}^T\mathbf{Wx}=0$
the dual SDP of $(12)$ is the same as SDP relaxation with redundant constraints:
$$
\begin{aligned}\min_{\mathbf{Z}\succeq0}\ \ &\text{tr}(\mathbf{QZ})\\\text{s.t. }\ &\text{tr}([\mathbf{Z}]_{00})=1\\&[\mathbf{Z}]_{kk}=[\mathbf{Z}]_{00},\ \forall k=1,\dots,K\\&[\mathbf{Z}]_{ij}=[\mathbf{Z}]_{ij}^T,\ \forall0\leq i<j\leq K\end{aligned}\tag{13}
$$
**Global Optimality Guarantee**
let $\mathbf{Z}^*$ be the optimal of $(13)$, if $\text{rank}(\mathbf{Z}^*)=1$, then the SDP relaxation is said to be tight and:
1. the optimal cost of $(13)$ matches the optimal cost of the QCQP $(10)$
2. $\mathbf{Z}^*$ can be written as $\mathbf{Z}^*= (\mathbf{x}^*)(\mathbf{x}^*)^T$, where $\mathbf{x}^*=[(\mathbf{q}^*_0)^T,(\mathbf{q}^*_1)^T,...,(\mathbf{q}^*_K)^T]^T,\ \mathbf{q}^*_k=\theta_k^*\mathbf{q}^*\theta_k^*\in\{\pm1\}$
3. $\pm\mathbf{x}^*$ are the two unique global minimizers of the original QCQP $(10)$
- $\mathbf{Z=xx}^T$ is rank 1 matrix
> 原問題是找$\mathbf{x}$,relax後的問題是找$\mathbf{Z}$,但不保證$\mathbf{Z}$一定是rank 1 matrix
if $\mathbf{Z}^*$ is a rank 1 solution (tight) by $(13)$, then optimal is guaranteed (in practice, SDP always returns rank 1 solution)
### Fast and Certification
solving SDP is slow, instead solve $(4)$ by GNC(<A href="https://hackmd.io/@doggydoggy0101/GNC">論文筆記</A>)
- QCQP,SDP are not used in TEASER++
- can we certify GNC?
**Optimality Certification**
given a feasible (not necessarily optimal) solution $(\hat{\mathbf{q}},\hat{\theta}_1,\dots,\hat{\theta}_K)$ of $(6)$ with $\hat{\mathbf{x}}=[\hat{\mathbf{q}}^T,\hat{\theta}_1\hat{\mathbf{q}}^T,\cdots,\hat{\theta}_k\hat{\mathbf{q}}^T]^T$ and $\hat{\mu}=\hat{\mathbf{x}}^T\mathbf{Q}\hat{\mathbf{x}}$ is the corresponding solution cost of $(10)$, then the following algorithm produces a suboptimality bound $\eta$ that satisfies:
$$
\frac{\hat{\mu}-\mu^*}{\hat{\mu}}\leq\eta\tag{14}
$$
where $\mu^*$ is the (unknown) global minimum of the nonconvex TLS rotation estimation $(6)$ or QCQP $(10)$

when the relaxation is tight and $0<\gamma<2$, then $(13)$ converges to 0
## Open Source
https://github.com/MIT-SPARK/TEASER-plusplus
Tutorial on TEASER++ (part 1)
https://youtu.be/uwNdLdRozeA
Tutorial on TEASER++ (part 2)
https://youtu.be/lRENNVC-FjM