# Graduated Non-Convexity for Robust Spatial Perception: From Non-Minimal Solvers to Global Outlier Rejection H. Yang, P. Antonante, V. Tzoumas and L. Carlone, *Graduated Non-Convexity for Robust Spatial Perception: From Non-Minimal Solvers to Global Outlier Rejection*, IEEE Robotics and Automation Letters, vol. 5, no. 2, pp. 1127-1134, 2020. [TOC] ## Abstract > 問題: > - non-minimal solvers rely on least squares, but are brittle against outliers > - apply robust cost function, but robust costs introduce non-convexity > > 提出GNC,同時使用robust cost與non-minimal solver > - does not require initial guess > - global optimality is not guaranteed SDP and Sums-of-Squares relaxations have led to non-minimal solvers, however, most non-minimal solvers rely on **least squares and are brittle against outliers**. While a standard approach is to use robust cost functions, the latter typically **introduce other non-convexities**, preventing the use of existing non-minimal solvers. We enable the **simultaneous use of non-minimal solvers and robust estimation** by providing a general-purpose approach for robust global estimation. We leverage the Black-Rangarajan duality between robust estimation and outlier processes, and show that GNC can be used in conjunction with non-minimal solvers to compute robust solutions, without requiring an initial guess. Although GNC’s global optimality cannot be guaranteed, we demonstrate the empirical robustness of the resulting robust non-minimal solvers in applications, including PGO. ### Robust cost > M-estimator least squares ($\mathbf{x}$: state, $\mathbf{y}$: measurement, $r$: residual) $$ \min_{\mathbf{x}\in \mathcal{X}}\sum_{i=1}^Nr^2(\mathbf{x,y}_i) $$ least squares for **outliers** ($\rho$: robust cost) $$ \min_{\mathbf{x}\in \mathcal{X}}\sum_{i=1}^N\rho(r(\mathbf{x,y}_i)) $$ Ex. **convex** Huber loss, **non-convex** truncated least squares, Geman-McClure - non-minimal solver can include convex robust cost - rare robust non-minimal solvers <!-- ## Related work ### Outlier-free Estimation **Minimal solver** - uses the smallest number of measurements to estimate **$\mathbf{x}$** - do not leverage data redundancy, can be sensitive to noise Ex. 3 point solver for point cloud registration, 12 point solver for mesh registration **Non-minimal solver** - assume Gaussian noise, which results in least squares optimization - generally is not in **closed form**, has only locally optimal solutions **Global optimization** - Branch and Bound - Sum of Squares relaxation - SDP relaxations - Lagrange duality ### Robust Estimation **Global methods** does not require initial guess - Consensus maximization Ex. RANSAC, ADAPT (linear) - M-estimation robust cost Ex. truncated least squares SDP relaxations (suitable for small size problems) - Graduated non-convexity robust cost (non-convex) lack on non-minimal solver **Local methods** requires initial guess - Iterative closest point --> ## Method ### Black-Rangarajan Duality given a robust cost function $\rho$, define $\phi(z)=\rho(\sqrt{z})$. If $\phi(z)$ satisfies $\lim_{z\rightarrow0}\phi^\prime(z)=1$, $\lim_{z\rightarrow\infty}\phi^\prime(z)=0$ and $\phi^{\prime\prime}(z)<0$, then $$ \min_{\mathbf{x}\in \mathcal{X}}\sum_{i=1}^N\rho(r(\mathbf{x,y}_i))=\min_{\mathbf{x}\in \mathcal{X},w_i\in{[0,1]}}\sum_{i=1}^N[w_ir^2(\mathbf{x,y}_i)+\Phi_\rho(w_i)] $$ where $w_i\in[0,1]$ are weights and $\Phi_\rho(w_i)$ defines a penalty on $w_i$. > SLAM: without acknowledging the connection with robust estimation, i.e., penalty term Black-Rangarajan provide a procedure to compute $\Phi_\rho(w_i)$ (step 1) Define $\phi(z)=\rho(\sqrt{z/\tau})$ where $z$ is possibly scaled by $\tau$ in $\rho$. (step 2) Compute $\phi^\prime(z)$ and $\phi^{\prime\prime}(z)$. (check) If satisfies $\lim_{z\rightarrow0}\phi^\prime(z)=1$, $\lim_{z\rightarrow\infty}\phi^\prime(z)=0$ and $\phi^{\prime\prime}(z)<0$, proceed. (Otherwise, $\rho$ does not have a simple outlier process formulation.) (step 3) Solve $w=\phi^\prime(z)$ to get inverse function $z=(\phi^\prime)^{-1}(w)$. (step 4) Define $\Phi(w)=\phi(z)-wz=\phi((\phi^\prime)^{-1}(w))-w(\phi^\prime)^{-1}(w)$. (step 5) New objective function $\tau wr^2+\Phi(w)$. > 在M-estimator裡$w=\phi^\prime(x)/x$ ### Graduated non-convexity surrogate cost $\rho_\mu$, $\mu$ is the control parameter - for a certain $\mu$, the surrogate function $\rho_\mu$ is convex - for a certain $\mu$, the surrogate function $\rho_\mu$ recovers the original non-convex $\rho$ start from the convex surrogate, gradually recover to the non-convex function Ex. Geman-McClure $$ \rho(r)=\frac{\bar{c}^2r^2}{\bar{c}^2+r^2}\implies\rho_\mu(r)=\frac{\mu\bar{c}^2r^2}{\mu\bar{c}^2+r^2} $$ where $\mu\rightarrow\infty$ is convex and $\mu=1$ is the original function Ex. Truncated Least Squares $$ \rho(r)=\begin{cases}r^2&\text{if }r^2\in[0,\bar{c}^2]\\\bar{c}^2&\text{if }r^2\in[\bar{c}^2,\infty]\end{cases}\implies\rho_\mu(r)=\begin{cases}r^2&\text{if }r^2\in[0,\frac{\mu}{\mu+1}\bar{c}^2]\\2\bar{c}|r|\sqrt{\mu(\mu+1)}-\mu(\bar{c}^2+r^2)&\text{if }r^2\in[\frac{\mu}{\mu+1}\bar{c}^2,\frac{\mu+1}{\mu}\bar{c}^2]\\\bar{c}^2&\text{if }r^2\in[\frac{\mu+1}{\mu}\bar{c}^2,\infty]\end{cases} $$ where $\mu\rightarrow0$ is convex and $\mu\rightarrow\infty$ is the original function ![gnc](https://hackmd.io/_uploads/ByjMJmJrT.gif) ### O**uter iteration** fix $\mu$ and optimize: $$ \min_{\mathbf{x}\in \mathcal{X}}\sum_{i=1}^N\rho_\mu(r(\mathbf{x,y}_i))=\min_{\mathbf{x}\in \mathcal{X},w_i\in{[0,1]}}\sum_{i=1}^N[w_ir^2(\mathbf{x,y}_i)+\Phi_{\rho_\mu}(w_i)] $$ we will discuss solving $\Phi_{\rho_\mu}(w_i)$ later by examples of GM & TLS ### Inner iteration first optimize over $\mathbf{x}$ (fixed $w_i$), then optimize over all $w_i$ (fixed $\mathbf{x}$) **variable update** optimize over $\mathbf{x}$ (fixed $w_i$) $$ \mathbf{x}^{(t)}=\min_{\mathbf{x}\in \mathcal{X}}\sum_{i=1}^Nw_i^{(t-1)}r^2(\mathbf{x,y}_i) $$ omit penalty term since it does not depend on $\mathbf{x}$, solved with non-minimal solvers **weight update** optimize over $\mathbf{w}$ (fixed $\mathbf{x}$) $$ \mathbf{w}^{(t)}=\min_{w_i\in{[0,1]}}\sum_{i=1}^N[w_ir^2(\mathbf{x}^{(t)},\mathbf{y}_i)+\Phi_{\rho_\mu}(w_i)] $$ since $r^2(\mathbf{x}^{(t)},\mathbf{y}_i)$ is constant, solved in closed form the process is repeated for changing $\mu$ and increasing the amount of non-convexity ![image](https://hackmd.io/_uploads/rkfOJm1Ha.png =500x) ## German-McClure example **Black-Rangarajan duality** $$ \rho_\mu(r)=\frac{\mu\bar{c}^2r^2}{\mu\bar{c}^2+r^2} $$ (step 1) $$ \phi(z)=\rho(\sqrt{z})=\frac{\mu\bar{c}^2z}{\mu\bar{c}^2+z} $$ (step 2, check) $$ \phi^\prime(z)=(\frac{\mu\bar{c}^2}{\mu\bar{c}^2+z})^2,\ \ \phi^{\prime\prime}(z)=-2\Big(\frac{\mu\bar{c}^2}{\mu\bar{c}^2+z}\Big)^3 $$ (step 3) Solve $(\phi^\prime)^{-1}:[0,1]\rightarrow \mathbb{R}$ $$ w=\phi^\prime(z)=(\frac{\mu\bar{c}^2}{\mu\bar{c}^2+z})^2\implies z=(\phi^\prime)^{-1}(w)=\frac{\mu\bar{c}^2}{\sqrt{w}}-\mu\bar{c}^2 $$ (step 4) Define $\Phi(w)=\phi(z)-wz$ $$ \begin{aligned}\Phi(w)&=\frac{\mu\bar{c}^2z}{\mu\bar{c}^2+z}-wz\\[10pt]&=\frac{\frac{(\mu\bar{c}^2)^2}{\sqrt{w}}-(\mu\bar{c}^2)^2}{\frac{\mu\bar{c}^2}{\sqrt{w}}}-\sqrt{w}\mu\bar{c}^2-w\mu\bar{c}^2\\[20pt]&=(\mu\bar{c}^2)^2-2\sqrt{w}\mu\bar{c}^2+w\mu\bar{c}^2=\mu\bar{c}^2(\sqrt{w}-1)^2\end{aligned} $$ (step 5) New objective function $wr^2+\Phi(w)$ $$ \rho_\mu(r)=wr^2+\mu\bar{c}^2(\sqrt{w}-1)^2 $$ **variable update** solve with non-minimal solvers such as weighted least squares $$ \mathbf{x}^{(t)}=\min_{\mathbf{x}\in \mathcal{X}}\sum_{i=1}^Nw_i^{(t-1)}r^2 $$ **weight update** derivative with respect to $w^{(t)}$, we get the closed form of $w^{(t)}$ $$ \nabla_w\rho_\mu(r)=r^2+\mu\bar{c}^2(\sqrt{w}-1)w^{-\frac{1}{2}}=r^2+\mu\bar{c}^2-\mu\bar{c}^2w^{-\frac{1}{2}}=0\implies w^{(t)}=\Big(\frac{\mu\bar{c}^2}{r^2+\mu\bar{c}^2}\Big)^2 $$ **others** - initialize $\mu=2(\max_i r^2(\mathbf{x}_0,\mathbf{y}_i))/\bar{c}^2$ - update $\mu\leftarrow\mu/1.4$ - stop when $\mu<1$ - choose $\bar{c}$ based on remark 1-2 in [2] ## Truncated Least Squares example **Black-Rangarajan duality** $$ \rho_\mu(r)=\begin{cases}r^2&\text{if }r^2\in[0,\frac{\mu}{\mu+1}\bar{c}^2]\\2\bar{c}|r|\sqrt{\mu(\mu+1)}-\mu(\bar{c}^2+r^2)&\text{if }r^2\in[\frac{\mu}{\mu+1}\bar{c}^2,\frac{\mu+1}{\mu}\bar{c}^2]\\\bar{c}^2&\text{if }r^2\in[\frac{\mu+1}{\mu}\bar{c}^2,\infty]\end{cases} $$ (step 1) $$ \phi(z)=\rho(\sqrt{z})=\begin{cases}z&\text{if }z\in[0,\frac{\mu}{\mu+1}\bar{c}^2]\\2\bar{c}\sqrt{z\mu(\mu+1)}-\mu(\bar{c}^2+z)&\text{if }z\in [\frac{\mu}{\mu+1}\bar{c}^2,\frac{\mu+1}{\mu}\bar{c}^2]\\\bar{c}^2&\text{if }z\in [\frac{\mu+1}{\mu}\bar{c}^2,\infty]\end{cases} $$ (step 2, check) $$ \phi^\prime(z)=\begin{cases}1&\text{if }z\in [0,\frac{\mu}{\mu+1}\bar{c}^2]\\\bar{c}\sqrt{\mu(\mu+1)}\cdot z^{-\frac{1}{2}}-\mu&\text{if }z\in [\frac{\mu}{\mu+1}\bar{c}^2,\frac{\mu+1}{\mu}\bar{c}^2]\\0&\text{if }z\in [\frac{\mu+1}{\mu}\bar{c}^2,\infty]\end{cases},\ \ \phi^{\prime\prime}(z)=-\frac{1}{2}\bar{c}\sqrt{\mu(\mu+1)}\cdot z^{-\frac{3}{2}},\ z\in [\frac{\mu}{\mu+1}\bar{c}^2,\frac{\mu+1}{\mu}\bar{c}^2] $$ (step 3) Solve $(\phi^\prime)^{-1}:[0,1]\rightarrow [\frac{\mu}{\mu+1}\bar{c}^2,\frac{\mu+1}{\mu}\bar{c}^2]$ $$ w=\phi^\prime(z)=\bar{c}\sqrt{\mu(\mu+1)}\cdot z^{-\frac{1}{2}}-\mu\implies z=(\phi^\prime)^{-1}(w)=\frac{\bar{c}^2\mu(\mu+1)}{(w+\mu)^2} $$ (step 4) Define $\Phi(w)=\phi(z)-wz$ $$ \begin{aligned}\Phi(w)&=2\bar{c}\sqrt{z\mu(\mu+1)}-\mu(\bar{c}^2+z)-wz\\[6pt]&=2\bar{c}^2\frac{\mu(\mu+1)}{(w+\mu)}-\mu(\bar{c}^2+\frac{\bar{c}^2\mu(\mu+1)}{(w+\mu)^2})-\frac{w\bar{c}^2\mu(\mu+1)}{(w+\mu)^2}\\&=\frac{\mu(1-w)}{w+\mu}\bar{c}^2\end{aligned} $$ (step 5) New objective function $wr^2+\Phi(w)$ $$ \rho_\mu(r)=\begin{cases}r^2&\text{if }r^2\in[0,\frac{\mu}{\mu+1}\bar{c}^2]\\wr^2+\frac{\mu(1-w)}{w+\mu}\bar{c}^2&\text{if }r^2\in[\frac{\mu}{\mu+1}\bar{c}^2,\frac{\mu+1}{\mu}\bar{c}^2]\\\bar{c}^2&\text{if }r^2\in[\frac{\mu+1}{\mu}\bar{c}^2,\infty]\end{cases} $$ **variable update** solve with non-minimal solvers such as weighted least squares $$ \mathbf{x}^{(t)}=\min_{\mathbf{x}\in \mathcal{X}}\sum_{i=1}^Nw_i^{(t-1)}r^2 $$ **weight update** derivative with respect to $w^{(t)}$ $$ \nabla_w\rho_\mu(r)=r^2+\bar{c}^2\frac{-\mu(w^{(t)}+\mu)-\mu(1-w^{(t)})}{(w^{(t)}+\mu)^2}=r^2-\frac{\mu(1+\mu)}{(w^{(t)}+\mu)^2}\bar{c}^2=0\implies w^{(t)}=\frac{\sqrt{\mu(1+\mu)}}{r}\bar{c}-\mu $$ Note that $w^{(t)}=1\iff\rho_\mu(r)=r^2$ and $w^{(t)}=0\iff\rho_\mu(r)=\bar{c}^2$, we get the closed form of $w^{(t)}$ $$ w^{(t)}=\begin{cases}1&\text{if }r^2\in[0,\frac{\mu}{\mu+1}\bar{c}]\\\frac{\sqrt{\mu(1+\mu)}}{r}\bar{c}-\mu&\text{if }r^2\in[\frac{\mu}{\mu+1}\bar{c}^2,\frac{\mu+1}{\mu}\bar{c}^2]\\0&\text{if }r^2\in[\frac{\mu+1}{\mu}\bar{c}^2,\infty]\end{cases} $$ **others** - initialize $\mu=\bar{c}^2/(2(\max_i r^2(\mathbf{x}_0,\mathbf{y}_i))-\bar{c}^2)$ - update $\mu\leftarrow1.4\mu$ - stop when the weighted sum of squares $\sum_{i=1}^Nw_ir^2$ converges - choose $\bar{c}$ based on remark 1-2 in [2] [1] Convex relaxations for pose graph optimization with outliers (IEEE RA-L, 2018) [2] A quaternion-based certifiably optimal solution to the Wahba problem with outliers (ICCV, 2019) ## Open Source https://github.com/MIT-SPARK/GNC-and-ADAPT