$\tilde{O}(\log n)$ - Approximation Algorithm Ideas for Gaussian KDE

# $\tilde{O}(\log n)$ - Approximation Algorithm Ideas for Gaussian KDE $$ \newcommand{\fkp}{\hat{f}_{K, P}} \newcommand{\fkpi}{\hat{f}_{K, \Pi P}} \newcommand{\sumpp}{\sum\limits_{p\in P}} \newcommand{\sumppp}{\sum\limits_{p\in P'}} \newcommand{\sumxk}{\sum\limits_{x\in K}} \newcommand{\kxp}{k(\|x^* - p\|^2)} \newcommand{\gxp}{g(\|x^* - p\|^2)} \newcommand{\epstwo}{\frac{\epsilon}{2}} \newcommand{\epsfour}{\frac{\epsilon}{4}} \newcommand{\epssix}{\frac{\epsilon}{6}} \newcommand{\xp}{\|x^* - p\|^2} \newcommand{\maxxd}{\max_{x \in \mathbb{R}^d}} \newcommand{\maxxm}{\max_{x \in \mathbb{R}^m}} \newcommand{\norm}[1]{\|#1\|} \newcommand{\R}{\mathbb{R}} \newcommand{\B}{\mathcal{B}} \newcommand{\A}{\mathcal{A}} \newcommand{\S}{\mathcal{S}} \newcommand{\oc}{\overline{c}} \newcommand{\tp}{\tilde{p}} $$ $$ \newcommand{\gammaone}{\epstwo\cdot\frac{\sumpp \kxp}{\sumpp \gxp \cdot \xp}} \newcommand{\sumppcondition}[1]{\sum\limits_{p\in P, \norm{x^* -p}^2 \le #1}} \newcommand{\sumppconditionp}[1]{\sum\limits_{p\in P, \norm{x^* -p}^2 > #1}} $$ $$ \newcommand{\gammatwo}{\epsfour\cdot\frac{\sumppcondition{\xi} \kxp}{\sumppcondition{\xi} \gxp \cdot \xp}} \DeclareMathOperator*{\argmax}{arg\,max} $$ ## Defines Defines there are $m = \log n \log\log n$ balls $\B_1, \B_2, ...,\B_m$ surrounding the optimum mode $x^*$. The squared radius $r_1^2, r_2^2, ..., r_m^2$ of those balls are as following: $$ \begin{align} r_1^2 &= O(1)\\ r_2^2 &= (1 + 1/\log n)\\ &...\\ r_i^2 &= (1 + 1/\log n)r_{i-1}^2\\ &...\\ r_m^2 &= (1 + 1/\log n)^{\log n\log\log n} \end{align} $$ Then, we defined the area between ball $\B_i$ and $\B_{i-1}$ is called shell $\S_i$ as shown in the following figure. <div style="text-align:center; width: 240px; border: green solid 1px;"> <img src=https://i.imgur.com/koQA6Lf.png style="margin: 0 auto;"/> </div> ## Approach I: a "magic" algorithm can give all points inside $\S_i$ ### Claim 1 The outside area of $\B_m$ does not include more then constant approximation of the total density value with $x^*$ as the center of balls. This claim is defined as the following $$ \sum_{i=1}^m \sum_{p\in \S_i}e^{-\norm{x^* - p}^2}\ge c\cdot \sumpp e^{-\norm{x^* - p}^2}, s.t.\ c\in (0, 1]. $$ **Prove**. We have $\sum_{p \in P} e^{-\norm{x^* - p}^2}\ge 1$ because the worst case is the $x^*$ directly on a point. And for the points located outside the $\B_m$, we have the sum value of those points less than $1$: $$ \sum_{p\not\in \B_m} e^{-\norm{x^* - p}^2} < \sum_{p\not\in \B_m} e^{-(1 + 1/\log n)^{\log n\log\log n}} = 1. $$ ### Claim 2 For some $i$ we have the estimation value as following $$ \begin{align} \sum_{p\in \S_i} e^{-\norm{x^* - p}^2} &\ge \frac{1}{\log n\log\log n} \cdot OPT\\ s.t.\ OPT &= \sumpp e^{-\norm{x^* - p}^2} \end{align} $$ **Prove**. This is trivially correct because we have $\log n\log\log n$ balls in total and there must be one ring including the density more than or equal to the average of the optimium estimate value because of the pigen hole principle. ### Claim 3 First, **we project all the points on the outer boundary** of the $\S_i$ this point belongs to and name those points $\tilde{p}$. Then, we can set $\tilde{x}^* = 1/|\S_i|\cdot \sum_{p\in \S_i} \tilde{p}$, the mean of all the points inside the shell $\S_i$. And we will still have, for some $i$ $$ \sum_{p\in \S_i} e^{-\norm{\tilde{x}^* - \tilde{p}}^2} \ge \frac{1}{\tilde{O}(\log n)}\cdot OPT $$ **Prove**. We have $\tilde{x}^* = 1/|\S_i|\cdot \sum_{p\in \S_i} \tilde{p}$. And based on all points are on the outer boundary of each $\S_i$, we have $\tilde{x}^*$ is equal to the one step mean-shift on $x^*$ of $\tilde{p}$ points of $\S_i$, which is $$ \tilde{x}^* = \frac{\sum_{p\in \S_i}\tilde{p}\cdot e^{-\norm{x^* - \tilde{p}}^2}}{\sum_{p\in \S_i} e^{-\norm{x^* - \tilde{p}}^2}} = \frac{1}{|\S_i|} \cdot \sum_{p\in \S_i} \tilde{p}, because\ \forall \tilde{p}\ s.t.\ \norm{x^* - \tilde{p}}^2 = r_i^2. $$ Since for Gaussian Kernel (convex), mean-shift algorithm will not get a solution worse than $x^*$. We have $$ \sum_{p\in \S_i}e^{-\norm{\tilde{x}^* - \tilde{p}}^2} \ge \sum_{p\in \S_i}e^{-\norm{x^* - \tilde{p}}^2}. $$ We also have $\norm{x^* - \tilde{p}}^2 \le (1 + 1/\log n)\cdot\norm{x^* - p}^2$, because both $p$ and $\tilde{p}$ are inside the ring $\S_i$. We have for the density: $$ \sum_{p\in \S_i}e^{-\norm{\tilde{x}^* - \tilde{p}}^2} \ge \sum_{p\in \S_i}e^{-\norm{x^* - \tilde{p}}^2} \ge \sum_{p\in \S_i}e^{-(1 + 1/\log n)\cdot\norm{x^* - p}^2} = \sum_{p\in \S_i}e^{-\norm{x^* - p}^2} \cdot e^{-\norm{x^* - p}^2 / \log n} $$ We have $$ e^{-\norm{x^* - p}^2 / \log n} \ge e^{-1}, \ becuase\ \norm{x^* - p}^2 \le \log n. $$ Thus, based on claim 2, we have $$ \sum_{p\in \S_i}e^{-\norm{\tilde{x}^* - \tilde{p}}^2} \ge \sum_{p\in \S_i}e^{-\norm{x^* - p}^2} \cdot e^{-\norm{x^* - p}^2 / \log n} \ge e^{-1} \cdot \sum_{p\in \S_i}e^{-\norm{x^* - p}^2} \ge \frac{1}{\tilde{O}(\log n)} \cdot OPT. $$ ### Summary We assumed we know the points in ring $\S_i$, and we project the points $\tilde{p}$ inside a ring on the outer boundary of $\S_i$. Then, we get the $\tilde{x}^*$ as the mean of $\tilde{p}$, and it is provable that the $\tilde{x}^*$ we get is better than the $\tilde{O}(\log n)$ - approximation of the $OPT$ density. ## Approach II: get $\S_i'$ which including maximum points in a ring with the specific outer radius $r_i$ and the inner radius $r_{i - 1}$ Without the assumption that the "magic" algorithm can give points inside $\S_i$, we use the Sk-EB alogrihtm to get the approximate $\S_i$ as $S_i'$. ### Assumption 1: we have the ring $\S_i'$ with maximum points Here, we assumed we have the ring $\S_i'$ with outer radius $r_i$ and inner radius $r_{i-1}$ containing the most points. (**Need to be discussed more about how to get this ring.**) ### Claim 4 $x^*$ is the center of $\S_i$, and $x'$ is the center of $\S_i'$. We also scaled the points $p'$ inside $\S_i'$ to the outer boundary of the ring $\S_i'$ as $\tp'$. For all $i$, we have $$ \sum_{\tilde{p}' \in \S_i'} e^{-\norm{x' - \tilde{p}' }^2} \ge c\cdot \sum_{p\in \S_i} e^{-\norm{x^* - p}^2} $$ **Prove**. Ring $\S_i'$ has the same radius as ring $\S_i$. Thus, we have for each $p$ $$ \norm{x' - \tp'}^2 = \norm{x^* - \tp}^2 \le (1 + 1/\log n)\cdot\norm{x^* - p}^2. $$ We have ring $\S_i'$ containing more than or equal number of points as $\S_i$. Therefore, as the density value we have, for all $i$ we have $$ \sum_{\tilde{p}' \in \S_i'} e^{-\norm{x' - \tilde{p}' }^2} \ge \sum_{p\in \S_i} e^{-\norm{x^* - \tp}^2} \ge \sum_{p\in \S_i} e^{-(1 + 1/\log n)\cdot\norm{x^* - p}^2} = \sum_{p\in \S_i}e^{-\norm{x^* - p}^2} \cdot e^{-\norm{x^* - p}^2 / \log n}. $$ The last step is from the same derivation as Cliam 3 (last part). We have $$ e^{-\norm{x^* - p}^2 / \log n} \ge e^{-1}, \ becuase\ \norm{x^* - p}^2 \le \log n. $$ Thus, based on claim 2, we have $$ \sum_{\tilde{p}' \in \S_i'} e^{-\norm{x' - \tilde{p}' }^2} \ge \sum_{p\in \S_i}e^{-\norm{x^* - p}^2} \cdot e^{-\norm{x^* - p}^2 / \log n} \ge e^{-1} \cdot \sum_{p\in \S_i}e^{-\norm{x^* - p}^2} \ge \frac{1}{\tilde{O}(\log n)} \cdot OPT. $$ **Actually, $c$ better than or equal to $e^{-1}$.** ### Claim 5 For some $i$, defined $\S_i'$ as the ring with outer radius $r_i$ and inner radius $r_{i-1}$ containing maximum number of points. Consider $x'$ as the center of $\S_i'$, we have $$ \sum_{\tilde{p}' \in \S_i'} e^{-\norm{x' - \tilde{p}'}^2} \ge \frac{1}{\tilde{O}(\log n)}\cdot OPT $$ **Prove**. Trivally correct, based on Claim 1,2,4. ### Claim 6 A constant approximation of $\S_i'^*$ will give us the same result as Claim 5. **Prove**. We have $$ \sum_{\tilde{p}' \in \S_i'} e^{-\norm{x' - \tilde{p}'}^2} \ge c \sum_{\tilde{p}'^* \in \S_i'^*} e^{-\norm{x'^* - \tilde{p}'^*}^2},\ because\ \norm{x' - \tilde{p}'}^2 = \norm{x'^* - \tilde{p}'^*}^2. $$ ## Find the Ring $\S_i'$ We need an algorithm which can give the maximum number of points under two specific quadratic constrains in a high dimensional space. We are still finding possible solutions for this... ## Sk-EB Approximation for $\S_i'$ Based on the definition of Sk-EB there are more points in the SkEB ball with the fixed radius. Thus, assume sphere $\B_i''$ with the center $x''$ is the SkEB ball (we project all the points to the boundary of ball $\B_i''$) with the same radius as $\S_i'$, we have $$ \sum_{\tp\in \B_i''} e^{-\norm{x'' - \tp}^2} \ge \sum_{\tilde{p}' \in \S_i'} e^{-\norm{x' - \tilde{p}'}^2} $$ We cannot directly have the optimum ball $\B_i''^*$, if we want $$ \sum_{\tp\in \B_i''} e^{-\norm{x'' - \tp}^2} \ge \frac{1}{\tilde{O}(\log n)}\cdot OPT $$ In paper, [A Structural Theorem for Center-Based Clustering in High-Dimensional Euclidean Space](https://rdcu.be/cmPZK). They mentioned with which aproximation, what time complexity it will be (Mainly Theorem 1 and Theorem 3). ![](https://i.imgur.com/Ytu3LvN.png) ![](https://i.imgur.com/jAL0kdU.png) Then, we discuss which approximation between $\sum_{\tp\in \B_i''} e^{-\norm{x'' - \tp}^2}$ and $\sum_{\tp\in \B_i''^*} e^{-\norm{x''^* - \tp}^2}$ can give us a polynomial time complexity. ### Time complexity analysis of constant approximation of estimate value of SkEB balls Based on $N(n, \epsilon) = O((n/\epsilon)^{\frac{2}{\epsilon}\log \frac{2}{\epsilon}})$, if we want a polynomail time complexity, $\epsilon$ has to be constant $c$. We have $\norm{x'' - \tp}^2 \le (1 + c)\norm{x''^* - \tp}^2$. Recall that $\norm{x''^* - \tp}^2$ is equal to the radius square of ring $\S_i'$ which is $\norm{x' - \tp}^2$. $$ \sum_{\tp\in \B_i''} e^{-\norm{x'' - \tp}^2} = \sum_{\tp\in \B_i''} e^{-(1 + c)\norm{x' - \tp}^2} = e^{-c\norm{x' - \tp}^2}\cdot \sum_{\tp\in \B_i''} e^{-\norm{x' - \tp}^2} \ge \frac{e^{-c\norm{x' - \tp}^2}}{\tilde{O}(\log n)}\cdot OPT $$ Then, we analyze $e^{-c\norm{x' - \tp}^2}$, and in our cases, we got $\norm{x' - \tp}^2 \le \log n$. Thus, we have $e^{-c\norm{x' - \tp}^2} \ge n^{-c}$. We have $$ \sum_{\tp\in \B_i''} e^{-\norm{x'' - \tp}^2} \ge \frac{1}{O(n^c)} \cdot OPT, for\ any\ c\in (0, 1). $$ Then, the time complexity is polynomial. However, with $\epsilon = 1/\log n$ the time complexity is $O((n\log n)^{2\log n \log\log n}\cdot d)$. One possible solution is we add the bandwidth $h = \sqrt{\log n}$. This idea fail mainly because in the previous claim we set the cut-off radius to $\sqrt{\log n}$, thus, we have to have $\epsilon$ to be $1/\log n$. After we have $h = \sqrt{\log n}$, we will have $\norm{(x' - \tp)/h}^2 \le some\ constant$. After we have $h$, we can have polynomial time complexity solved KDE problem with $\tilde{O}(\log n)$ approximation of the estimate value. ### Directly adapt function $f$ to KDE problem ![](https://i.imgur.com/fC6TIHf.png) Here, it is maximum function $f$ instead of minimum, thus, the inequality should be $\ge$. #### Epanchnikov Kernel ****Assume after the approximation no points are chopped by the cut-off circle.** For the Epanchnikov Kernel, we have $k = 1$ and $f(X; c_1^*) = \sum_{x \in X} 1 - \norm{c_1^* - x}^2$, $f(X; c_1') = \sum_{x \in X} 1 - \norm{c_1' - x}^2$. And for the distant relation, $\epsilon$ is constant: $$ \norm{c_1' - x}^2 \le (1 + \epsilon) \cdot \norm{c_1^* - x}^2. $$ Then, we want to have $$ f(X; c_1') = \sum_{x \in X} 1 - \norm{c_1' - x}^2 \ge \sum_{x \in X} 1 - (1 + \epsilon)\norm{c_1^* - x}^2 \ge \alpha \sum_{x \in X} 1 - \norm{c_1^* - x}^2 $$ So, we have $\alpha$ can be $$ \alpha = \frac{1 - (1 + \epsilon)\cdot \norm{c_1^* - x}^2}{1 - \norm{c_1^* - x}^2} $$ Here, we have $\norm{c_1^* - x}^2 \le 1$, which is no more than a constant. And $\epsilon$ is also a constant. - When $\norm{c_1^* - x}^2$ is a constant, we have $\alpha$ is a constant, too. - When $\norm{c_1^* - x}^2$ is less than a constant, such as $1/n$ or $1/\log n$, the $(1 + \epsilon)$ part can be ignored. Then, $\alpha = 1$. Thus, in every cases, we have $\alpha = \mu(1 + \epsilon)$ is a constant. And the time complexity is polynomial. #### Gaussian Kernel In our Gaussian KDE, we have $k = 1$ and $f(X; c_1^*) = \sum_{x \in X} e^{-\norm{c_1^* - x}^2}$, $f(X; c_1') = \sum_{x \in X} e^{-\norm{c_1' - x}^2}$. And for the distant relation, $\epsilon$ is constant: $$ \norm{c_1' - x}^2 \le (1 + \epsilon) \cdot \norm{c_1^* - x}^2. $$ Then, we have $$ f(X; c_1') = \sum_{x \in X} e^{-\norm{c_1' - x}^2} \ge \sum_{x \in X} e^{-(1 + \epsilon)\norm{c_1^* - x}^2} = e^{-\epsilon \norm{c_1^* - x}^2 } \cdot \sum_{x \in X} e^{-\norm{c_1^* - x}^2} $$ Based on $\norm{c_1^* - x}^2 \le \log n$ and $\epsilon$ is a constant, we have $$ e^{-\epsilon \norm{c_1^* - x}^2 } \ge n^{-\epsilon}. $$ We have $\mu(1 + \epsilon) = n^{-\epsilon}$.

Syntax	Example	Reference
# Header	Header	基本排版
- Unordered List	Unordered List
1. Ordered List	Ordered List
- [ ] Todo List	Todo List
> Blockquote	Blockquote
Bold font	Bold font
Italics font	Italics font
~~Strikethrough~~	~~Strikethrough~~
19^th^	19^th
H~2~O	H₂O
++Inserted text++	Inserted text
==Marked text==	Marked text
[link text](https:// "title")	Link
![image alt](https:// "title")	Image
`Code`	`Code`	在筆記中貼入程式碼
```javascript var i = 0; ```	`var i = 0;`
:smile:		Emoji list
{%youtube youtube_id %}	Externals
$L^aT_eX$	L^aT_eX
:::info This is a alert area. :::	This is a alert area.