# Problem 4 - Group B ## Presentation Write your presentation here if you like (you can also use your ipad if one of you has one) Dummy formula: $$R(f) - R(f^\star) \leq \mathcal{R}(\mathcal{H}) - \gamma$$ Dummy align environment \begin{align} R(f) - R(f^\star) &\leq \alpha \beta \\ &\leq c\tau \end{align} ``` Some code ``` ## Scratchpad for collaboration You can brainstorm ideas here and delete this section later (or keep it) ## Question 1 - Classical statistics - empirical distribution converges to true one - degeree of freedomm is fixed - more data helps - High dimensional - more data does not necessarily help - average distance between data points grows as $\mathcal{O}(\sqrt{d})$ - In modern image datasets, $n$ and $d$ roughly grow together, hence high-dimensional statistics provide better intuition - ex: MNIST -> CIFAR10 -> ImageNet - more features and also more data available - Thm 1 says that the learned function converges to a polynomial of deg. 2 - => everything which is not representable by a deg-2 poly. has some error - we can only interpolate what is interpolateable by a deg-2 poly. - We do not expect the bias to vanish as $d \to \infty$ since the problem also becomes harder - but shouldn't it actually become *easier* when we increase $d$ and get more degrees of freedom? ## Question 2 ### 1) - Trivial since $x_i^T x_i = 1$ ### 2) we use $k(x,x')=(x^Tx')^3$ Condition on $\mathcal{E}_\mathbf{X}$ (for some $\epsilon > 0$). $$ \begin{align} \|M\| &\leq \|M\|_{fro} \\ &= \sqrt{2\sum_{x\neq x'}k(x,x')^2} \\ &\leq n\sqrt{2}\max_{i\neq j} |x_i^Tx_j|^3\\ &\leq ncn^{\frac{-3}{2}}(\log{n})^\frac{3 (1+\epsilon)}{2} \\ &= c\frac{(\log{n})^\frac{3(1+\epsilon)}{2}}{\sqrt{n}}\to 0 \end{align} $$ for some $c > 0$. <!-- - distribution of $x^Tx'$? --> <!-- &\leq \sqrt{2\sum_{x\neq x'}\mathbb{E}[k(x,x')^2]} \\ &\leq \sqrt{2\sum_{x\neq x'}\mathbb{E}[e^{2x^Tx'}]} \\ &\leq n \sqrt{2 \mathbb{E}[e^{2x^Tx'}]} -->