Course: Neural Nets
Deadline: Feb 20
Collaborations policy. You may collaborate with other students and submit a report as a group of 1 to 5 members. However, please note that submitting individually will not provide any additional grading advantage. Therefore, we encourage you to work in groups. Please note that the group policy has been slightly modified from the course syllabus.
Submission Policy. Please submit your reports to Canvas for evaluation. Reports can be pdf files or a link to a Markdown page on hackmd.io or similar websites. Sending links to hackmd's markdown allow us to post comments and ask questions to your report.
Generative AI policy. We recommend avoiding the use of generative AI or, at the very least, ensuring that you fully understand any generated solution before including it in your report. If you do not find the exercises interesting, you have other options for credit such as helping with lecture notes instead.
Q & A policy. You can add your questions as comments to this page. You can select a part of the text and add a comment to it. Then, I will clarify.
Suppose that \(x_1, \dots, x_n \in \mathbb{R}^d\) are drawn i.i.d. from distribution \(\mu\) and \(y_1, \dots, y_n\) are drawn i.i.d. from \(\nu\). In the Introducation Lecture, we defined empirical Maximum Mean Discrepancy (MMD) between \(\mu\) and \(\nu\) as
\[\text{MMD}_n(\mu,\nu):=\frac{1}{n^2} \sum_{i,j=1}^n k(x_i,x_j) - \frac{2}{n^2} \sum_{i,j=1}^n k(x_i,y_j) + \frac{1}{n^2} \sum_{i,j=1}^n k(y_i,y_j)\]
Recall that this metric can be used for a statistical test to see whether \(\mu\) and \(\nu\) are identical. In this excerices, we assume \(k(x,y) = e^{-\|x-y\|^2/2}\).
Part a.
Can we compute MMD\(_n\) for \(n=100000\)? What is the time complexity of computing MMD\(_n\) (O-complexity in \(n\))?
Part b.
To tackle the above challenge, propose a method to approximate MMD\(_n\) for a large \(n\). Hint: See Random Features lecturenotes or slides
Part c.
Implement your proposed method and report your obtained approximate MMD for points \(x_1, \dots, x_n\) and \(y_1, \dots, y_n\) available at this colab. You can include your implementation in your report.
X = data[0,0,:,:]# n \times d matrix containing x_is
Y = data[0,1,:,:]# n \times d matrix containing y_is
# Your implementation
Part d.
What is the time complexity of your proposed method in \(n\)? Compare with those in part a.
Part e.
What is the approximation error of your proposed method? No need to explain how to derive the error bound just use \(O()\) notation. Hint: see Claim 1 in the original paper on random feature
Part f. How can you improve the estimation error in part E.?
Recall: A function \(f\) obeys the reproducing property (RP) associated with kernel \(k\) if
\[\textbf{RP:} \quad \quad \quad f(x) = \int k(x,y) f(y) dy\]
Assume \(k(x,y)= e^{-\| x - y \|^2/2}\). We want to provide insights on this property.
Background
Function \(f(x)\) is Lipschtiz continous if there is a finite constant \(L\) such that \(|f(x)-f(z)|\leq L \| x- z \|\) holds for all inputs \(x\) and \(y\).
Part a.
Prove: A differentiable \(f\) is Lipschitz continous if \(\max_x \| \nabla f(x)\|\) is bounded. Recall \(\nabla f(x)\) denotes the gradient of function \(f\) evaluated at \(x\). Hint: Use Taylor remainder theorem
Part b.
Prove that \(f(x) = k(x,b)\) is Lipschitz continous. Hint: Use part a.
Part c.
Suppose that \(\int |f(y)|dy\) is finite and \(f\) obeys RP.
Prove: \(f\) is Lipschitz continous. Hint: use part b
Part d
Do you think the following function obeys RP? Why? Hint: use part c
Part e
Can you now explain why kernel regression (shown bellow and studied in lecturenote on kernel methods ) performance is not good for this function? Hint: recall representer theorem in the lecture notes and use part d
or
or
By clicking below, you agree to our terms of service.
New to HackMD? Sign up
Syntax | Example | Reference | |
---|---|---|---|
# Header | Header | 基本排版 | |
- Unordered List |
|
||
1. Ordered List |
|
||
- [ ] Todo List |
|
||
> Blockquote | Blockquote |
||
**Bold font** | Bold font | ||
*Italics font* | Italics font | ||
~~Strikethrough~~ | |||
19^th^ | 19th | ||
H~2~O | H2O | ||
++Inserted text++ | Inserted text | ||
==Marked text== | Marked text | ||
[link text](https:// "title") | Link | ||
 | Image | ||
`Code` | Code |
在筆記中貼入程式碼 | |
```javascript var i = 0; ``` |
|
||
:smile: | ![]() |
Emoji list | |
{%youtube youtube_id %} | Externals | ||
$L^aT_eX$ | LaTeX | ||
:::info This is a alert area. ::: |
This is a alert area. |
On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?
Please give us some advice and help us improve HackMD.
Do you want to remove this version name and description?
Syncing