# 深度學習相關數學 $$ dataset\ D = \{(x_1, y_1), (x_2, y_2), \dots\} $$ ## Square Loss ### Polynomial Function $$ f(x;\theta) = \theta_0 + \theta_1 x + \dots + \theta_n x^{n}$$ Construct matrix $X$, $\Theta$ and $Y$ $$ X = \left[ \begin{array}{ccccc} 1 & x_1 & x_1^2 & \cdots & x_1^n \\ 1 & x_2 & x_2^2 & \cdots & x_2^n \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ \end{array} \right], \Theta = \left[ \begin{array}{c} \theta_0 \\ \theta_1 \\ \vdots \\ \theta_n \end{array} \right], Y = \left[ \begin{array}{c} y_1 \\ y_2 \\ \vdots \end{array} \right] $$ Then, $$ Loss(\theta) = \sum_{i=1}^{|D|} (y_i - f(x_i;\theta))^2 = ||X\Theta - Y||^2 $$ Goal : minimize $||X\Theta -Y||^2$ \begin{split} ||X\Theta - Y||^2 = & (X\Theta - Y)^T(X\Theta - Y) \\ = & (X\Theta)^T(X\Theta) - (X\Theta)^T Y - Y^T(X\Theta) + Y^TY \\ = & \Theta^T (X^TX) \Theta - \Theta^TX^TY - Y^TX\Theta + Y^TY \\ = & \Theta^T(X^TX)\Theta - 2\cdot Y^TX\Theta + Y^TY \\ \frac{\partial}{\partial \Theta} ||X\Theta - Y||^2 = & 2\cdot (X^TX)\Theta - 2 \cdot X^TY \end{split} If $X^TX$ is invertible, $$ \frac{\partial}{\partial \Theta} ||X\Theta - Y||^2 = 0 \implies \Theta = (X^TX)^{-1}(X^TY) $$ ## Absolute Loss $$ f(x;\theta) = \theta_0 + \theta_1 x + \theta_2 x^2 $$ \begin{split} Loss(\theta) = & \sum_{i=1}^{|D|} |y_i - f(x_i;\theta)| \\ = & \sum_{i = 1}^{|D|} |y_i - (\theta_0 + \theta_1 x_i + \theta x_i^2)| \\ = & \sum_{i = 1}^{|D|} \sqrt{(y_i - (\theta_0 + \theta_1 x_i + \theta_2 x_i^2))^2} \\ \\ \frac{\partial}{\partial \theta_0} Loss(\theta) = & \sum_{i=1}^{|D|}\frac{\partial}{\partial \theta_0} \sqrt{(y_i - (\theta_0 + \theta_1 x_i + \theta_2 x_i^2))^2} \\ = & \sum_{i = 1}^{|D|} (\frac{1}{2} \cdot \frac{1}{\sqrt{(y_i - (\theta_0 + \theta_1 x_i + \theta_2 x_i^2))^2}})(2 \cdot (y_i - (\theta_0 + \theta_1 x_i + \theta_2 x_i^2))(-1) \\ = & \sum_{i=1}^{|D|}\frac{y_i - (\theta_0 + \theta_1 x_i + \theta_2 x_i^2)}{\sqrt{(y_i - (\theta_0 + \theta_1 x_i + \theta_2 x_i^2))^2}} \cdot (-1) \\ = & \sum_{i=1}^{|D|} \frac{y_i - f(x_i;\theta)}{|y_i - f(x_i;\theta)|}\cdot (-1)\\ \\ \frac{\partial}{\partial \theta_1} Loss(\theta) = & \sum_{i=1}^{|D|}\frac{\partial}{\partial \theta_1} \sqrt{(y_i - (\theta_0 + \theta_1 x_i + \theta_2 x_i^2))^2} \\ = & \sum_{i = 1}^{|D|} (\frac{1}{2} \cdot \frac{1}{\sqrt{(y_i - (\theta_0 + \theta_1 x_i + \theta_2 x_i^2))^2}})(2 \cdot (y_i - (\theta_0 + \theta_1 x_i + \theta_2 x_i^2))(-x_i) \\ = & \sum_{i=1}^{|D|}\frac{y_i - (\theta_0 + \theta_1 x_i + \theta_2 x_i^2)}{\sqrt{(y_i - (\theta_0 + \theta_1 x_i + \theta_2 x_i^2))^2}} \cdot (-x_i) \\ = & \sum_{i=1}^{|D|} \frac{y_i - f(x_i;\theta)}{|y_i - f(x_i;\theta)|}\cdot (-x_i)\\ \\ \frac{\partial}{\partial \theta_2} Loss(\theta) = & \sum_{i=1}^{|D|}\frac{\partial}{\partial \theta_2} \sqrt{(y_i - (\theta_0 + \theta_1 x_i + \theta_2 x_i^2))^2} \\ = & \sum_{i = 1}^{|D|} (\frac{1}{2} \cdot \frac{1}{\sqrt{(y_i - (\theta_0 + \theta_1 x_i + \theta_2 x_i^2))^2}})(2 \cdot (y_i - (\theta_0 + \theta_1 x_i + \theta_2 x_i^2))(-x_i^2) \\ = & \sum_{i=1}^{|D|}\frac{y_i - (\theta_0 + \theta_1 x_i + \theta_2 x_i^2)}{\sqrt{(y_i - (\theta_0 + \theta_1 x_i + \theta_2 x_i^2))^2}} \cdot (-x_i^2) \\ = & \sum_{i=1}^{|D|} \frac{y_i - f(x_i;\theta)}{|y_i - f(x_i;\theta)|}\cdot (-x_i^2)\\ \end{split}
×
Sign in
Email
Password
Forgot password
or
By clicking below, you agree to our
terms of service
.
Sign in via Facebook
Sign in via Twitter
Sign in via GitHub
Sign in via Dropbox
Sign in with Wallet
Wallet (
)
Connect another wallet
New to HackMD?
Sign up