--- title: 矩陣導數 tags: MMDS --- # 矩陣導數 ## Gradient $\nabla f$ 令 $f: \mathbb{R}^n \mapsto \mathbb{R}$ 為一個多變數可導函數,則==純量函數==$f$在點$x=(x_1, x_2,\cdots,x_n)$的 gradient $$\nabla f\equiv \begin{bmatrix}\frac{\partial f}{\partial x_1} &\frac{\partial f}{\partial x_2} &\cdots &\frac{\partial f}{\partial x_n}\end{bmatrix}$$ 為一個一維==向量== :::warning Note: $\nabla$ 是作用在**純量函數** ::: ## Jacobian 在微積分的討論中他表示的是變數變換的時候,單位面積的變化。 在這邊我們從另外一個方向討論,把他解釋成 gradient 在==向量函數==的**推廣** 令$f: \mathbb{R}^n \mapsto\mathbb{R}^m$ $$J(\vec f) \equiv \begin{bmatrix} (\nabla f_1)^T\\ (\nabla f_2)^T\\ \vdots\\ (\nabla f_m)^T \end{bmatrix}= \begin{bmatrix} \frac{\partial f_1}{\partial x_1} &\frac{\partial f_1}{\partial x_2} &\cdots &\frac{\partial f_1}{\partial x_n}\\ \frac{\partial f_2}{\partial x_1} &\frac{\partial f_2}{\partial x_2} &\cdots &\frac{\partial f_2}{\partial x_n}\\ \vdots &\vdots &\ddots &\vdots\\ \frac{\partial f_m}{\partial x_1} &\frac{\partial f_m}{\partial x_2} &\cdots &\frac{\partial f_m}{\partial x_n} \end{bmatrix}$$ **Note:** Jacobian 也有分擺直的或擺橫的,比較不正規的記憶法就是 vector 習慣擺直的微分沒地方擺只好往右邊長:sweat_smile::sweat_smile::sweat_smile: :::warning Note: Jocobian 是作用在**向量函數** ::: ## Hessian Hessian 可以說是 Jocobian 跟 gradient 的結合。 gradient 本身就是一個向量函數,給定一個 m 維的座標點會得到每個方向的斜率,並將它擺成矩陣。因此可以在對他取 Jocobian $$H(f)\equiv J(\nabla f)= \begin{bmatrix} (\nabla \frac{\partial f}{\partial x_1})^T\\ (\nabla \frac{\partial f}{\partial x_2})^T\\ \vdots\\ (\nabla \frac{\partial f}{\partial x_n})^T \end{bmatrix}= \begin{bmatrix} \frac{\partial}{\partial x_1}(\frac{\partial f}{\partial x_1}) &\frac{\partial}{\partial x_2}(\frac{\partial f}{\partial x_1}) &\cdots &\frac{\partial}{\partial x_n}(\frac{\partial f}{\partial x_1})\\ \frac{\partial}{\partial x_1}(\frac{\partial f}{\partial x_2}) &\frac{\partial}{\partial x_2}(\frac{\partial f}{\partial x_2}) &\cdots &\frac{\partial}{\partial x_n}(\frac{\partial f}{\partial x_2})\\ \vdots &\vdots &\ddots &\vdots\\ \frac{\partial}{\partial x_1}(\frac{\partial f}{\partial x_n}) &\frac{\partial}{\partial x_2}(\frac{\partial f}{\partial x_n}) &\cdots &\frac{\partial}{\partial x_n}(\frac{\partial f}{\partial x_n}) \end{bmatrix}$$ :::warning - 因為$\frac{\partial^2f}{\partial x_i\partial x_j}=\frac{\partial^2 f}{\partial x_j \partial x_i}$,所以$H(f)$為對稱矩陣。 - 根據定義 Hessian 是作用在**純量函數** - 也可以寫成 $H(f) = \begin{bmatrix}\nabla\frac{\partial f}{\partial x_1} &\nabla\frac{\partial f}{\partial x_2} &\cdots &\nabla\frac{\partial f}{\partial x_n}\end{bmatrix}$(因為Hessian 是對稱矩陣) ::: ## Other 習慣上會將 gradien 當作純量函數的一階導數,Hessian 當作二階導數。 <!-- ## Chain Rule 純量$o = g(\vec y)$, $\vec v$ 為 $g$ 對 $\vec y$ 求導(gradient) 又 $\vec y = f(\vec x)$, $J^y_x$ 為 $f$ 的 Jocbian 則$(J^y_x)^T\vec v$ 相當於 $g$ 對 $y$ 求導(gradient) pf: \begin{split}J^{T}\cdot v=\left(\begin{array}{ccc} \frac{\partial y_{1}}{\partial x_{1}} & \cdots & \frac{\partial y_{m}}{\partial x_{1}}\\ \vdots & \ddots & \vdots\\ \frac{\partial y_{1}}{\partial x_{n}} & \cdots & \frac{\partial y_{m}}{\partial x_{n}} \end{array}\right)\left(\begin{array}{c} \frac{\partial l}{\partial y_{1}}\\ \vdots\\ \frac{\partial l}{\partial y_{m}} \end{array}\right)=\left(\begin{array}{c} \frac{\partial l}{\partial x_{1}}\\ \vdots\\ \frac{\partial l}{\partial x_{n}} \end{array}\right)\end{split} :::warning 上述就只是一個合成函數或者說變數變換 對 gradient 的影響。 ::: -->
×
Sign in
Email
Password
Forgot password
or
By clicking below, you agree to our
terms of service
.
Sign in via Facebook
Sign in via Twitter
Sign in via GitHub
Sign in via Dropbox
Sign in with Wallet
Wallet (
)
Connect another wallet
New to HackMD?
Sign up