# 情報論的機械学習 ## 2020-01-07 <details> <summary>講義内容</summary> パーセプトロンの収束定理 バックプロパゲーション Amari先生が1967に出してたが,Rumelhart(1986)が有名そう SVM SOFT-SVM ReLU CNN Hierachical Latent Variable Optimization 1 ![](https://i.imgur.com/mO37pJd.jpg) 2 ![](https://i.imgur.com/LLkPpp6.jpg) 3 ![](https://i.imgur.com/EEYwbdb.jpg) 4 ![](https://i.imgur.com/oanlpZH.jpg) 5 ![](https://i.imgur.com/gb2CQVV.jpg) 6 ![](https://i.imgur.com/1260Kt5.jpg) 7 ![](https://i.imgur.com/nLNVSjT.jpg) 8 ![](https://i.imgur.com/bAdhkhJ.jpg) 9 ![](https://i.imgur.com/1FhS3UN.jpg) 10![](https://i.imgur.com/qxT5k5v.jpg) 11![](https://i.imgur.com/1HDCBGp.jpg) 12![](https://i.imgur.com/Xb47igJ.jpg) 13![](https://i.imgur.com/2wzYHhD.jpg) 14![](https://i.imgur.com/xgYzg8w.jpg) 15![](https://i.imgur.com/v8GOCZZ.jpg) 16![](https://i.imgur.com/5C6XtyM.jpg) 17![](https://i.imgur.com/WvlfFNK.jpg) 18![](https://i.imgur.com/McAzE1T.jpg) 19![](https://i.imgur.com/uyDJyzf.jpg) 20![](https://i.imgur.com/5x3WXDM.jpg) 21![](https://i.imgur.com/Or81CnY.jpg) </details> ## 2019-12-17 <details> <summary>講義内容</summary> 1 ![](https://i.imgur.com/DJH24E5.jpg) 2 ![](https://i.imgur.com/3tmXQBg.jpg) 3 ![](https://i.imgur.com/kMdlxD6.jpg) 4 ![](https://i.imgur.com/joFuNsK.jpg) 5 ![](https://i.imgur.com/BQGJevm.jpg) 6 ![](https://i.imgur.com/isAIFEC.jpg) 7 ![](https://i.imgur.com/8vrWonz.jpg) 8 ![](https://i.imgur.com/qPeKxmB.jpg) 9 ![](https://i.imgur.com/3yOU858.jpg) 10![](https://i.imgur.com/V0knBHK.jpg) 過去に分子にある$\Gamma^{\frac{n+1}{2}}$は$\Gamma^{\frac{n+1}{2}}$は$\pi^{\frac{n+1}{2}}$の間違い 11![](https://i.imgur.com/NWUrHTX.jpg) 12![](https://i.imgur.com/HHxRc0I.jpg) 13![](https://i.imgur.com/7EBZLdE.jpg) 14![](https://i.imgur.com/BE7lora.jpg) 15![](https://i.imgur.com/NQlfsuj.jpg) 16![](https://i.imgur.com/shFnIru.jpg) </details> ## 2019-12-10 <details> <summary>講義内容</summary> 1 ![](https://i.imgur.com/N5JU1xP.jpg) 2 ![](https://i.imgur.com/20C9Fme.jpg) 3 ![](https://i.imgur.com/OWyhrqL.jpg) 4 ![](https://i.imgur.com/haVVkwF.jpg) 5 ![](https://i.imgur.com/okeZvzl.jpg) 6 ![](https://i.imgur.com/fNvCTk2.jpg) 7 ![](https://i.imgur.com/JgHnilM.jpg) 8 ![](https://i.imgur.com/SBXgEWn.jpg) deterministicに決まるものは決めていきたい.計算量が大きいものにかんしてはサンプリングを行う 9 ![](https://i.imgur.com/pycL4lU.jpg) 10![](https://i.imgur.com/aEZ0HGT.jpg) 11![](https://i.imgur.com/RLblz0U.jpg) 12![](https://i.imgur.com/gPHoH6a.jpg) 13![](https://i.imgur.com/w3B9A3R.jpg) 14![](https://i.imgur.com/xbTSmu8.jpg) 15![](https://i.imgur.com/W2J0LkP.jpg) 16![](https://i.imgur.com/5nidyr1.jpg) 17![](https://i.imgur.com/s3lTBFb.jpg) 18![](https://i.imgur.com/MqUS3n7.jpg) <!-- this is an end --> </details> ## 2019-12-03 <details> <summary>講義内容</summary> 前回の最後の方の内容は正しくない 1 ![](https://i.imgur.com/xmrcYVp.jpg) 2 ![](https://i.imgur.com/LK0tBeE.jpg) 3 ![](https://i.imgur.com/3SrwJtj.jpg) 4 ![](https://i.imgur.com/woINDT0.jpg) 5 ![](https://i.imgur.com/dsRZYt8.jpg) 6 ![](https://i.imgur.com/EwW2p4U.jpg) 7 ![](https://i.imgur.com/ue3WLpd.jpg) $z^n$は母関数のためのものであり,潜在変数とは関係がない 8 ![](https://i.imgur.com/WlPuJ0s.jpg) 9 ![](https://i.imgur.com/XIoCJdf.jpg) 10![](https://i.imgur.com/c00YpX4.jpg) 11![](https://i.imgur.com/vDpu5cy.jpg) 12![](https://i.imgur.com/qLWXRhc.jpg) 13![](https://i.imgur.com/lNjgLI6.jpg) 14![](https://i.imgur.com/6VuFXI3.jpg) 15![](https://i.imgur.com/nMMo7RU.jpg) 16![](https://i.imgur.com/SElnpbZ.jpg) 17![](https://i.imgur.com/IENTa0N.jpg) 18![](https://i.imgur.com/EZohzYo.jpg) 19![](https://i.imgur.com/VPcw81s.jpg) 20![](https://i.imgur.com/2KzVF9z.jpg) <!-- this is an end --> </details> ## 2019-11-26 <details> <summary>講義内容</summary> 有限混合モデル 1 ![](https://i.imgur.com/Whwn0iq.jpg) 2 ![](https://i.imgur.com/BkIY7AK.jpg) 3 ![](https://i.imgur.com/10ZOivT.jpg) 4 ![](https://i.imgur.com/WXtiWKv.jpg) 5 ![](https://i.imgur.com/uMcMX39.jpg) 6 ![](https://i.imgur.com/TWKALbp.jpg) 7 ![](https://i.imgur.com/wu3G9qr.jpg) 8 ![](https://i.imgur.com/GKvLrug.jpg) 9 ![](https://i.imgur.com/WjewQTT.jpg) 10![](https://i.imgur.com/vNiW3AS.jpg) 11![](https://i.imgur.com/yqKJMDt.jpg) 12![](https://i.imgur.com/OPWmNap.jpg) 13![](https://i.imgur.com/nUzyD7p.jpg) 14![](https://i.imgur.com/hLB8mS5.jpg) 15![](https://i.imgur.com/XRo3LZH.jpg) 16![](https://i.imgur.com/vhgrRhU.jpg) 17![](https://i.imgur.com/C4QFLcV.jpg) 18![](https://i.imgur.com/yekbgkc.jpg) 19![](https://i.imgur.com/EDHzOm6.jpg) 20![](https://i.imgur.com/nlhneWm.jpg) 21![](https://i.imgur.com/Z6OJV9N.jpg) 22![](https://i.imgur.com/9kaF6Zc.jpg) 23![](https://i.imgur.com/CuvxUt2.jpg) 24![](https://i.imgur.com/Y25UnV1.jpg) 25![](https://i.imgur.com/UlBEhJ5.jpg) <!-- this is an end --> </details> ## 2019-11-19 <details> <summary>講義内容</summary> データの最悪値の最小値をとる 1 ![](https://i.imgur.com/XRT4zHn.jpg) 2 ![](https://i.imgur.com/Wq6DUKt.jpg) 3 ![](https://i.imgur.com/Vxf8TA8.jpg) 4 ![](https://i.imgur.com/7maIzHq.jpg) 5 ![](https://i.imgur.com/a8xZqEz.jpg) 6 ![](https://i.imgur.com/zzbbfuf.jpg) 7 ![](https://i.imgur.com/lBdJbAS.jpg) 8 ![](https://i.imgur.com/8KNokH3.jpg) 9 ![](https://i.imgur.com/BRdLEea.jpg) 10![](https://i.imgur.com/OTcfPZD.jpg) 11![](https://i.imgur.com/8oq2H1S.jpg) 12![](https://i.imgur.com/li59jLS.jpg) 13![](https://i.imgur.com/fXxhAf9.jpg) 14![](https://i.imgur.com/mlYEmzj.jpg) 15![](https://i.imgur.com/QkO6OPV.jpg) 16![](https://i.imgur.com/WmSfUw9.jpg) 17![](https://i.imgur.com/jcS6dGp.jpg) 18![](https://i.imgur.com/unOBfQp.jpg) 19![](https://i.imgur.com/TxYPUWn.jpg) 20![](https://i.imgur.com/ROEffzv.jpg) 21![](https://i.imgur.com/pLbfQZ3.jpg) 22![](https://i.imgur.com/sNLUd8T.jpg) 23![](https://i.imgur.com/zzhWEjW.jpg) 24![](https://i.imgur.com/dvWbCzs.jpg) 25![](https://i.imgur.com/Qlf1SLi.jpg) <!-- this is an end --> </details> ## 2019-11-12 <details> <summary>講義内容</summary> 1 ![](https://i.imgur.com/kGKDSGj.jpg) 2 ![](https://i.imgur.com/kGSFDko.jpg) 3 ![](https://i.imgur.com/nobTm31.jpg) nが十分大のときに確率1で成立 4 ![](https://i.imgur.com/pZmDmNP.jpg) 5 ![](https://i.imgur.com/xCIrJ7C.jpg) 6 ![](https://i.imgur.com/ipfwBP1.jpg) 7 ![](https://i.imgur.com/nvJi1sJ.jpg) 8 ![](https://i.imgur.com/jE7bHUJ.jpg) 9 ![](https://i.imgur.com/lisuY6L.jpg) 10![](https://i.imgur.com/o6dFxsy.jpg) 11![](https://i.imgur.com/x4Jm7zK.jpg) 12![](https://i.imgur.com/9GxlJEU.jpg) 13![](https://i.imgur.com/hX0aVYv.jpg) 14![](https://i.imgur.com/fPAR8O9.jpg) 15![](https://i.imgur.com/V5iBnlz.jpg) 16![](https://i.imgur.com/HdPaZAv.jpg) 17![](https://i.imgur.com/I5GrsRk.jpg) 18![](https://i.imgur.com/24YTmwU.jpg) 19![](https://i.imgur.com/sX7NOcQ.jpg) 20![](https://i.imgur.com/CRWAHo1.jpg) 21![](https://i.imgur.com/0RQaxtQ.jpg) 22![](https://i.imgur.com/x6wbHYA.jpg) 23![](https://i.imgur.com/ZikzJAo.jpg) 24![](https://i.imgur.com/Yhp7deu.jpg) 25![](https://i.imgur.com/rwaX4fS.jpg) 26![](https://i.imgur.com/1zEBFcx.jpg) 27![](https://i.imgur.com/WYTHYF0.jpg) <!-- this is an end --> </details> ## 2019-11-05 <details> <summary>講義内容</summary> 1 ![](https://i.imgur.com/nfFkWi0.jpg) 2 ![](https://i.imgur.com/RCJtm8C.jpg) 3 ![](https://i.imgur.com/N2QmTnb.jpg) 4 ![](https://i.imgur.com/rIHzkcG.jpg) 5 ![](https://i.imgur.com/S8CHBmo.jpg) 6 ![](https://i.imgur.com/6Mlj7VZ.jpg) 7 ![](https://i.imgur.com/UsaBwvc.jpg) 8 ![](https://i.imgur.com/CJasNr1.jpg) 9 ![](https://i.imgur.com/QbONUiY.jpg) 10![](https://i.imgur.com/wfcQGTH.jpg) 11![](https://i.imgur.com/4cHlSIH.jpg) 12![](https://i.imgur.com/21epDy3.jpg) 13![](https://i.imgur.com/eCVMO6k.jpg) 14![](https://i.imgur.com/QMoUqTG.jpg) 15![](https://i.imgur.com/b8RduZG.jpg) 16![](https://i.imgur.com/Uudc5HJ.jpg) 17![](https://i.imgur.com/kNl9773.jpg) 18![](https://i.imgur.com/3CTvjmD.jpg) 19![](https://i.imgur.com/h0iq251.jpg) 20![](https://i.imgur.com/tkILfdk.jpg) 21![](https://i.imgur.com/5dzAMso.jpg) 漸近近似式の証明: Text(情学) p17-p19 中心極限定理による証明 22![](https://i.imgur.com/nWC9ZnL.jpg) 23![](https://i.imgur.com/k0wVfnM.jpg) 24![](https://i.imgur.com/y3lpkkA.jpg) 25![](https://i.imgur.com/OWGwSn7.jpg) 26![](https://i.imgur.com/bUYXVxL.jpg) 27![](https://i.imgur.com/WJlFT8u.jpg) <!-- this is an end --> </details> ## 2019-10-29 <details> <summary>講義内容</summary> モデル選択 情報量基準 最適なモデル$k$を決定する AIC (Akaike's Information Criterion) [Akaike 1973] 1 ![](https://i.imgur.com/S0kwyCn.jpg) 2 ![](https://i.imgur.com/XfNX4j2.jpg) 3 ![](https://i.imgur.com/TEb8Dia.jpg) 4 ![](https://i.imgur.com/U3EPjiq.jpg) 5 ![](https://i.imgur.com/JVkn5yO.jpg) 証明続き 6 ![](https://i.imgur.com/0t1TJv6.jpg) 7 ![](https://i.imgur.com/xwGsAKv.jpg) 8 ![](https://i.imgur.com/7mknkDJ.jpg) 9 ![](https://i.imgur.com/hTXJkWn.jpg) 10![](https://i.imgur.com/4pm3fo9.jpg) 11$P_{NML}$![](https://i.imgur.com/Kip5gTJ.jpg) 12$MDL$![](https://i.imgur.com/7AsCWBm.jpg) 混入13![](https://i.imgur.com/yVwSWBg.jpg) 混入14![](https://i.imgur.com/sKNUfL7.jpg) 15![](https://i.imgur.com/rVCHWgo.jpg) 16![](https://i.imgur.com/u4a9v77.jpg) 17![](https://i.imgur.com/FNwl4xl.jpg) 18![](https://i.imgur.com/07tTL2E.jpg) 19![](https://i.imgur.com/p055NK0.jpg) 20![](https://i.imgur.com/uf5DXX9.jpg) 21![](https://i.imgur.com/N6Etuxh.jpg) 22![](https://i.imgur.com/HIsly8J.jpg) 23![](https://i.imgur.com/qNAZssM.jpg) 24![](https://i.imgur.com/uAWJ00z.jpg) 25![](https://i.imgur.com/xX8FFCQ.jpg) <!-- this is an end --> </details> ## 2019-10-15 <details> <summary>講義内容</summary> LASSO: 投げ縄という意味もある 1 LASSO ![](https://i.imgur.com/hVvyXkT.jpg) 2 ![](https://i.imgur.com/Y6tZS9S.jpg) 3 ![](https://i.imgur.com/k7xTtGG.jpg) 4 ![](https://i.imgur.com/4uRduMM.jpg) 5 Graphical LASSO![](https://i.imgur.com/DN41Bvd.jpg) 6 ![](https://i.imgur.com/JBzMm17.jpg) 7 ![](https://i.imgur.com/tQohlUT.jpg) 8 ![](https://i.imgur.com/tQJlKAI.jpg) 9 ![](https://i.imgur.com/sQTQIcD.jpg) 10![](https://i.imgur.com/89wHPRI.jpg) 11![](https://i.imgur.com/bRGRPzx.jpg) 12![](https://i.imgur.com/7Mhp9SY.jpg) 13![](https://i.imgur.com/VgbvYVJ.jpg) 14![](https://i.imgur.com/Z0fRavG.jpg) 今までの損失関数を$\frac{1}{n}$倍に訂正 15![](https://i.imgur.com/p1ECGDh.jpg) 16![](https://i.imgur.com/2OVnwaL.jpg) 17![](https://i.imgur.com/TzBdzF8.jpg) 18![](https://i.imgur.com/gTxwy3R.jpg) 19![](https://i.imgur.com/T9DHmlO.jpg) 20![](https://i.imgur.com/mU2NfuV.jpg) 21![](https://i.imgur.com/kCnocoz.jpg) <!-- this is an end --> </details> ## 2019-10-08 <details> <summary>講義内容</summary> Chap1 情報論学習理論とは何か 情報量(=記述長)の視点からデータの内在的構造を抽出 Chap2 パラメータ推定 2.1 最尤推定 (maximum likelihood estimation M.L.E.) 確率モデル 知識表現 $H$:パラメータ、$x^n: x_1 \cdots x_n$ 教師なし学習 $\mathcal{P} = \{p(x^n; \theta) | \theta \in H \subset R^k\}$ xが連続: probability density function (p.d.f) xが離散: probability mass function (p.m.f) 教師あり学習 $\mathcal{P} = \{p(y^n|x^n; \theta) | \theta \in H \subset R^k\}$ yが連続: 分類 yが離散: 回帰 $\underline{\mathrm{Problem}}$ $x^n = x_1 \cdots x_n$ :観測変数列 $x_i \sim p(X; \theta)$ ; i.i.d. (independently identically distributed) $\theta$ 未知 $x^n$: given $\Rightarrow \theta$ を推定 (estimation) 尤度関数(Likelihood Function) $x^n$: given $\mathcal{L}(\theta) = p(x^n, \theta) = \prod_{i=1}^n p(x_i; \theta)$ -> max $\hat{\theta} = \underset{\theta}{\mathrm{argmax}} p(x^n; \theta)$ : 最尤推定量(maximum likelihood estimator m.l.e.) $\hat{\theta} = \underset{\theta}{\mathrm{argmin}}\{-\log p(x^n; \theta)\}$ Ex\) 多次元正規分布のm.l.e. $\mu$: 平均パラメータベクトル $\Sigma$: 分散共分散行列 $x \in \mathbb{R}^d, p \in \mathbb{R}^d, \Sigma \in \mathbb{R}^{d\times d}, \theta = (\mu, \Sigma)$ p.d.f. $p(x, \theta) = \frac{1}{{(2\pi)^{\frac{d}{2}}}{|\Sigma|}^\frac12}\exp{(-\frac{{(x - \mu)}^T\Sigma^{-1}(x - \mu)}{2})}$ $x^n = x_1\cdots x_n$:given \begin{align} \mathcal{L}(\mu, \Sigma) &= -\log{p(x^n, \theta)} = -\log{\prod_{i=1}^n p(x_i, \theta)} \\ \\ &= - \frac{n}{2} \log{|\Sigma|}^{-1} + \frac12 \sum_{i=1}^n {(x_i - \bar{x})}^T\Sigma{(x_i - \bar{x})} \\ &+ \frac{n}{2}(\bar{x} - \mu)^T\Sigma^{-1}(\bar{x} - \mu) + \frac{nd}{2}\log(2\pi) \end{align} ここで、 $\bar{x} := \frac{1}{n}\sum_{i=1}^n x_i$ $S := \sum_{i=1}^n(x_i - \bar{x}){(x_i - \bar{x})}^T$ とすると $L(\mu, \Sigma) = \frac{n}{2}\log{|\Sigma|}^{-1} + \frac{1}{2}tr(\Sigma^{-1}S) + \frac{n}{2}(\bar{x} - \mu)^T\Sigma(\bar{x} - \mu)$ ここで一般に$x^T Ax = tr(Axx^T)$を利用した $\Lambda = \Sigma^{-1}$を利用して $\frac{\partial{L(\hat{\mu}, \Lambda)}}{\partial{\Lambda}} = 0$なる$\Lambda$を求める $L = -\frac{n}{2}\log|\Lambda| + \frac12 tr(\Lambda S)$ $\frac{\partial{L(\hat{\mu}, \Lambda)}}{\partial{\Lambda}} = -\frac{n}{2}(\Lambda^{-1})^T + \frac12 S = 0$ $\hat{\Sigma} = \hat{\Lambda}^{-1} = \frac{S}{n}$ $\hat{\theta} = (\hat{\mu}, \hat{\Sigma}) = (\bar{x}, \frac{S}{n})$ 一般に$\frac{\partial}{\partial{A}}Tr(A^T B) = b$ 一般に$\frac{\partial}{\partial{A}}\log{|A|} = (A^{-1})^T$ $\underline{\mathrm{Note}}$ 外れ値検知 outlier detection 新しいデータ$x$の$x^n = x_1 \cdots x_n$からの外れ値度合い(degree of outlier) $= -\log p(x; \hat{\mu}, \hat{\Sigma})$ ($\hat{\mu}, \hat{\Sigma}$: $x^n$からのm.l.e.) $= \frac{1}{2}(x - \hat{\mu})^T\hat{\Sigma}^{-1}(x - \hat{\mu}) + \log((\sqrt{2\pi})^{-1}{|\hat{\Sigma}|}^{\frac12})$ $x$から$\hat{\mu}$までの$\hat{\Sigma}^{-1}$を計算するMaharanobis距離 Ex) 回帰分析 $x = \begin{bmatrix} 1 \\ x^1 \\ \vdots \\ x^{d - 1}\end{bmatrix} \in \mathbb{R}^d$ $\theta = \begin{bmatrix} 1 \\ \theta^1 \\ \vdots \\ \theta^{d - 1}\end{bmatrix} \in \mathbb{R}^d$ $y \in \mathbb{R}$ $\sigma$: known p.d.f. \begin{align} p(y|x;\theta) = \frac{1}{\sqrt{2\pi} \sigma}\exp{(- \frac{{(y - \theta^T x)}^2}{2\sigma^2})} \\ \Leftrightarrow y = \theta^T x + \epsilon ~(epsilon \sim \mathcal{N}(0, \sigma^2)) \\ = \theta_0 + \theta_1 x^1 + \cdots + \theta_{d-1}x^{d-1} + \epsilon \end{align} \begin{align} (x_1, y_1), \cdots, (x_n, y_n): given \\ \mathcal{L}(\theta) = \prod_{i = 1}^n p(y_i|x_i; \theta) \\ = \prod{i=1}^n(\frac{1}{\sqrt{2\pi}\sigma}\exp(-\frac{{(y_i - \theta^T)}^2}{2\sigma^2})) \end{align} \begin{align} l(\theta) = -\log L(\theta) \\ = n \log(\sqrt{2\pi}\sigma^2) + \sum_{i=1}^n\frac{(y_i - \theta^T x_i)^2}{2\sigma^2} \end{align} -> min wrt? $\theta$ $X = {(x_i,\cdots x_n)}^T$ $Y = {(y_1 \\ \vdots \\ y_n)}$ とすると \begin{align} \underset{\theta}{\min} \{\sum_{i=1}^n(y_i - \theta^T x_i)^2\} \\ \Leftrightarrow \underset{\theta}{\min}\{(Y - X\theta)^T(Y - X\theta)\} \end{align} $\min$の中身を$l(\theta)$とすると \begin{align} \frac{\partial{l}}{\partial{\theta}} = -2 X^T Y + 2{(X^TX)} \theta \end{align} $X^T X$が正則ならば $\hat{\theta} = (X^T X)^{-1}XY$ $\theta$のm.l.e. 正規方程式 (normal equation) Ex(離散分布 (多項分布)) multidimentional distribution $x = \{0, 1, \cdots m\}$ $\underline{\mathrm{proof}}$ $p(X = i) = \theta_i (i = 0, 1 \cdots m)$ パラメータ空間 $H = \{\theta \in \mathbb{R}^{m+1} | \sum_{i=0}^m \theta_i = 1, \theta_i \geq 0\}$ $x^n = x_1 \cdots x_n$: given $n_i: X = i$の生起数 尤度関数$L(\theta) = \prod_{i=0}^n \theta^{ni}$ \begin{align} l(\theta) = - \log \prod{i=0}^n{\theta_i}^{n_i} \\ = n\{H(\frac{n_0}{n}, \frac{n_{m+1}}{n}) + D(\{\frac{n_i}{n}|\{\theta_i\})\} \end{align} ここに、$H(z_0 \cdots Z_m) = -\sum_{i=0}^n z_i \log z_i$ $z_i$は0からmの和が1ですべて0以上 $D(\{z_i\} || \{w_i\}) = \sum_{i=0}^m z_i \log{\frac{z_i}{w_i}}$ kullback-leibler divergence $\underline{\mathrm{Note}}$ $D \geq 0$の等号は$z_i = w_i$で成立 $D(\{z_i\} || \{w_i\}) = \sum_{i=0}^m z_i \log{\frac{z_i}{w_i}}$ $\geq \sum_{i=0}^m z_i(1 - \frac{w_i}{z_i}) = 0$ $\hat{\theta} = \frac{n_1}{n}$ $(i = 0 \cdots m)$ $\theta$のm.l.e. $\underline{\mathrm{Thm}}$ (MLEの一致性 consistency) $x_i \sim p(; \theta)$ i.i.d. $\hat{\theta} = \hat{\theta}(x^n)$ m.l.e. $\mathcal{p} = \{p(x;\theta)\}$に関するある正則条件下で \begin{align} \forall \epsilon > 0 \underset{n \rightarrow \infty}{\lim} P [|\hat{\theta}(x^n) - \theta| > \epsilon] = 0 (\hat{\theta} \underset{p}{\rightarrow} \theta) \\ ||\theta|| = \sqrt{\theta^T\theta} \end{align} $\underline{\mathrm{Thm}}$ (MLEの漸近正規性、有効性) $\mathcal{p}$に関するある正則条件のもとで $\sqrt{n}(\hat{\theta} - \theta)~\mathcal{N}(0, \mathcal{L}^{-1})$ ![](https://i.imgur.com/TybirRi.jpg) ![](https://i.imgur.com/Sed4mJz.jpg) ![](https://i.imgur.com/K8nKtWa.jpg) ![](https://i.imgur.com/GMFzfHH.jpg) ![](https://i.imgur.com/0KBOvmW.jpg) <!-- this is an end --> </details>