###### tags: `論文摘要` `機器學習`
# Hyperparameter Optimization for Machine Learning Models Based on Bayesian Optimization
* 時間: 2017
* Conference: Journal of Electronic Science and Technology
* Link: https://www.sciencedirect.com/science/article/pii/S1674862X19300047
* MLA: Wu, Jia, et al. "Hyperparameter optimization for machine learning models based on Bayesian optimization." Journal of Electronic Science and Technology 17.1 (2019): 26-40.
## 概論
使用Bayesian optimization對Gaussian process近似的模型做超參數更新。
## 方法
已知$x$為我們需要調整的超參數,首先預先假設該模型f(x)符合高斯分布,因此可以透過高斯過程近似該模型:
$$f(x)\sim \text{GP}(m(x), k(x, x'))$$
其中$k(x, x')$為covariance function:
$$k(x_i, x_j)=\exp(-\frac{1}{2}\parallel x_i-x_j \parallel^2)$$
接著可以依$x$與$f(x)$定義出數據集$D$,寫作下式(假設有$t$筆超參數)
$$D_{1:t}=\{x_n, f_n\}^{t}_{n=1}\\f_n=f(x_n)$$
由此便能推得$f_{t+1}$同樣也遵守高斯分布,即:
$$
\left [
\begin{array}{cc}
f_{1:t} \\
f_{t+1}
\end{array}
\right ]
\sim N(
\left [
\begin{array}{cc}
K & k \\
k^T & k(x_{t+1}, x_{t+1})
\end{array}
\right ]
)\\f_{1:t}=[f_1, f_2, ..., f_t]^T\\
\textbf{k}=[k(x_{t+1}, x_1)k(x_{t+1}, x_2)...k(x_{t+1}, x_t)]
$$
且mean function與covariance function分別為:
$$\mu_{t+1}(x_{t+1})=\textbf{k}^T\textbf{K}^{-1}\textbf{f}_{1:t}\\\sigma^2_{t+1}=-\textbf{k}^T\textbf{K}^{-1}\textbf{k}+k(x_{t+1}, x_{t+1})$$
至此高斯近似模型就定義完了。
下一步是選用acquisition function去優化上述近似模型:
$$x^+=\mathop{\arg\min}\limits_{x\in A}u(x\mid D)$$
等效於:
$$x^+=\mathop{\arg\min}\limits_{x_i\in x_{1:t}}f(x_i)$$
常見的acquisition function有下列兩種:
* Probability of improvement (PI)
* Expected improvement (EI)