最佳化 - HackMD

# 最佳化 ## 0606: - [Theoretical guarantees for approximate sampling from smooth and log-concave densities](https://arxiv.org/pdf/1412.7392.pdf) - the Markov chain is the Euler discretisation of a continuous-time diffusion process with $pi$ as invariant density ## 0530: 我會先講貝氏推論需要抽樣(其他貝氏推論的東西在考慮要不要放/放在哪) 然後講 MC 性質優於其他方法的地方，~~最後是 langevin 跟其他 MCMC 的比較。~~ ### Why are the optimization problems interesting? In most scene of Bayesian inference, we need to generate large samples from the posterior to renew our prior. (formulate 一下關於貝氏推論的敘述) [Taboga, Marco (2021). "Bayesian inference", Lectures on probability theory and mathematical statistics. Kindle Direct Publishing. Online appendix.](https://www.statlect.com/fundamentals-of-statistics/Bayesian-inference.) [Deep Bayes](https://github.com/bayesgroup/deepbayes-2019/tree/master/lectures) (compare MCMC with others like importance sampling) (To say that MCMC is more effective because focusing on regions with higher probability) A category of known sampling methods are Markov Chain Monte Carlo. The process of MCMC methods are as follow: 1. Construct a Markov transition function. 2. Start the sampling with some x_0. 3. Generate sample points $x_1, x_2, .., x_t, ..$ follows the Markov transition function. Eventually, the draws we generate would appear as if they are coming from our target distribution (posterior) **(need source)**. There are many methods that we can find a Markov transition function. Some of them are Gibbs sampling, Hamilton sampling and the Langevin Monte Carlo sampling. (compare these methods) ## 0527: ## Sampling from a log-concave distribution with Projected Langevin Monte Carlo https://arxiv.org/pdf/1507.02564.pdf?fbclid=IwAR3Jm1z38Ngr3kjryGsi1aH-YyzEIlqH2eCzmXRPwcQzjHvhZwg69fwV0Z8 ### Markov chain Monte Carlo 當我們需要使用採樣作為計算積分的方式時（如算期望值？）MCMC 是一類讓我們的採樣逼近 target distribution 的採樣的方法。 - random walk MC：產生一系列點的方式是 random walk - langevin MC：產生一系列點的方式是 Langevin Dynamics 1. ![](https://hackmd.io/_uploads/S1J8RDy82.png =300x60) 2. 點的移動是負梯度 + Brownian motion 3. ![](https://hackmd.io/_uploads/HkM-yuJUn.png =300x60) 下一個點是梯度下降加噪音 U: potential 4. 積分得到![](https://hackmd.io/_uploads/Sk-H1_yUn.png =400x60) 5. 穩態解(對 t 微分 = 0) - https://blog.csdn.net/qq_37473939/article/details/108331856 ### In paper - f: potential (In this paper: convex, lipschitz, smooth) --- ### Stochastic Gradient Langevin Algorithms http://proceedings.mlr.press/v134/lamperski21a/lamperski21a.pdf?fbclid=IwAR2wpsz_gwUCx8xru7HPu6djRpaUAdFwx6hbQW-I9RvLrrlgS3c-7X8sBZc - other work 1. f non-convex in x and z iid, no constraint 2. f convex, no z, convex constraint 3. f nonconvex, no z, KL-divergence - this work - non convex, with iid z, compact convex constraint - What are the optimization problems of interest? - Markov Chain Monte Carlo sampling? - 蒙地卡羅 sampling 是在積分 pdf w/ many variables 的方法，因為難計算所以使用抽樣的方式 - Markov Chain 系統性產生變數的方法(有馬可夫性) - Why are the optimization problems interesting? - What is the algorithm you want to introduce? - Why do you not consider any other existing algorithms? - more general scenary - Why can we expect the algorithm to work? - How do we analyze the algorithm? - 用了很多 internal variable 去 bound 住 w1 distance - What are the key ideas in the analysis?