# 最佳化
## 0606:
- [Theoretical guarantees for approximate sampling from smooth and log-concave densities](https://arxiv.org/pdf/1412.7392.pdf)
- the Markov chain is the Euler discretisation of a continuous-time diffusion process with $pi$ as invariant density
## 0530:
我會先講貝氏推論需要抽樣(其他貝氏推論的東西在考慮要不要放/放在哪)
然後講 MC 性質優於其他方法的地方,~~最後是 langevin 跟其他 MCMC 的比較。~~
### Why are the optimization problems interesting?
In most scene of Bayesian inference, we need to generate large samples from the posterior to renew our prior.
(formulate 一下關於貝氏推論的敘述)
[Taboga, Marco (2021). "Bayesian inference", Lectures on probability theory and mathematical statistics. Kindle Direct Publishing. Online appendix.](https://www.statlect.com/fundamentals-of-statistics/Bayesian-inference.)
[Deep Bayes](https://github.com/bayesgroup/deepbayes-2019/tree/master/lectures)
(compare MCMC with others like importance sampling)
(To say that MCMC is more effective because focusing on regions with higher probability)
A category of known sampling methods are Markov Chain Monte Carlo. The process of MCMC methods are as follow:
1. Construct a Markov transition function.
2. Start the sampling with some x_0.
3. Generate sample points $x_1, x_2, .., x_t, ..$ follows the Markov transition function.
Eventually, the draws we generate would appear as if they are coming from our target distribution (posterior) **(need source)**.
There are many methods that we can find a Markov transition function. Some of them are Gibbs sampling, Hamilton sampling and the Langevin Monte Carlo sampling.
(compare these methods)
## 0527:
## Sampling from a log-concave distribution with Projected Langevin Monte Carlo
https://arxiv.org/pdf/1507.02564.pdf?fbclid=IwAR3Jm1z38Ngr3kjryGsi1aH-YyzEIlqH2eCzmXRPwcQzjHvhZwg69fwV0Z8
### Markov chain Monte Carlo
當我們需要使用採樣作為計算積分的方式時(如算期望值?)MCMC 是一類讓我們的採樣逼近 target distribution 的採樣的方法。
- random walk MC:產生一系列點的方式是 random walk
- langevin MC:產生一系列點的方式是 Langevin Dynamics
1. 
2. 點的移動是負梯度 + Brownian motion
3.  下一個點是梯度下降加噪音 U: potential
4. 積分得到
5. 穩態解(對 t 微分 = 0)
- https://blog.csdn.net/qq_37473939/article/details/108331856
### In paper
- f: potential (In this paper: convex, lipschitz, smooth)
---
### Stochastic Gradient Langevin Algorithms
http://proceedings.mlr.press/v134/lamperski21a/lamperski21a.pdf?fbclid=IwAR2wpsz_gwUCx8xru7HPu6djRpaUAdFwx6hbQW-I9RvLrrlgS3c-7X8sBZc
- other work
1. f non-convex in x and z iid, no constraint
2. f convex, no z, convex constraint
3. f nonconvex, no z, KL-divergence
- this work
- non convex, with iid z, compact convex constraint
- What are the optimization problems of interest?
- Markov Chain Monte Carlo sampling?
- 蒙地卡羅 sampling 是在積分 pdf w/ many variables 的方法,因為難計算所以使用抽樣的方式
- Markov Chain 系統性產生變數的方法(有馬可夫性)
- Why are the optimization problems interesting?
- What is the algorithm you want to introduce?
- Why do you not consider any other existing algorithms?
- more general scenary
- Why can we expect the algorithm to work?
- How do we analyze the algorithm?
- 用了很多 internal variable 去 bound 住 w1 distance
- What are the key ideas in the analysis?
<!-- ###
forward-backward splitting
- [An inertial forward-backward algorithm for monotone inclusions](https://arxiv.org/pdf/1403.3522.pdf)
- split T to A+B such that the resolvent is easy to compute
- [Efficient Online and Batch Learning Using Forward Backward Splitting](https://www.jmlr.org/papers/volume10/duchi09a/duchi09a.pdf)
- FOBOS
- [Efficient Learning using Forward-Backward Splitting](https://web.stanford.edu/~jduchi/projects/DuchiSi09b.pdf)
- empirical loss + regularization term
- [A FIELD GUIDE TO FORWARD-BACKWARD SPLITTING WITH A FASTA IMPLEMENTATION](https://arxiv.org/pdf/1411.3406.pdf)
- 提到多種小情況
- split into diff. and non-diff.
- [A Generalized Forward-Backward Splitting](https://hal.science/hal-00613637/document)
- 提到一些應用場景
- [Relaxed Forward–Backward Splitting Methods for Solving Variational Inclusions and Applications](https://link.springer.com/article/10.1007/s10915-021-01608-7)~~ -->