# Stochastic Iteration Convergence
:::success
Let $\{Y_t\}_t$ be a process in $\mathbb{R}^n$ adapted to some filtration $\{\mathcal{F}_t\}$ defined by
$$
Y_{t+1} = (1-\gamma_t)Y_t + \gamma_t v_t,
$$ where $\{\gamma_t\}$ is a deterministic sequence in $[0,1]$ satisfying
$$
\sum_{t} \gamma_t=\infty, \quad \sum_{t} \gamma_t^2 <\infty
$$ and $v_t$ is $\mathcal{F}_{t+1}$-measurable and satisfies
$$
\mathbb{E} [v_t \mid \mathcal{F_t}]=0, \quad \mathbb{E} [\|v_t\|_2^2 \mid \mathcal{F_t}]\leq A
$$ for some constant $A>0$. We wish to show that $Y_t\to 0$ almost surely.
:::
1. Let $\Gamma_n:=\prod_{t=0}^n (1-\gamma_t)$. Note that since
\begin{align*}
\log \prod_{t=0}^\infty (1-\gamma_t) = \sum_{t=0}^\infty\log (1-\gamma_t) \leq \sum_{t=0}^\infty -\gamma_t = -\infty
\end{align*} we have $\prod_{t=0}^\infty (1-\gamma_t)=0=\lim_n \Gamma_n$.
2. Since $\mathbb{E} [Y_{t+1}]=(1-\gamma_t) \mathbb{E}[Y_t]$, we have $\mathbb{E} [Y_{t+1}] = \Gamma_t \mathbb{E} [Y_0] \to 0$ as $t\to\infty$.
3. The squared $2$-norm of the process is given by
\begin{align*}
\|Y_{t+1}\|_2^2 = (1-\gamma_t)^2\|Y_t\|_2^2 + \gamma_t^2 \|v_t\|_2^2 + 2\gamma_t(1-\gamma_t) \langle Y_t, v_t \rangle
\end{align*} and thus
\begin{align*}
\mathbb{E} \|Y_{t+1}\|_2^2 & = \mathbb{E} [\mathbb{E} \|Y_{t+1}\|_2^2 \mid \mathcal{F}_t]] = (1-\gamma_t)^2\mathbb{E} \|Y_t\|_2^2 + \gamma_t^2 \mathbb{E}\|v_t\|_2^2 \\
& \leq (1-\gamma_t)^2\mathbb{E} \|Y_t\|_2^2 + \gamma_t^2 A.
\end{align*}
Writing $y_k:=\mathbb{E} \|Y_k\|_2^2$, we have $y_{k+1}\leq (1-\gamma_k)^2y_k + \gamma_k^2 A$ and hence
\begin{align*}
y_1 & \leq (1-\gamma_0)^2 y_0 + \gamma_0^2 A \leq (1-\gamma_0)^2 y_0 + \gamma_0 A\\
y_2 & \leq (1-\gamma_1)^2 (1-\gamma_0)^2 y_0 + (1-\gamma_1)^2\gamma_0 A + \gamma_1^2 A \\
& = \Gamma_1^2 y_0 + A\left[(1-\gamma_1)\cdot(1-\gamma_1)\gamma_0 + \gamma_1\cdot\gamma_1\right] \\
& \leq \Gamma_1^2 y_0 + A \max \left\{(1-\gamma_1)\gamma_0, \gamma_1\right\} \\
y_3 & \leq \Gamma_2^2 y_0 + A \left[ (1-\gamma_2)^2\max \left\{(1-\gamma_1)\gamma_0, \gamma_1\right\} + \gamma_2^2\right] \\
& = \Gamma_2^2 y_0 + A \max\left\{(1-\gamma_2)\max \left\{(1-\gamma_1)\gamma_0, \gamma_1\right\} , \gamma_2 \right\} \\
& \leq \Gamma_2^2 y_0 + A \max \left\{(1-\gamma_2)(1-\gamma_1)\gamma_0, (1-\gamma_2)\gamma_1, \gamma_2 \right\} \\
y_4 & \leq \Gamma_3^2y_0 + A \max \left\{(1-\gamma_3)(1-\gamma_2)(1-\gamma_1)\gamma_0, (1-\gamma_3)(1-\gamma_2)\gamma_1, (1-\gamma_3)\gamma_2 , \gamma_3\right\}\\
& = \Gamma_3^2 y_0 + A \max \left\{ \frac{\Gamma_3}{\Gamma_0} \gamma_0, \frac{\Gamma_3}{\Gamma_1} \gamma_1, \frac{\Gamma_3}{\Gamma_2} \gamma_2, \frac{\Gamma_3}{\Gamma_3} \gamma_3\right\}.
\end{align*}
By induction we have
\begin{align*}
y_{k+1} \leq \Gamma_k^2y_0 + A \max_{j=0,\ldots, k}\left\{\frac{\Gamma_k}{\Gamma_j}\gamma_j \right\}.
\end{align*}
For any $\epsilon>0$, since $\gamma_j\to 0$ as $j\to\infty$, there exists $J$ such that for any $j\geq J$ we have $\gamma_j < \epsilon$. Since $\Gamma_k\to 0$ as $k\to\infty$, there exists some $K$ such that for all $k\geq K$ we have $\Gamma_k < \epsilon \Gamma_J \leq \epsilon$. Notice that $\Gamma_k/\Gamma_j \in [0,1]$ and is decreasing in $k$ for a fixed $j$. Thus for $k\geq K$ we have
$$
\max_{j=0,\ldots, J}\left\{\frac{\Gamma_k}{\Gamma_j}\gamma_j \right\} \leq \frac{\Gamma_k}{\min_{j=0,\ldots, J} \Gamma_j} \cdot 1 = \frac{\Gamma_k}{\Gamma_J} < \epsilon
$$ and
$$
\max_{j=J+1, \ldots, k}\left\{\frac{\Gamma_k}{\Gamma_j}\gamma_j \right\} \leq \max_{j=J+1, \ldots, k} \gamma_j < \epsilon,
$$ which means for $k\geq K$ we have
$$
y_{k+1} \leq \epsilon^2 y_0 + A\epsilon.
$$ This shows that $\mathbb{E} \|Y_t\|_2^2\to 0$ as $t\to\infty$. In particular, (each component of) the sequence $Y_t$ is bounded in $L^1$.
4. Since $Y_t$ has its components bounded in $L^1$, the convergence theorem for supermartingale (Theorem 11.5 in [1]) which relies on an upcrossing argument shows that $Y_t$ converges to some random variable $Y_\infty$ almost surely, where $Y_\infty$ is $\mathcal{F}_\infty$-measurable.
By 2. we have $\mathbb{E}[Y_\infty]=0$. Furthermore, by 3. and the dominated convergence theorem,
$$
\mathbb{E} \|Y_\infty\|_2^2 = \lim_{t\to\infty}\mathbb{E} \|Y_t\|_2^2=0.
$$ This means $Y_\infty=0$ almost surely since
$$
\mathbb{P}(Y_\infty\neq 0) = \lim_{n\to\infty} \mathbb{P} \left(\|Y_\infty\|^2_2 \geq \frac{1}{n}\right) \leq n\mathbb{E} \|Y_\infty\|_2^2 = 0.
$$
# References
[1] Williams, David. Probability with martingales. Cambridge university press, 1991.