# Concentration for sum of a random length sequence :::info ## Exercise 7.1 from Bandit Algorithms [1] Let $(X_t)_t$ be a sequence of i.i.d. Gaussian random variables with mean $\mu$ and unit variance defined on the probability space $(\Omega,\mathcal{F}, \mathbb{P})$. Let $T:\Omega\to \mathbb{N}=\{1,2,3,\ldots\}$ be a random variable and let $\hat{\mu} := \frac{1}{T} \sum_{t=1}^T X_t$. 1. Show that if $T$ and $(X_t)_t$ are independent, the for any $\delta \in (0,1)$, $$ \mathbb{P} \left( \hat{\mu} -\mu \geq \sqrt{\frac{2\log \delta^{-1}}{T}} \right) \leq \delta $$ 2. Now drop the assumption that $T$ and $(X_t)_t$ are independent. Let $\mathcal{F}_t:= \sigma(X_1,\cdots, X_t)$. For each $\delta\in (0,1)$ find $T$ such that $\{T=t\}\in \mathcal{F}_t$ for all $t=1,2,\ldots$ and $$ \mathbb{P} \left( \hat{\mu} -\mu\geq \sqrt{\frac{2\log \delta^{-1}}{T}} \right) =1 $$ 3. Show that in general we have $$ \mathbb{P} \left( \hat{\mu} -\mu\geq \sqrt{\frac{2\log (T(T+1)/\delta)}{T}} \right) \leq \delta $$ ::: Without loss of generality assume $\mu=0$. Recall that if $Y\in \mathcal{N}(\mu, \sigma^2)$ then for $y\geq 0$ we have \begin{align*} \mathbb{P}(Y\geq y)\leq \exp\left(-\frac{y^2}{2\sigma^2}\right). \end{align*} Since $X_i$'s are assumed to be independent standard normal, we have $\hat{\mu}\sim \mathcal{N}(\mu, 1/T)$. Analogous results can be obtained for subgaussian random variables using Hoeffding's inequality instead. We also remark that tighter bounds in (3) are possible, which require more sophisticated tools. ## (1) By independence we can write \begin{align*} \mathbb{P} \left( \hat{\mu} \geq \sqrt{\frac{2\log \delta^{-1}}{T}} \right) &= \sum_{t=1}^\infty \mathbb{P} \left( \hat{\mu} \geq \sqrt{\frac{2\log \delta^{-1}}{T}} \Bigg\vert T=t \right) \mathbb{P}(T=t) \\ & = \sum_{t=1}^\infty \mathbb{P} \left( \frac1t \sum_{i=1}^t X_t \geq \sqrt{\frac{2\log \delta^{-1}}{t}} \right) \mathbb{P}(T=t) \\ & \leq \sum_{t=1}^\infty \exp\left(-\dfrac{\frac2t \log \frac1\delta}{2\cdot\frac1t} \right)\mathbb{P}(T=t) = \delta. \end{align*} ## (2) For $\omega\in\Omega$ define \begin{align} T(\omega):= \inf \left\{ n\geq 1 : \frac1n \sum_{t=1}^n X_t(\omega) \geq \sqrt{\frac{2\log\delta^{-1}}{n} }\right\}. \end{align} Then $\mathbb{I}\{T=t\}$ is a function of $X_1,\ldots, X_t$ and hence is $\mathcal{F}_t$-measurable. By the definition of $T$ we have $\hat{\mu}\geq \sqrt{2 (\log \delta^{-1})/T}$ on the subset $\Omega$ where $T$ is defined. It then remains to check that $T$ is well-defined, namely $T<\infty$ almost surely. By the law of iterated logarithm, we have for almost every $\omega\in\Omega$, \begin{align} \limsup_{n\to\infty} \frac{\sum_{t=1}^n X_t(\omega)}{\sqrt{2n\log\log n}}=1. \end{align} Thus for every $\epsilon \in (0,1)$ and $m\geq 1$ there exists $n\geq m$ such that \begin{align} \frac{\sum_{t=1}^n X_t(\omega)}{\sqrt{2n\log\log n}}\geq 1-\epsilon, \end{align} which implies \begin{align} \frac{\frac1n \sum_{t=1}^n X_t(\omega)}{\sqrt{(2\log\delta^{-1})/n}}\geq (1-\epsilon)\sqrt{\frac{\log\log n}{\log \delta^{-1}}}, \end{align} which is greater than $1$ for $m$ sufficiently large. This means that the set in the definition of $T$ is a nonempty subset of $\mathbb{N}$, so $T<\infty$ almost surely. ## (3) Since $\{T=t\}_{t\geq 1}$ partitions the sample space, we have \begin{align*} \mathbb{P} \left( \hat{\mu} \geq \sqrt{\frac{2\log (T(T+1)/\delta)}{T}} \right) & = \sum_{t=1}^\infty \mathbb{P} \left( \hat{\mu} \geq \sqrt{\frac{2\log (T(T+1)/\delta)}{T}} \text{ and } T=t \right) \\ & \leq \sum_{t=1}^\infty \mathbb{P} \left( \frac1t\sum_{i=1}^t X_i\geq \sqrt{\frac{2\log (t(t+1)/\delta)}{t}} \right) \\ & \leq \sum_{t=1}^\infty \exp\left(-\dfrac{\frac{2\log (t(t+1)/\delta)}{t}}{2\cdot \frac1t}\right) \\ & = \sum_{t=1}^\infty \frac{\delta}{t(t+1)}=\delta. \end{align*} # References [1] Lattimore, Tor, and Csaba Szepesvári. Bandit algorithms. Cambridge University Press, 2020.