[Efficient Algorithms for Online Decision Problems](https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.93.3588&rep=rep1&type=pdf)

# [Efficient Algorithms for Online Decision Problems](https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.93.3588&rep=rep1&type=pdf) ## Background 1. Randomized Weighted Majority (RWM) achieves no-regret but not necessarily efficient, since the number of experts can be exponentially large. - Expensive to maintain a probability distribution over such a large space! 2. This paper gives an efficient algorithm, "Follow the Perturbed Leader", that achieves no-regret if the loss function is linear. - i.e. given the states $\{c_t\}_{t=1}^T \in \mathcal{C}^{*}$ and an action $x \in S$, the loss function $\ell: \mathcal{C} \times S \to \mathbb{R}$ satisfies $\sum_{t=1}^T \ell(c_t, x) = \ell(\sum_{t=1}^T c_t, x)$ (assuming $\mathcal{C}$ is closed under addition) - Note that the environment may be adversarial, so any deterministic algorithm can be exploited and suffer linear regret. ## Model - We have a set $S$ of feasible points in $\mathbb{R}^m$. Each time step $t$, we (probably randomly) pick $x_t \in S$. We then observe the state $c_t$ and we pay $\ell(c_t, x_t) = c_t \cdot x_t$. - Below we will also refer to $c_t$ as the cost vector - The set $S$ may have exponentially-many points. However, we have an oracle $M: \mathcal{C} \to S$ that, given any cost vector $c$, finds the $x \in S$ that minimizes $\ell(c, x)$. We will use it to produce an *online* algorithm so that our overall expected cost is not too much worse than the best $x \in S$ in hindsight. Specifically, we want for any sequence $c_1,...,c_T$ to get: $$\mathbb{E}[\mathrm{Loss~of~our~Alg.}] \leq \min_{x \in S}~ \sum_{t=1}^T \ell(c_t, x) + o(T)$$ - We will need to assume that $S$ is bounded and so are the cost vectors $c_t$. ## Results **Algorithm (Follow the Perturbed Leader)** 1. At time $t$, sample a random vector $c_0$ uniformly from $[0, 2/\epsilon]^m$, where $\epsilon$ is a set parameter. 2. Output $x_t = M(c_0 + c_1 + ... + c_{t-1})$ **Analysis Outline** 1. Show that outputting $M(c_1 + ... + c_t)$ gives good result, i.e., peek into the future and use that today (i.e. **be-the-leader**). Note this is *not* the same as $M(c_t)$, which obviously would be super-optimal. Now, let alg. $A$ (follow-the-leader) output $\arg\min_{x \in S} \ell(c_1 + ... + c_{t-1},x)$, and let alg. $B$ (be-the-leader) output $\arg\min_{x \in S} \ell(c_1 + ... + c_t, x)$. If we add a random vector $c_{0A}, c_{0B}$ into their respective loss function, where $c_{0A}, c_{0B} \sim_{i.i.d.} U([0,2/\epsilon]^m)$ and rename $A\to A'$ and $B\to B'$, then as $\epsilon \to 0$, $A'$ and $B'$ will start behaving more and more alike. - The support of the random vector can also be a ball, but cube is simpler to analyze 2. We then show that for an appropriate value of $\epsilon$ a. $B'$ is close to $B$ (be-the-leader) b. $A'$ is close to $B'$. This will imply $A'$ (follow-the-perturbed-leader) is good. **Applications** ###### tags: `Digest`