# [Efficient Algorithms for Online Decision Problems](https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.93.3588&rep=rep1&type=pdf)
## Background
1. Randomized Weighted Majority (RWM) achieves no-regret but not necessarily efficient, since the number of experts can be exponentially large.
- Expensive to maintain a probability distribution over such a large space!
2. This paper gives an efficient algorithm, "Follow the Perturbed Leader", that achieves no-regret if the loss function is linear.
- i.e. given the states $\{c_t\}_{t=1}^T \in \mathcal{C}^{*}$ and an action $x \in S$, the loss function $\ell: \mathcal{C} \times S \to \mathbb{R}$ satisfies $\sum_{t=1}^T \ell(c_t, x) = \ell(\sum_{t=1}^T c_t, x)$ (assuming $\mathcal{C}$ is closed under addition)
- Note that the environment may be adversarial, so any deterministic algorithm can be exploited and suffer linear regret.
## Model
- We have a set $S$ of feasible points in $\mathbb{R}^m$. Each time step $t$, we (probably randomly) pick $x_t \in S$. We then observe the state $c_t$ and we pay $\ell(c_t, x_t) = c_t \cdot x_t$.
- Below we will also refer to $c_t$ as the cost vector
- The set $S$ may have exponentially-many points. However, we have an oracle $M: \mathcal{C} \to S$ that, given any cost vector $c$, finds the $x \in S$ that minimizes $\ell(c, x)$. We will use it to produce an *online* algorithm so that our overall expected cost is not too much worse than the best $x \in S$ in hindsight. Specifically, we want for any sequence $c_1,...,c_T$ to get:
$$\mathbb{E}[\mathrm{Loss~of~our~Alg.}] \leq \min_{x \in S}~ \sum_{t=1}^T \ell(c_t, x) + o(T)$$
- We will need to assume that $S$ is bounded and so are the cost vectors $c_t$.
## Results
**Algorithm (Follow the Perturbed Leader)**
1. At time $t$, sample a random vector $c_0$ uniformly from $[0, 2/\epsilon]^m$, where $\epsilon$ is a set parameter.
2. Output $x_t = M(c_0 + c_1 + ... + c_{t-1})$
**Analysis Outline**
1. Show that outputting $M(c_1 + ... + c_t)$ gives good result, i.e., peek into the future and use that today (i.e. **be-the-leader**). Note this is *not* the same as $M(c_t)$, which obviously would be super-optimal.
Now, let alg. $A$ (follow-the-leader) output $\arg\min_{x \in S} \ell(c_1 + ... + c_{t-1},x)$, and let alg. $B$ (be-the-leader) output $\arg\min_{x \in S} \ell(c_1 + ... + c_t, x)$. If we add a random vector $c_{0A}, c_{0B}$ into their respective loss function, where $c_{0A}, c_{0B} \sim_{i.i.d.} U([0,2/\epsilon]^m)$ and rename $A\to A'$ and $B\to B'$, then as $\epsilon \to 0$, $A'$ and $B'$ will start behaving more and more alike.
- The support of the random vector can also be a ball, but cube is simpler to analyze
2. We then show that for an appropriate value of $\epsilon$
a. $B'$ is close to $B$ (be-the-leader)
b. $A'$ is close to $B'$.
This will imply $A'$ (follow-the-perturbed-leader) is good.
**Applications**
###### tags: `Digest`