# Ranking with Long-Term Constraints $$ \sum_{t=1}^T r_t^\top \Sigma_t u - \phi^\top \left(\tau - \sum_{t=1}^T W\Sigma_t e \right)_+ $$ <hr /> ### Cost-Oblivious P-Controller #### Constraint: $$ (\frac{t}{T}\tau - s_{t})_+ $$ #### Tracking Error: $$ \left[ \sum_{i=1}^m w_{ij} \left( \frac{t}{T}\tau_j - s^j_{t} \right)_+ \right]_{j=1}^n = W^\top \left(\frac{t}{T}\tau - s_{t} \right)_+ $$ #### Controller: $$ \Pi_{\text{CO}}(x_t,s_{t-1},t) = \arg\max_\Sigma~ \left[ r_t + \gamma W^\top \frac{\left( \frac{t-1}{T}\tau - s_{t-1} \right)_+}{\color{red}{Z}}\right]^\top \Sigma u $$ where $Z \in \mathbb{R}^m$ is the number of items in each constriant group and $\gamma \in \mathbb{R}$ is a scalar value. <hr/> ### Oracle Controller $$ \sum_{t=1}^T r_t^\top \Sigma_t u - \phi^\top \left(\tau - \sum_{t=1}^T W\Sigma_t e \right)_+ $$ ### Cost-Aware Reactive-Controller #### Objective: $$ \min_{0\leq \lambda\leq \phi} \frac{1}{T}\sum_{t=1}^T u(a_t|x_t) - \lambda^T \left(\frac{1}{T}\tau - \frac{1}{T}\sum_{t=1}^T c(a_t|x_t)\right), $$ #### Iterative Solution: $$ \Sigma_t = \arg\max_\Sigma r_t^\top \Sigma u - (\lambda_{t-1})_{[0,\phi]}^\top \left(\color{red}{\frac{1}{T}\tau} - W\Sigma e \right) $$ $$\lambda_t = \lambda_{t-1} + \gamma \left(\frac{1}{T}\tau - W\Sigma_t e \right), $$ #### Lambda: $$ \lambda_t = \lambda_{0} + \sum_{k=1}^{t} \gamma \left(\frac{1}{T}\tau - W \Sigma_k e \right) = \lambda_0 + \gamma\left(\frac{t}{T}\tau - s_{t}\right) $$ #### Controller: $$ \Pi_{\text{CA}}(x_t,s_{t-1},t) = \arg\max_\Sigma r_t^\top \Sigma u +\gamma\left(\tfrac{t-1}{T}\tau - s_{t-1}\right)_{[0,\phi]}^\top W\Sigma e $$ #### <span style="color:red"> Substitute:</span> \begin{align*} \Sigma_t &= \arg\max_\Sigma r_t^\top \Sigma u - (\lambda_{t-1})_{[0,\phi]}^\top \left(\frac{1}{T}\tau - W\Sigma e \right) \\ \Sigma_t &= \arg\max_\Sigma r_t^\top \Sigma u - \left( \lambda_0 + \gamma\left[\frac{t-1}{T}\tau - s_{t-1}\right] \right)_{[0,\phi]}^\top \left(\frac{1}{T}\tau - W\Sigma e \right) \color{red}{\text{definition of}~\lambda_{t-1}}\\ \Sigma_t &= \arg\max_\Sigma r_t^\top \Sigma u - \left(\gamma\left[\frac{t-1}{T}\tau - s_{t-1}\right] \right)_{[0,\phi]}^\top \left(\frac{1}{T}\tau - W\Sigma e \right) \color{red}{\lambda_{0}=0}\\ \Sigma_t &= \arg\max_\Sigma r_t^\top \Sigma u + \left(\gamma\left[\frac{t-1}{T}\tau - s_{t-1}\right] \right)_{[0,\phi]}^\top W\Sigma e~ \color{red}{\text{distribute and simply}}\\ \end{align*} <hr/> ### Cost-Aware Reactive-Controller (Online BPC No Error) #### <span style="color:red"> Iterative Solution (No error on $\frac{1}{T}\tau$):</span> $$ \Sigma_t = \arg\max_\Sigma r_t^\top \Sigma u + (\lambda_{t-1})_{[0,\phi]}^\top \color{red}{ W\Sigma e} $$ $$\lambda_t = \lambda_{t-1} + \gamma \left(\frac{1}{T}\tau - W\Sigma_t e \right), $$ #### <span style="color:red"> Substitute (No error on $\frac{1}{T}\tau$):</span> \begin{align*} \Sigma_t &= \arg\max_\Sigma r_t^\top \Sigma u + (\lambda_{t-1})_{[0,\phi]}^\top W\Sigma e \\ \Sigma_t &= \arg\max_\Sigma r_t^\top \Sigma u + \left( \lambda_0 + \gamma\left[\frac{t-1}{T}\tau - s_{t-1}\right] \right)_{[0,\phi]}^\top W\Sigma e ~\color{red}{\text{definition of}~\lambda_{t-1}}\\ \Sigma_t &= \arg\max_\Sigma r_t^\top \Sigma u + \left(\gamma\left[\frac{t-1}{T}\tau - s_{t-1}\right] \right)_{[0,\phi]}^\top W\Sigma e ~ \color{red}{\lambda_{0}=0}\\ \end{align*} <hr /> ### Online BPC No Error with Hinge $$ \Sigma_t = \arg\max_\Sigma r_t^\top \Sigma u - (\lambda_{t-1})_{[0,\phi]}^\top \color{red}{ \left( \frac{1}{T}\tau - W\Sigma_t e \right)_+} $$ $$\lambda_t = \lambda_{t-1} + \gamma \left(\frac{1}{T}\tau - W\Sigma_t e \right), $$ <hr /> ### BPC \begin{align*} \Sigma_t = &\arg\max_\Sigma r_t^\top \Sigma u \\ & \text{s.t.} W \Sigma e + s_{t-1} \geq \frac{t}{T} \delta \end{align*} <hr /> ### BPC Relax \begin{align*} \Sigma_t = &\arg\max_\Sigma r_t^\top \Sigma u - \kappa \\ & \text{s.t.} W \Sigma e + s_{t-1} - \frac{t}{T} \delta + \kappa \geq 0 \end{align*} <hr /> ### BPC Phi \begin{align*} \Sigma_t = &\arg\max_\Sigma r_t^\top \Sigma u + \phi( \frac{t}{T} \delta - W \Sigma e + s_{t-1})_+ \end{align*} <hr /> ### P-controller No-State \begin{align*} \Sigma_t = \text{argmax}_{\Sigma \in \Delta} \left[ \left( r + { W^\top \lambda^\top \odot \left[\frac{t}{T} \tau - s_{t-1}\right] }_+ \right)^\top \Sigma u \right] \end{align*} <!--- #### <span style="color:red"> Assume that $u=e$:%</span> \begin{align*} \Pi_{\text{CA}}(x_t,s_{t-1},t) &= \arg\max_\Sigma r_t^\top \Sigma u + \gamma\left(\tfrac{t-1}{T}\tau - s_{t-1}\right)_{[0,\phi]}^\top W\Sigma e\\ \Pi_{\text{CA}}(x_t,s_{t-1},t) &= \arg\max_\Sigma r_t^\top \Sigma u + \gamma\left(\tfrac{t-1}{T}\tau - s_{t-1}\right)_{[0,\phi]}^\top W\Sigma \color{red}{u}\\ \Pi_{\text{CA}}(x_t,s_{t-1},t) &= \arg\max_\Sigma \left[ r_t^\top + \gamma\left(\tfrac{t-1}{T}\tau - s_{t-1}\right)_{[0,\phi]}^\top W \right] \color{red}{\Sigma u} \end{align*} <span style="color:red">Does the scale of $u$ and $e$ matter? It seems like we only care about position so why not set $u$ and $e$ to a position vector, i.e. $u=e=<1,2,3,4,5,6,7,8>$ </span> --> <hr /> ### Cost-Aware Predictive-Controller #### Objective: $$ \sum_{t=1}^T c(a_t|x_t) = s_{h-1} +c(a_h|x_h) + C(A_h|C_h) $$ #### Objective depending on $a_t$: $$ r_t^\top \Sigma_t u - \phi^T \left(\tau - \left[s_{t-1} + r_t^\top \Sigma_t e + \sum_{t'=t+1}^T r_t^\top \Sigma_{t'} e \right]\right)_+ $$ #### Multi-forecast: $$ \max_a \frac{1}{B}\sum_{b=1}^B u(a|x_t) - \phi^T \left(\tau - s_{t-1} - c(a|x_t) -\hat{C}_t^b \right)_+ $$ #### Iterative Solution: $$ a_t = \arg\max_a~ u(a|x_t) + \frac{1}{B}\sum_{b=1}^B \left(\lambda^b_{t-1} \right)_{[0,\phi]}^\top c(a|x_t) $$ $$ \lambda_t^b = \lambda_{t-1}^b + \gamma \left(\tau - s_{t-1} - c(a_t|x_t) - \hat{C}_t^b \right),~~b=1,\dots,B. $$ #### <span style="color:red"> Iterative Solution(Error):</span> $$ a_t = \arg\max_a~ u(a|x_t) + \frac{1}{B}\sum_{b=1}^B \left(\lambda^b_{t-1} \right)_{[0,\phi]}^\top \color{red}{\left(\tau - s_{t-1} - c(a|x_t) - \hat{C}_t^b \right)} $$ $$ \lambda_t^b = \lambda_{t-1}^b + \gamma \left(\tau - s_{t-1} - c(a_t|x_t) - \hat{C}_t^b \right),~~b=1,\dots,B. $$ #### <span style="color:blue"> Iterative Solution (Exact hinge):</span> $$ a_t = \arg\max_a~ u(a|x_t) + \color{blue}{\phi^\top \left(\tau - s_{t-1} - c(a|x_t) - C_t \right)_+} $$ #### <span style="color:blue"> Iterative Solution (Approximate Hinge):</span> $$ a_t = \arg\max_a~ u(a|x_t) + \color{blue}{\frac{1}{B}\sum_{b=1}^B \phi^\top \left(\tau - s_{t-1} - c(a|x_t) - \hat{C}_t^b \right)_+} $$ #### <span style="color:orange"> Iterative Solution (Exact $C$):</span> $$ a_t = \arg\max_a~ u(a|x_t) + \left(\lambda_{t-1} \right)_{[0,\phi]}^\top c(a|x_t) $$ $$ \lambda_t = \lambda_{t-1} + \gamma \left(\tau - s_{t-1} - c(a_t|x_t) - \color{orange}{C}_t \right) $$ #### <span style="color:orange"> Iterative Solution (Exact Hinge $C$):</span> $$ a_t = \arg\max_a~ u(a|x_t) + \left(\lambda_{t-1} \right)_{[0,\phi]}^\top \left(\tau - s_{t-1} - c(a_t|x_t) - \color{orange}{C}_t \right)_+ $$ $$ \lambda_t = \lambda_{t-1} + \gamma \left(\tau - s_{t-1} - c(a_t|x_t) - \color{orange}{C}_t \right) $$ #### Test: $\Sigma_t = \text{argmax}_{\Sigma \in \Delta} \left[ r^\top \Sigma e - \lambda^\top (\frac{t}{T} \tau - [s_{t-1} + W\Sigma e] )_+ \right]$ $a = \text{argmax}_{a} \left[ u(a| x_t) - \lambda^\top (\frac{t}{T} \tau - [s_{t-1} + c(a|x_t)]) \right]$ }