Game Theory === https://math.meta.stackexchange.com/questions/5020/mathjax-basic-tutorial-and-quick-reference # Week 1: Introduction and Overview :::success **Best Responce** Definition: $a_{-i} = \{a_{1},a_{2},a_{3},...a_{-i}, a_{i}, ...a_{n-1}, a_{n}\}$ $a = (a_{-i}, a_i)$ $a_i^* \in BR(a_{-i}),\iff \forall a_{i} \in A_{i}, u_{i}(a_i^* , a_{-i}) \geq u_{i}(a_i , a_{-i})$ > A is the set of all actions, a is entire action vactor, we call it the action profile. > $a_i$ is your original action, $BR()$ is the set of Best Responce. > ui is utility. ::: :::success **Nash Equilibrium** Definition: a = ⟨a1,...,an⟩ is a (“pure strategy”) $Nash \ equilibrium \iff \forall i, ai ∈ BR(a_{−i}).$ ::: :::success **Strictly Donimate** Definition: $s_i$ strictly donimates $s_i^{'}$ , if $\forall s_{-i} \in S_{-i}, u_i(s_i, s_{-i}) \gt u_i(s_i^{'},s_{-i})$ ::: :::success **Very Weakly Donimate** Definition: $s_i$ very weakly donimates $s_i^{'}$ , if $\forall s_{-i} \in S_{-i}, u_i(s_i, s_{-i}) \ge u_i(s_i^{'},s_{-i})$ ::: :::success **Pareto Optimality** Definition: An outcome $o^*$ is Pareto Optimal if there is no other outcome that Pareto Donimates it. ::: # Week 2: Mixed Strategies and Nash Equilibrium :::success **Best Responce** Definition: $s_i^* \in BR(s_{-i}),\iff \forall s_{i} \in S_{i}, u_{i}(s_i^* , s_{-i}) \geq u_{i}(s_i , s_{-i})$ ::: :::success **Nash Equilibrium** Definition(for mix stratigies): s = ⟨s1,...,sn⟩ is a (“mix strategy”) $Nash \ equilibrium \iff \forall i, s_i ∈ BR(s_{−i}).$ ::: :::success support is the set of pure strategies that receive positive probability under the mix strategy of the players. ::: 如何最佳?讓對方不管選什麼,得分期望都一樣。 # Week 3: Alternate Solution Concepts :::success **Strictly Dominated Strategies** Definition: A strategy $a_i \in A_i$ is strictly donimated by $a_i^{'} \in A_i$ , if $\forall a_{-i} \in A_{-i}, u_i(a_i, a_{-i}) \lt u_i(a_i^{'},a_{-i})$ ::: :::success **Weakly Donimated** Definition: $a_i$ weakly donimated $a_i^{'}$ , if $u_i(a_i, a_{-i}) \le u_i(a_i^{'},a_{-i}) \forall a_{-i} \in A_{-i}, and \\ \exists \ \ u_i(a_i, a_{-i}) \lt u_i(a_i^{'},a_{-i}) a_{-i} \in A_{-i}$ ::: Weakly donimated can remove them iteratively too, but: 1. They can be best replies. 2. Order of removal can matter. 3. At least one equilibrium preserved. :::success **Maxmin** Definition: The $maxmin \ strategy$ for player i is $arg\max_{s_i} \min_{s_{-i}} u_i(s_1,s_2)$, and the $maxmin \ value$ for player i is $\max_{s_i}\min_{s_{−i}} u_i(s_1,s_2)$. ::: Why would i want to play a maxmin strategy? • a conservative agent maximizing worst-case payoff :::success **Minmax** Definition: In a two-player game, the $minmax \ strategy$ for player i against player −i is $arg\min_{s_i}\max_{s_{−i}} u_{−i}(s_i,s_{-i})$, and player −i’s $minmax \ value$ is $\min_{s_i} \max_{s_{-i}} u_{-i}(s_i,s_{-i})$. ::: :::info Theorem (**Minimax theorem** (von Neumann, 1928)) In any finite, two-player, zero-sum game, in any $Nash \ equilibrium$ each player receives a payoff that is equal to both his maxmin value and his minmax value. ::: 1. Each player’s maxmin value is equal to his minmax value. The maxmin value for player 1 is called **the value of the game**. 2. For both players, the set of maxmin strategies coincides with the set of minmax strategies. 3. Any maxmin strategy profile (or, equivalently, minmax strategy profile) is a Nash equilibrium. Furthermore, these are all the Nash equilibria. Consequently, all Nash equilibria have the same payoff vector (namely, those in which player 1 gets the value of the game). -- # Week 4: Extensive-Form Games A (finite) perfect-information game (in extensive form) is defined by the tuple $(N,A,H,Z,χ,ρ,σ,u)$, where: * Players: $N$ is a set of n players * Actions: $A$ is a (single) set of actions * Choice nodes and labels for these nodes: * Choice nodes: $H$ is a set of non-terminal choice nodes. * Action function: $χ : H → 2^{\lvert A\rvert}$ assigns to each choice node a set of possible actions * $ρ : H \to N$ assigns to each non-terminal node $h$ a player $i ∈ N$ who chooses an action at $h$ * Terminal nodes: $Z$ is a set of terminal nodes, disjoint from $H$ * Successor function: $σ : H \times A \to H \bigcup Z$ maps a choice node and an action to a new choice node or terminal node such that $\forall$ $h_1,h_2 \in H$ and $a_1,a_2 \in A$, if $σ(h_1,a_1) = σ(h_2,a_2)$ then $h_1 = h_2$ and $a_1 = a_2$ * Utility function: $u = (u_1,...,u_n)$; $u_i : \Bbb Z \to \Bbb R$ is a utility function for player $i$ on the terminal nodes $\Bbb Z$ :::success **pure strategies** quentities Definition: Let $G = (N,A,H,Z,χ,ρ,σ,u)$ be a perfect-information extensive-form game. Then the pure strategies of player $i$ consist of the cross product $$\prod_{h \in H, \ ρ(h)=i}χ(h)$$ ::: :::success **subgame** of $G$ rooted at $h$ Definition: The subgame of $G$ rooted at $h$ is the restriction of $G$ to the descendents of $H$. ::: :::success **subgames** of $G$ Definition: The set of subgames of $G$ is defined by the subgames of $G$ rooted at each of the nodes in $G$. ::: $s$ is a **subgame perfect equilibrium** of $G \iff \forall$ **subgame** $G^′$ of $G$, the restriction of s to $G^′$ is a Nash equilibrium of $G^′$ * NOTE: * since $G$ is its own subgame, every **SPE** is a **NE**. * this definition rules out “non-credible threats” :::success **Imperfect-information Game (in extensive form)** Definition: An imperfect-information game (in extensive form) is a tuple (N,A,H,Z,χ,ρ,σ,u,I), where (N,A,H,Z,χ,ρ,σ,u) is a perfect-information extensive-form game, and $I = (I_1,...,I_n)$, where $I_i = (I_{i,1},...,I_{i,ki})$ is an $equivalence \ relation$ on (that is, a partition of) ${h \in H : ρ(h) = i}$ with the property that $χ(h) = χ(h^′)$ <font size=2 color=red>(該節點有相同決策選項)</font> and $ρ(h) = ρ(h^′)$ <font size=2 color=red>(該節點是同一決策人)</font> whenever there exists a $j$ for which $h \in I_{i,j}$ and $h^′ \in I_{i,j}$. ::: # Week 5: Repeated Games :::success **Average reward** Definition: Given an infinite sequence of payoffs $r_1,r_2,...$ for player $i$, the average reward of $i$ is: $$\lim_{k \to \infty} \sum^k_{j=1} {r_j\over k}$$ ::: :::success **Future discounted reward** Definition: Given an infinite sequence of payoffs $r_1,r_2,...$ for player $i$ and discount factor ${\beta}$ with $0 \lt \beta \lt 1$, i^’s future discounted reward is $$\sum^{\infty}_{j=1} {\beta}^j r_j$$ ::: Two equivalent interpretations of the discount factor: 1. the agent cares more about his well-being in the near term than in the long term 2. the agent cares about the future just as much as the present, but with probability 1−${\beta}$ the game will end in any given round :::success **Stochastic game** Definition A stochastic game is a tuple $(Q,N,A,P,R)$, where $Q$ is a finite set of states $N$ is a finite set of n players $A = A_1 \times ···\times A_n$, where $A_i$ is a finite set of actions available to player $i$ $P : Q \times A \times Q \to [0,1]$ is the transition probability function; $P(q,a, \hat q)$ is the probability of transitioning from state $q$ to state $\hat q$ after joint action a $R = \{r_1,...,r_n\}$, where $r_i : Q \times A \to R$ is a real-valued payoff function for player $i$. ::: :::success For every $a \in A$ let $w(a)$ be the number of times the opponent has player action $a$. Can be initialized to non-zero starting values. Assess opponent’s strategy using these counts: $σ(a)$ = ${w(a)} \over {\sum}_{a^{'} \in A} w(a′)$ ::: :::info Theorem About Fictitious Play: **If** the empirical distribution of each player’s strategies **converges** in fictitious play, **then** it converges to a $Nash \ equilibrium$. -- Each of the following are a **sufficient conditions** for the empirical frequencies of play to converge in fictitious play: 1. The game is zero sum; 2. The game is solvable by iterated elimination of strictly dominated strategies; 3. The game is a potential game; 4. The game is $2 \times n$ and has generic payoffs. ::: :::success **Regret** Definition: The regret an agent experiences at time $t$ for not having played $s$ is $R^t(s) = max(α^t(s)−α^t,0)$. ::: :::success **No-regret learning rule** Definition: A learning rule exhibits no regret if for any pure strategy of the agent $s$ it holds that $$Pr([{\lim}_{\infty} R^t(s)] \le 0) = 1$$. ::: At each time step each action is chosen with probability proportional to its regret. That is: $${\sigma}^{t+1}_i (s) = {R^t(s) \over {\sum}_{s^{'} \in S_i} R^t(s^{'})}$$ where ${\sigma}^{t+1}_i (s)$ is the probability that agent $i$ plays pure strategy $s$ at time $t + 1$. Converges to a correlated equilibrium for finite games. ### Equilibria of Infinitely Repeated Games Consider any n-player game $G = (N,A,u)$ and any payoff vector $r = (r_1,r_2,...,r_n)$. Let $$v_i = \min_{s_{−i} \in S{−i}} \max_{s_i \in S_i} u_i(s_{-i},s_i)$$ :::success **Enforceable** definition: A payoff profile r is **enforceable** if $r_i \ge v_i$. ::: :::success **feasible** definition: A payoff profile $r$ is feasible if there exist rational, non-negative values ${\alpha}_a \ such \ that \ \forall i$, we can express $r_i$ as ${\sum}_{a \in A} \ {\alpha}_a \ u_i(a)$, with ${\sum}_{a \in A} \ {\alpha}_a = 1$. ::: :::info Folk theorem: Consider any n-player game G and any payoff vector$r = (r_1,r_2,...,r_n)$. 1. If $r$ is the payoff in any Nash equilibrium of the infinitely repeated G with average rewards, then for each player $i$, $r_i$ is enforceable. 2. If $r$ is both feasible and enforceable, then $r$ is the payoff in some Nash equilibrium of the infinitely repeated $G$ with average rewards. ::: Discounted Repeated Games Stage game: $(N,A,u)$ Discount factors: $β_1,...,β_n \ ,and \ β_i \in [0,1]$ Assume a common discount factor for now: ${\beta}_i = {\beta} \ \forall i$ Payoff from a play of actions $a_1,...,a_t,...$: $${\sum}_t {\beta}^t_i \times u_i(a_t)$$ 當 Cooperate Payoff = Defect payoff 時有臨界$\beta$值 當 Cooperate Payoff < Defect payoff 時則永遠 Defect # Week6: Bayesian Games We’ll still assume: 1. All possible games have the same number of agents and the same strategy space for each agent; **differing only in payoffs**. 2. Agents’ beliefs are posteriors, obtained by conditioning a common prior on individual private signals. :::success **Bayesian Game** Definition: A Bayesian game is a tuple $(N,G,P,I)$ where $N$ is a set of agents, $G$ is a set of games with $N$ agents each such that if $g, \ g^{'} \in G$ then for each agent $i \in N$ the strategy space in $g$ is identical to the strategy space in $g^{'}$, <font size=2 color=red>(for these $G = (N,A,u)$)</font> $P \in {\Pi}(G)$ is a common prior over games, where ${\Pi}(G)$ is the set of all probability distributions over $G$<font size=2 color=red>(是g賽局的可能性)</font>, and $I = (I_1,...,I_n)$ is a set of partitions of $G$, one for each agent. <font size=2 color=red>(所以I將會是一個等價關係類)</font> ::: :::success **Bayesian Game** Definition(in Epistemic type): A Bayesian game is a tuple $(N,A,{\Theta},p,u)$ where $N$ is a set of agents, $A = (A_1,...,A_n)$, where $A_i$ is the set of actions available to player $i$, ${\Theta} = ({\theta}_1,...,{\theta}_n)$, where ${\theta}_i$ is the type space of player $i$, $p : {\theta} \to [0,1]$ is the common prior over types, $u = (u_1,...,u_n)$, where $u_i : A \times {\theta} \ \to R$ is the utility function for player $i$. ::: ### Stratigies Given a Bayesian game $(N,A,{\Theta},p,u)$ with finite sets of players, actions, and types, strategies are defined as follows: * Pure strategy: $s_i : {\Theta}_i \to A_i$ * a choice of a pure action for player $i$ as a function of his or her type. * Mixed strategy: $s_i : {\Theta}_i \to {\pi}(A_i)$ * a choice of a mixed action for player $i$ as a function of his or her type. * $s_i(a_i|{\Theta}_i)$ * the probability under mixed strategy $s_i$ that agent $i$ plays action $a_i$, given that $i$’s type is ${\Theta}_i$. ### Expected Utility Three standard notions of expected utility: 1. ex-ante * the agent knows nothing about anyone’s actual type; 2. interim * an agent knows her own type but not the types of the other agents; 3. ex-post * the agent knows all agents’ types. Given a Bayesian game $(N,A,{\Theta},p,u)$ with finite sets of players, actions, and types, $i$’s **interim expected utility** with respect to type ${\theta}_i$ and a mixed strategy profile $s$ is $${EU}_i(s|{\theta}_i) = \sum_{ {\theta}_{-1} \in {\Theta}_{-i} } p({\theta}_{-i}|{\theta}_i) \cdot \{\sum_{a \in A}(\prod_{j \in N} s_j(a_j, {\theta}_j)) \cdot u_i(a,{\theta}_i,{\theta}_{-i})\}$$ $i$’s **ex-ante** expected utility with respect to a mixed strategy profile $s$ is: $$EU(s|{\theta}) = \sum_{{\theta}_i \in {\Theta}_i } p({\theta}_i) \cdot {EU}_i(s|{\theta}_i)$$ A Bayesian equilibrium is a mixed strategy profile s that satisfies: $\forall i$ and ${\theta}_i \in {\Theta}_i$ $$s_i \in arg\max_{s^′_i} EU_i(s^′_i,s_{-i}|{\theta}_i)$$ The above is defined based on interim maximization. It is equivalent to an ex-ante formulation: If $p({\theta}_i) \gt 0, \ \forall \ {\theta}_i \in {\Theta}_i$, then this is equivalent to requiring that $\forall i$ $$s_i \in arg\max_{s^′_i} EU_i(s^′_i,s_{-i}) = arg\max_{s^{'}_i} \sum_{{\theta}_i} (p({\theta}_i)EU_i(s^′_i,s_{-i}|{\theta}_i))$$ # Week 7: Coalitional Games We are not concerned with: * how the agents make individual choices within a coalition * how they coordinate :::warning **Transferable utility** assumption: * payoffs may be redistributed among a coalition’s members. * satisfied whenever payoffs are dispensed in a universal currency. * each coalition can be assigned a single value as its payoff. ::: :::success **Coalitional game with transferable utility** Definition: A coalitional game with transferable utility is a pair $(N,v)$, where $N$ is a finite set of players, indexed by $i$, and $v : 2^{\vert N \vert} \to \Bbb R$ associates with each coalition $S \subseteq N$ a real-valued payoff $v(S)$ that the coalition’s members can distribute among themselves, and We assume that $v(\emptyset) = 0$. ::: :::success **Superadditive game** Definition: A game $G = (N,v)$ is superadditive if $\forall S,T \subset N$, if $S \cap T = \emptyset$, then $v(S \cup T) \ge v(S) + v(T)$. ::: Superadditivity is justified when coalitions can always work without interfering with one another. ## Shapley’s axioms $\forall S$ that contains neither $i$ nor $j$, $v(S \cup \{i\}) = v(S \cup \{j\})$. :::warning **Symmetry** $\forall v$, if $i$ and $j$ are interchangeable then $\psi_i(N,v) = \psi_j(N,v)$. ::: $i$ is a dummy player if the amount that $i$ contributes to any coalition is 0. :::warning **Dummy player** $\forall v$, if $i$ is a dummy player then $\psi_i(N,v) = 0$. ::: :::warning **Additivity** For all two $v_1$ and $v_2$, $\psi_i(N,v_1 + v_2) = \psi_i(N,v_1) + \psi_i(N,v_2)$ for each $i$, where the game $(N,v_1 + v_2)$ is defined by $(v_1 + v_2)(S) = v_1(S) + v_2(S)$ for every coalition $S$. ::: ## Shapley Value :::info **Shapley Value** Theorem: Given a coalitional game $(N,v)$, there is a unique payoff division $x(v) = \psi(N,v)$ that divides the full payoff of the grand coalition and that satisfies the **Symmetry**, **Dummy player** and **Additivity** axioms: the Shapley Value ::: for each $i$, $$\psi_i(N,v) = {1 \over {N!}} \sum_{S \subseteq N \setminus \{i\}} \vert S\vert! \cdot (\vert N\vert − \vert S\vert −1)! \cdot [v(S \cup {i})−v(S)]$$ ## Core it is a set :::success **Core** definition: A payoff vector $x$ is in the core of a coalitional game $(N,v)$ $\iff \forall S \subseteq N$, $\sum_{i \in S}x_i \ge v(S)$ ::: Is the core always nonempty? > **NO.** > Consider again the voting game. > The set of minimal coalitions that meet the required 51 votes is $\{A,B\},\{A,C\},\{A,D\}, and\{B,C,D\}$. > If the sum of the payoffs to parties $B, C, and D$ is less than $100 million, then this set of agents has incentive to deviate. > If $B, C, and D$ get the entire payoff of $100 million, then $A$ will receive $0 and will have incentive to form a coalition with whichever of $B, C, and D$ obtained the smallest payoff. > Thus, the core is empty for this game. Is the core always unique? > **NO.** > Consider changing the example so that an 80% majority is required. > The minimal winning coalitions are now$\{A,B,C\}and\{A,B,D\}$. > Any complete distribution of the $100 million among $A$ and $B$ now belongs to the core. > All winning coalitions need the support of these two parties. > Thus, the core is not unique for this game. :::success **Simple game** Definition: A game $G = (N,v)$ is simple if $\forall S \subset N$, $v(S) ∈\in \{0,1\}$. ::: :::success **Veto** Definition: A player $i$ is a veto player if $v(N \setminus \{i\}) = 0$. ::: :::info In a **simple game** the **core** is **empty** $\iff$ there is **no veto player**. If there are veto players, the core consists of all payoff vectors in which the nonveto players get 0. ::: :::success **Convex game** Definition: A game $G = (N,v)$ is convex if $\forall S,T \subset N$, $v(S \cup T) \ge v(S) + v(T)−v(S \cap T)$. ::: > Convexity is a stronger condition than superadditivity. > :::info Every convex game has a nonempty core. ::: :::info In every convex game, the Shapley value is in the core. ::: 2/16/2019 done.