Game Theory
===
https://math.meta.stackexchange.com/questions/5020/mathjax-basic-tutorial-and-quick-reference
# Week 1: Introduction and Overview
:::success
**Best Responce** Definition:
$a_{-i} = \{a_{1},a_{2},a_{3},...a_{-i}, a_{i}, ...a_{n-1}, a_{n}\}$
$a = (a_{-i}, a_i)$
$a_i^* \in BR(a_{-i}),\iff \forall a_{i} \in A_{i}, u_{i}(a_i^* , a_{-i}) \geq u_{i}(a_i , a_{-i})$
> A is the set of all actions, a is entire action vactor, we call it the action profile.
> $a_i$ is your original action, $BR()$ is the set of Best Responce.
> ui is utility.
:::
:::success
**Nash Equilibrium** Definition:
a = ⟨a1,...,an⟩ is a (“pure strategy”)
$Nash \ equilibrium \iff \forall i, ai ∈ BR(a_{−i}).$
:::
:::success
**Strictly Donimate** Definition:
$s_i$ strictly donimates $s_i^{'}$ , if $\forall s_{-i} \in S_{-i}, u_i(s_i, s_{-i}) \gt u_i(s_i^{'},s_{-i})$
:::
:::success
**Very Weakly Donimate** Definition:
$s_i$ very weakly donimates $s_i^{'}$ , if $\forall s_{-i} \in S_{-i}, u_i(s_i, s_{-i}) \ge u_i(s_i^{'},s_{-i})$
:::
:::success
**Pareto Optimality** Definition:
An outcome $o^*$ is Pareto Optimal if there is no other outcome that Pareto Donimates it.
:::
# Week 2: Mixed Strategies and Nash Equilibrium
:::success
**Best Responce** Definition:
$s_i^* \in BR(s_{-i}),\iff \forall s_{i} \in S_{i}, u_{i}(s_i^* , s_{-i}) \geq u_{i}(s_i , s_{-i})$
:::
:::success
**Nash Equilibrium** Definition(for mix stratigies):
s = ⟨s1,...,sn⟩ is a (“mix strategy”)
$Nash \ equilibrium \iff \forall i, s_i ∈ BR(s_{−i}).$
:::
:::success
support is the set of pure strategies that receive positive probability under the mix strategy of the players.
:::
如何最佳?讓對方不管選什麼,得分期望都一樣。
# Week 3: Alternate Solution Concepts
:::success
**Strictly Dominated Strategies** Definition:
A strategy $a_i \in A_i$ is strictly donimated by $a_i^{'} \in A_i$ , if $\forall a_{-i} \in A_{-i}, u_i(a_i, a_{-i}) \lt u_i(a_i^{'},a_{-i})$
:::
:::success
**Weakly Donimated** Definition:
$a_i$ weakly donimated $a_i^{'}$ , if $u_i(a_i, a_{-i}) \le u_i(a_i^{'},a_{-i}) \forall a_{-i} \in A_{-i}, and \\ \exists \ \ u_i(a_i, a_{-i}) \lt u_i(a_i^{'},a_{-i}) a_{-i} \in A_{-i}$
:::
Weakly donimated can remove them iteratively too, but:
1. They can be best replies.
2. Order of removal can matter.
3. At least one equilibrium preserved.
:::success
**Maxmin** Definition:
The $maxmin \ strategy$ for player i is $arg\max_{s_i} \min_{s_{-i}} u_i(s_1,s_2)$, and the $maxmin \ value$ for player i is $\max_{s_i}\min_{s_{−i}} u_i(s_1,s_2)$.
:::
Why would i want to play a maxmin strategy?
• a conservative agent maximizing worst-case payoff
:::success
**Minmax** Definition:
In a two-player game, the $minmax \ strategy$ for player i against player −i is $arg\min_{s_i}\max_{s_{−i}} u_{−i}(s_i,s_{-i})$, and player −i’s $minmax \ value$ is $\min_{s_i} \max_{s_{-i}} u_{-i}(s_i,s_{-i})$.
:::
:::info
Theorem (**Minimax theorem** (von Neumann, 1928))
In any finite, two-player, zero-sum game, in any $Nash \ equilibrium$ each player receives a payoff that is equal to both his maxmin value and his minmax value.
:::
1. Each player’s maxmin value is equal to his minmax value. The maxmin value for player 1 is called **the value of the game**.
2. For both players, the set of maxmin strategies coincides with the set of minmax strategies.
3. Any maxmin strategy profile (or, equivalently, minmax strategy profile) is a Nash equilibrium. Furthermore, these are all the Nash equilibria. Consequently, all Nash equilibria have the same payoff vector (namely, those in which player 1 gets the value of the game).
--
# Week 4: Extensive-Form Games
A (finite) perfect-information game (in extensive form) is defined by the tuple $(N,A,H,Z,χ,ρ,σ,u)$, where:
* Players: $N$ is a set of n players
* Actions: $A$ is a (single) set of actions
* Choice nodes and labels for these nodes:
* Choice nodes: $H$ is a set of non-terminal choice nodes.
* Action function: $χ : H → 2^{\lvert A\rvert}$ assigns to each choice node a set of possible actions
* $ρ : H \to N$ assigns to each non-terminal node $h$ a player $i ∈ N$ who chooses an action at $h$
* Terminal nodes: $Z$ is a set of terminal nodes, disjoint from $H$
* Successor function: $σ : H \times A \to H \bigcup Z$ maps a choice node and an action to a new choice node or terminal node such that $\forall$ $h_1,h_2 \in H$ and $a_1,a_2 \in A$, if $σ(h_1,a_1) = σ(h_2,a_2)$ then $h_1 = h_2$ and $a_1 = a_2$
* Utility function: $u = (u_1,...,u_n)$; $u_i : \Bbb Z \to \Bbb R$ is a utility function for player $i$ on the terminal nodes $\Bbb Z$
:::success
**pure strategies** quentities Definition:
Let $G = (N,A,H,Z,χ,ρ,σ,u)$ be a perfect-information extensive-form game. Then the pure strategies of player $i$ consist of the cross product
$$\prod_{h \in H, \ ρ(h)=i}χ(h)$$
:::
:::success
**subgame** of $G$ rooted at $h$ Definition:
The subgame of $G$ rooted at $h$ is the restriction of $G$ to the descendents of $H$.
:::
:::success
**subgames** of $G$ Definition:
The set of subgames of $G$ is defined by the subgames of $G$ rooted at each of the nodes in $G$.
:::
$s$ is a **subgame perfect equilibrium** of $G \iff \forall$ **subgame** $G^′$ of $G$, the restriction of s to $G^′$ is a Nash equilibrium of $G^′$
* NOTE:
* since $G$ is its own subgame, every **SPE** is a **NE**.
* this definition rules out “non-credible threats”
:::success
**Imperfect-information Game (in extensive form)** Definition:
An imperfect-information game (in extensive form) is a tuple (N,A,H,Z,χ,ρ,σ,u,I), where (N,A,H,Z,χ,ρ,σ,u) is a perfect-information extensive-form game, and
$I = (I_1,...,I_n)$, where $I_i = (I_{i,1},...,I_{i,ki})$ is an $equivalence \ relation$ on (that is, a partition of) ${h \in H : ρ(h) = i}$ with the property that $χ(h) = χ(h^′)$ <font size=2 color=red>(該節點有相同決策選項)</font> and $ρ(h) = ρ(h^′)$ <font size=2 color=red>(該節點是同一決策人)</font> whenever there exists a $j$ for which $h \in I_{i,j}$ and $h^′ \in I_{i,j}$.
:::
# Week 5: Repeated Games
:::success
**Average reward** Definition:
Given an infinite sequence of payoffs $r_1,r_2,...$ for player $i$, the average reward of $i$ is:
$$\lim_{k \to \infty} \sum^k_{j=1} {r_j\over k}$$
:::
:::success
**Future discounted reward** Definition:
Given an infinite sequence of payoffs $r_1,r_2,...$ for player $i$ and discount factor ${\beta}$ with $0 \lt \beta \lt 1$, i^’s future discounted reward is
$$\sum^{\infty}_{j=1} {\beta}^j r_j$$
:::
Two equivalent interpretations of the discount factor:
1. the agent cares more about his well-being in the near term than in the long term
2. the agent cares about the future just as much as the present, but with probability 1−${\beta}$ the game will end in any given round
:::success
**Stochastic game** Definition
A stochastic game is a tuple $(Q,N,A,P,R)$, where
$Q$ is a finite set of states
$N$ is a finite set of n players
$A = A_1 \times ···\times A_n$, where $A_i$ is a finite set of actions available to player $i$
$P : Q \times A \times Q \to [0,1]$ is the transition probability function; $P(q,a, \hat q)$ is the probability of transitioning from state $q$ to state $\hat q$ after joint action a
$R = \{r_1,...,r_n\}$, where $r_i : Q \times A \to R$ is a real-valued payoff function for player $i$.
:::
:::success
For every $a \in A$ let $w(a)$ be the number of times the opponent has player action $a$.
Can be initialized to non-zero starting values.
Assess opponent’s strategy using these counts:
$σ(a)$ = ${w(a)} \over {\sum}_{a^{'} \in A} w(a′)$
:::
:::info
Theorem About Fictitious Play:
**If** the empirical distribution of each player’s strategies **converges** in fictitious play, **then** it converges to a $Nash \ equilibrium$.
--
Each of the following are a **sufficient conditions** for the empirical frequencies of play to converge in fictitious play:
1. The game is zero sum;
2. The game is solvable by iterated elimination of strictly dominated strategies;
3. The game is a potential game;
4. The game is $2 \times n$ and has generic payoffs.
:::
:::success
**Regret** Definition:
The regret an agent experiences at time $t$ for not having played $s$ is $R^t(s) = max(α^t(s)−α^t,0)$.
:::
:::success
**No-regret learning rule** Definition:
A learning rule exhibits no regret if for any pure strategy of the agent $s$ it holds that $$Pr([{\lim}_{\infty} R^t(s)] \le 0) = 1$$.
:::
At each time step each action is chosen with probability proportional to its regret. That is:
$${\sigma}^{t+1}_i (s) = {R^t(s) \over {\sum}_{s^{'} \in S_i} R^t(s^{'})}$$
where ${\sigma}^{t+1}_i (s)$ is the probability that agent $i$ plays pure strategy $s$ at time $t + 1$.
Converges to a correlated equilibrium for finite games.
### Equilibria of Infinitely Repeated Games
Consider any n-player game $G = (N,A,u)$ and any payoff vector $r = (r_1,r_2,...,r_n)$.
Let $$v_i = \min_{s_{−i} \in S{−i}} \max_{s_i \in S_i} u_i(s_{-i},s_i)$$
:::success
**Enforceable** definition:
A payoff profile r is **enforceable** if $r_i \ge v_i$.
:::
:::success
**feasible** definition:
A payoff profile $r$ is feasible if there exist rational, non-negative values ${\alpha}_a \ such \ that \ \forall i$, we can express $r_i$ as ${\sum}_{a \in A} \ {\alpha}_a \ u_i(a)$, with ${\sum}_{a \in A} \ {\alpha}_a = 1$.
:::
:::info
Folk theorem:
Consider any n-player game G and any payoff vector$r = (r_1,r_2,...,r_n)$.
1. If $r$ is the payoff in any Nash equilibrium of the infinitely repeated G with average rewards, then for each player $i$, $r_i$ is enforceable.
2. If $r$ is both feasible and enforceable, then $r$ is the payoff in some Nash equilibrium of the infinitely repeated $G$ with average rewards.
:::
Discounted Repeated Games
Stage game: $(N,A,u)$
Discount factors: $β_1,...,β_n \ ,and \ β_i \in [0,1]$
Assume a common discount factor for now: ${\beta}_i = {\beta} \ \forall i$
Payoff from a play of actions $a_1,...,a_t,...$:
$${\sum}_t {\beta}^t_i \times u_i(a_t)$$
當 Cooperate Payoff = Defect payoff 時有臨界$\beta$值
當 Cooperate Payoff < Defect payoff 時則永遠 Defect
# Week6: Bayesian Games
We’ll still assume:
1. All possible games have the same number of agents and the same strategy space for each agent; **differing only in payoffs**.
2. Agents’ beliefs are posteriors, obtained by conditioning a common prior on individual private signals.
:::success
**Bayesian Game** Definition:
A Bayesian game is a tuple $(N,G,P,I)$ where
$N$ is a set of agents,
$G$ is a set of games with $N$ agents each such that if $g, \ g^{'} \in G$ then for each agent $i \in N$ the strategy space in $g$ is identical to the strategy space in $g^{'}$, <font size=2 color=red>(for these $G = (N,A,u)$)</font>
$P \in {\Pi}(G)$ is a common prior over games, where ${\Pi}(G)$ is the set of all probability distributions over $G$<font size=2 color=red>(是g賽局的可能性)</font>, and
$I = (I_1,...,I_n)$ is a set of partitions of $G$, one for each agent. <font size=2 color=red>(所以I將會是一個等價關係類)</font>
:::
:::success
**Bayesian Game** Definition(in Epistemic type):
A Bayesian game is a tuple $(N,A,{\Theta},p,u)$ where
$N$ is a set of agents,
$A = (A_1,...,A_n)$, where $A_i$ is the set of actions available to player $i$,
${\Theta} = ({\theta}_1,...,{\theta}_n)$, where ${\theta}_i$ is the type space of player $i$,
$p : {\theta} \to [0,1]$ is the common prior over types,
$u = (u_1,...,u_n)$, where $u_i : A \times {\theta} \ \to R$ is the utility function for player $i$.
:::
### Stratigies
Given a Bayesian game $(N,A,{\Theta},p,u)$ with finite sets of players, actions, and types, strategies are defined as follows:
* Pure strategy: $s_i : {\Theta}_i \to A_i$
* a choice of a pure action for player $i$ as a function of his or her type.
* Mixed strategy: $s_i : {\Theta}_i \to {\pi}(A_i)$
* a choice of a mixed action for player $i$ as a function of his or her type.
* $s_i(a_i|{\Theta}_i)$
* the probability under mixed strategy $s_i$ that agent $i$ plays action $a_i$, given that $i$’s type is ${\Theta}_i$.
### Expected Utility
Three standard notions of expected utility:
1. ex-ante
* the agent knows nothing about anyone’s actual type;
2. interim
* an agent knows her own type but not the types of the other agents;
3. ex-post
* the agent knows all agents’ types.
Given a Bayesian game $(N,A,{\Theta},p,u)$ with finite sets of players, actions, and types, $i$’s **interim expected utility** with respect to type ${\theta}_i$ and a mixed strategy profile $s$ is
$${EU}_i(s|{\theta}_i) = \sum_{ {\theta}_{-1} \in {\Theta}_{-i} } p({\theta}_{-i}|{\theta}_i) \cdot \{\sum_{a \in A}(\prod_{j \in N} s_j(a_j, {\theta}_j)) \cdot u_i(a,{\theta}_i,{\theta}_{-i})\}$$
$i$’s **ex-ante** expected utility with respect to a mixed strategy profile $s$ is:
$$EU(s|{\theta}) = \sum_{{\theta}_i \in {\Theta}_i
} p({\theta}_i) \cdot {EU}_i(s|{\theta}_i)$$
A Bayesian equilibrium is a mixed strategy profile s that satisfies: $\forall i$ and ${\theta}_i \in {\Theta}_i$
$$s_i \in arg\max_{s^′_i} EU_i(s^′_i,s_{-i}|{\theta}_i)$$
The above is defined based on interim maximization. It is equivalent to an ex-ante formulation:
If $p({\theta}_i) \gt 0, \ \forall \ {\theta}_i \in {\Theta}_i$, then this is equivalent to requiring that $\forall i$
$$s_i \in arg\max_{s^′_i} EU_i(s^′_i,s_{-i}) = arg\max_{s^{'}_i} \sum_{{\theta}_i} (p({\theta}_i)EU_i(s^′_i,s_{-i}|{\theta}_i))$$
# Week 7: Coalitional Games
We are not concerned with:
* how the agents make individual choices within a coalition
* how they coordinate
:::warning
**Transferable utility** assumption:
* payoffs may be redistributed among a coalition’s members.
* satisfied whenever payoffs are dispensed in a universal currency.
* each coalition can be assigned a single value as its payoff.
:::
:::success
**Coalitional game with transferable utility** Definition:
A coalitional game with transferable utility is a pair $(N,v)$, where
$N$ is a finite set of players, indexed by $i$, and
$v : 2^{\vert N \vert} \to \Bbb R$ associates with each coalition $S \subseteq N$ a real-valued payoff $v(S)$ that the coalition’s members can distribute among themselves, and
We assume that $v(\emptyset) = 0$.
:::
:::success
**Superadditive game** Definition:
A game $G = (N,v)$ is superadditive if $\forall S,T \subset N$, if $S \cap T = \emptyset$, then $v(S \cup T) \ge v(S) + v(T)$.
:::
Superadditivity is justified when coalitions can always work without interfering with one another.
## Shapley’s axioms
$\forall S$ that contains neither $i$ nor $j$, $v(S \cup \{i\}) = v(S \cup \{j\})$.
:::warning
**Symmetry**
$\forall v$, if $i$ and $j$ are interchangeable then $\psi_i(N,v) = \psi_j(N,v)$.
:::
$i$ is a dummy player if the amount that $i$ contributes to any coalition is 0.
:::warning
**Dummy player**
$\forall v$, if $i$ is a dummy player then $\psi_i(N,v) = 0$.
:::
:::warning
**Additivity**
For all two $v_1$ and $v_2$, $\psi_i(N,v_1 + v_2) = \psi_i(N,v_1) + \psi_i(N,v_2)$ for each $i$, where the game $(N,v_1 + v_2)$ is defined by $(v_1 + v_2)(S) = v_1(S) + v_2(S)$ for every coalition $S$.
:::
## Shapley Value
:::info
**Shapley Value** Theorem:
Given a coalitional game $(N,v)$, there is a unique payoff division $x(v) = \psi(N,v)$ that divides the full payoff of the grand coalition and that satisfies the **Symmetry**, **Dummy player** and **Additivity** axioms: the Shapley Value
:::
for each $i$,
$$\psi_i(N,v) = {1 \over {N!}} \sum_{S \subseteq N \setminus \{i\}} \vert S\vert! \cdot (\vert N\vert − \vert S\vert −1)! \cdot [v(S \cup {i})−v(S)]$$
## Core
it is a set
:::success
**Core** definition:
A payoff vector $x$ is in the core of a coalitional game $(N,v)$
$\iff \forall S \subseteq N$, $\sum_{i \in S}x_i \ge v(S)$
:::
Is the core always nonempty?
> **NO.**
> Consider again the voting game.
> The set of minimal coalitions that meet the required 51 votes is $\{A,B\},\{A,C\},\{A,D\}, and\{B,C,D\}$.
> If the sum of the payoffs to parties $B, C, and D$ is less than $100 million, then this set of agents has incentive to deviate.
> If $B, C, and D$ get the entire payoff of $100 million, then $A$ will receive $0 and will have incentive to form a coalition with whichever of $B, C, and D$ obtained the smallest payoff.
> Thus, the core is empty for this game.
Is the core always unique?
> **NO.**
> Consider changing the example so that an 80% majority is required.
> The minimal winning coalitions are now$\{A,B,C\}and\{A,B,D\}$.
> Any complete distribution of the $100 million among $A$ and $B$ now belongs to the core.
> All winning coalitions need the support of these two parties.
> Thus, the core is not unique for this game.
:::success
**Simple game** Definition:
A game $G = (N,v)$ is simple if $\forall S \subset N$, $v(S) ∈\in \{0,1\}$.
:::
:::success
**Veto** Definition:
A player $i$ is a veto player if $v(N \setminus \{i\}) = 0$.
:::
:::info
In a **simple game** the **core** is **empty** $\iff$ there is **no veto player**. If there are veto players, the core consists of all payoff vectors in which the nonveto players get 0.
:::
:::success
**Convex game** Definition:
A game $G = (N,v)$ is convex if $\forall S,T \subset N$, $v(S \cup T) \ge v(S) + v(T)−v(S \cap T)$.
:::
> Convexity is a stronger condition than superadditivity.
>
:::info
Every convex game has a nonempty core.
:::
:::info
In every convex game, the Shapley value is in the core.
:::
2/16/2019 done.