Week 4a - Adversarial Search I
===
{%hackmd theme-dark %}
Outline & Objective
---
- Understand the differences between standard search problems and adversarial search problems
- Understand the workings behind the Minimax algorithm and α - β pruning
- Able to use Minimax algorithm to solve an adversarial/game search problem
- Able to use α-β pruning to speed up adversarial/game search
Introduction
---
For the previous parts on search, we assumed a simple environment, that is:
- Fully observable
- Deterministic
- Sequential
- Static
- Discrete
- Single-agent
In this search problem, we now no longer need to assume that the environment is single-agent (meaning that there are other agents that will determine the agents' actions). In addition, the environment could also be stochastic rather than deterministic.
Adversarial/Games vs General Search Problems
---
- Multi-agent vs Single-agent
- "Unpredictable" opponent → solution is a contingency plan (contingent on the opponent's moves). It is also possible for us to look ahead (expect opponent's moves).
- Time limits → unlikely to find goal, must approximate. This is because usually the search space is really large.
- Plan of attack
- Minimax: Algorithm for perfect play (best possible move in the context of opponent's behavior). This algorithm assumes that the agent can see the goal state of the game.
- α-β pruning: Finite time, approximate evaluation (some utility score to evaluate the current state compared to others), pruning (based on utility score)
- Expecti Minimax: chances in games/search
Types of Games
---

In this lecture, we will focus on fully observable, deterministic games.
Game Search Problem Formulation
---

A strategic two-player game can be formally defined by:
- Initial State
- Actions
- Terminal Test (Win/Lose/Draw)
- Utility Function (numerical reward for the outcome)
- Chess: +1, 0, -1
- Poker: Cash won or lost
- In a zero-sum game with two players, each player's utility for a state are equal and opposite (when opponent gains 10 points, we lose 10 points)
### Example: Game Tree for Tic-tac-toe

Number of possible moves in tic-tac-toe: $9!$ (because 9 choices -> 8 choices ...)
Minimax
---
- Perfect play for deterministic, perfect-information (fully observable) games
- Idea: choose move to position with highest minimax value
- = best achievable payoff against best play
### Minimax Algorithm

`a` will try to maximize their score, `b` will try to minimize `a`'s score.
Based on this logic, `a` will then has to choose the `max` of the `min` possible scores, which in this case, will be `max([3,2,2]) = 3`
More formally,

### Minimax Properties
- Completeness: Yes, if tree is finite
- Optimality: Yes, against an optimal (rational) opponent
- Time Complexity: $O(b^m)$
- Space Complexity: $O(bm)$ (depth-first exploration)
The search component of Minimax is similar to DFS (since agent will try to explore from the goal state upward)
Time complexity of this algorithm is terrible.
- For chess, b ≈ 35, m ≈ 100 for "reasonable" games
- Time complexity ≈ $O(35^{100})$