Week 4a - Adversarial Search I

Week 4a - Adversarial Search I === {%hackmd theme-dark %} Outline & Objective --- - Understand the differences between standard search problems and adversarial search problems - Understand the workings behind the Minimax algorithm and α - β pruning - Able to use Minimax algorithm to solve an adversarial/game search problem - Able to use α-β pruning to speed up adversarial/game search Introduction --- For the previous parts on search, we assumed a simple environment, that is: - Fully observable - Deterministic - Sequential - Static - Discrete - Single-agent In this search problem, we now no longer need to assume that the environment is single-agent (meaning that there are other agents that will determine the agents' actions). In addition, the environment could also be stochastic rather than deterministic. Adversarial/Games vs General Search Problems --- - Multi-agent vs Single-agent - "Unpredictable" opponent &rightarrow; solution is a contingency plan (contingent on the opponent's moves). It is also possible for us to look ahead (expect opponent's moves). - Time limits &rightarrow; unlikely to find goal, must approximate. This is because usually the search space is really large. - Plan of attack - Minimax: Algorithm for perfect play (best possible move in the context of opponent's behavior). This algorithm assumes that the agent can see the goal state of the game. - α-β pruning: Finite time, approximate evaluation (some utility score to evaluate the current state compared to others), pruning (based on utility score) - Expecti Minimax: chances in games/search Types of Games --- ![](https://i.imgur.com/ee3yDsa.png) In this lecture, we will focus on fully observable, deterministic games. Game Search Problem Formulation --- ![](https://i.imgur.com/aVfCXux.png) A strategic two-player game can be formally defined by: - Initial State - Actions - Terminal Test (Win/Lose/Draw) - Utility Function (numerical reward for the outcome) - Chess: +1, 0, -1 - Poker: Cash won or lost - In a zero-sum game with two players, each player's utility for a state are equal and opposite (when opponent gains 10 points, we lose 10 points) ### Example: Game Tree for Tic-tac-toe ![](https://i.imgur.com/edWfc4C.png) Number of possible moves in tic-tac-toe: $9!$ (because 9 choices -> 8 choices ...) Minimax --- - Perfect play for deterministic, perfect-information (fully observable) games - Idea: choose move to position with highest minimax value - = best achievable payoff against best play ### Minimax Algorithm ![](https://i.imgur.com/AviiEeq.png) `a` will try to maximize their score, `b` will try to minimize `a`'s score. Based on this logic, `a` will then has to choose the `max` of the `min` possible scores, which in this case, will be `max([3,2,2]) = 3` More formally, ![](https://i.imgur.com/u1HV0v9.png) ### Minimax Properties - Completeness: Yes, if tree is finite - Optimality: Yes, against an optimal (rational) opponent - Time Complexity: $O(b^m)$ - Space Complexity: $O(bm)$ (depth-first exploration) The search component of Minimax is similar to DFS (since agent will try to explore from the goal state upward) Time complexity of this algorithm is terrible. - For chess, b ≈ 35, m ≈ 100 for "reasonable" games - Time complexity ≈ $O(35^{100})$