DSA HW3 - HackMD

# DSA HW3 1. minimax 2. alpha-beta pruning 3. memorization 4. heuristic method --- ## minimax ---- ### game tree - A node represents a state of the game - An edge represents the player's feasible strategy and state transition - Root node is the initial state of the game - Leaf nodes are the states at the end of the game - Each player moves optimally ---- sample: 3 DQ H3 H2 HA D6 D3 ---- ![](https://i.imgur.com/W9xhiFy.png =x600) ---- For each player, when he wins, let the score be the remaining points of the loser when he loses, let the score be the remaining points of himself Then the goal becomes to maximize this score ---- In addition, you can find that it is a zero-sum game Therefore, we can use the minimax algorithm ---- ![](https://i.imgur.com/KObbHhL.png =x600) --- ## alpha-beta pruning ---- ![](https://i.imgur.com/INcK1CI.png) ---- ![](https://i.imgur.com/2dPhxNe.png) ---- ![](https://i.imgur.com/oLpB0Tk.png =x600) ---- Max layers generate lower bound to the subtree Min layers generate upper bound to the subtree Each node receives two additional parameters: $alpha,beta$, which represent the lower bound and upper bound of the answer respectively The answer is useful only if it is between $alpha\sim beta$ ---- Max layers update $alpha$ after computing a branch Min layers update $beta$ after computing a branch if $beta\le alpha$, it means that the node is impossible to return a useful answer So there is no need to calculate remaining branches. --- ## memorization ---- Consider this case: 4 SA S2 D3 D4 D5 D6 H7 H8 The possible order of playing cards are: 1. SA->pass->S2->pass->D3... 2. S2->pass->SA->pass->D3... 3. D3->D5->D4->D6... 4. D4->D5->D3->D6... ... ---- We can find that the state of 1,2 are the same: - both of them have the same remaining cards - the previous card is also the same(D3) Thus, they will lead to the same result So we should only calculate once ---- Store the answer after computing each state (use data structures such as unordered_map) When entering a node, if the state of the node has been computed before, return the answer directly In this way we can avoid repeating computing same states ---- ### How to express a state 1. the remaining cards of both players 2. whose turn it is 3. the previous card 4. $alpha,beta$ ---- #### remaining cards: Each of them has $n$ cards initially Remaining cards are a subset of initial cards Thus, we can use $n$ bits to represent it $i{\text -}th$ bit means whether he still has the $i{\text -}th$ card ---- We need $2n$ bits to store remaining cards of both players $1$ bit to represent the current player $2+4$ bits to represent the suit and the number of previous card respectively $9$ bits for both $alpha$ and $beta$ Thus we can use $59$ bits to express a state: store all of them in an unsigned long long --- ## heuristic method ---- For a game, there is often some intuitive strategies If you can make sure that it is an optimal strategy, you may greatly reduce the states to be searched Besides, choose the search order properly may let alpha-beta pruning algorithm cut more branches Think and test it by yourself !