HackMD - Collaborative Markdown Knowledge Base

State space: all possible ways the game/situation could be be in * Initial/Goal State: Start and End States (could be more than one goal state) * Action Space: Set of all Actions, where an action is… * Transition Space: Takes in Current State + Next Action, Return an updated state space **Uninformed Seach (Tree-Search Algos)** * On tables page * **Informed Seach (Extra Information from Heuristics)** * On tables page * **Heuristics** * Some function, theory, strategy that leads us towards the goal * Good heuristics can dramatically reduce search cost * Usually can be devised from relaxing the problem - take out the constraints, simplify the issue * Ex. Euclidean distance (straight line distance), Manhattan distance (grid distance), Number of objects left before goal state etc. * 2 TYPES: either overestimate the min cost solution or underestimate **Admissible** * A heuristic is admissible if it never overestimates the min cost to goal * h(G) = 0, this must be true - good trick to assess admissibility * f(G) = g(G) + h(G) = g(G) + 0 **Consistency** * If the heuristic difference never overestimates the actual step costs between neighbors * If h is consistent & admissible, then … * f is non-decreasing along every path * If f(y) < c∗ then y will eventually be expanded * If f(y) > c∗ then y will never be expanded * If f(y) = c∗ then y might be expanded * c∗ = cost of optimal solution **Dominance** * Basically way to evaluate which heuristic is better * If one heuristic is less vague than the other it dominates (assume admissible) * If one heuristic has a wider range value it dominates (assume admissible) * Ex. h_a and h_b are admissible, and h_a dominates h_b * During A* search, every node expanded with h_a will also be expanded with h_b * Same is true for A∗ graph search if h_a and h_b are consistent * number of expansions using h_a ≤ number of expansions using h_b * Get dominance: Let h(x)=max(ha(x),hb(x)) - larger value heuristic dominates if both are admissible **Constraint Satisfiability Problem (CSPs)** * Variables{x1, x2, ..., xn} that can take values in domains{D1, D2, ..., Dn} * State: set of assignments of values to variables * e.g., {x1= v1, x4= v3, x2= v2} * Constraints: specify allowable combination of assignments to variables * Unary constraint: single variable * e.g., 𝑥'≠10 * Binary constraint: two variables * e.g., SA ≠WA * Higher order constraint: 3 or more variables * e.g., Σ'𝑥'≤1 * Goal: state in which all variables are assigned and constraints are satisfied * When setting up these^, better to constrain the domains the most so the actual constraints are the least complicated, try to limit to unary constraints or binary * Solving CSPs: * Naïve algorithm: Generate-and-Test * Generate all possible assignments and test if they satisfy the constraints * n variables, |D| domain size, m constraints = O(m|D|n) running time * Solving CSP as a Standard Search Problem * Solve by setting up Problem like first thing on this page -> states, actions, transition function, and goal * Backtracking (basically DFS) * Select a Node * Minimum Remaining Values (MRV): Choose the Node with the least # of consistent values (least possibilities for labeling based on constraints) * Leads to a shorter tree/pruning * Maximum Degree: Use this to break ties in MRV, pick the node with the maximum edges connected to it * Select a Value for It * Least Constraining Value (LCV): Pick the value for the node that allows the neighboring variables the most choices in values * Keep going until solution or failure and backtrack * Propagation * Forward Propagation: propagates information from assigned variable to unassigned variable, but does not provide early detection for all failures * Constraint propagation: does however provide early detecting by repeatedly enforcing constraints consistancy locally * We check consistency and throw out values that cause inconsistencies * Arc consistency checks two variables * A CSP is arc-consistent if all variable pairs are arc consistent * The pair (X, Y) of constraint variables is arc consistent if for each value there exists a value such that the assignments X = x and Y = y satisfy all binary constraints between X and Y * Path consistency checks three variables * 𝒌-consistency checks 𝑘 variables **Adversarial Search** * Focus on Sequential, 2-Player, Zero-Sum Games * Sequential: Players take turns (eg. chess not rock/paper/scissors) * Zero-Sum Games - One player wins at the loss of the other player, Only one can win, no ties (eg. Poker not Prisoners Dilemma) **MiniMax** * 2 Players - Min and Max * Focus is to always take the (min/max) of the items at each level * Shown in a tree format, play game bottom up * Start will all the options (numbers) - known as the terminal state * Turn: Give a set of children a parent node by (minimizing/maximizing) children, do this for entire level * Do this for every level and alternate min and max turns until there is only one node left - winner * No chance - everything is deterministic * Optimal, full tree known beforehand * Examines 𝑶(𝒃^𝒎) for MinMax **Alpha-Beta Pruning (Exact) Heuristic - adapted mostly for MiniMax** * Every node has a range: [alpha, beta], start at (-inf, inf) * * For the rest [x, y]: x will be for a min turn and y will be for a max turn * Start at leftmost set, go through every child in the terminal state in this set and set alpha_1,1 and beta_1,1 = [min/max](set 1) * Prune any other set by: * If child [</>] alpha_1,1 = beta_1,1, * Prune all the rest in the set; Go to next set * Else, * Update [beta/alpha] 1,2 with value * General Rule: * Pruned sets * Min Level: (-inf, value] * Max Level: [value, inf) * Full sets * Min Level: (min val, min val) * Max Level: (max val, max val) * Optimal, full tree known beforehand * Examines 𝑶(𝒃^(𝒎/𝟐)) **Three Player Games** * Each player has an individual utility * Since, not possible for player 𝐶 to minimize utility of both 𝐴 and 𝐵, Natural strategy is for 𝐶 to maximize his/her own utility, same for other players * Sochastic Games * Third Player is Chance, for our purposes we have his player perform in an expected way (eg. instead of randomly picking between numbers, pick the average on third player chance’s turn) * Eg. Backgammon