Ch04: Search in Complex Environment

# Search in Complex Environment ## Local Search and Optimization Problems - Previously introduced search algorithm gives: - When a goal is found, the *path* to that goal constitutes a *solution* to the problem. - However, in many problems, the path to the goal is **irrelevant**; the goal state itself is the solution. - **Local search** algorithms operate by searching from a start state to neighboring states, without keeping track of the paths, nor the set of states that have been reached. - They are **not systematic** – they might never explore a portion of the search space where a solution actually resides. - Two advantages - Use **very little memory** - Often find **reasonable solutions** in large or infinite state spaces (systematic algorithms may be unsuitable) - Useful for solving pure **optimization problems** - Aim to find the best state according to an **objective function** --- - A complete local search algorithm - Always finds a goal if one exists - An optimal algorithm - Always finds a global minimum/maximum ### Hill-Climbing ![](https://hackmd.io/_uploads/SJdbkw1fp.png) - Simply a loop that continuously moves in the direction of increasing/decreasing value. - **Does not maintain a search tree**, only the state and the value of the objective function are recorded. :::info #### 8-queens with hill-climbing - Local search usually uses a **complete-state formulation** - Each state has 8-queens on the board, one per column - Successors of a state are all possible states generated by moving a single queen to another square in the same column, which gives 8 * 7 = 56 successors. (8 queens and 7 possible positions) - Heuristic cost function h is the number of pairs of queens that are attacking each other - Global minimum is zero - ![](https://hackmd.io/_uploads/BJ90SDJz6.png =600x) ::: - Hill climbing often get stuck for the following reasons - Local maxima - Ridges, result in a sequence of local maxima that is very difficult for greedy algorithms to navigate - Plateaux, a flat area of the state-space landscape, and hill-climbing might can lost - Can be a flat local maximum, - No uphill exit exists - or a shoulder - Progress is possible - In each case, the algorithm reaches a point at which no progress is being made - In 8-queens, 86% stuck, only 14% solved - But it works quickly, 4 steps on average when it succeeds - Not bad, since state space is about 8^8^ ≈ 17 millions states - Sideways move - Hope that the plateau is really a shoulder. - Trap in an infinite loop, if it is a flat local maximum not a shoulder - Improvement - Put a limit on the number of consecutive sideways moves allowed. - 100 consecutive sideways moves in 8-queens raises the percentage solved from 14% to 94%. - Cost: average steps to solve the problem become 21 and 64 for failure ### Variations of Hill Climbing - Stochastic hill climbing - Chooses at random from among the uphill moves - First-Choice hill climbing - Implements stochastic hill climbing by generating successors randomly until one is generated that is better than the current state - Random-restart hill climbing - Adopts the well-known adage - “If at first you don’t succeed, try, try again” - It conducts a series of hill-climbing searches from randomly generated initial states, until a goal is found ### Simulated Annealing - Combine hill climbing with a random walk that yields both efficiency and completeness. - In metallurgy, annealing is the process used to temper or harden metals and glass by heating them to a high temperature and then gradually cooling them, thus allowing the material to reach a low-energy crystalline state. - Consider **gradient descent** - Getting a ping-pong ball into the deepest crevice in a bumpy surface. - If we let the ball roll, it will come to rest at a local minimum. - If we shake surface, we can bounce the ball out of the local minimum. - The simulated-annealing - Start by shaking hard and then gradually reduce the intensity of the shaking. - Quite similar to hill climbing. - Instead of picking the best move, however it picks a random move. - If the move improves the situation, it is always accepted. - Otherwise, the algorithm accepts the move with some probability less than 1 ![](https://hackmd.io/_uploads/rystdDJMa.png =800x) ### Local Beam Search - It keeps track of k states rather than just one - It begins with k randomly generated states - At each step, all the successor of all k states are generated. If any one is a goal, the algorithm halts - Otherwise, it selects the k best successors from the complete list and repeats - Differ from random restart search - Random restart runs independently - In local beam search, useful information is passed among the parallel search threads - Problems - “Come over here, the grass is greener!” - Lack of diversity among the k states - Stochastic beam search helps alleviate this problem - Choose k successors at random, with the probability of choosing a given successor being an increasing function of its value. - Resemble to the process of natural selection - “Successors” (offspring) of a “state” (organism) populate the next generation according to its “value” (fitness) ### Genetic Algorithm - A variant of **stochastic beam search** in which successor states are generated by combing **two parent states** rather than by modifying a **single state**. - Starts with a set of *k* randomly generated states, called the **population**. - Each state, or **individual**, is represented as a string over a finite alphabet. - In 8-queens, it can be represented using $8 \times \log_2 8 = 24$ bits, or - 8 digits, each in the range from 1 to 8. - Each state is rated by the objective function or **fitness function**. - e.g. nonattacking pairs of queens in 8-queens. 28 for a solution. The value of the 4 states are 24, 23, 20, and 11 ![](https://hackmd.io/_uploads/H1SVsDyfa.png =800x) - Two pairs are selected at random for reproduction, in accordance with the probabilities in (b) - Notice that one individual is selected twice and one not at all - For each pair to be mated, a **crossover** point is chosen randomly from the positions in the string - The offspring are created by crossing over the parent strings at the crossover point - Finally, each location is subject to random mutation with a small independent probability - Successful use of genetic algorithm requires careful engineering of the representation ![](https://hackmd.io/_uploads/Sy4fMu1fp.png =600x) ## Local Search in Continuous Spaces - Example, placing three airports anywhere in Romania, such that the sum of squared distance from each city on the map to its nearest airport is minimized. - This is a six-dimensional space and the objective function is $f(x_1, y_1, x_2, y_2, x_3, y_3)$. $$ f(x_1, y_1, x_2, y_2, x_3, y_3) = \sum^3_{i = 1} \sum_{c \in C_i} (x_i - x_c)^2 + (y_i - y_c)^2 $$ - $C_i$ is the set of cities whose closest airport is airport $i$. --- - To discretize the neighborhood of each state, we can move only one airport at a time in either the $x$ or $y$ direction by a fixed amount $\delta$. - 6 variables, this gives 12 possible successors for each state. - Local search algorithms can be applied. - Many methods attempt to use the gradient of the landscape to find a maximum. The gradient of the objective function. $$ \nabla f = (\frac{\delta f}{\delta x_1}, \frac{\delta f}{\delta y_1}, \frac{\delta f}{\delta x_2}, \frac{\delta f}{\delta y_2}, \frac{\delta f}{\delta x_3}, \frac{\delta f}{\delta y_3}) $$  - A maximum can be found by solving the equation $\nabla f = 0$. ## Search with Nondeterministic Actions - In a **partially observable** environment, every percept helps narrow down the set of possible states the agent might be in, thus making it easier for the agent to achieve its goals - When the environment is **nondeterministic**, percepts tell the agent which of the possible outcomes of its actions has actually occurred - In both cases, future percepts can not be determined in advance - The solution to a problem is not a sequence but a **contingency plan**. (also known as a **strategy**) - Specifies what to do depending on what percepts are received ### The Erratic Vacuum World - In the erratic vacuum world, the Suck action works as follows: - When applied to a dirty square the action cleans the square and sometimes cleans up dirt in an adjacent square, too. - When applied to a clean square the action sometimes deposits dirt on the carpet. - A **RESULTS** function that returns a set of possible outcome states. - i.e. the Suck action in state 1 leads to a state in the set {5, 7}. - Solution to the problem is generalized as contingency plan ![](https://hackmd.io/_uploads/rkEHt_1G6.png =400x) ![](https://hackmd.io/_uploads/S1-NcOyf6.png =400x) ### AND-OR Search Trees ### Slippery Vacuum World ## Searching with Partial Observations ### Sensorless Version of the Vacuum World