F24: CS410 Homework 1: Heuristic Search

# CS410 Homework 1: Heuristic Search **Due Date: 9/18/2024 at 12pm** **Need help?** Feel free post on [Edstem](https://edstem.org/us/courses/61309) for TA assistance. ## Assignment Overview Last assignment was concerned with blind search. Although blind search will eventually find an optimal solution to a search problem, it may take more time (e.g., seconds) than the number of atoms in the universe. This assignment is designed to introduce you to heuristic search. Heuristic search is not always guaranteed to find an optimal solution, but heuristics are useful nevertheless because they can speed up search and/or they can save on memory requirements. In this assignment, you will experiment with BFS, DFS, and an additional blind search algorithm, *iterative deepening*, on small instances of a simple Tile Game to evaluate their run times and memory usages. Additionally, you will implement *A** *search*, a heuristic search algorithm, which you will seed with two different heuristics, in order to compare their performance on larger instances of the Tile Game. **N.B.** These two ideas, iterative deepening and A* search, can be combined to solve even larger search problems! ### Learning Objectives What you will know: - what informed/heuristic search is, and how it differs from blind search - what an admissible heuristic is, and how it differs from inadmissible ones What you will be able to do: - implement an informed search algorithm - develop heuristics for informed search, both admissible and inadmissible - reason about the tradeoffs between a search algorithm's optimality and its efficiency, both in terms of run time and memory usage ## The Puzzle The Sliding Tiles puzzle is a famous Tile Game. It is played on an $n$-by-$n$ grid ($n$ is usually 3 or 4, or perhaps 5), with numbers occupying all but 1 of the $n^2$ cells, and an empty space occupying the last. Starting from a state with the numbers arranged at random, the goal is to rearrange all the tiles into ascending order, with the empty space in the bottom right-hand corner. The name derives from the fact that the tiles can slide over the empty space. Here is an illustration of the classic Sliding Tiles puzzle: <div style="text-align:center"> <img src="https://hackmd.io/_uploads/rJQFt4OFA.gif" style="width: 30%"/> </div> <br> The Tile Game for this assignment is a variant of this classic puzzle. It is also played on a $n$-by-$n$ grid, with numbers occupying the cells. But there is no space; all $n^2$ tiles are occupied. Our version is played by swapping any two adjacent tiles, where adjacency is defined grid-wise--diagonal tiles are not adjacent. As in the original Sliding Tiles puzzle, the goal of our simple Tile Game is to arrange the numbers in ascending order. The goal state on a 3-by-3 board is: \begin{bmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & 9 \end{bmatrix} Here is one way to arrive at this goal, from a random start state: $$ \begin{array}{ccccccc} & \begin{bmatrix} 1 & 6 & 2 \\ 7 & 8 & 3 \\ 5 & 4 & 9 \end{bmatrix} & \begin{bmatrix} 1 & 2 & 6 \\ 7 & 8 & 3 \\ 5 & 4 & 9 \end{bmatrix} & \begin{bmatrix} 1 & 2 & 3 \\ 7 & 8 & 6 \\ 5 & 4 & 9 \end{bmatrix} & \begin{bmatrix} 1 & 2 & 3 \\ 7 & 8 & 6 \\ 4 & 5 & 9 \end{bmatrix} & \begin{bmatrix} 1 & 2 & 3 \\ 4 & 8 & 6 \\ 7 & 5 & 9 \end{bmatrix} & \begin{bmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & 9 \end{bmatrix} \\ \end{array} $$ The search space (i.e., the number of states) in a Tile Game is exponential, because there are $m! \in \Omega(m^m)$ ways to order $m$ consecutive integers. As we cannot possibly enumerate this space for all but the smallest choices of $m$, you will solve instances of the Tile Game using heuristic search. To do so, you will implement both blind and informed search algorithms. The informed algorithms rely on heuristics, which you will develop by encoding domain-specific knowledge. By their very nature, the informed algorithms are more sophisticated than the blind algorithms. Indeed, if your heuristics are well designed, you will find that you can solve larger Tile Game instances using informed search algorithms. ## Data Structures & Algorithms In this assignment, we worked out the problem of how to represent the data--in this case, a Tile Game--for you. Unsurprisingly, we represent the game as a basic search problem (recall the `SearchProblem` abstract class from Homework 0), with states, actions, transitions, and an initial state and a goal state. Given a dimension $n$, we represent a `State` in the Tile Game as an $n$-tuple of $n$-tuples. The outer $n$-tuple represents the rows in a Tile Game, while the inner $n$-tuples represent the entries in the corresponding row's columns. For example, the goal state is represented in Python as: `((1,2,3),(4,5,6),(7,8,9))`, while the start state is some random arrangement of the tiles: i.e., the numbers 1 through 9 in a 3x3 matrix. The transition correspondence maps a state to all states that can be reached by swapping two adjacent tiles.  :::spoiler **Hint: Data Structures** The data structures in this assignment are the same as the ones in Homework 0. In particular, a `TileGame` is a `SearchProblem`, and a `TileGameState` is (trivially) a `State`. ::: In this assignment, you will be experimenting with iterative deepening search (IDS) and implementing A* search. These algorithms are described in Sections 3.4 (blind search) and 3.5 (informed search) of Russell and Norvig. As in HW 0, your A* implementation should be designed to operate on (undirected) graphs. That is, to the extent that memory permits, it should keep track of previously visited states, and should not visit them again. Unlike BFS, DFS, and IDS, which are **blind** search algorithms, A* is an **informed** search algorithm, meaning it relies on a **heuristic** to guide its search. A heuristic is an estimate of the cost to a nearest goal.  A* is optimal when it is endowed with an admissible (i.e., optimistic) heuristic. When its heuristic is inadmissible, optimality is no longer guaranteed. However, inadmissible heuristics can significantly reduce the number of states searched and therefore reduce the runtime of A*. Moreover, they can be faster to compute than admissible heuristics. You will investigate trade offs between efficiency and solution quality in this (and most CS 410) assignment(s).   ## Tasks In this assignment, you will experiment with more and more sophisticated algorithms capable of solving larger and larger instances of the Tile Game. To do so, you will run a series of experiments with both blind and informed search algorithms on games of varying sizes, using various heuristics, and then you will summarize your findings in your `README`. ### Part 1: Blind Search Revisited In `bfs_and_dfs.pyc`, you can find obscured implementations of BFS and DFS, while in `blind_search.py`, you can find a readable implementation of IDS. :::spoiler In case you're not using python 3.10.6 If you're not using python 3.10.6, you may get errors when trying to run `blind_search.py` since it uses `bfs_and_dfs.pyc`. If you went through the setup guide and did everything successfully, running `blind_search.py` should not be a problem. However, if this causes issues for you, feel free to copy over your bfs/dfs code from Homework 0 and change the import statement in `blind_search.py` to use your code. :::  :::info **Task 1** Run the aforementioned blind search algorithms, BFS, DFS, and IDS, on a few sample Tile Games by executing `python blind_search.py --size 2 --trials 10`. The `main` method in this file will run all three algorithms, in this case on 10 randomly generated Tile Games of size 2, and will output a table summarizing the results (e.g., the maximum number of states on the search frontier for each algorithm). Modify the command line arguments to experiment with Tile Games of different sizes. Does increasing or decreasing the number of trials change the results? Summarize your findings in a brief paragraph in your README. In your summary, you should answer questions like, is IDS more or less expensive than BFS and DFS in terms of the maximum size of the search frontier? What about in terms of the number of states visited? Does it seem worth the effort? ::: :::spoiler **Hint: Randomness** Each call to the `TileGame(n)` constructor generates a random permutation of the numbers 1 through $n^2$ on an $n \times n$ grid. ::: :::spoiler **Hint: Termination** Feel free to terminate search after 1 minute. You can do so either by programming in this stopping time, or by simply terminating the run manually after 1 minute using `Ctrl-C`. :::   ### Part 2: Informed Search Informed search algorithms are so-called because they rely on information about the cost of reaching a goal state. It is usually straightforward for an agent to calculate the cost of reaching a state once it arrives there, but a challenge remains: to estimate the cost of reaching the goal from wherever it currently is! A **heuristic** is a function that produces an estimate of the cost of reaching the goal from the current state. An **admissible heuristic** is optimistic: it never overestimates this cost. That is, to be on the safe/conservative side, it always estimates the goal to be closer (technically, no farther away) than it actually is. *Manhattan distance* is a distance measure, where all movements are constrained to be axis-aligned (i.e., at right angles), as if you were walking from one location to another in Manhattan. At any non-goal state in the Tile game, each tile is some number of adjacent swaps (i.e., steps) from the goal: i.e., some Manhattan distance from the goal. We call the sum ofall these per-tile distances the *total Manhattan distance*. :::spoiler **Is total Manhattan distance an admissible heuristic? Why or why not?** The issue is that a single swap can reduce total Manhattan distance by 2, not just 1, if the two swapped tiles *both* get closer to their goal state. Manhattan distance alone can therefore *overestimate* the number of steps required to reach the goal. For example, consider a Tile game that is almost solved but requires one final swap to reach the goal state. The total Manhattan distance of this game is 2, since there are two tiles that are each one step away from their positions in the goal state, but this distance (2) overestimates the number of steps (1) it actually takes to reach the goal state! ::: :::spoiler **Describe a variant of total Manhattan distance that is admissible.** The aforementioned example is the worst case, in that a swap that moves one tile closer to its goal can only ever reduce total Manhattan distance by at most 2. Therefore, we can make the Manhattan distance heuristic admissible by dividing it by 2! This new "half-Manhattan distance" heuristic can no longer overestimate the number of steps needed to reach the goal. ::: Your next task is to implement an informed search algorithm, namely A*, using *half-Manhattan distance* as your admissible heuristic. We have implemented this heuristic for you in the `admissible_heuristic` function in the `heuristics.py` file. :::info **Task 2** Implement A* (`astar`) in `informed_search.py`. ::: :::success **Signature** The `astar` function takes as input a `HeuristicSearchProblem`, which extends `SearchProblem` with a `heuristic` method, and returns a solution to this problem or `None`, where, as in Homework 0, a solution is a path through the search tree from the start state to a goal, together with a dictionary of summary statistics. ::: :::spoiler **Hint: Testing & Debugging** You can test and debug your implementation of A* on sample Tile Games by running `informed_search.py`, which will call `astar` on a Tile Game of size 3 with both half-Manhattan distance (admissible) and Manhattan distance (inadmissible). You can modify `main` within `informed_search.py` to do experiment with other sizes or heuristics. ::: ### Part 3: Admissible vs. Inadmissible Heuristics Not all heuristics are admissible. An **inadmissible heuristic** is one that may be pessimistic: it might overestimate the true cost to reach the goal state. That is, it is more liberal than an admissible heuristic, in that it sometimes estimates the goal to be farther away that it actually is. For example, in the Tile Game, multiplying the sum of the per-tile half-Manhattan distances by any value $\lambda > 1$ is *inadmissible*, because doing so overestimates the number of steps required to reach the goal state. Indeed, $\lambda > 1$ is a measure of how far from admissible the heuristic becomes. We have also implemented this $\lambda$-parameterized inadmissible variant of Manhattan distance for you, in the `inadmissible_heuristic` function in `heuristics.py`. **N.B.** Recall that our admissible heuristic is half-Manhattan distance. When $\lambda=2$, half-Manhattan distance is precisely Manhattan distance! :::info **Task 3** Run the `compare_lambdas` function in the `compare_heuristics.py` file, which runs `astar` using both heuristics, `admissible` and `inadmissible`, on a large number of sample instances (50 by default), and calculates some average statistics. View the output file `heuristics_lambda_states.jpg`, which depicts the average solution lengths vs. the average number of states visited. ::: Hopefully, your graph resembles the one we generated: ![heuristics](https://hackmd.io/_uploads/rkvxJiz5C.jpg) Observe that A* with lower values of $\lambda$ expands significantly more states, and thus take longer to run, than A* with higher values of $\lambda$. For example, A* with $\lambda=5$, an overly pessimistic heuristic, ran 100 times faster in our experiments than A* with $\lambda = 1$, an admissible heuristic. On the other hand, solution quality, i.e., the length of the solution found, suffers as $\lambda$ (i.e., the degree of inadmissibility) increases.   **N.B.** Although not apparent in this example, another reason to use inadmissible heuristics, beyond expanding fewer states, is that they can be faster to compute than admissible heuristics *per-state*, thus leading to further run-time savings (i.e., increasing efficiency), once again, at the cost of optimality. ### Part 4: A Heuristic of Your Own Your next and final task is to develop your own inadmissible heuristic for the Tile Game. This heuristic should provide an educated guess about the cost of reaching a goal state from an arbitrary game state, `TileGameState`. :::info **Task 4a** Design and implement an inadmissible heuristic for the `TileGame` in the `my_heuristic` function in `heuristics.py`. In your `README`, include a 1-2 sentence description of your heuristic. ::: :::success **Hint: Signature** The `my_heuristic` function takes as input a `TileGameState` and returns a heuristic value for this state. ::: :::spoiler **It goes without saying ...** that you should implement something other than Manhattan distance multiplied by a constant, which we have already implemented for you. ::: :::info **Task 4b** After you implement `my_heuristic`, run the functions in `compare_heuristics.py` again. Your heuristic should now appear on the generated plot alongside the Manhattan distance heuristic variants. Once again, view the output file, `heuristics_lambda_states.jpg`.  Summarize your findings in 1-2 sentences in your `README`. In particular, explain how the performance of `my_heuristic`--by which we mean *your* heuristic!--compares to that of the Manhattan distance heuristic variants? Be sure to touch on the tradeoffs between efficiency and optimality. ::: ## Downloads Please click [here](https://classroom.github.com/a/i324hAhp) to access the assignment code. ### Support Code * `bfs_and_dfs.pyc`: Contains obscured implementations of BFS and DFS * `blind_search.py`: Contains a readable implementation of iterative deepening * `search_problem.py`: Provides an abstract base class,`SearchProblem`, that represents search problems (as in Homework 0), given a `State` * `heuristic_search_problem.py`: Provides an abstract base class,`HeuristicSearchProblem`, that endows search problems with heuristics * `tile_game_problem.py`: Contains the `TileGame` class, which implements `SearchProblem`, and the `HeuristicTileGame` class, which implements of `HeuristicSearchProblem` * `compare_heuristics.py`: Contains functions that you can run to compare the performance of A* using different heuristics ### Stencil Code * `informed_search.py`: This is where you will implement A* search. * `heuristics.py`: This is where you will implement your heuristic (`my_heuristic`). :::spoiler **Hint: Using `Queues` in Python** Python `queue` module includes an implementation of FIFO queues, LIFO stacks, and Priority Queues. Each provides three basic queue functions: `put`, `get`, and `empty`. To add an item to a `PriorityQueue` with a given priority, add a tuple in the form of *(priority, item)*. By default, `PriorityQueue` retrieves the tuple with the *lowest* priority value. Visit [the official Python documentation](https://docs.python.org/3/library/queue.html) for more information about the module. ::: ### Testing Code Our implementation of the Tile Game is parameterized by its dimension. You should take advantage of this parameterization while developing your code: you can test it on 2-by-2 games, which should run more quickly than 3-by-3 games or 4-by-4 games. - `unit_tests.py`: A file in which to write unit tests for your code. As in HW 0, we have provided a simple test, which checks only that the start and goal states of a returned path are correct, and that it is of the correct length--when the length of the optimal path happens to be known. You should add more tests to further verify the correctness of your code. Does it succeed at finding solutions when they exist? And does it terminate gracefully when they do not? Does it find optimal solutions when it should? Etc. We use directed graphs to test your implementation of A*, and you can too! - `dgraph.py`: A file containing the class `DirectedGraph`. :::spoiler **Hint: Graph Data Structure** In the `DirectedGraph` class, graphs are represented as adjacency matrices. Each matrix $M$ has $n$ rows and $n$ columns, with each index $i \in \{ 1, \ldots, n \}$ representing a vertex, so that $M_{ij} = 1$ if there is an edge from $i$ to $j$, and $M_{ij} = 0$ otherwise. ::: :::info **Task** Describe your tests in words in your README. ::: :::warning **Test early and often!** And don't forget to write your tests *before* you write any code. Never, and we mean *never*, write tests based on the output of your program. ::: :::danger **⚠️WARNING⚠️** Your solution should modify the stencil and testing code *only*. You will be penalized for modifying the support code, especially if it interferes with the autograder. ::: ## Submission ### Handin Your handin should contain the following: - all modified files, including comments describing the logic of your implementations, your tests, and any graphs you generated - a README containing: - a brief paragraph summarizing your implementation and highlighting the outcomes of all tests - your responses to any conceptual questions, including summaries of your experimental findings - known problems in the code - anyone you worked with - any outside resources used (eg. Stack Overflow, ChatGPT) ### Gradescope Submit your assignment via Gradescope. To submit through GitHub, follow this sequence of commands: 1. `git add -A` 2. `git commit -m "commit message"` 3. `git push` Now, you are ready to upload your repo to Gradescope. :::danger **⚠️WARNING⚠️** Make sure you have the correct assignment repo and branch selected before submitting. ::: *Tip*: If you are having difficulties submitting through GitHub, you may submit by zipping up your hw folder. ### Rubric | Component | Points | Notes | |-------------------|------|--------------------------------| | Blind search analysis (Task 1)| 10 | Points awarded for reporting findings in README | | A* implementation (Task 2) | 40 | Points awarded for passing tests | | Inadmissible Heuristic Implementation (Task 4a) | 10 | Points awarded for implementing an inadmissible heuristic (other than the one one provided). No points awarded for an admissible heuristic.| | Heuristic Comparison (Task 4b)| 20 | Points awarded for a summarized comparison of runs with admissible vs. inadmissible heuristics in README| | Tests | 10 | Points awarded for testing A* search implementation | | README | 10 | Points awarded for including all required components | ### Grading We will check each of your search algorithms to ensure that states are expanded in a correct order. The autograder considers a state to be expanded whenever `get_successors` is called on it. So be sure that `get_successors` is not called unnecessarily. For most search problems, there will be many correct orders of expansion. Our autograder will give you full points if you expand states in any of the proper orders.  :::success Congrats on submitting your homework; Steve is proud of you!! ![image](https://hackmd.io/_uploads/Hy04IokhA.png) ![image](https://hackmd.io/_uploads/S1OQ2aCwA.png) :::