# CS410 Homework 0: Blind Search **Due Date: 9/18/2024 at 12pm** **Need help?** Feel free post on [Edstem](https://edstem.org/us/courses/61309) for TA assistance. ## Assignment Overview Welcome to CS410! In this assignment, you will prepare your local development environment, and then use it to complete your first programming task. <!-- : implementing breadth-first and depth-first search to solve mazes. --> <!-- TODO: Insert Maze solving gif --> In the first part of this assignment, you will install and setup the software you will be using throughout the course, and you will familiarize yourselves with the platforms you will be using to download, develop, and submit assignments. <!-- :::warning The set of applications that is used to develop and run software is called a **technical stack**. In this part of the assignment, you will be introduced to the CS410 technical stack. In particular, you will learn how to download support code; develop, run, and debug your own code; and submit your assignments. ::: --> In the second part of this assignment, you will use your newly set up development environment to solve mazes using breadth-first and depth-first search. <!--First, you will be given a search problem, and will implement search algorithms to find a solution. Then, you will formalize a search problem, which you can (optionally) solve with (a variant of) your search algorithm implementations.--> ### Learning Objectives What you will know: - two blind search algorithms, depth-first and breadth-first search, which can be used to solve basic search problems. - how to formalize an informal description of a problem (specifically, maze generation) as a basic search problem What you will be able to do: - use python in an IDE (vscode) with the CS410 virtual environment - download support code for your assignments from Github Classroom and upload your completed work to Gradescope ## Part I: Installation and Course Setup 1. Before diving into the assignment, please read the [CS 410 Course Missive](https://hackmd.io/@cs410/BknqjaNhR) carefully. 1. After reading, sign [this form](https://forms.gle/k9ENULmfatWFi5d27) to confirm that you have understood and agree to the CS410 Course Collaboration policy. 1. Please complete the course [onboarding form](https://forms.gle/7UHdPeyfybnbdzdAA). :::warning **Please complete the course onboarding form by 9/11!** This will allow us to schedule your discussion sections, which start on 9/16. ::: 4. Be sure to follow the link to [Gradescope](https://www.gradescope.com/) on the course onboarding form. You will submit all your assignments on Gradescope. Please sign up! :::warning **Please set up your Gradescope account now**! That way, you will have ample time to debug any account-related issues. ::: 5. Be sure to join **EdStem**, which you can access through Canvas, a forum where you can pose public or private questions to the course staff. 1. You can find the **TAs office hours schedule** and room locations on the [course website](https://browncsci410.github.io/ai-website-f24/hours.html) or by adding the [CS410 Google Calendar](https://calendar.google.com/calendar/u/1?cid=Y18zOWRmMDU1ZjI1Nzc5ZWZkYWZkNTFjZmI0ZDUwNGMxZjllNGViOGZiNDM3Nzk3N2JkMzJmM2I2YzZjZjhjNWE0QGdyb3VwLmNhbGVuZGFyLmdvb2dsZS5jb20). 1. Assignment materials (i.e., support code, conceptual questions, and data) will be posted on **Github Classroom**. You can learn more about using Git, Github, and GitHub Classroom [here](https://hackmd.io/@cs410/ry3PhMEtC). 1. Follow the instructions on the [CS410 webpage](https://browncsci410.github.io/ai-website-f24/) under the link "[Installation and Setup](https://hackmd.io/@cs410/BJvhqHXuR)". These instructions will guide you through downloading and installing Python 3.10.6 and VSCode, and setting up your virtual environment. :::danger **⚠️WARNING⚠️** You must activate the CS410 virtual environment every time you work on a CS410 assignment! You can find the activation instructions [here](https://hackmd.io/s3r91ZwvS9G1KhSv_FZWCA?both#Activating-the-Environment). If you forget to activate this virtual environment, your code may be bug-ridden, or it may not run at all. ::: Congratulations! You're now ready to code! ![MinecraftSteveDancingGif (1)](https://hackmd.io/_uploads/rJEuSgXK0.gif) ## Part II: Blind Search Search and optimization lie at the core of AI. The goal of this part of this assignment is to introduce you to blind search. Blind search is so-called, because you search for goals without any hints as to where they might be. You will use blind search to solve mazes that we formalize as search problems for you, and you will then formalize a search problem of your own. Did you ever happen to play the children's game "hot or cold," where a hider hides an object, and a seeker searches for it. That game was *not* blind search, because the hider gives the seeker clues as to whether she is hot (close) or cold (far) from the object. Blind search would be like searching without any clues! We start off the course with blind search (HW0), i.e., search sans clues, before moving on to informed search (HW1), which relies on heuristics designed to estimate clues like "hot" and "cold". ## Data Structures & Algorithms Recall from lecture that a **basic search problem** is a tuple $\langle X, T, S, G \rangle$, where - $X$ is a set of states - $T: X \rightrightarrows X$ is a transition correspondence* - $S \subseteq X$ is a set of start states - $G \subseteq X$ is a set of goal states *A **correspondence** is a set-valued function; it need not map its input into a unique output. <!-- The actions are the agent's possible movements: north, south, east, or west. --> The problem of solving a maze is a basic search problem. In particular, <!-- The states are mazes, which are large squares inhabited by many smaller square rooms, each with up to four walls. --> - The states are the agent's possible locations in the maze. - The transition correspondence maps a state to an adjacent one to its north, south, east, *and* west, as long as there is no wall in the specified direction. - The start state is the top left corner of the maze. - The goal state is the bottom right corner of the maze. Search problems are solved with, you guessed it!, search algorithms. The most rudimentary approach to search is *exhaustive* search, where *all* solutions are eventually considered. Most search problems are too big to solve exhaustively--imagine searching for an optimal strategy for playing chess--so often a heuristic/suboptimal alternative is used. But using exhaustive search to solve small problems is a good way to start learning about search. This assignment is concerned with two blind search algorithms, namely breadth-first search (BFS) and depth-first search (DFS). These algorithms are described in the course textbook, Russell and Norvig, 3rd or 4th edition, Section 3.4. :::spoiler **Visualization Tool** You might find this [tool](https://qiao.github.io/PathFinding.js/visual/), which can be used to visualize search algorithms, helpful in understanding the difference between BFS and DFS. ::: Your task is to implement BFS and DFS on (undirected) graphs, which requires that you keep track of already visited states (to the extent that memory allows) to avoid revisits. :::spoiler **Hint: BFS vs. DFS** Although DFS is a recursive algorithm, it is very similar to BFS when viewed as an iterative algorithm instead. They differ only in their choice of data structure, a stack vs. a queue. For this, your first assignment, it suffices to implement iterative DFS, as doing so requires only a one-line change from BFS. ::: If you think about it for a moment, you may realize that not only is maze solving a search problem, but maze generation is also a search problem. Imagine an agent trying to move from the top left to the bottom right of a blank canvas. That would not make for a very interesting maze. Where should you draw walls and obstacles to make the maze more challenging? And how can you answer this question? Easy: search! The mazes in this assignment are generated using ... drumroll! ... (randomized) DFS! Instead of starting from a blank canvas, maze generation starts with an $m \times n$ sheet of graph paper (i.e., an $m \times n$ grid with all its walls). An agent traverses this grid using DFS, but randomizing the order of the agent's actions (north, south, east, or west). The agent moves in the next specified direction as long as there is a wall to erase on its path and it has not already visited that grid cell; otherwise, it backtracks. The algorithm terminates after visiting all cells, at which point it is guaranteed to have generated a **perfect maze**: i.e., one with exactly one path between any two cells. ![57186230-3e952180-6ea9-11e9-916f-1de455bb63ed](https://hackmd.io/_uploads/HyTvkDl9R.gif) ## Tasks Your first task is to implement BFS and DFS, and to use them to find paths through a maze. You second task is to conceptually formulate maze generation as a search problem. You can then optionally use (randomized BFS or) DFS to search for mazes, thereby generating mazes of your own. The search problems are coded as instances of `SearchProblem`, an abstract class with abstract methods `get_start_state`, `is_goal_state`, and `get_successors`. Although it is technically an abstract class in python, note that `SearchProblem` functions as an interface: i.e., any `SearchProblem` instance, such as a `Maze`, must implement these methods. This abstract class can be constructed from an optional `State`, which in the case of mazes, stores their dimensions. :::info **Task 1** Implement BFS (`bfs`) and DFS (`dfs`) in `search.py`. ::: :::success **Signatures** Each search method in `search.py` takes as input a `SearchProblem` and outputs an option, which is either a solution to this problem or `None`. When a solution is found, meaning a path through the search tree from the start state to a goal, it is returned, together with statistics summarizing the search. Regarding the statistics, we ask that you report 3 things: 1. `path_length`: the length of the solution path from the start to the goal 2. `states_expanded`: the total number of states expanded (a node is **expanded** when `get_successors` is called on it) 3. `max_frontier_size`: the maximum number of states on the search **frontier** (i.e., the maximum size of the data structure that stores the yet-to-be-visited states) The number of states expanded is a proxy for run time, while the maximum size of the frontier measures memory usage. ::: <!-- :::warning **Note!** Your implementations of BFS and DFS should be **problem agnostic**. They should work on any `SearchProblem`: e.g., mazes, routing problems, like finding a driving route from Providence to Boston or a flight from Boston to Tokyo, etc. ::: --> :::spoiler **Hint: BFS vs. DFS** The BFS and DFS implementations are very similar. The only difference is their data structure. So it's best to focus on them one at time, because once one of them is working, it should not be hard to get the other one working as well. ::: :::info **Task 2:** In your README, describe maze generation as a basic search problem. What are the states, actions, and transitions? What are the start and goal states? ::: :::info **Extra Credit: Task 3** Implement maze generation as search by implementing `MazeGenerator`, a class that implements the `SearchProblem` interface (i.e., `get_start_state`, `is_goal_state`, and `get_successors`) in `maze_generator.py`. Then run BFS or DFS on `MazeGenerator` to generate mazes and solve them. You can do so by executing the following command in your terminal: `python maze_generator.py`. **N.B.** You can modify the size of the mazes you generate and solve using the command-line `size` argument: `python maze_generator.py --size 10`. ::: <!-- ## Solution to Task 2 <font style="color: red"> To describe maze generation as a basic search problem, we need to identify the states, actions, transitions, start state, and goal state. 1. There are three parts to each state: 1. The location of the "agent" (i.e., the maze generator). Since the agent can only inhabit one cell at a time, the states are the set $S^1 = \{ 0, 1 \}^{m \times n}$ s.t. $S^1_{ij} = 1$ for only one pair $(i,j) \in [m] \times [n]$, while $S^1_{ij} = 0$ otherwise. 1. The walls of the maze, where they exist, represented as zeros or ones on cell boundaries, so $S^2 = \{ 0, 1 \}^{(m+1) \times (n+1)}$. 1. Indicator variables attached to all cells indicating whether they have been visited or not: $S^3 = \{ 0, 1 \}^{m \times n}$. Thus, the set of all states is $S = S^1 \times S^2 \times S^3$. 1. The actions are north, south, east, and west. 1. The agent moves to an adjacent cell in the direction specified. However, an agent can only move if it has not already visited the cell and there is a "temporary" wall in its way for it to erase, i.e., moving would not take it outside the confines of the grid. 1. The start state is a bounded grid: i.e., a grid with permanent walls all around it, with temporary walls inserted everwhere else, the agent in an arbitrary location (i.e., grid cell), and all cells marked as unvisited except the initial location of "the agent". 1. The goal state is a perfect maze: i.e., one with exactly one path between any two cells in the grid. There is a theorem that states that visiting all cells is a sufficient condition for reaching this goal, which reduces the goal state to exactly that: "all cells visited". </font> --> <!-- Below is the pseudocode for the maze generation algorithm: ```plaintext def drunken_walk(row, col): directions ← [north, south, east, west] shuffle the directions array set the cell at [row, col] to visited for each direction in directions: new_row, new_col ← row + Δ_row, col + Δ_col (based on the direction) if the cell is out of bounds then place a wall from the original coord to where you move > in my description, i start with boundaries all around the grid. then you just do nothing when a cell is out of bounds. else if the cell is unvisited then remove the wall in that direction from the original cell remove the wall in the opposite direction from the new cell drunken_walk(new_row, new_col) ``` --> ## Downloads Please click [here](https://classroom.github.com/a/Wyhrsi-l) to access the assignment code. :::warning **Nomenclature:** In CS410, code for all assignments comprises two parts: **support** code, which you should *not* edit, and **stencil** code, which you *must* edit to complete your assignments. You should also edit the **testing** code, by adding your own tests. ::: ### Support Code - `search_problem.py`: Provides the abstract base class `SearchProblem`, which functions as an interface for a basic search problem. It specifies three key methods: `get_start_state`, `is_goal_state`, and `get_successors`, which any instance of `SearchProblem` must implement. :::warning Note that `SearchProblem` depends on a `State`. Indeed, given a `State`, a `SearchProblem` defines the start, the goals, and how to transition among them. ::: - `maze.py`: Provides an implementation of `Maze` based on `MazeState`. A `MazeState` is a collection of `MazeRooms`, each of which is a grid cell with up to four walls. We implement `MazeState` as a `Tuple[Tuple[MazeRoom]]`. <!-- Because it is a `SearchProblem`, `Maze` implements `get_start_state`, `is_goal_state`, and `get_successors`. --> :::spoiler Why do we implement `MazeState` using tuples instead of lists? In python, tuples are immutable data structures, while lists are mutable. - **Immutability**: A `MazeState` is a immutable because its dimensions are constant. - **Performance**: Tuples are more efficient than lists when data is immutable (though lists may be more efficient when data is subject to frequent changes). - **Hashability**: Immutable objects like tuples are `Hashable`, so they can be used as dictionary keys or in sets. - **Design Clarity**: Using a tuple signals that a maze's structure should remain unchanged. ::: ### Stencil Code - `search.py:` Templates for search algorithms. This is where you will implement BFS and DFS. :::spoiler **Hint: Data Structures** `search.py` imports the queue module, which contains Queue, which as expected implements queues, and LifoQueue, which implements stacks. (It also imports `PriorityQueue`, which will come in handy when you work on HW 1.) ::: - `maze_generator.py`: Template for maze generation. This is where you can (optionally) code a `MazeGenerator` by implementing the `SearchProblem` interface. ### Testing Code Our implementation of `Maze` is parameterized by its dimension. You should take advantage of this parameterization while developing your code. Your algorithms should run more quickly on smaller mazes than on larger ones. - `unit_tests.py`: A file in which to write unit tests for your code. As an example, we have provided a simple test, which checks only that the start and goal states of a returned path are correct, and that it is of the correct length--when the length of the optimal path happens to be known. You should add more tests to further verify the correctness of your code. Does it succeed at finding solutions when they exist? And does it terminate gracefully when they do not? Does it find optimal solutions when it should? Etc. We use directed graphs to test your implementations of BFS and DFS, and you can too! - `directed_graph.py`: A file containing the class `DirectedGraph`. :::spoiler **Hint: Graph Data Structure** In the `DirectedGraph` class, graphs are represented as adjacency matrices. Each matrix $M$ has $n$ rows and $n$ columns, with each index $i \in \{ 1, \ldots, n \}$ representing a vertex, so that $M_{ij} = 1$ if there is an edge from $i$ to $j$, and $M_{ij} = 0$ otherwise. ::: :::info **Task** Describe your tests in words in your README. ::: :::warning **Test early and often!** And don't forget to write your tests *before* you write any code. Never, and we mean *never*, write tests based on the output of your program. ::: :::danger **⚠️WARNING⚠️** Your solution should modify the stencil and testing code *only*. You will be penalized for modifying the support code, especially if it interferes with the autograder. ::: ## Submission ### Handin Your handin should contain the following: - all files, including comments describing the logic of your implementations and tests - a README containing: - a summary of your tests, explaining their outcomes - your responses to any conceptual questions - known problems in your code - anyone you worked with - any outside resources used (eg. Stack Overflow, ChatGPT) ### Gradescope Submit your assignment via Gradescope. To submit through GitHub, follow this sequence of commands: 1. `git add -A` 2. `git commit -m "commit message"` 3. `git push` Now, you are ready to upload your repo to Gradescope. :::danger **⚠️WARNING⚠️** Make sure you have the correct assignment repo and branch selected before submitting. ::: *Tip*: If you are having difficulties submitting through GitHub, you may submit by zipping up your hw folder. ### Grading Your code will be graded on correctness. We will test it on a variety of mazes. You will receive no credit for your implementations if you call functions in an external library (e.g., NetworkX) for BFS or DFS. ### Rubric | Component | Points | Notes | | -------- | -------- | -------- | | BFS | 25 | Points awarded for passing tests| | DFS | 25 | Points awarded for passing tests| | Search Stats | 0 | Compares your stats to our implementations and reports the results. Your stats don't have to match ours perfectly, but if your stats are far off, we may manually regrade your solutions to check the inconsistoncies. | | Maze Generation | 30 | Points awarded for identifying components of maze generation as a search problem| | Tests | 10 | Points awarded for testing BFS and DFS implementations| | README | 10 | Points awarded for including all required components| | Extra Credit | 10 | Points awarded for coding maze generation using `MazeGenerator`, i.e., as a `SearchProblem`| :::success Congrats on submitting your homework; Steve is proud of you!! ![image](https://hackmd.io/_uploads/S1Um8iJ3C.png) ![image](https://hackmd.io/_uploads/S1OQ2aCwA.png) :::