Thuc Nguyen's ADSA Course - Lecture 10. Graph Traversal

## Lecture 10. Graph Traversal - Introduction to Graph Theory - Graph Representation - BFS - DFS - Graph Connectivity - Flood Fill --- ### Graph Theory - A graph is a set of **vertices** and **edges** that form connections between the vertices. - More formal: a graph $G$ is an ordered pair of a set $V$ of vertices and a set $E$ of edges given as $G = (V, E)$. --- #### Graph Theory - **Undirected** graph vs **Directed** graph (**digraph**) - A directed graph is a graph in which edges have **orientations**. <img src="https://cdn.ucode.vn/uploads/1/upload/rbOfidpv.png" class="element-left content-img" /> --- #### Graph Theory - **Simple** graph vs **Multigraph** - A multigraph is a generalization that allows **multiple edges** to have the same pair of endpoints. <img src="https://cdn.ucode.vn/uploads/1/upload/gyedyqBW.png" class="element-left content-img" /> --- #### Graph Theory - **Weighted** graph vs **Unweighted** - In a weighted graph, each edge is assigned a weight. <img src="https://cdn.ucode.vn/uploads/1/upload/ipyYEnLP.png" class="element-left content-img" /> --- #### Graph Terminologies <img src="https://cdn.ucode.vn/uploads/1/upload/uUQgAqZg.png" class="element-left content-img" /> - **Node** or **vertex**: A, B, C, D, and E. - **Edge**: a connection between two vertices - **Degree** of a vertex: This is the number of vertices that are adjacent to a given vertex. - **Adjacency**: This refers to the connection(s) between a node and its neighbor. - **Path**: A sequence of vertices where each adjacent pair is connected by an edge. - **Cycle**: A path, in which the first and last node are the same. - **Loop**: When an edge from a node is adjacent to itself, that edge forms a loop. --- #### Graph Connectivity - A graph is **connected** if there is a path between any two nodes. For example, the following graph is connected: <img src="https://cdn.ucode.vn/uploads/1/upload/kccdPLOI.png" class="element-left content-img" /> <img src="https://cdn.ucode.vn/uploads/1/upload/BHBanfrT.png" class="element-left content-img" /> --- #### Graph Connectivity - The connected parts of a graph are called its **components**. For example, the following graph contains three components: {1, 2, 3}, {4, 5, 6, 7} and {8}. <img src="https://cdn.ucode.vn/uploads/1/upload/qZguTLQl.png" class="element-left content-img" /> --- #### Graph Theory - A **tree** is a **connected graph** that consists of $n$ nodes and $n − 1$ edges. - There is a **unique path** between any two nodes of a tree. For example, the following graph is a tree: <img src="https://cdn.ucode.vn/uploads/1/upload/mMkRjxip.png" class="element-left content-img" /> --- #### Neighbors and degrees - Two nodes are **neighbors** or **adjacent** if there is an edge between them. - The **degree** of a node is the number of its neighbors. - The sum of degrees in a graph is always $2m$, where $m$ is the number of edges. <img src="https://cdn.ucode.vn/uploads/1/upload/yFzJFoVZ.png" class="element-left content-img" /> --- #### Neighbors and degrees - A graph is **regular** if the degree of every node is a constant $d$. - A graph is **complete** if the degree of every node is $n − 1$. - In a directed graph: - The **indegree** of a node is the number of edges that end at the node, - The **outdegree** of a node is the number of edges that start at the node. <img src="https://cdn.ucode.vn/uploads/1/upload/tXGGfVge.png" class="element-left content-img" /> --- ### Graph representation **Adjacency list**: A simple list, in which each index (represents a vertex), stores the adjacent nodes to that vertex. <img src="https://cdn.ucode.vn/uploads/1/upload/DjYMVRLo.png" class="element-left content-img" /> <img src="https://cdn.ucode.vn/uploads/1/upload/YvUSmwpA.png" class="element-left content-img" /> --- #### Graph representation **Adjacency list**: A dictionary/map can also be used: ```python= graph = dict() graph['A'] = ['B', 'C'] graph['B'] = ['E','A', 'C'] graph['C'] = ['A', 'B', 'E','F'] graph['E'] = ['B', 'C'] graph['F'] = ['C'] ``` --- #### Graph representation **Adjacency matrix**: <img src="https://cdn.ucode.vn/uploads/1/upload/DjYMVRLo.png" class="element-left content-img" /> <img src="https://cdn.ucode.vn/uploads/1/upload/fyAEbnIQ.png" class="element-left content-img" /> --- #### Graph representation **Adjacency matrix**: ```python= vertices = sorted(graph.keys()) n = len(vertices) adjacency_matrix = [([0] * n) for _ in range(n)] for i, v in enumerate(vertices): for neighbor in graph[v]: j = vertices.index(neighbor) adjacency_matrix[i][j] = adjacency_matrix[j][i] = 1 print(*adjacency_matrix, sep="\n") # [0, 1, 1, 0, 0] # [1, 0, 1, 1, 0] # [1, 1, 0, 1, 1] # [0, 1, 1, 0, 0] # [0, 0, 1, 0, 0] ``` --- #### Graph representation **Edge list**: <img src="https://cdn.ucode.vn/uploads/1/upload/DjYMVRLo.png" class="element-left content-img" /> [('A', 'B'), ('A', 'C'), ('B', 'E'), ('B', 'A'), ('C', 'A'), ('C', 'B'), ('C', 'E'), ('C', 'F'), ('E', 'B'), ('E', 'C'), ('F', 'C')] --- #### Graph representation **Edge list**: ```python= vertices = sorted(graph.keys()) edges_list = [] for i, v in enumerate(vertices): for neighbor in graph[v]: edges_list.append((v, neighbor)) print(edges_list) # [('A', 'B'), ('A', 'C'), ('B', 'E'), ('B', 'A'), # ('C', 'A'), ('C', 'B'), ('C', 'E'), ('C', 'F'), # ('E', 'B'), ('E', 'C'), ('F', 'C')] ``` --- ### Graph traversal - Traversal: keeping track of which vertices have already been **visited** and which ones have not. - Two popular approaches: - Breadth-first search - Depth-first search --- #### Breadth-first search - The BFS algorithm starts at a node (chooses as its root node) - And visits the **neighboring nodes**, after which it explores neighbors on the **next level** of the graph. --- #### Breadth-first search <img src="https://cdn.ucode.vn/uploads/1/upload/tonVPDiA.png" class="element-left content-img" /> --- #### Breadth-first search - Adjacency list: ```python= graph = dict() graph['1'] = ['2', '4'] graph['2'] = ['1', '3', '5'] graph['3'] = ['2', '6'] graph['4'] = ['1'] graph['5'] = ['2', '6'] graph['6'] = ['3', '5'] ``` --- #### Breadth-first search - Suppose that we start at node 1 - First, we process all nodes that can be reached from node 1 using a single edge: <img src="https://cdn.ucode.vn/uploads/1/upload/GFRXrrPH.png" class="element-left content-img" /> --- #### Breadth-first search - After that, we proceed to nodes 3 and 5: <img src="https://cdn.ucode.vn/uploads/1/upload/LJIqXNjN.png" class="element-left content-img" /> --- #### Breadth-first search - Finally, we visit node 6: <img src="https://cdn.ucode.vn/uploads/1/upload/mFqjKFXI.png" class="element-left content-img" /> --- #### Breadth-first search - Distances from the starting node: <img src="https://cdn.ucode.vn/uploads/1/upload/tonVPDiA.png" class="element-left content-img" /> <img src="https://cdn.ucode.vn/uploads/1/upload/DKIGTckg.png" class="element-left content-img" /> --- #### Breadth-first search <img src="https://cdn.ucode.vn/uploads/1/upload/omKkMkpU.png" class="element-left content-img" /> --- #### Breadth-first search <img src="https://cdn.ucode.vn/uploads/1/upload/aofrYjjB.png" class="element-left content-img" /> --- #### Breadth-first search <img src="https://cdn.ucode.vn/uploads/1/upload/wWkrQxuI.png" class="element-left content-img" /> --- #### Breadth-first search - $n$ is the number of nodes and $m$ is the number of edges. - Time complexity of BFS is $O(n + m)$ when using Adjacency list - When using Adjacency matrix: $O(n^2)$ - When using Edge list: $O(n \times m)$ --- #### Breadth-first search BFS Tree: <img src="https://cdn.ucode.vn/uploads/1/upload/jCRZlGcK.png" class="element-left content-img" /> --- #### Breadth-first search Let $G$ be an undirected or directed graph on which a BFS traversal starting at vertex $s$ has been performed. Then: - The traversal visits all vertices of $G$ that are reachable from $s$. - For each vertex $v$ at level $i$, the path of the **BFS tree T** between $s$ and $v$ has $i$ edges, and is the **shortest** path of G from s to v. - If $(u,v)$ is an edge that is not in the BFS tree, then the level number of $v$ can be **at most $1$** greater than the level number of $u$. --- #### Breadth-first search ```python= from collections import deque def breadth_first_search(graph, root): visited = {root: 0} # node: distance queue = deque([root]) while len(queue) > 0: s = queue.popleft() for u in graph[s]: if u not in visited: visited[u] = visited[s] + 1 queue.append(u) return visited graph = dict() graph['1'] = ['2', '4'] graph['2'] = ['1', '3', '5'] graph['3'] = ['2', '6'] graph['4'] = ['1'] graph['5'] = ['2', '6'] graph['6'] = ['3', '5'] print(breadth_first_search(graph, "1")) # {'1': 0, '2': 1, '4': 1, '3': 2, '5': 2, '6': 3} ``` --- ### Depth-first search - Traverse the **depth** of any particular path in the graph **before** traversing its **breadth**. - As such, child nodes are visited first before sibling nodes. --- #### Depth-first search Consider this graph: <img src="https://cdn.ucode.vn/uploads/1/upload/hgDBzFgH.png" class="element-left content-img" /> --- #### Depth-first search - Assume that we begin the search at node $1$. - The search first proceeds to node $2$: <img src="https://cdn.ucode.vn/uploads/1/upload/ahsnbRdo.png" class="element-left content-img" /> --- #### Depth-first search - After that, nodes $3$ and then $5$ will be visited: <img src="https://cdn.ucode.vn/uploads/1/upload/RcevZSpY.png" class="element-left content-img" /> --- #### Depth-first search - Neighbors of node $5$, $3$ and $2$ have all been visited, so we next move from node $1$ to node $4$. <img src="https://cdn.ucode.vn/uploads/1/upload/zkXvFGqx.png" class="element-left content-img" /> --- #### Depth-first search <img src="https://cdn.ucode.vn/uploads/1/upload/ifdgmIJT.png" class="element-left content-img" /> --- #### Depth-first search <img src="https://cdn.ucode.vn/uploads/1/upload/EATTXwkJ.png" class="element-left content-img" /> --- #### Depth-first search <img src="https://cdn.ucode.vn/uploads/1/upload/IXdMUdps.png" class="element-left content-img" /> --- #### Depth-first search DFS Tree: <img src="https://cdn.ucode.vn/uploads/1/upload/FdogIFPN.png" class="element-left content-img" /> --- #### Depth-first search - $n$ is the number of nodes and $m$ is the number of edges. - Time complexity of DFS is $O(n + m)$ when using Adjacency list - When using Adjacency matrix: $O(n^2)$ - When using Edge list: $O(n \times m)$ --- #### Depth-first search ```python= def depth_first_search(graph, root): visited = [] # can use set/dict for better performance def dfs(s): if s in visited: return visited.append(s) for u in graph[s]: dfs(u) dfs(root) return visited graph = dict() graph['1'] = ['2', '4'] graph['2'] = ['1', '3', '5'] graph['3'] = ['2', '5'] graph['4'] = ['1'] graph['5'] = ['2', '3'] print(depth_first_search(graph, "1")) ``` --- ### Graph Connectivity - A graph is **connected** if there is a path between any two nodes. - The connected parts of a graph are called its **components**. For example, the following graph contains three components: {1, 2, 3}, {4, 5, 6, 7} and {8}. <img src="https://cdn.ucode.vn/uploads/1/upload/qZguTLQl.png" class="element-left content-img" /> --- #### Cut vertex - A single vertex whose removal **disconnects** a graph is called a cut-vertex. - Let $G$ be a connected graph. A vertex $v$ of $G$ is called a **cut vertex** (khớp) of $G$, if $G-v$ (Remove $v$ from $G$) results a disconnected graph. <img src="https://cdn.ucode.vn/uploads/1/upload/fTLjcTaW.png" class="element-left content-img" /> --- #### Cut vertex - The right-most node in the gray area on the left is a cut vertex. <img src="https://cdn.ucode.vn/uploads/1/upload/fTLjcTaW.png" class="element-left content-img" /> --- #### Cut vertex - A connected graph $G$ may have maximum $n-2$ cut vertices. - Removing a vertex may increase the number of components in a graph by at least one. - Every non-pendant vertex of a **tree** is a cut vertex. Note: - A vertex v of G is called a pendant (lủng lẳng, lòng thòng) vertex if and only if v has degree 1. - Non-Pendant Vertices are the vertices that have degrees other than 1. - An edge of a graph is said to be a pendant edge if and only if one of its vertices is a pendant vertex. - An edge of a graph is said to be a non-pendant edge if it does not contain a pendant vertex as one of its vertexes. --- #### Cut edge (bridge) - A **cut-edge** or **bridge** is a single edge whose removal disconnects a graph. - Let $G$ be a connected graph. An edge $e$ of $G$ is called a cut edge of $G$, if $G-e$ (Remove $e$ from $G$) results a disconnected graph. <img src="https://cdn.ucode.vn/uploads/1/upload/KSxWHDfN.png" class="element-left content-img" /> --- #### Cut edge (bridge) - A connected graph $G$ may have at most $n-1$ cut edges. - Removal of a cut edge may increase the number of components in a graph by at most one. - A cut edge $e$ must not be the part of any cycle in G. - If a cut edge exists, then a cut vertex must also exist because at least one vertex of a cut edge is a cut vertex. - If a cut vertex exists, then the existence of any cut edge is not necessary. --- #### Cut Set In a connected graph $G$, a **cut set** is a set $S$ of **edges** with the following properties: - The removal of **all** the edges in $S$ **disconnects** $G$. - The removal of **some** of edges (but not all) in $S$ **does not disconnect** $G$. <img src="https://cdn.ucode.vn/uploads/1/upload/rptJWaGR.png" class="element-left content-img" /> --- #### Edge Connectivity - The **edge connectivity** of a connected graph $G$ is the minimum number of edges whose removal makes $G$ disconnected. - It is denoted by $λ(G)$. When $λ(G) = k$, then graph $G$ is said to be **k-edge-connected**. It remains connected whenever fewer than $k$ edges are removed. <img src="https://cdn.ucode.vn/uploads/1/upload/ksNAeSGz.png" class="element-left content-img" /> Note: 2-edge-connected graph --- #### Vertex Connectivity - The connectivity (or **vertex connectivity**) of a connected graph $G$ is the minimum number of vertices whose removal makes G disconnects or reduces to a trivial graph. - It is denoted by $K(G)$. When $K(G) = k$, then graph $G$ is said to be **k-vertex-connected**. The graph remains connected whenever fewer than $k$ vertices are removed. <img src="https://cdn.ucode.vn/uploads/1/upload/esEZtidY.png" class="element-left content-img" /> Note: A graph with connectivity 4. --- #### Connectivity in directed graph - A **directed graph** is **strongly connected** if there is a path from each vertex to another vertex. - **Unilaterally Connected**: every pair of vertices $u$, $v$, there is a directed path from $u$ to $v$ **OR** a directed path from $v$ to $u$. - **Weakly Connected**: A directed graph is weakly connected the underlying undirected graph is connected. --- #### Connectivity in directed graph <img src="https://cdn.ucode.vn/uploads/1/upload/TvTRrWii.png" class="element-left content-img" /> <img src="https://cdn.ucode.vn/uploads/1/upload/aOENBzqs.png" class="element-left content-img" /> <img src="https://cdn.ucode.vn/uploads/1/upload/IAGuOamn.png" class="element-left content-img" /> --- #### Connectivity in directed graph <img src="https://cdn.ucode.vn/uploads/1/upload/RYAJUHbH.png" class="element-left content-img" /> --- #### Strongly connected components <img src="https://cdn.ucode.vn/uploads/1/upload/coMMQqny.png" class="element-left content-img" /> --- ### Flood fill technique - Find (shortest) path between cells in a matrix. - Intuitive and simple implementation. - Can do without knowledge of graph theory. - Use the same technique as in BFS. <img src="https://cdn.ucode.vn/uploads/1/upload/OofTiQhS.png" class="element-left content-img" /> --- ### Flood fill technique - Find shortest path between two cells in a maze (may have obstacles) <img src="https://cdn.ucode.vn/uploads/1/upload/lSppOjAK.png" class="element-left content-img" /> --- ### Flood fill technique - Find shortest path between two cells in a maze (may have obstacles) <img src="https://cdn.ucode.vn/uploads/1/upload/qvPfeTRY.png" class="element-left content-img" /> --- ### Flood fill technique - Queue-based implementation ```python= dirs = [(-1, 0), (1, 0), (0, -1), (0, 1)] queue = deque([(x0, y0)]) while queue: cur_cell = queue.popLeft() for (dx, dy) in dirs: visit(cur_cell[0] + dx, cur_cell[1] + dy) queue.append([cur_cell[0] + dx, cur_cell[1] + dy]) ``` --- ## The end ---