# Introduction to Graphs - EDITS # Introduction Graphs have widespread use within software engineering. They're a powerful data structure utilized to represent relationships between different types of data. In a graph, data points are stored as a **nodes** and each relationship between two data points is represented by an **edge**. For example, a social network is a graph in which people are nodes and friendships between two people are considered edges. Graphs are best suited for problems in which there are binary relationships between objects. Once a problem can be represented as a graph, the problem can usually be solved using one of the key graph algorithms. For interviews, it is vital to know how to implement a graph, basic graph traversals (BFS, DFS) and how to topologically sort graph(s). # Graph Terminology ## Graph components Graphs consist of a set of.. * **vertices**, also referred to as nodes * Nodes that are directly connected by an edge are commonly referred to as **neighbors**. * **edges**, connections between pairs of vertices <img src="https://lh3.google.com/u/0/d/1pjpXfzKUMLtCMbN-lajELJtxxouGmOYG=w2312-h1560-iv1" width="472" height="350"/> ## Graph types ### Directed & undirected graphs A **directed** graph is a graph in which all edges have directions. An example of a directed edge is a one way street. An **undirected** graph is a graph in which all edges do not have a direction. An example of this is a friendship! <img src="https://lh3.google.com/u/0/d/1DFNg6q729fy_8HuufwSggfi6EzTGiBrs=w2312-h1560-iv1" width="486" height="380"/> ### Cyclic & acyclic graphs Before reviewing what cyclic and acyclic graphs are, there are two key terms to cover: **paths** and **cycles**. A **path** is a sequence of vertices connected by edges and a **cycle** is a path whose first and last vertices are the same. A **cyclic** graph contains at least one cycle. An **acyclic** graph has no cycles. <img src="https://lh3.google.com/u/0/d/1sGBAQhiOBf--NkM9cOAjAWAJ3d2LaUd2=w2312-h1560-iv1" width="486" height="380"/> **Directed acylic graphs (DAGs)**, directed graphs without cycles, commonly appear in interview questions. In a DAG, the following terms are often used to denote nodes with special properties: * **Sink** nodes have no outgoing edges, only incoming edges * **Source** nodes have no incoming edges, only outgoing edges # Graph representations ## Adjacency lists An **adjacency list** is the most common way to represent a graph. In an adjacency list, each node stores a list of its adjacent vertices. For undirected graphs, each edge from _u_ to _v_ is stored twice: once in _u_'s list of neighbors and once in _v_'s list of neighbors. <img src="https://lh3.google.com/u/0/d/1C_BKwFN-Uny1zkZIxN1tEtKoq9czJ__5=w2312-h1560-iv1" width="647" height="300"/> ## Edge sets/ lists An **edge set** simply represents a graph as a collection of all its edges. <img src="https://lh3.google.com/u/0/d/12yCXaO4bD97hoOzZ5TTX6ITW_RLbgF10=w2312-h1560-iv1" width="647" height="300"/> ## Adjacency matrix An **adjacency matrix** represents a graph with _n_ nodes as a _n_ by _n_ boolean matrix, in which matrix[_u_][_v_] is set to true if an edge exists between node _u_ to node _v_. <img src="https://lh3.google.com/u/0/d/1VBnJtc6nnYFA-qdkr2A8dXCA7eQkFWfp=w2312-h1560-iv1" width="685" height="350"/> This representation of a graph is efficient for checking if an edge exists between a pair of vertices. However, it may be less efficient for search algorithms because it requires iterating through all the nodes in the graph to identify a node's neighbors. ## Runtime Analysis Below is a chart of the most common graph operations and their runtimes for each of the graph representations. In the chart below, _V_ represents the number of verticies in the graph and _E_ represents the number of edges in the graph. | Representation | Getting all adjacent edges for a vertex| Traversing entire graph | hasEdge(u, v) | Space | | -------- | -------- | -------- | -------- | -------- | | **Adjacency matrix** | O(V) | O(V<sup>2</sup>) | O(1) | O(V<sup>2</sup>)| | **Edge Set** | O(E) | O(E) | O(E) | O(E) | **Adjacency List** | O(1) | O(V + E) | O(max number of edges a vertex has) | O(E + V) Credit: [UC Berkeley data structures course](https://docs.google.com/presentation/d/1GOOt1Ierm9jJFq9o26uRW20GdU6E5hrAZvsoQIreJew)