{%hackmd theme-dark %}
F74077138 Irina Dabaeva
F74067036 Leonardo Toshi Elementi
# Graphs
Graphs are discrete structures consisting of vertices and edges that connect these vertices. There are many different types of graphs, such as connected and disconnected graphs, bipartite graphs, weighted graphs, directed and undirected graphs, and simple graphs. They are used in a variety of areas. In computer science, as an example, graphs are used to represent networks, data organization, the flow of computation, and etc.
Here is the example of Nondeterministic finite automaton:

## Graph Theory
A **graph** is a pair G = (V, E), where V is a set whose elements are called *vertices*, and E is a set of paired vertices, whose elements are called *edges*.
Here is an example of the graph. This graph is called a *simple graph*.

**Simple graph** does not contain more than one edge between any two vertices and no edge connects a vertex to itself. In other words a simple graph is a graph without loops and multiple edges.
Let V be the set of vertices and E be the set of edges. We say that the pair (V, E) determines a **multigtaph** G with vertex set V and edge set E if, for some x, y ∈ V, there are two or more edges in E of the form:
- (x, y) for a directed multigraph,
- {x, y} for an undirected multigraph.
It's obvious that a simple graph is not a multigraph. We also can say that it's *loop-free*.
Unlike simple graph, **pseudograph** allows loops as well as multiple edges.

Previously we mentioned directed and undirected graphs. And it's time to define these terms.
Let V be a finite nonempty set, and let E ⊆ V x V. The pair (V, E) is called a **directed graph** (or **digraph**), where V is the set of vertices and E is the set of directed edges that are associated with the pair of vertices. The directed edge associated with (a, b) starts at a and ends at b. We denote a directed graph as G = (V, E).
In a directed grap:
- *In-degree* of vertex v (deg+(v)) is the number of edges with v as their terminal vertex;
- *Out-degree* of vertex v (deg-(v)) is the number of edges with v as their initial vertex.
When there is no concern about edge directions, we still write G = (V, E). But in this case, E is the set of unordered pairs of elements taken from V. G is called **undirected graph**.
 
Above figure (on the left-hand side) provides an example of a directed graph on V = {a, b, c, d, e} with E = {(a, a), (a, b), (a, d), (b, c)}. For any edge, such as (b, c), we say that the edge is *incident* with vertices b, c; b is said to be *adjacent* to c, whereas c is *adjacent* from b. In addition, vertex b is called the *source* of the edge (b, c), and vertex c is the *terminating* vertex. The edge (a, a) is an example of a *loop*, and the vertex e that has no incident edges is called an *isolated* vertex.
*Degree* of a vertex is the number of edges connecting it. In the example below, vertex a has degree 3 , and the rest have degree 1 . A vertex that has degree 1 is called an *end vertex*.

Also you could notice numbers associated with each edge. These are the weights. The *weight* of an edge is often referred to as the "cost" of the edge. So, we can say it's a **weighted graph**, a graph in which each branch is given a numerical weight.
**Handshaking Theorem:**
Handshaking theorem states that the sum of degrees of the vertices of a graph is twice the number of edges.

**Proof:**
Since the degree of a vertex is the number of edges incident with that vertex, the sum of degree counts the total number of times an edge is incident with a vertex. Since every edge is incident with exactly two vertices,each edge gets counted twice,once at each end. Thus the sum of the degrees is equal twice the number of edges.
**Corollary:**
- In a graph, the total number of odd degree vertices is even.
Let x, y be (not necessarily distinct) vertices in an undirected graph G = (V, E). An x-y *walk* in G is a (loop-free) finite alternating sequence of vertices and edges from G, starting at vertex x and ending at vertex y and involving n edges e(i) = {x(i-1), x(i)}, where 1 <= i <= n.
The *length* of this walk is n, the number of edges in the walk. When n = 0, there are no edges, x = y, and the walk is called *trivial*.
Any x-y walk where x = y (and n > 1) is called a *closed* walk. Otherwise the walk is *open*.
There are two special types of walks. Consider any x-y walk in an undirected graph G = (V, E):
- if no edge in the x-y walk is repeated, then the walk is called an x-y *trail*. A closed x-x trial is called a *circuit*;
- if no vertex of the x-y walk occurs more than once, then the walk is called an x-y *path*. When x = y, the term *cycle* is used to describe such a closed path.
**Example:**

Walk: A -> B -> C -> D -> B -> A -> C

Circuit: A -> B -> C -> D

Path: A -> B -> C -> D

**Theorem:**
Let G = (V, E) be an undirected graph, with a, b ∈ V, a != b. If there exists a trial (in G) from a to b, then there is a path (in G) from a to b.
**Proof:**
Since there is a trail from a to b, we select one of the shortest lengths, say {a, x1}, {x(1), x(2)}, ..., {x(n), b}. If this trail is not a path, we have the situation {a, x(1)}, {x(1), x(2)}, ..., {x(k-1), x(k)}, {x(k), x(k+1)}, {x(k+1), x(k+2)}, ..., {x(m-1), x(m)}, {x(m), x(m+1)}, ..., {x(n), b}, where k < m and x(k) = x(m), possibly with k = 0 and a (=x(0)) = x(m), or m = n+1 and x(k) = b (=x(n+1)). But then we have a contradiction because {a, x(1)}, {x(1), x(2)}, ..., {x(k-1), x(k)}, {x(m), x(m+1)}, ..., {x(n), b} is a shorter trail from a to b.
Let G = (V, E) be an undirected graph or multigraph with no isolated vertices. Then G is said to have an *Euler circuit* if there is a circuit in G that traverses every edge of the graph exactly once.
*Eulerian path* is a trail in a finite graph that visits every edge exactly once (allowing for revisiting vertices).
A *Hamiltonian path*, also called a *Hamilton path*, is a graph path between two vertices of a graph that visits each vertex exactly once.
A *Hamiltonian circuit* is a circuit that visits every vertex once with no repeats.
A **complete graph**, denoted K(n), is a simple graph that contains exactly one edge between each pair of n distinct vertices.

In graph theory, the **hypercube graph** Q(n) is the graph formed from the vertices and edges of an n-dimensional hypercube. For instance, the cubical graph Q(3) is the graph formed by the 8 vertices and 12 edges of a three-dimensional cube. Qn has 2^n vertices, 2^(n−1)n edges.
Here is the example of Q(4) hypercube:

Properties of hypercube:
- Q(n) is n *regular*. **Regular graph** is the graph where each vertex has the same number of neighbors. A regular graph with vertices of degree k is called a k‑regular graph or regular graph of degree k. Also, from the handshaking lemma, a regular graph contains even number vertices with odd degree;

- Q(n) is *bipartite*. A simple graph is called **bipartite** if its vertex set, V, can be partitioned into two disjoint subsets such that each edge connects a vertex from one subset to a vertex of the other.

- Every hypercube Q(n) with n > 1 has a *Hamiltonian cycle*, a cycle that visits each vertex exactly once.
**Proof:**
We prove that any hypercube of dimension n >= 2 has a Hamiltonian cycle using induction.
*Base case:*
The 2-dimensional hypercube is shown below:

It has 4 edges connected in a cycle; this is itself the Hamiltonian cycle.
*Inductive step:*
We will assume that an n-dimensional hypercube has a Hamiltonian cycle, say v1,..., v2^(n). The n+1-dimensional hypercube C(n+1) is formed from two n-dimensional hypercubes, say C(n) with vertices vi and D(n) with vertices wi respectively, for i = 1,..., 2^(n). Then to the union of C(n) and D(n), we add edges connecting vi to wi for each i, to form the n+1-dimensional hypercube C(n+1). The case of a 3D cube formed from two squares is showen here:

Now, we claim that the cycle v1,...,v2^(n), w2^(n), w2^(n)-1,..., w1 is a Hamiltonian cycle. This is shown in the example above. We must check that all vertices appear and no vertices are repeated, and also that every vertex is adjacent to the previous one. For most edges, the edges are contained in either Cn or Dn and this follows from inductive hypothesis. The only edges that are new joining the pairs (v2^(n), w2^(n) and (v1, w1). But these exist from the construction of Cn+1. Hence this is Hamiltonian cycle.
*Euler’s Theorem:*
1. If a graph has more than 2 vertices of odd degree then it has no Euler paths;
2. If a graph is connected and has 0 or exactly 2 vertices of odd degree, then it has at least one Euler path;
3. If a graph is connected and has 0 vertices of odd degree, then it has at least one Euler circuit.
A graph is *Eulerian* iff it is connected and every vertex has an even degree. The k-dimensional hypercube is connected and every vertex has a degree equal to k. Hence, the hypercube is Eulerian iff k is even. And we know that Eulerian circuit is an Eulerian path which starts and ends on the same vertex. So, we can say that if hypercube has Eulerian cycle then it has Eulerian path.
We call graph G **connected** if there is a path between any two distinct vertices of G. A graph that is not connected is called **disconnected**.
 
If G = (V, E) is a graph (directed or undirected), then G1 = (V1, E1) is called a *subgraph* of G if ∅ != V1 ⊆ V and E1 ⊆ E, where each edge in E1 is incident with vertices in V1.

Given a (directed or undirected) graph G = (V, E), let G1 = (V1, E1) be a subgraph of G. if V1 = V, then G1 is called a *spanning subgraph* of G.

The idea of a subgraph lets us define us a *complement* of an undirected loop-free graph.
Let G be a loop-free graph on n vertices. The *complement of G*, denoted G', is a subgraph of K(n) consisting of the n vertices in G and all edges that are not in G.

Let G1 = (V1, E1) and G2 = (V2, E2) be two undirected graphs. A function f: V1->V2 is called a *graph isomorphism* if
- f is one-to-one and onto, and
- for all a, b ∈ V1, {a, b} ∈ E1 iff {f(a), f(b)} ∈ E2.
When such a function exists, G1 and G2 are called **isomorphic graphs**.

For the graph above the function f is defined by:
f(A) = E, f(B) = F, f('C) = G, f(D) = H
provides an isomorphism.
A graph (or multigraph) G is called **planar** if G can be drawn in the plane with its edges intersecting only at vertices of G. In other words no edge crossing each other.
Here is the example of planar graph:

*Euler's formula* states that if a finite, connected, planar graph is drawn in the plane without any edge intersections, and V is the number of vertices, E is the number of edges and F is the number of faces (regions bounded by edges, including the outer, infinitely large region), then **V - E + F = 2**.
**Proof of Euler's formula:**
We prove V + F = E + 2 by induction on V.
Induction basis:
- If V = 1 then the graph consists of E loops.
- The number of faces in the graph is F = 1 + E.
- We indeed have V + F = E + 2.
Induction Step:
- We consider the case of V > 1.
- Since G is connected, there exists an edge a, b that is not a loop.
- We contract a, b, but without merging parallel edges.
- We obtain a graph G' with V − 1 vertices, E − 1 edges, and F faces.
- By the applying the induction hypothesis on G', we have V − 1 + F = E − 1 + 2.
- Increasing both sides by 1 completes the proof.
In graph theory, *graph coloring* is a special case of graph labeling; it is an assignment of labels traditionally called "colors" to elements of a graph subject to certain constraints. In its simplest form, it is a way of coloring the vertices of a graph such that no two adjacent vertices are of the same color; this is called a vertex coloring. Similarly, an edge coloring assigns a color to each edge so that no two adjacent edges are of the same color, and a face coloring of a planar graph assigns a color to each face or region so that no two faces that share a boundary have the same color.
Here is the example of vertex graph coloring:

Computer network is one of the applications of graph coloring. We can use the vertex coloring algorithm to find a proper coloring of the map with four colors. Vertex coloring algorithm may be used for assigning at most four different frequencies for any GSM (Grouped Special Mobile) mobile phone networks.
---
## Graph Algorithms
### Motivation
- As more and more people enjoy online shopping due to lockdown, we were curious about how package would be efficiently.
### Representations of graphs
- We can choose between two standard ways to represent a graph G = (V,E): as a collection of adjacency lists or as an adjacency matrix. Either way applies to both directed and undirected graphs.
***Adjacency-list representation***...Provides a compact way to represent *++sparse++* graphs—those for which |E| is much less than |V^2^| it is usually the method of choice.
***Adjacency-matrix representation***...When the graph is *++dense++*, |E| is close to |V^2^| or when we need to be able to tell quickly if there is an edge connecting two given vertices.

Two representations of an undirected graph. ( a ) An undirected graph G with 5 vertices and 7 edges. ( b ) An adjacency-list representation of G. ( c ) The adjacency-matrix representation of G

Two representations of a directed graph. ( a ) A directed graph G with 6 vertices and 8 edges. ( b ) An adjacency-list representation of G. ( c ) The adjacency-matrix representation of G.
### Dijkstra's Algorithm
- We thought we could use this algorithm charactaristics to find shortest path to deliver from point A to B.
#### Example
- Suppose we want to move from T to Y.

1. Vertex Y has 0 distance from Y.
2. Find all of the verticies that connects to Y.
3. Among the path found in process 2, pick the shortest(closeest to the end), this case, Y-P, 76
4. Then find the shortest path that leads to T. In this case, P-E 96, and P-MR 27, path P-MR seems to be the one, however, path Y-MR 96 is shorter than path Y-P-MR 103. Therefore, path P-E 96 is chosen.
5. Choose next shortest path from process 2. This case, Y-MR 96.
6. From process 5, find shortest path that leads to T. We can see that path MR-A 79 is the shortest.
7. Same as process 5, choose the next shortest path form process 2, in this case, next is path Y-NB 104.
8. Form process 7, find the shortest path that leads to T. We can see that path NB-A 36 is the shortest.
9. Now we found 2 path that starts from Y and leads to A, Y-MR-A(96 + 79 = 175) and Y-NB-A(104 + 36 = 140), Obviously, Y-NB-A is shorter, so we replace path Y-MR-A with Y-NB-A because we are looking for shortest path, and not the others.
10. As we did in process 4, we look for next vertices that leads to T, in this case, vertex A and vertex E.
11. Comparing path A-T 20 and E-T 57, path A-T is shorter, so we choose A-T, and as we previously found, shortest path from Y to A is 140, Y-NB-A-T(104 + 36 + 20 = 160), is shorter than path Y-P-E 172, which implies Y-NB-A-T is the shortest path.
### Dijkstra's Algorithm on table.
#### Example
- Using the following table, find the shortest time to ship from Baltimore to Bakersfield.


1. As we done in previous example, we start from destination, Bakersfield, mark distance as 0.
2. There are 2 veticies that leads to Bakersfield, which are Denver and Dallas, now we choose shotest path, which is Denver.
3. Following the process 2, Denver has 2 verticies that leads to Baltimore are connected, Atlanta and Chicago, and we chose which ever has shorter path, which is Chicago.
4. From Chicago, there are 2 path, Atlanta and Baltimore, however, going to Atlanta is same as going back, so we remove Atlanta as option and choose Baltimore. Which completes the path.
5. Going back process 2, we now choose Dallas.
6. From Dallas, we choose Atlanta. Because it is the shortest and, Bakersfield-Dallas-Chicago is 43, which is more than Bakersfield-Denver-Chicago 37.
7. Now, from Atlanta, same as process 4, choosing Chicago is same as going back, therefore we choose Baltimore.
8. Finally, shortest path is Denver-Chicago-Baltimore 52
### Breath First Search (BFS)
- Algorithm for traversing or searching tree or graph data structures.
- Given a graph G = (V,E) and a distinguished source vertex s, breadth-first search ==systematically explores the edges of G to “discover” every vertex that is reachable from s.== It computes the distance (smallest number of edges) from s to each reachable vertex. It also produces a “breadth-first tree” with root s that contains all reachable vertices. For any vertex v reachable from s, the simple path in the breadth-first tree from s to v corresponds to a “shortest path” from s to v in G, that is, a path containing the smallest number of edges. The algorithm works on both directed and undirected graphs.
- More Simple Version...
1.Visit the adjacent unvisited vertex. Mark it as visited. Display it. Insert it in a queue.
2.If no adjacent vertex is found, remove the first vertex from the queue.
3.Repeat 1 and 2 until the queue is empty.
pseudo code of BFS
```c=
for each vertex u ∈ G.V - {s}
u.color = WHITE
u.d = ∞
u.π = NIL
s.color = GRAY
s.d = 0
s.π = NIL
Q = ∅
ENQUEUE(Q; s)
while Q ≠ ∅;
u = DEQUEUE(Q)
for each v ∈ G.Adj[u]
if v.color == WHITE
v.color = GRAY
v.d = u.d + 1
v.π = u
ENQUEUE(Q, v)
u.color = BLACK
```
#### Time complexity
The operations of enqueuing and dequeuing take O(1) time, and so the total time devoted to queue operations is O(V). Because the procedure scans the adjacency list of each vertex only when the vertex is dequeued, it scans each adjacency list at most once. Since the sum of the lengths of all the adjacency lists is Θ(E), the total time spent in scanning adjacency lists is O(E). The overhead for initialization is O(V), and thus the ==total running time of the BFS procedure is O(V + E).==
#### Example
Operation of BFS on an undirected graph. Tree edges are shown shaded as they are produced by BFS. The value of u.d appears within each vertex u. The queue Q is shown at the beginning of each iteration of the while loop of lines 10–18. Vertex distances appear below vertices in the queue.

### Depth First Search (DFS)
- The strategy followed by depth-first search is, as its name implies, ==to search “deeper” in the graph whenever possible.==
psuedo code for DFS
```c=
DFS(G)
for each vertex u ∈ G.V
u.color = WHITE
u.π = NIL
time = 0
for each vertex u ∈ G.V
if u.color == WHITE
DFS_VISIT(G,u)
```
```c=
DFS_VISIT(G,u)
// white vertex u has just been discovered
time = time + 1
u.d = time
u.color = GRAY
//explore edge(u.v)
for each v ∈ G.Adj[u]
if v.color == WHITE
v.π = u
DFS_VISIT(G,v)
// blacken u; it is finished
u.color = BLACK
time = time + 1
u.f = time
```
##### Running time
- The loops on lines 2–4 and lines 6–8 of DFS take time Θ(V), exclusive of the time to execute the calls to DFS-VISIT. As we did for breadth-first search, we use aggregate analysis. The procedure DFS-VISIT iscalled exactly once for each vertex v ∈ V , since the vertex u on which DFS-VISIT is invoked must be white and the first thing DFS-VISIT does is paint vertex u gray.
- During an execution of DFS_VISIT (G, V), the loop on lines 7–10 executes |Adj[v]|times. Since
$$
\sum_{v ∈ V}|Adj[v]| = Θ(E)
$$
the total cost of executing lines 7–10 of DFS_VISIT is Θ(E). ==The running time of DFS is therefore Θ(V + E).==
#### Example
The progress of the depth-first-search algorithm DFS on a directed graph. As edges are explored by the algorithm, they are shown as either shaded (if they are tree edges) or dashed (otherwise). Nontree edges are labeled B, C, or F according to whether they are back, cross, or forward edges. Timestamps within vertices indicate discovery time/finishing times.

### Toplogical sort
- A topological sort of a {dag | Directed Acyclic Graph} G = (V,E) is a linear ordering of all its vertices such that if G contains an edge (u,v), then u appears before v in the ordering. (If the graph contains a cycle, then no linear ordering is possible.) We can view a topological sort of a graph as an ordering of its vertices along a horizontal line so that all directed edges go from left to right.
#### Algorithm
1. Call DFS(G) to compute finishing times v.f for each vertex v
2. As each vertex is finished, insert it onto the front of a linked list
3. return the linked list of vertices
#### Time complexity
==Time complexity of topological sort in time is Θ(V + E)==, since depth-first search takes Θ(V + E) time and it takes 𝛰(1) time to insert each of the |V| vertices onto the front of the linked list.
#### Topological Sorting vs DFS
- DFS print a vertex and then recursively call DFS for its adjacent vertices.
- In topological sorting, we need to print a vertex before its adjacent vertices.
- For example, in the graph shown below, the vertex ‘5’ should be printed before vertex ‘0’, but unlike DFS, the vertex ‘4’ should also be printed before vertex ‘0’. So Topological sorting is different from DFS.
- A DFS of the shown graph is “5 2 3 1 0 4”, but it is not a topological sorting.

#### Example
(a) Professor Bumstead topologically sorts his clothing when getting dressed. Each directed edge (u,v) means that garment u must be put on before garment v. The discovery and finishing times from a depth-first search are shown next to each vertex.

(b) The same graph shown topologically sorted, with its vertices arranged from left to right in order of decreasing finishing time. All directed edges go from left to right.

### Strongly connected components
- A directed graph is strongly connected if there is a path between all pairs of vertices. A strongly connected component (SCC) of a directed graph is a maximal strongly connected subgraph.
#### Algorithm
1. Call DFS(G) to compute finishing times u.f for each vertex u
2. Compute G^T^
3. Call DFS(G^T^), but in the main loop of DFS, consider the vertices in order of decreasing u.f (as computed in line 1)
4. Output the vertices of each tree in the depth-first forest formed in line 3 as a
separate strongly connected component
#### Example
( a ) A directed graph G. Each shaded region is a strongly connected component of G. Each vertex is labeled with its discovery and finishing times in a depth-first search, and tree edges are shaded.

( b ) The graph G^T^, the transpose of G, with the depth-first forest computed in line 3 of STRONGLY-CONNECTED-COMPONENTS shown and tree edges shaded. Each strongly connected component corresponds to one depth-first tree. Vertices b, c, g, and h, which are heavily shaded, are the roots of the depth-first trees produced by the depth-first search of G^T^.

( c ) The acyclic component graph G^SCC^ obtained by contracting all edges within each strongly connected component of G so that only a single vertex remains in each component.

#### Thoghts
- We noticed that Dijkstra's Algorithm could be used in micro casses, such as shortest path to next delivery point, and Euler path could be used in macro way, such as when you are about to start delivery, and you want to figure out what order is efficient to go through all of the delivery point.
## References:
* https://www.wikiwand.com/en/Graph_theory
* https://www.wikiwand.com/en/Hypercube_graph
* https://www.wikiwand.com/en/Regular_graph
* Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, Clifford Stein - Introduction to algorithms (2009, The MIT Press)
* Ralph P. Grimaldi - Discrete and Combinatorial Mathematics_ An Applied Introduction, Fifth Edition (2003, Pearson).
* https://www.wikiwand.com/en/Planar_graph
* https://www.geeksforgeeks.org/topological-sorting/
* https://www.geeksforgeeks.org/strongly-connected-components/