【資工考研】DSA: Graph (1) BFS & DFS

Graph Representations
BFS & DFS
Shortest Path Problem
Minimum Spanning Tree
Articulation Point
Flow Network

About Graph

一個圖 (Graph)

G

定義為

G = (V, E)

，其中

V

代表頂點 (vertices) 之集合；

E

代表邊 (edges) 之集合
兩頂點

v_{i}

v_{j}

若相連，表示為

(v_{i}, v_{j})

圖中邊集合

E

中元素若為無序對，則稱

G

為無向圖 (Undirected Graph)；反之，若為有序對，則稱

G

為有向圖 (Directed Graph)：

無向圖中與某頂點
$v$ 相連的邊數稱為該頂點之 degree (
$\deg (v)$ )
有向途中自某頂點
$v$ 出發與其他頂點相連之邊數稱為該頂點之 out-degree；自其他頂點出發與
$v$ 相連之邊數稱為該頂點之 in-degree

我們定義一條自

v_{0}

到

v_{k}

之 path 為一序列

v_{0}, v_{1}, \dots, v_{k}

此序列中任相鄰之兩頂點

v_{i}

v_{j}

均存在邊

(v_{i}, v_{j})

使之相連

若在 path 中每一個頂點均只出現一次，則此 path 為一 simple path
Cycle 為起終點相同頂點之 path

給定

G = (V, E)

，稱兩頂點

v_{i}, v_{j} \in V

為 reachable 若且唯若存在一條從

v_{i}

到

v_{j}

之 path

若
$G$ 中任取兩頂點均 reachable 則稱此圖為連通的 (connected)
若
$G$ 中任取兩頂點均存在兩條以上互斥的 path 使之 reachable 則稱此圖為雙連通 (biconnected)
若移除
$G$ 中某頂點
$v$ 可使得
$G$ 不連通，則稱
$v$ 為切點 (cut point)

G = (V, E)

之子圖

H = (V^{'}, E^{'})

亦為一圖，滿足

V^{'} \subseteq V

且

E^{'} \subseteq E

Graph Representations

Adjacency Matrix
Adjacency List
Adjacency multi-list
Incidence Matrix
Index + Array

給定

G = (V, E)

為有

n

個頂點、

m

個邊之圖，那麼寫程式時該如何表達圖呢？

Adjacency Matrix

用一個

n \times n

的二維陣列 A，若

v_{i}

v_{j}

為 reachable 則 A[i,j] == 1；反之，則 A[i,j] == 0

若把二維陣列看作矩陣，則無向圖之 adjacency matrix 必為 symmetric matrix

Adjacency matrix 需要

O (n^{2})

的空間，適用於邊數多的圖 (dense graph)

std::vector<std::vector<int>> adjMatrix;

Adjacency List

用一個長度 n 的陣列，其中每格代表

G

中對應的頂點，陣列元素裝該與頂點相連的頂點所構成的 linked list
需要

O (n + m)

的空間，適合邊少的圖 (sparse graph)

std::vector<std::list<int>> adjList;

雖然我上面 adjacency matrix 是給 vector<std::vector<int>> 而 adjacency list 我是給 vector<std::list<int>>
但其實通常在 C++ 中要表示 adjacency list 還是更習慣使用 vector<std::vector<int>>

畢竟每列 vector 長度可變，我們只要裝填元素的邏輯是清楚的就好

Adjacency Multilist

很像 adjacency list ，不過我們對長度 n 的陣列裝的東西有所不同，改試著裝邊 (邊節點)：

Edge Node (可有可無)

Link for V1

Link for V2

// Forward declaration
struct VertexNode;
struct EdgeNode;

struct EdgeNode
{
    int v1;
    int v2;
    EdgeNode* linkV1;
    EdgeNode* linkV2;
    
    EdgeNode(int v1, int v2) : v1(v1), v2(v2), linkV1(nullptr), linkV2(nullptr) {}
};

struct VertexNode
{
    int vertex;
    EdgeNode* edge;
    
    VertexNode(int v) : vertex(v), edge(nullptr) {}
};

std::vector<VertexNode*> adjMultilist

Incidence Matrix

有向圖

G

的 incidence matrix

M

是一個

n \times m

的矩陣滿足：

M [i] [j] = {\begin{cases} 1 & if vertex v_{i} is incident to edge e_{j} and the edge is directed away from v_{i}, \\ - 1 & if vertex v_{i} is incident to edge e_{j} and the edge is directed toward v_{i}, \\ 0 & if vertex v_{i} is not incident to edge e_{j} . \end{cases}

如果是無向圖：

M [i] [j] = {\begin{cases} 1 & if vertex v_{i} is incident to edge e_{j}, \\ 0 & if vertex v_{i} is not incident to edge e_{j} . \end{cases}

Index + Array

陣列前

n

項是 index 用來表示陣列中從那開始是記錄與該頂點相鄰的頂點，後面則全都紀錄相鄰頂點

很少考，但考出來你要記得上述規則，不然你會看不出圖長怎樣

BFS & DFS

接下來是圖論中相當頻繁使用的兩個演算法，用在圖之遍歷：

Breadth-First Search (廣度優先搜尋)
Depth-First Search (深度優先搜尋)

在演算法中，因為它有要說明別的東西，所以多了些標示及定義，實際上像是 LeetCode 刷題時，則如資料結構中那樣直接寫

BFS

假設

G = (V, E)

為一個 graph，

s \in V

為一個 source vertex ，則

\forall u \in V

：

$u . π$ 為
$u$ 之 predecessor (breadth-first tree 中之 parent)，
$s . π =$ nil (在 BFS 中，
$u$ 是被
$u . π$ 所發現的)

u . color = {\begin{cases} white, if u is undiscovered. \\ gray, if u is discovered, but \exists v, a neighbor of u, which is undiscovered. \\ black, if u and all of its neighbors are discovered. \end{cases}

u . d

為 BFS 所求出，自

s

到

u

之距離
4.

δ (s, v)

為自

s

到

v

所需經過之最少邊數，

\forall v \in V

adj [u] = {v | (u, v) \in E}

enum class Color
{
    WHITE,
    GRAY,
    BLACK,
};

struct Graph
{
    int num;
    struct Vertex
    {
        int d;
        Vertex* pi;
        Color color;
    };
    
    std::vector<std::vector<int>> adjList;
    std::vector<Vertex> vertices;
    
    Graph(int n) : num(n), adjList(n), vertices(n) {}
    
    void addEdge(int u, int v)
    {
        adjList[u].push_back(v);
        adjList[v].push_back(u); // Remove this for directed graphs
    }
    
    const std::vector<int>& neighbors(int u) const noexcept
    {
        return adjList[u];
    }
};

void BFS(Graph &g, int s)
{
    
    for (int i = 0; i < g.num; ++i)
    {
        g.vertices[i].d = INT_MAX;
        g.vertices[i].pi = nullptr;
        g.vertices[i].color = Color::WHITE;
    }
    
    g.vertices[s].d = 0;
    g.vertices[s].pi = nullptr;
    g.vertices[s].color = Color::GRAY;
    
    std::queue<int> q;
    q.push(s);
    
    while (!q.empty())
    {
        const int u = q.front();
        q.pop();
        
        for (int v : g.neighbors(u))
        {
            // If v is undiscovered
            if (g.vertices[v].color == Color::WHITE)
            {
                g.vertices[v].color == Color::GRAY;
                g.vertices[v].d = g.vertices[u].d + 1;
                g.vertices[v].pi = &g.vertices[u];
                
                q.push(v);
            }
        }
        
        // After processing all neighbors, mark u as fully explored (black)
        g.vertices[u].color = Color::BLACK;
    }
}

Time complexity:

O (| V | + | E |)

(

∵

我們使用 adjacency list 來表示圖，如果是 adjacency matrix，則會是

O ({| V |}^{2})

)

BFS 只能找出所有與

s

為 reachable 的頂點，即

s

所在之 component 中的所有頂點
所以要注意，依此定義不見得能遍歷圖中所有頂點
也因此， BFS 可以用來判斷

G

是否為連通圖

Diameter

假設

u

v

為

G

中距離最遠的兩頂點，則該兩點之距離即為 diameter (

max_{u, v \in V} δ (u, v)

)

在這邊，可以延伸到兩件事：

在 tree 上找 diameter (即 tree 上找 longest path) 存在 linear time algorithm 可解
在一般圖上找 longest path 為 NP-Complete problem ，並不好解

關於 tree 上找 diameter 的演算法可參考下列：

int treeDiameter(Graph &tree)
{
    // Perform BFS from an arbitrary vertex (e.g. vertex 0)
    BFS(tree, 0);
    
    auto compare = [](const Graph::Vertex& a, const Graph::Vertex& b)
    {
        return a.d < b.d;
    };
    
    // Find the vertex with the largest d value
    int t = std::distance(tree.vertices.begin(), std::max_element(tree.vertices.begin(), tree.vertices.end(), compare));
    
    BFS(tree, t);
    
    int u = std::distance(tree.vertices.begin(), std::max_element(tree.vertices.begin(), tree.vertices.end(), compare));
    
    return tree.vertices[u].d;
}

DFS

與 BFS 不同的是，對於搜尋到的頂點

u

，我們會紀錄其發現時間

u . d

，接著對其在遞迴樹上的 child 繼續遞迴搜尋，直到後代都被搜尋到才結束對

u

的的搜尋，並且紀錄該結束時間

u . f

u . color = {\begin{cases} white, if u is undiscovered. \\ gray, if u is discovered, but \exists v, a descendant of u, which is undiscovered. \\ black, if u and all of its descendants are discovered. \end{cases}

也由此可知，在此定義下的 DFS 能遍歷

G

中所有的頂點

enum class Color
{
    WHITE,
    GRAY,
    BLACK,
};

struct Graph
{
    int num;
    struct Vertex
    {
        int d;
        int f;
        Vertex* pi;
        Color color;
    };
    
    std::vector<std::vector<int>> adjList;
    std::vector<Vertex> vertices;
    
    Graph(int n) : num(n), adjList(n), vertices(n) {}
    
    void addEdge(int u, int v)
    {
        adjList[u].push_back(v);
        adjList[v].push_back(u); // Remove this for directed graphs
    }
};

// Forward declaration
void DFSVisit(Graph &g, int u, int &t);

void DFS(Graph &g)
{
    for (auto &u : g.vertices)
    {
        u.color = Color::WHITE;
        u.pi = nullptr;
        u.d = 0;
        u.f = 0;
    }
    
    int t = 0;
    
    for (int u = 0; u < g.num; ++u)
    {
        if (g.vertices[u].color == Color::WHITE)
            DFSVisit(g, u, t);
    }
}

void DFSVisit(Graph &g, int u, int &t)
{
    g.vertices[u].color = Color::GRAY;
    g.vertices[u].d = ++t;
    
    for (int v : g.adjList[u])
    {
        if (g.vertices[v].color == Color::WHITE)
        {
            g.vertices[v].pi = &g.vertices[u];
            DFSVisit(g, v, t);
        }
    }
    
    g.vertices[u].color = Color::BLACK;
    g.vertices[u].f = ++t;
}

Time complexity:

O (| V | + | E |)

Note

雖然上面的 code 是基於無向圖，但我們接著的討論會先基於有向圖

以此定義執行完 DFS ，我們可以根據 depth-first forest 將

G

中所有

(u, v) \in E

分為以下四類：

Tree edge: if
$v . π = u$ , then
$(u, v)$ is a tree edge
Back edge: if
$v$ is an ancestor of
$u$ in the depth-first forest (including self-loops)
Forward edge: if
$v$ is a descendant of
$u$ in the depth-first forest, but
$v . π \neq u$
Cross edge: otherwise

依據我們 DFS 執行完後的

G

，此時顏色就派上用場了

\forall (u, v) \in E {\begin{cases} If v . color = white, then it's a tree edge \\ If v . color = gray, then it's a back edge \\ If v . color = black and u . d < v . d, then it's a forward edge \\ If v . color = black and u . d > v . d, then it's a cross edge \end{cases}

此外，

forward edge 中因為
$v$ 為
$u$ 之 descendant ，所以
$u . d < v . d < v . f < u . f$
cross edge 中因為
$v$ 不為
$u$ 之 descendant ，所以
$v . d < v . f < u . d < u . f$

如果

G

是無向圖，則想想會發現，forward edge 可以想成 back edge 的反向而 cross edge 可以想成 tree edge 的反向
故如果

G

為無向圖則

G

中僅含 tree edge 與 back edge

此外，不難想像，

G

is acyclic

\Leftrightarrow G

(經過 DFS 後)不含 back edge
acyclic 表示

G

沒有 cycle，讀者如果想像不出來，可以試著用反證法去證明看看

常考的應用：

判斷
$G$ 是否連通並找出所有 connected component
判斷
$G$ 是否 acyclic
在 directed acyclic graph (DAG) 上找出一個 topological sort
在有向圖中找出所有 strongly connected component
在無向圖中找出 biconnected component 與 acticulation point

判斷
$G$ 是否 acyclic

bool detectCycleUtil(Graph &g, int u)
{
    g.vertices[u].color = Color::GRAY;
    
    for (int v : g.adjList[u])
    {
        if (g.vertices[v].color == Color::WHITE)
        {
            if (detectCycleUtil(g, v))
                return true;
        }
        else if (g.vertices[v].color == Color::GRAY) // (u, v) is a  back edge
            return true;
    }
    
    g.vertices[u].color = Color::BLACK;
    return false;
}

bool detectCycle(Graph &g)
{
    for (auto &u : g.vertices)
    {
        u.color = Color::WHITE;
    }
    
    for (int u = 0; u < g.num; ++u)
    {
        if (g.vertices[u].color == Color::WHITE)
        {
            if (detectCycleUtil(g, u))
                return true;
        }
    }
    
    return false;
}

Important

值得注意的是，上面的 algo 在有向圖不會有問題，但在無向圖中，判斷是否為 back edge 還要多一個條件是

v

並非

u

之 parent (g.vertices[v].color == Color::GRAY && g.vertices[u].pi != &g.vertices[v])

Tip

對無向圖

G

做 DFS 的過程中如果偵測到 back edge ，則

G

含有 cycle
若走完

| V | - 1

個邊後尚有邊未被走訪，表示

G

中含有 cycle
也因此對於無向圖，偵測是否含有 cycle 的演算法之時間複雜度為

O (| V |)

，與

| E |

無關

找出 topological sort

給定一個有向圖

G = (V, E)

，若

L

為

V

中元素之一種排序方式滿足：若

(u, v) \in E

則在

L

中

u

出現在

v

的前面，則稱

L

為一個 topological sort
由此定義可知對於有向圖

G

，存在

G

之 topological sort 若且唯若

G

is acyclic

所以，給定 DAG

G

，對其執行 DFS 後，無論

(u, v)

為 tree edge, forward edge 還是 cross edge ，皆可推得

v . f < u . f

故我們找出 topological sort 的演算法只需要對 DFS 後所有頂點的 finishing time 做排序即可

void DFSVisit(Graph &g, int u, std::list<int> &l)
{
    g.vertices[u].color = Color::GRAY;
    
    for (int v : g.adjList[u])
    {
        if (g.vertices[v].color == Color::WHITE)
            DFSVisit(g, v, l);
    }
    
    g.vertices[u].color = Color::BLACK; // Finished
    l.push_front(u); // Insert vertex at the front of the list (topological order)

}

void DFS(Graph &g, std::list<int> &l)
{
    for (auto &u : g.vertices)
    {
        u.color = Color::WHITE;
    }
    
    for (int u = 0; u < g.num; ++u)
    {
        if (g.vertices[u].color == Color::WHITE)
            DFSVisit(g, u, l);
    }
}

std::list<int> topologicalSort(Graph &g)
{
    std::list<int> l;
    DFS(g, l);
    return l;
}

找出所有 strongly connected component

給定有向圖

G = (V, E)

，若

C \subseteq V

為 maximal 滿足：

\forall u, v \in C, \exists

path 自

u

走到

v

且亦

\exists

path 自

v

走到

u

則稱

C

為一個 strongly connected component (SCC) of

G

如果給定有向圖

G

，建立一個新圖

G^{'} = (V^{'}, E^{'})

稱之為 component graph
顧名思義，

V^{'}

中元素對應

G

中的 components
則

(v_{i}, v_{j}) \in E^{'} \Leftrightarrow \exists u \in C_{i}, w \in C_{j}

使得

G

中存在自

u

到

w

的 path

這樣的 component graph 必為 DAG

此時離想出演算法只差最後的巧思：我們對

G

中所有邊取反向建成新的

G^{T} = (V, E^{T})

我們先對

G

做 DFS 再對

G^{T}

做 DFS，過程中依照第一次 DFS 求出的 finishing time 由大到小選點，則最後所得的 depth-first forest 中每棵 tree 即一個 SCC

這便是 Kosaraju's algorithm

void DFSVisit(Graph &g, int u, std::vector<int> &finishStack)
{
    g.vertices[u].color = Color::GRAY;
    
    for (int v : g.adjList[u])
    {
        if (g.vertices[v].color == Color::WHITE)
            DFSVisit(g, v, finishStack);
    }
    
    g.vertices[u].color = Color::BLACK;
    finishStack.push_back(u);
}

void DFS(Graph &g, std::vector<int> &finishStack)
{
    for (auto &u : g.vertices)
    {
        u.color = Color::WHITE;
    }
    
    for (int u = 0; u < g.num; ++u)
    {
        if (g.vertices[u].color == Color::WHITE)
            DFSVisit(g, u, finishStack)
    }
}

Graph getTransposeGraph(Graph &g)
{
    Graph gT(g.num);
    for (int u = 0; u< g.num; ++u)
    {
        for (int v : g.adjList[u])
        {
            gT.addEdge(v, u);
        }
    }
    
    return gT;
}

void DFSOnTranspose(Graph &gT, int u, std::vector<int> &component)
{
    gT.vertices[u].color = Color::GRAY;
    component.push_back(u);
    
    for (int v : gT.adjList[u])
    {
        if (gT.vertices[v].color == Color::WHITE)
            DFSOnTranspose(gT, v, component);
    }
}

std::vector<std::vector<int>> SCC(Graph &g)
{
    std::vector<int> finishStack;
    finishStack.reserve(g.num);
    DFS(g, finishStack);
    
    Graph gT = getTransposeGraph(g);
    
    for (auto &vertex : gT.vertices)
    {
        vertex.color = Color::WHITE;
    }
    
    std::vector<std::vector<int>> scc;
    
    while (!finishStack.empty())
    {
        const int u = finishStack.back();
        finishStack.pop_back();
        
        if (gT.vertices[u].color == Color::WHITE)
        {
            std::vector<int> component;
            DFSOnTranspose(gT, u, component);
            scc.push_back(component);
        }
    }
    
    return scc;
}

【資工考研】DSA: Graph (1) BFS & DFS

About Graph

Graph Representations

Adjacency Matrix

Adjacency List

Adjacency Multilist

Incidence Matrix

Index + Array

BFS & DFS

BFS

Diameter

DFS

判斷 G 是否 acyclic

找出 topological sort

找出所有 strongly connected component

【資工考研】DSA: Graph 全系列

Read more

DM Generating Function

DSA Search & Sort

DM 遞迴關係式

DM 二元關係

判斷
$G$ 是否 acyclic