## Continue Graph Thinking
https://hackmd.io/@teacher144123/r1-V5Bn-5
---
## Graph Representation Learning ch5-6
- Present: Dragon Chen
- Date: 2022/03/25
---
### Recap: Graph Data

- Zachary's karate club dataset
----
Basic types:
- undirected
- directed
- weighted
----
### Recap: Adjacency Matrix

Undirected graph is **symmetric**
Weighted graph has **different values**
----
Advanced kinds of graph
- multi-relation
- heterogeneous
- multiplex
- ...
---
### Recap: Multi-Relation Graph

Edge has different types
----
### Recap: Decoder

We make our decoder **multi-relational**
No multi-relation:
$DEC(ENC(u), ENC(v)) = S[u, v]$
$DEC(z_u, z_v) = S[u, v]$
----
Break

----
## Intro
So, how to build a model?
---
**Graph Neural Network** is coming
----
## Message Passing

----
$$
h_u^{k + 1} = UPDATE^k \left( h_u^k, AGGR^k (\{h_v^k, \exists v \in N(u)\}) \right) \\
= UPDATE^k \left( h_u^k, m_{N(u)}^k \right)
$$
UPDATE and AGGREGATE functions are differentiable functions(e.g. NN).
$m_{N(U)}^k$ is the message aggregated from $u$ 's neighborhood $N(u)$
----
### Then, some basic GNN
skipable

----
### What is next, self-loop

It's common in modern GNN
----
### Generalization is powerful trick of NN
The issue with the most basic aggregation of sum-up is that it is **unstable** and **highly sensitive to node degrees**.
I know, it's simple, just change the sum-up to average.
----
## GCN(Graph Convolution Network)
Not enough.
GCN (2016) is soming. It's still the basic baseline nowadays.

----
Take a break

----

---
Let's go complex
----
### Set Pooling

Little improvement. Add the change of overfitting (cuz MLP).
e.g. GraphSAGE-pool
----
### Janossy Pooling(Permutation-Sensitive)
Instead of **Permutation-Invariant**(the orders aren't matter), we can use Permutation-Sensitive approach.

Sum up all the possible ordering.
$\prod$ are all the permutations.
$\rho_\phi$ is a **Permutation-Sensitive** function. Some researchers use LSTM.
----
### Node Attention - GAT (Graph Attention Network)



----
### Transformer
You can also do multi-head
----

---
### What we got so far?
- Basic GNN
- Add self-loops
- Normalize with the degree of node $u$ and $v$
- Set Pooling (Use the complexer MLP)
- Janossy Pooling (Use permutation-sensitive model)
- Node Attention (Use MLP to compute attention score)
- Multi-Head Transformer
---
We know there are two key operations: **Aggregation and Update**.
The update stage also have choices.
----
### The Biggest Problem - Over-Smoothing

When $K \rightarrow \infty$, the embedding become the distribution of the whole graph (based on the random-walk theory).
In the meantime, the local information from neighborhood will be lost.
----
### Concatenation and Skip-Connections

----
### Gated Update
Use GRU to update $h$

----
### Jumping Knowledge Connections

To improve the quality, simply learn the representation of the **last layer from each layers of message passing**.

---
## Edge Feature and Multi-Relation
----
### Relational GNN

Increase in the number of parameters, as now we have one trainable matrix per relation type.
----
### Attention and Feature Concatenation

----
### Graph Pooling
{"metaMigratedAt":"2023-06-16T21:17:34.650Z","metaMigratedFrom":"YAML","title":"Graph Representation Learning ch5~6","breaks":true,"slideOptions":"{\"theme\":\"serif\",\"transition\":\"slide\"}","contributors":"[{\"id\":\"1155b8f4-1581-4900-adcf-1b4abbb2c7fd\",\"add\":9017,\"del\":4716}]"}