# NARS 第二版-開發文件
###### tags: `國泰專案`

## 目標
1. **DGL** $\rightarrow$ **PyG**
## Feature Averaging 步驟
1. **Message Passing**: 會對特定一種 relation type 做**訊息傳遞**。
2. **Message Aggregation-1**: 聚合(相加)特定一種 relation type 的訊息,回傳形狀 `(num_dst_nodes, feat_dim)` 的張量。
3. **Message Aggregation-2**: 聚合(相加)不同 relation type 所聚合的特徵。
4. 計算每個 node 的 degree。
5. 將每個 node 聚合後的特徵除以對應的 degree。
<center>
<img src=https://i.imgur.com/ffhIeTP.png width=300 height=200>
</center>
## Feature Averaging 程式概述
1. **SumConv(x, edge_index, size)**
+ **x**: `Union[Tensor, OptPairTensor]` node 特徵。
+ **edge_index**: 邊。
+ **size**
```python=
from torch_geometric.nn.conv import MessagePassing
class SumConv(MessagePassing):
def __init__(self):
super().__init__(aggr='add', flow="source_to_target") # "Add" aggregation
def forward(self, x, edge_index, size = None):
if isinstance(x, Tensor):
x: OptPairTensor = (x, x)
return self.propagate(edge_index, x=x, size=size)
def message(self, x_j): # x_i center node, x_j neighbor node
return x_j
```
```mermaid
graph LR
A["propagate"] --> B["message"] --> C["aggregation"]
```
2. **聚合(相加)不同 relation type 傳遞到 node 的特徵**
```python=
for stype, etype, dtype in new_data.edge_types:
x_pair = (new_data[stype][f'hop_{hop-1}'], new_data[dtype][f'hop_{hop-1}'])
edges = new_data[stype, etype, dtype].edge_index
new_feat = sum_conv(x_pair, edges)
if dtype in ntype2feat:
ntype2feat[dtype] += new_feat
else:
ntype2feat[dtype] = new_feat
```
3. **計算 degree**
```python=
for ntype in ntypes:
deg = 0
for stype, etype, dtype in new_data.edge_types:
if ntype == dtype:
deg = deg + degree(new_data[stype, etype, dtype].edge_index[1, :], new_data[dtype]['hop_0'].shape[0])
norm = 1.0 / deg.float()
norm[torch.isinf(norm)] = 0
new_data[ntype]["norm"] = norm.view(-1, 1).to(device)
```
## 使用範例
```python=
# upload nars_trainer, nars_utils, nars_model
import nars_trainer
from nars_utils import read_relation_subsets
...
...
trainer = nars_trainer.NARS_TRAINER()
trainer.build_model(partial, sample_size, num_feats, in_feats, num_hops, num_hidden, num_classes, ff_layer)
trainer.compile(loss_fcn, optimizer, lr, weight_decay, device)
trainer.fit(g, labels, target_node_type, rel_subsets, train_nid, val_nid, test_nid, num_epochs, batch_size)
```
## Code Review
1. fix index `i`
```python=
def gen_rel_subset_feature(data, target_node_type, rel_subset, R, device):
new_data = HeteroData()
ntypes = set()
sum_conv = SumConv()
for etype in rel_subset:
stype, _, dtype = list(filter(lambda x: x[1] == etype, data.edge_types))[0]
# add nodes
new_data[stype]['hop_0'] = data[stype]['feat']
new_data[dtype]['hop_0'] = data[dtype]['feat']
ntypes.add(stype)
ntypes.add(dtype)
# add edges
src, dst = list(data[(stype, etype, dtype)].edge_index)
new_data[stype, etype, dtype].edge_index = torch.stack([src, dst], dim=0)
new_data[dtype, etype + "_r", stype].edge_index = torch.stack([dst, src], dim=0)
for ntype in ntypes:
deg = 0
for stype, etype, dtype in new_data.edge_types:
if ntype == dtype:
deg = deg + degree(new_data[stype, etype, dtype].edge_index[1, :], new_data[dtype]['hop_0'].shape[0])
norm = 1.0 / deg.float()
norm[torch.isinf(norm)] = 0
new_data[ntype]["norm"] = norm.view(-1, 1).to(device)
res = []
for hop in range(1, R + 1):
ntype2feat = {} # save k-hop feature (sum)
for stype, etype, dtype in new_data.edge_types:
x_pair = (new_data[stype][f'hop_{hop-1}'], new_data[dtype][f'hop_{hop-1}'])
edges = new_data[stype, etype, dtype].edge_index
new_feat = sum_conv(x_pair, edges)
if dtype in ntype2feat:
ntype2feat[dtype] += new_feat
else:
ntype2feat[dtype] = new_feat
for ntype in new_data.node_types:
assert ntype in ntype2feat # because subgraph is not directional
feat_dict = new_data[ntype]
old_feat = feat_dict.pop(f'hop_{hop-1}')
if ntype == target_node_type:
res.append(old_feat.cpu())
feat_dict[f"hop_{hop}"] = ntype2feat.pop(ntype).mul_(feat_dict["norm"])
res.append(new_data[target_node_type].pop(f"hop_{R}").cpu())
return res
def recompute_selected_subsets(data, target_node_type, selected_subsets, R, feat_size, device):
# TODO: recompute in parallel using multi-processing
# Or we should save neighbor-averaged features to disk and load them back instead of re-computing
start = time.time()
num_nodes, _ = data[target_node_type]["feat"].shape
with torch.no_grad():
feats = [
torch.zeros(num_nodes, len(selected_subsets), feat_size)
for _ in range(R + 1)
]
for i, subset in enumerate(selected_subsets):
rel_feats = gen_rel_subset_feature(data, target_node_type, subset, R, device)
for hop in range(R + 1):
feats[hop][:, i, :] = rel_feats[i]
end = time.time()
print("Recompute takes {:.4f} sec".format(end - start))
return feats
```
+ `feats`: 長度為 `R+1` 的列表,每個元素的形狀為 `(num_nodes, len(selected_subsets), feat_size)`
+ `gen_rel_subset_feature`: 長度為 `R+1` 的列表,每個元素的形狀為 `(num_nodes, feat_size)`
`
## 參考資料
1. [PyG Message Passing](https://pytorch-geometric.readthedocs.io/en/latest/notes/create_gnn.html)
2. [程式碼](https://colab.research.google.com/drive/1E1vFUhHTGgaQbKaZA6OtGS5ru43b1WHf)
3. [Message Passing 詳解](https://zqfang.github.io/2021-08-07-graph-pyg/)