feature manipulation:feature augmentation, such as we can use cycle count as augmented node features
struture manipulation:
sparse graph: add virtual nodes or edges
dense graph: sample neighbors when doing message passing
large graph: sample subgraphs to compute embeddings
2. GNN training
(1)Node-level
After GNN computation, we have ddd-dim node embeddings: {hv(L)∈Rd,∀v∈G}\text { embeddings: }\left\{\mathbf{h}_v^{(L)} \in \mathbb{R}^d, \forall v \in G\right\} embeddings: {hv(L)∈Rd,∀v∈G}
W(H)∈Rk∗d\mathbf{W}^{(H)} \in \mathbb{R}^{k * d}W(H)∈Rk∗d : We map node embeddings from hv(L)∈Rd\mathbf{h}_v^{(L)} \in \mathbb{R}^dhv(L)∈Rd to y^v∈Rk\widehat{y}_v \in \mathbb{R}^kyv∈Rk
compute the loss
(2)Edge-level
use pairs of node embeddings
such as k-way prediction:y^uv=Headedge (hu(L),hv(L))\widehat{\boldsymbol{y}}_{u v}=\operatorname{Head}_{\text {edge }}\left(\mathbf{h}_u^{(L)}, \mathbf{h}_v^{(L)}\right)yuv=Headedge (hu(L),hv(L))
Concatenation + Linear:y^uv=Linear(Concat(hu(L),hv(L)))\hat{\boldsymbol{y}}_{u v}=\operatorname{Linear}\left(\operatorname{Concat}\left(\mathbf{h}_u^{(L)}, \mathbf{h}_v^{(L)}\right)\right)y^uv=Linear(Concat(hu(L),hv(L))),and Linear\operatorname{Linear}Linear can map 2d-dim embeddings to k-dim embeddings
this approach only applies to 1-way prediction(预测边是否存在)
k-way prediction:
(3)Graph-level
use all the node embeddings in our graph
such as k-way prediction:y^G=Headgraph({hv(L)∈Rd,∀v∈G})\widehat{\boldsymbol{y}}_G=\operatorname{Head}_{\operatorname{graph}}\left(\left\{\mathbf{h}_v^{(L)} \in \mathbb{R}^d, \forall v \in G\right\}\right)yG=Headgraph({hv(L)∈Rd,∀v∈G})
Headgraph\operatorname{Head}_{\operatorname{graph}}Headgraph ≈ AGG(`) in a GNN layer
Gloal pooling:use Gloal mean or max or sum pooling instead of Headgraph\operatorname{Head}_{\operatorname{graph}}Headgraph
3. Issue of Global pooling
(1)Global pooling的毛病
Useing global pooling over a large graph will lose information
toy example(1-dim node embeddings):
Node embeddings for G1:{−1,−2,0,1,2}G_1:\{-1,-2,0,1,2\}G1:{−1,−2,0,1,2}, global sum pooling ans:0
Node embeddings for G2:{−10,−20,0,10,20}G_2:\{-10,-20,0,10,20\}G2:{−10,−20,0,10,20},global sum pooling ans:0
特点:只看均值,不看方差
so we can use hierarchical pooling 分层池化
toy example:We will aggregate via ReLU(Sum(⋅))\operatorname{ReLU}(\operatorname{Sum}(\cdot))ReLU(Sum(⋅))
We first separately aggregate the first 2 nodes and last 3 nodes;Then we aggregate again to make the final prediction