Graph neural network: Difference between revisions

Content deleted Content added
mNo edit summary
complete GCN
Line 26:
=== Build model ===
 
* '''Propagation module''': updating information carried by nodes and/or edges by some aggregation methods.
* '''Sampling module''': when a graph is too large, sampling modules are needed for computation preservation.
* '''Pooling module''': when higher level information (sub-graph-level, graph-level) is needed for the task, pooling modules are needed to aggregate low-level information and provide hierarchical propagation.
 
== Architecture ==
 
=== Convolution based methods ===
The main idea of this type of methods is to generalize [[Convolutional neural network|standard CNN]] or [[Attention (machine learning)|attentional methods]] to graph structured data, and that is, to define some [[receptive fields]] with respect to given nodes and propagate information within. Based on the operation ___domain, we can further divide these methods into two groups:, '''spectral approaches''' and '''spatial approaches'''.
 
==== Spectral approaches ====
Line 47:
where <math>L</math> is the [[Graph Laplacian|original graph Laplacian]], <math>I</math> is an [[identity matrix]], and <math>\Lambda</math> is a diagonal matrix.
 
Therefore, based on the [[Graph Fourier Transform|convolution's property]]<ref>{{Cite book|last=Mallat|first=Stephane|url=https://books.google.com.tw/books?hl=zh-TW&lr=&id=hbVOfWQNtB8C&oi=fnd&pg=PP1&ots=qunXivn446&sig=-CBvvP9HIfpAFLnQU_d5R0RKLpg&redir_esc=y#v=onepage&q&f=false|title=A Wavelet Tour of Signal Processing|date=1999-09-14|publisher=Elsevier|isbn=978-0-08-052083-4|language=en}}</ref>, the convolution of the signal <math>x</math> and a [[Kernel|learnable kernel function]] <math>g</math> is defined as:
 
<math>g\star x = \mathcal{F^{-1}}(\mathcal{F}(g)\odot \mathcal{F}(x)) = U(U^Tg\odot U^Tx)</math>,
Line 55:
<math>g_{\theta}\star x = Ug_{\theta}U^Tx</math>.
 
===== GCN<ref>{{Cite journal|last=Kipf|first=Thomas N.|last2=Welling|first2=Max|date=2017-02-22|title=Semi-Supervised Classification with Graph Convolutional Networks|url=http://arxiv.org/abs/1609.02907|journal=International Conference on Learning Representations (ICLR), 2017}}</ref> =====
Given a input graph with <math>N</math> nodes and the node feature matrix <math>X</math>, GCN simplified the convolution layer to �<math>H^{(l+1)} = \sigma (\tilde{D}^{-1/2}\tilde{A}\tilde{D}^{-1/2}H^{(l)}W^{(l)})</math>, where
 
* <math>\tilde{A} = A + I</math>;
* <math>\tilde{D}_{ii} = \textstyle \sum_{j} \displaystyle\tilde{A}_{ij} </math>;
* <math>H^{(l)}\in\mathbb{R}^{N\times D}</math> is the input [[Feature (machine learning)|feature map]] at <math>l</math>-th layer, <math>H^{(0)} = X</math>;
* <math>H^{(l+1)}\in\mathbb{R}^{N\times D'}</math> is the output [[Feature (machine learning)|feature map]] at <math>l</math>-th layer;
* <math>W^{(l)}\in\mathbb{R}^{D\times D'}</math>is the trainable weight matrix at <math>l</math>-th layer;
* <math>\sigma (\centerdot) </math> is the [[activation function]].
 
Since both <math>\tilde{A}</math> and <math>\tilde{D}</math> are able to be pre-calculated, this graph convolution method can be easily accelerated by GPU implementation. '''Note that this method only suits for undirected graphs with no edge features.'''
 
== References ==