Graph neural network: Difference between revisions

Content deleted Content added
WikiCleanerBot (talk | contribs)
m v2.05b - Bot T20 CW#61 - Fix errors for CW project (Reference before punctuation)
Line 40:
The outputs of one or more MPNN layers are node representations <math>\mathbf{h}_u</math> for each node <math>u \in V</math> in the graph. Node representations can be employed for any downstream task, such as node/graph [[Statistical classification|classification]] or edge prediction.
 
Graph nodes in an MPNN update their representation aggregating information from their immediate neighbours. As such, stacking <math>n</math> MPNN layers means that one node will be able to communicate with nodes that are at most <math>n</math> "hops" away. In principle, to ensure that every node receives information from every other node, one would need to stack a number of MPNN layers equal to the graph [[DistanceDiameter (graph theory)|diameter]]. However, stacking many MPNN layers may cause issues such as oversmoothing<ref name=chen2021 /> and oversquashing.<ref name=alon2021 /> Oversmoothing refers to the issue of node representations becoming indistinguishable. Oversquashing refers to the bottleneck that is created by squeezing long-range dependencies into fixed-size representations. Countermeasures such as skip connections<ref name=hamilton2017 /><ref name=xu2021 /> (as in [[residual neural network]]s), gated update rules<ref name=li2016 /> and jumping knowledge<ref name=xu2018 /> can mitigate oversmoothing. Modifying the final layer to be a fully-adjacent layer, i.e., by considering the graph as a [[complete graph]], can mitigate oversquashing in problems where long-range dependencies are required.<ref name=alon2021 />
 
Other "flavours" of MPNN have been developed in the literature,<ref name=bronstein2021 /> such as graph convolutional networks<ref name=kipf2016 /> and graph attention networks,<ref name=velickovic2018 /> whose definitions can be expressed in terms of the MPNN formalism.