Graph neural network: Difference between revisions

Content deleted Content added
finish scene graph and start point cloud
finish convolution based methods
Line 79:
 
====== KC-Net<ref name=":5" /> ======
The graph in KC-Net is constructed by kNN. They design a 3D kernel which composed of several learnable 3D points, and the convolution of a given point as center is operated by calculate the similarity between each pair of "'''one kernel point and relative position of one neighboring point of the given center point'''". KC-Net also provides a graph max-pooling module to better capture higher level features of point clouds.
 
====== DGCNN<ref name=":1" /> ======
Convolution method used is DGCNN is using two sets of learnable weights to aggregate feature of a given center point and feature difference between this point and each of its neighbors separately. Although not using pooling module to obtain multi-scale feature, DGCNN dynamically re-define the graph by changing the neighbor number and considering distances in feature space instead of the original 3D space. This idea helps DGCNN better captures semantic information.
 
====== KP-Conv<ref name=":2" /> ======
The neighbor points of a given center point of KP-Conv is chosen by a given radius, all the points inside would regard as the center point's neighbor. While the key idea is similar to KC-Net, kernel points of KP-Conv can be either deformable or stable based on the current task. To explore geometry in different scale, KP-Conv provides pooling module by regional sampling.
 
====== 3D-GCN<ref name=":3" /><ref name=":4" /> ======
[[File:Illustration of 3D-GCN's receptive field and kernel.png|thumb|390x390px|Illustration of 3D-GCN's receptive field and kernel]]
Graphs are constructed by kNN. 3D-GCN designs deformable 3D kernels as each kernel has one center kernel point <math>k_C\in \mathbb{R}^3</math> and several support points <math>k_1, k_2, ... k_S\in \mathbb{R}^3</math>. Given a data point <math>p_n\in \mathbb{R}^3</math>and its neighbor points <math>p_1, p_2, ... p_m \in \mathbb{R}^3</math>, the convolution is operated by taking the direction vector of the center point to each neighbor <math>p_m - p_n</math> and the direction vector of the center kernel point to each support <math>k_S - k_C</math>, calculate their [[cosine similarity]], and then map this similarity to feature space by another learnable parameters <math>\mathcal{w}</math>. Since the convolution is calculated by cosine similarity instead of exact coordinate, 3D-GCN better captures an 3D object's geometry instead of ___location, and is totally '''shift and scale invariant.''' Similar to KC-Net, 3D-GCN also design a graph max-pooling to explore multi-resolution information, while preserving the largest activation.
 
=== Recurrent based methods ===
 
== References ==