Region Based Convolutional Neural Networks: Difference between revisions

Content deleted Content added
OAbot (talk | contribs)
m Open access bot: url-access=subscription updated in citation with #oabot.
Architecture: {{-}} to stop images from misleadingly going into the wrong sections
Line 23:
=== Selective search ===
Given an image (or an image-like feature map), '''selective search''' (also called Hierarchical Grouping) first segments the image by the algorithm in (Felzenszwalb and Huttenlocher, 2004),<ref>{{Cite journal |last1=Felzenszwalb |first1=Pedro F. |last2=Huttenlocher |first2=Daniel P. |date=2004-09-01 |title=Efficient Graph-Based Image Segmentation |url=https://link.springer.com/article/10.1023/B:VISI.0000022288.19776.77 |journal=International Journal of Computer Vision |language=en |volume=59 |issue=2 |pages=167–181 |doi=10.1023/B:VISI.0000022288.19776.77 |issn=1573-1405|url-access=subscription }}</ref> then performs the following:<ref name=":1" />
 
 
'''Input:''' (colour) image
Line 46 ⟶ 45:
[[File:R-cnn.svg|thumb|272x272px|R-CNN architecture]]
Given an input image, R-CNN begins by applying selective search to extract [[Region of interest|regions of interest]] (ROI), where each ROI is a rectangle that may represent the boundary of an object in image. Depending on the scenario, there may be as many as {{nobr|two thousand}} ROIs. After that, each ROI is fed through a neural network to produce output features. For each ROI's output features, an ensemble of [[support-vector machine]] classifiers is used to determine what type of object (if any) is contained within the ROI.<ref name=":2">{{Cite journal |last1=Girshick |first1=Ross |last2=Donahue |first2=Jeff |last3=Darrell |first3=Trevor |last4=Malik |first4=Jitendra |date=2016-01-01 |title=Region-Based Convolutional Networks for Accurate Object Detection and Segmentation |url=https://ieeexplore.ieee.org/document/7112511 |journal=IEEE Transactions on Pattern Analysis and Machine Intelligence |volume=38 |issue=1 |pages=142–158 |doi=10.1109/TPAMI.2015.2437384 |pmid=26656583 |issn=0162-8828|url-access=subscription }}</ref>
{{-}}
 
=== Fast R-CNN ===
Line 51:
[[File:RoI_pooling_animated.gif|thumb|268x268px|RoI pooling to size 2x2. In this example region proposal (an input parameter) has size 7x5.]]
At the end of the network is a '''ROIPooling''' module, which slices out each ROI from the network's output tensor, reshapes it, and classifies it. As in the original R-CNN, the Fast R-CNN uses selective search to generate its region proposals.
{{-}}
 
=== Faster R-CNN ===
[[File:Faster-rcnn.svg|thumb|Faster R-CNN]]While Fast R-CNN used selective search to generate ROIs, Faster R-CNN integrates the ROI generation into the neural network itself.<ref name=":4" />
{{-}}
 
=== Mask R-CNN ===