Revision as of 04:54, 20 June 2025 edit OAbot (talk \| contribs) Bots 643,717 edits m Open access bot: url-access=subscription updated in citation with #oabot. ← Previous edit		Revision as of 06:15, 7 August 2025 edit undo 174.138.218.72 (talk) →Architecture: {{-}} to stop images from misleadingly going into the wrong sections Next edit →
Line 23: === Selective search === Given an image (or an image-like feature map), '''selective search''' (also called Hierarchical Grouping) first segments the image by the algorithm in (Felzenszwalb and Huttenlocher, 2004),<ref>{{Cite journal \|last1=Felzenszwalb \|first1=Pedro F. \|last2=Huttenlocher \|first2=Daniel P. \|date=2004-09-01 \|title=Efficient Graph-Based Image Segmentation \|url=https://link.springer.com/article/10.1023/B:VISI.0000022288.19776.77 \|journal=International Journal of Computer Vision \|language=en \|volume=59 \|issue=2 \|pages=167–181 \|doi=10.1023/B:VISI.0000022288.19776.77 \|issn=1573-1405\|url-access=subscription }}</ref> then performs the following:<ref name=":1" /> '''Input:''' (colour) image Line 46 ⟶ 45: [[File:R-cnn.svg\|thumb\|272x272px\|R-CNN architecture]] Given an input image, R-CNN begins by applying selective search to extract [[Region of interest\|regions of interest]] (ROI), where each ROI is a rectangle that may represent the boundary of an object in image. Depending on the scenario, there may be as many as {{nobr\|two thousand}} ROIs. After that, each ROI is fed through a neural network to produce output features. For each ROI's output features, an ensemble of [[support-vector machine]] classifiers is used to determine what type of object (if any) is contained within the ROI.<ref name=":2">{{Cite journal \|last1=Girshick \|first1=Ross \|last2=Donahue \|first2=Jeff \|last3=Darrell \|first3=Trevor \|last4=Malik \|first4=Jitendra \|date=2016-01-01 \|title=Region-Based Convolutional Networks for Accurate Object Detection and Segmentation \|url=https://ieeexplore.ieee.org/document/7112511 \|journal=IEEE Transactions on Pattern Analysis and Machine Intelligence \|volume=38 \|issue=1 \|pages=142–158 \|doi=10.1109/TPAMI.2015.2437384 \|pmid=26656583 \|issn=0162-8828\|url-access=subscription }}</ref> {{-}} === Fast R-CNN === Line 51: [[File:RoI_pooling_animated.gif\|thumb\|268x268px\|RoI pooling to size 2x2. In this example region proposal (an input parameter) has size 7x5.]] At the end of the network is a '''ROIPooling''' module, which slices out each ROI from the network's output tensor, reshapes it, and classifies it. As in the original R-CNN, the Fast R-CNN uses selective search to generate its region proposals. {{-}} === Faster R-CNN === [[File:Faster-rcnn.svg\|thumb\|Faster R-CNN]]While Fast R-CNN used selective search to generate ROIs, Faster R-CNN integrates the ROI generation into the neural network itself.<ref name=":4" /> {{-}} === Mask R-CNN ===

Region Based Convolutional Neural Networks: Difference between revisions