Revision as of 03:30, 11 June 2024 edit AnomieBOT (talk \| contribs) Bots 6,858,153 edits m Dating maintenance tags: {{Cn}} ← Previous edit		Revision as of 01:27, 15 June 2024 edit undo Sirgeorge The 1ST (talk \| contribs) 37 edits added links to page describing the aforementioned "Selective Search" algorithm. Next edit →
Line 7: The original goal of R-CNN was to take an input image and produce a set of bounding boxes as output, where each bounding box contains an object and also the category (e.g. car or pedestrian) of the object. More recently, R-CNN has been extended to perform other computer vision tasks. The following covers some of the versions of R-CNN that have been developed. * November 2013: '''R-CNN'''. Given an input image, R-CNN begins by applying a mechanism called [[Selective Search (Object Recognition)\|Selective Search]] to extract [[Region of interest\|regions of interest]] (ROI), where each ROI is a rectangle that may represent the boundary of an object in image. Depending on the scenario, there may be as many as {{nobr\|two thousand}} ROIs. After that, each ROI is fed through a neural network to produce output features. For each ROI's output features, a collection of [[support-vector machine]] classifiers is used to determine what type of object (if any) is contained within the ROI.{{cn\|date=June 2024}} * April 2015: '''Fast R-CNN'''. While the original R-CNN independently computed the neural network features on each of as many as two thousand regions of interest, Fast R-CNN runs the neural network once on the whole image. At the end of the network is a novel method called ROIPooling, which slices out each ROI from the network's output tensor, reshapes it, and classifies it. As in the original R-CNN, the Fast R-CNN uses [[Selective Search (Object Recognition)\|Selective Search]] to generate its region proposals.<ref name=":0">{{Cite news\|last=Bhatia\|first=Richa\|url=https://analyticsindiamag.com/what-is-region-of-interest-pooling/\|title=What is region of interest pooling?\|date=September 10, 2018\|work=Analytics India\|access-date=March 12, 2020}}</ref> * June 2015: '''Faster R-CNN'''. While Fast R-CNN used [[Selective Search (Object Recognition)\|Selective Search]] to generate ROIs, Faster R-CNN integrates the ROI generation into the neural network itself.<ref name=":0" /> * March 2017: '''Mask R-CNN'''. While previous versions of R-CNN focused on object detection, Mask R-CNN adds instance segmentation. Mask R-CNN also replaced ROIPooling with a new method called ROIAlign, which can represent fractions of a pixel.<ref>{{Cite news\|last=Farooq\|first=Umer\|url=https://medium.com/@umerfarooq_26378/from-r-cnn-to-mask-r-cnn-d6367b196cfd\|title=From R-CNN to Mask R-CNN\|date=February 15, 2018\|work=Medium\|access-date=March 12, 2020}}</ref><ref>{{Cite news\|last=Weng\|first=Lilian\|url=https://lilianweng.github.io/lil-log/2017/12/31/object-recognition-for-dummies-part-3.html\|title=Object Detection for Dummies Part 3: R-CNN Family\|date=December 31, 2017\|work=Lil'Log\|access-date=March 12, 2020}}</ref> * June 2019: '''Mesh R-CNN''' adds the ability to generate a 3D mesh from a 2D image.<ref>{{Cite news\|last=Wiggers\|first=Kyle\|url=https://venturebeat.com/2019/10/29/facebook-highlights-ai-that-converts-2d-objects-into-3d-shapes/\|title=Facebook highlights AI that converts 2D objects into 3D shapes\|date=October 29, 2019\|work=VentureBeat\|access-date=March 12, 2020}}</ref>

Region Based Convolutional Neural Networks: Difference between revisions