Content deleted Content added
→Architecture: selective search pseudocode |
Citation bot (talk | contribs) Added bibcode. Removed URL that duplicated identifier. Removed parameters. | Use this bot. Report bugs. | Suggested by Headbomb | Linked from Wikipedia:WikiProject_Academic_Journals/Journals_cited_by_Wikipedia/Sandbox | #UCB_webform_linked 251/967 |
||
(16 intermediate revisions by 10 users not shown) | |||
Line 1:
{{Short description|Machine learning model family}}
[[File:R-cnn.svg|thumb|272x272px|R-CNN architecture]]
'''Region-based Convolutional Neural Networks (R-CNN)''' are a family of machine learning models for [[computer vision]], and specifically [[object detection]] and localization.<ref name=":0">{{Cite book |
R-CNN has been extended to perform other computer vision tasks, such as: tracking objects from a drone-mounted camera,<ref>{{Cite news |last=Nene |first=Vidi |date=Aug 2, 2019 |title=Deep Learning-Based Real-Time Multiple-Object Detection and Tracking via Drone |url=https://dronebelow.com/2019/08/02/deep-learning-based-real-time-multiple-object-detection-and-tracking-via-drone/ |access-date=Mar 28, 2020 |work=Drone Below}}</ref> locating text in an image,<ref>{{Cite news |last=Ray |first=Tiernan |date=Sep 11, 2018 |title=Facebook pumps up character recognition to mine memes |url=https://www.zdnet.com/article/facebook-pumps-up-character-recognition-to-mine-memes/ |access-date=Mar 28, 2020 |publisher=[[ZDNET]]}}</ref> and enabling object detection in [[Google Lens]].<ref>{{Cite news |last=Sagar |first=Ram |date=Sep 9, 2019 |title=These machine learning methods make google lens a success |url=https://analyticsindiamag.com/these-machine-learning-techniques-make-google-lens-a-success/ |access-date=Mar 28, 2020 |work=Analytics India}}</ref>
Line 12:
* November 2013: '''R-CNN'''.<ref name=":2" />
* April 2015: '''Fast R-CNN'''.<ref name=":3">{{Cite
* June 2015: '''Faster R-CNN'''.<ref name=":4">{{Cite journal |
* March 2017: '''Mask R-CNN'''.<ref name=":5">{{Cite
*
* June 2019: '''Mesh R-CNN''' adds the ability to generate a 3D mesh from a 2D image.<ref>{{Cite journal |last1=Gkioxari |first1=Georgia |last2=Malik |first2=Jitendra |last3=Johnson |first3=Justin |date=2019 |title=Mesh R-CNN |url=https://openaccess.thecvf.com/content_ICCV_2019/html/Gkioxari_Mesh_R-CNN_ICCV_2019_paper.html |pages=9785–9795|arxiv=1906.02739 }}</ref>
== Architecture ==
Line 21 ⟶ 22:
=== Selective search ===
Given an image (or an image-like feature map), '''selective search''' (also called Hierarchical Grouping) first segments
Output: Set of object ___location hypotheses L ▼
'''Input:''' (colour) image
Segment image into initial regions R = {r₁, ..., rₙ} using Felzenszwalb and Huttenlocher (2004)▼
Initialise similarity set S = ∅▼
foreach Neighbouring region pair (rᵢ, rⱼ) do▼
▲ Segment image into initial regions R = {
▲ Initialise similarity set S = ∅
S = S ∪ s(rᵢ, rⱼ)▼
while S ≠ ∅ do▼
S = S ∪ s(r<sub>i</sub>, r<sub>j</sub>)
Merge corresponding regions rₜ = rᵢ ∪ rⱼ▼
▲ '''while''' S ≠ ∅ do
Remove similarities regarding rᵢ: S = S \ s(rᵢ, r∗)▼
Get highest similarity s(r<sub>i</sub>, r<sub>j</sub>) = max(S)
Remove similarities regarding rⱼ: S = S \ s(r∗, rⱼ)▼
Calculate similarity set Sₜ between rₜ and its neighbours▼
R = R ∪ rₜ▼
Extract object ___location boxes L from all regions in R</syntaxhighlight>▼
=== R-CNN ===
[[File:R-cnn.svg|thumb|272x272px|R-CNN architecture]]
Given an input image, R-CNN begins by applying
{{-}}
=== Fast R-CNN ===
[[File:Fast-rcnn.svg|thumb|Fast R-CNN]]While the original R-CNN independently computed the neural network features on each of as many as two thousand regions of interest, Fast R-CNN runs the neural network once on the whole image.<ref name=":3" />
[[File:RoI_pooling_animated.gif|thumb|268x268px|RoI pooling to size 2x2. In this example region proposal (an input parameter) has size 7x5.]]
At the end of the network is a '''ROIPooling''' module, which slices out each ROI from the network's output tensor, reshapes it, and classifies it. As in the original R-CNN, the Fast R-CNN uses selective search to generate its region proposals.
{{-}}
=== Faster R-CNN ===
[[File:Faster-rcnn.svg|thumb|Faster R-CNN]]While Fast R-CNN used selective search to generate ROIs, Faster R-CNN integrates the ROI generation into the neural network itself.<ref name=":4" />
{{-}}
=== Mask R-CNN ===
[[File:Mask-rcnn.svg|thumb|Mask R-CNN]]While previous versions of R-CNN focused on object
== References ==
<references />
== Further reading ==
* {{Cite web |last=Parthasarathy |first=Dhruv |date=2017-04-27 |title=A Brief History of CNNs in Image Segmentation: From R-CNN to Mask R-CNN |url=https://blog.athelas.com/a-brief-history-of-cnns-in-image-segmentation-from-r-cnn-to-mask-r-cnn-34ea83205de4 |access-date=2024-09-11 |website=Medium |language=en}}
[[Category:Object recognition and categorization]]
|