Object categorization from image search: Difference between revisions

Content deleted Content added
BG19bot (talk | contribs)
m Performance: Remove blank line(s) between list items per WP:LISTGAP to fix an accessibility issue for users of screen readers. Do WP:GENFIXES and cleanup if needed. Discuss this at... using AWB
Bender the Bot (talk | contribs)
m Model: HTTP to HTTPS for Brown University
 
(19 intermediate revisions by 13 users not shown)
Line 1:
{{update|date=September 2019}}
In [[computer vision]], the problem of '''object categorization from image search''' is the problem of training a [[Statistical classification|classifier]] to recognize categories of objects, using only the[[image search]], i.e., images retrieved automatically with an Internet [[search engine]]. Ideally, automatic image collection would allow classifiers to be trained with nothing but the category names as input. This problem is closely related to that of [[content-based image retrieval]] (CBIR), where the goal is to return better image search results rather than training a classifier for image recognition.
 
Traditionally, classifiers are trained using sets of images that are labeled by hand. Collecting such a set of images is often a very time-consuming and laborious process. The use of Internet search engines to automate the process of acquiring large sets of labeled images has been described as a potential way of greatly facilitating computer vision research.<ref name = "fergus">
{{cite conference
| last = Fergus
| first = R. |author2=Fei-Fei, L. |author3=Perona, P. |author4=Zisserman, A.
| title = Learning Object Categories from Google抯 Image Search
| booktitlebook-title = Proc. IEEE International Conference on Computer Vision
| url = http://vision.cs.princeton.edu/documents/FergusFei-FeiPeronaZisserman_ICCV05.pdf
| year = 2005}}
Line 14 ⟶ 15:
 
=== Unrelated images ===
One problem with using Internet image search results as a training set for a classifier is the high percentage of unrelated images within the results. It has been estimated that, when a search engine such as Google images is queried with the name of an object category (such as ''airplane?''), up to 85% of the returned images are unrelated to the category.<ref name = "fergus"/>
 
=== Intra-class variability ===
Line 29 ⟶ 30:
<math>\displaystyle P(w|d) = \sum_{z=1}^Z P(w|z)P(z|d)</math>
 
An important assumption made in this model is that <math>\displaystyle w</math> and <math>\displaystyle d</math> are conditionally independent given <math>\displaystyle z</math>. Given a topic, the probability of a certain word appearing as part of that topic is independent of the rest of the image.<ref name = "hofmann">{{cite conference
| first = Thomas
| last = Hofmann
| title = Probabilistic Latent Semantic Analysis
|book-title booktitle = Uncertainty in Artificial Intelligence
| year = 1999
| url = httphttps://www.cs.brown.edu/~th/papers/Hofmann-UAI99.pdf}}</ref>
|url-status = dead
|archive-url = https://web.archive.org/web/20070710083034/http://www.cs.brown.edu/~th/papers/Hofmann-UAI99.pdf
|archive-date = 2007-07-10
}}</ref>
 
Training this model involves finding <math>\displaystyle P(w|z)</math> and <math>\displaystyle P(z|d)</math> that maximizes the likelihood of the observed words in each document. To do this, the [[expectation maximization]] algorithm is used, with the following [[objective function]]:
Line 63 ⟶ 68:
==== Selecting words ====
Words in an image were selected using 4 different feature detectors:<ref name = "fergus"/>
* [[Kadir-BradyKadir–Brady saliency detector]]
* [[Corner detection|Multi-scale Harris detector]]
* [[Difference of Gaussians]]
Line 83 ⟶ 88:
| first = Li-Jia |author2=Wang, Gang |author3=Fei-Fei, Li
| title = OPTIMOL: automatic Online Picture collection via Incremental MOdel Learning
| booktitlebook-title = Proc. IEEE Conference on Computer Vision and Pattern Recognition
| year = 2007
| url = http://vision.cs.princeton.edu/documents/LiWangFei-Fei_CVPR2007.pdf}}
Line 104 ⟶ 109:
{{cite journal
| last = Teh
| first = Yw |author2=Jordan, MI |author3=Beal, MJ |author4=Blei, David
| title = Hierarchical Dirichlet Processes
| journal = Journal of the American Statistical Association
Line 113 ⟶ 118:
| issue = 476
| page = 1566
| citeseerx = 10.1.1.5.9094 | s2cid = 7934949 }}
}}
</ref>
 
Line 157 ⟶ 162:
{{cite conference
| last = Fergus
| first = R. |author2=Perona, P. |author3=Zisserman, A.
| title = A visual category filter for Google images
| booktitlebook-title = Proc. 8th European Conf. on Computer Vision
| year = 2004
| url = http://www.robots.ox.ac.uk/~fergus/papers/Fergus_ECCV4.pdf
Line 169 ⟶ 174:
|author2=Forsyth, D.
| title = Animals on the web
| booktitlebook-title = Proc. Computer Vision and Pattern Recognition
| year = 2006
| doi = 10.1109/CVPR.2006.57
| url = http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1640929
}}</ref>
* Yanai and Barnard, 2006 <ref>
Line 179 ⟶ 184:
|author2=Barnard, K.
| title = Probabilistic web image gathering
| booktitlebook-title = ACM SIGMM workshop on Multimedia information retrieval
| year = 2005
| url = http://portal.acm.org/citation.cfm?id=1101838
Line 186 ⟶ 191:
== References ==
<references/>
 
== External links ==
{{Empty section|date=July 2010}}
 
== See also ==
Line 198 ⟶ 200:
 
[[Category:Object recognition and categorization]]
[[Category:Image search]]