Content deleted Content added
m WP:CHECKWIKI error fix for #61. Punctuation goes before References. Do general fixes if a problem exists. - using AWB (9421) |
m →Model: HTTP to HTTPS for Brown University |
||
(25 intermediate revisions by 18 users not shown) | |||
Line 1:
{{update|date=September 2019}}
In [[computer vision]],
Traditionally, classifiers are trained using sets of images that are labeled by hand. Collecting such a set of images is often a very time-consuming and laborious process. The use of Internet search engines to automate the process of acquiring large sets of labeled images has been described as a potential way of greatly facilitating computer vision research.<ref name = "fergus">
{{cite conference
| last = Fergus
▲| coauthors = Fei-Fei, L.; Perona, P.; Zisserman,A.;
| title = Learning Object Categories from Google抯 Image Search
|
| url = http://vision.cs.princeton.edu/documents/FergusFei-FeiPeronaZisserman_ICCV05.pdf
| year = 2005}}
Line 15:
=== Unrelated images ===
One problem with using Internet image search results as a training set for a classifier is the high percentage of unrelated images within the results. It has been estimated that, when a search engine such as Google images is queried with the name of an object category (such as ''airplane
=== Intra-class variability ===
Line 30:
<math>\displaystyle P(w|d) = \sum_{z=1}^Z P(w|z)P(z|d)</math>
An important assumption made in this model is that <math>\displaystyle w</math> and <math>\displaystyle d</math> are conditionally independent given <math>\displaystyle z</math>. Given a topic, the probability of a certain word appearing as part of that topic is independent of the rest of the image.<ref name
|
|
|
|book-title
|
|
|url-status = dead
|archive-url = https://web.archive.org/web/20070710083034/http://www.cs.brown.edu/~th/papers/Hofmann-UAI99.pdf
|archive-date = 2007-07-10
}}</ref>
Training this model involves finding <math>\displaystyle P(w|z)</math> and <math>\displaystyle P(z|d)</math> that maximizes the likelihood of the observed words in each document. To do this, the [[expectation maximization]] algorithm is used, with the following [[objective function]]:
Line 64 ⟶ 68:
==== Selecting words ====
Words in an image were selected using 4 different feature detectors:<ref name = "fergus"/>
* [[
* [[Corner detection|Multi-scale Harris detector]]
* [[Difference of Gaussians]]
Line 82 ⟶ 86:
{{cite conference
| last = Li
| first = Li-Jia |author2=Wang, Gang |author3=Fei-Fei, Li
| title = OPTIMOL: automatic Online Picture collection via Incremental MOdel Learning
|
| year = 2007
| url = http://vision.cs.princeton.edu/documents/LiWangFei-Fei_CVPR2007.pdf}}
Line 106 ⟶ 109:
{{cite journal
| last = Teh
▲| coauthors = Jordan, MI; Beal, MJ; Blei,David
| title = Hierarchical Dirichlet Processes
| journal = Journal of the American Statistical Association
Line 116 ⟶ 118:
| issue = 476
| page = 1566
| citeseerx = 10.1.1.5.9094 | s2cid = 7934949 }}
</ref>
Line 147 ⟶ 149:
* ''Ability to collect images'': OPTIMOL, it is found, can automatically collect large numbers of good images from the web. The size of the OPTIMOL-retrieved image sets surpass that of large human-labeled image sets for the same categories, such as those found in [[Caltech 101]].
* ''Classification accuracy'': Classification accuracy was compared to the accuracy displayed by the classifier yielded by the pLSA methods discussed earlier. It was discovered that OPTIMOL achieved slightly higher accuracy, obtaining 74.8% accuracy on 7 object categories, as compared to 72.0%.
* ''Comparison with batch learning'': An important question to address is whether OPTIMOL's incremental learning gives it an advantage over traditional batch learning methods, when everything else about the model is held constant. When the classifier learns incrementally, by selecting the next images based on what it learned from the previous ones, three important results are observed:
** Incremental learning allows OPTIMOL to collect a better dataset
Line 162:
{{cite conference
| last = Fergus
| first = R. |author2=Perona, P. |author3=Zisserman, A.
| title = A visual category filter for Google images
|
| year = 2004
| url = http://www.robots.ox.ac.uk/~fergus/papers/Fergus_ECCV4.pdf
Line 173 ⟶ 172:
| last = Berg
| first = T.
|
| title = Animals on the web
|
| year = 2006
| doi = 10.1109/CVPR.2006.57
}}</ref>
* Yanai and Barnard, 2006 <ref>
Line 183 ⟶ 182:
| last = Yanai
| first = K
|
| title = Probabilistic web image gathering
|
| year = 2005
| url = http://portal.acm.org/citation.cfm?id=1101838
Line 192 ⟶ 191:
== References ==
<references/>
== See also ==
Line 204 ⟶ 200:
[[Category:Object recognition and categorization]]
[[Category:Image search]]
|