Content deleted Content added
No edit summary |
m →Model: HTTP to HTTPS for Brown University |
||
(20 intermediate revisions by 14 users not shown) | |||
Line 1:
{{update|date=September 2019}}
In [[computer vision]],
Traditionally, classifiers are trained using sets of images that are labeled by hand. Collecting such a set of images is often a very time-consuming and laborious process. The use of Internet search engines to automate the process of acquiring large sets of labeled images has been described as a potential way of greatly facilitating computer vision research.<ref name = "fergus">
{{cite conference
| last = Fergus
| first = R. |author2=Fei-Fei, L. |author3=Perona, P. |author4=Zisserman, A.
| title = Learning Object Categories from Google抯 Image Search
|
| url = http://vision.cs.princeton.edu/documents/FergusFei-FeiPeronaZisserman_ICCV05.pdf
| year = 2005}}
Line 14 ⟶ 15:
=== Unrelated images ===
One problem with using Internet image search results as a training set for a classifier is the high percentage of unrelated images within the results. It has been estimated that, when a search engine such as Google images is queried with the name of an object category (such as ''airplane
=== Intra-class variability ===
Line 29 ⟶ 30:
<math>\displaystyle P(w|d) = \sum_{z=1}^Z P(w|z)P(z|d)</math>
An important assumption made in this model is that <math>\displaystyle w</math> and <math>\displaystyle d</math> are conditionally independent given <math>\displaystyle z</math>. Given a topic, the probability of a certain word appearing as part of that topic is independent of the rest of the image.<ref name
|
|
|
|book-title
|
|
|url-status = dead
|archive-url = https://web.archive.org/web/20070710083034/http://www.cs.brown.edu/~th/papers/Hofmann-UAI99.pdf
|archive-date = 2007-07-10
}}</ref>
Training this model involves finding <math>\displaystyle P(w|z)</math> and <math>\displaystyle P(z|d)</math> that maximizes the likelihood of the observed words in each document. To do this, the [[expectation maximization]] algorithm is used, with the following [[objective function]]:
Line 63 ⟶ 68:
==== Selecting words ====
Words in an image were selected using 4 different feature detectors:<ref name = "fergus"/>
* [[
* [[Corner detection|Multi-scale Harris detector]]
* [[Difference of Gaussians]]
Line 83 ⟶ 88:
| first = Li-Jia |author2=Wang, Gang |author3=Fei-Fei, Li
| title = OPTIMOL: automatic Online Picture collection via Incremental MOdel Learning
|
| year = 2007
| url = http://vision.cs.princeton.edu/documents/LiWangFei-Fei_CVPR2007.pdf}}
Line 104 ⟶ 109:
{{cite journal
| last = Teh
| first = Yw |author2=Jordan, MI |author3=Beal, MJ |author4=Blei, David
| title = Hierarchical Dirichlet Processes
| journal = Journal of the American Statistical Association
Line 113 ⟶ 118:
| issue = 476
| page = 1566
| citeseerx = 10.1.1.5.9094 | s2cid = 7934949 }}
</ref>
Line 144 ⟶ 149:
* ''Ability to collect images'': OPTIMOL, it is found, can automatically collect large numbers of good images from the web. The size of the OPTIMOL-retrieved image sets surpass that of large human-labeled image sets for the same categories, such as those found in [[Caltech 101]].
* ''Classification accuracy'': Classification accuracy was compared to the accuracy displayed by the classifier yielded by the pLSA methods discussed earlier. It was discovered that OPTIMOL achieved slightly higher accuracy, obtaining 74.8% accuracy on 7 object categories, as compared to 72.0%.
* ''Comparison with batch learning'': An important question to address is whether OPTIMOL's incremental learning gives it an advantage over traditional batch learning methods, when everything else about the model is held constant. When the classifier learns incrementally, by selecting the next images based on what it learned from the previous ones, three important results are observed:
** Incremental learning allows OPTIMOL to collect a better dataset
Line 159 ⟶ 162:
{{cite conference
| last = Fergus
| first = R. |author2=Perona, P. |author3=Zisserman, A.
| title = A visual category filter for Google images
|
| year = 2004
| url = http://www.robots.ox.ac.uk/~fergus/papers/Fergus_ECCV4.pdf
Line 171 ⟶ 174:
|author2=Forsyth, D.
| title = Animals on the web
|
| year = 2006
| doi = 10.1109/CVPR.2006.57
}}</ref>
* Yanai and Barnard, 2006 <ref>
Line 181 ⟶ 184:
|author2=Barnard, K.
| title = Probabilistic web image gathering
|
| year = 2005
| url = http://portal.acm.org/citation.cfm?id=1101838
Line 188 ⟶ 191:
== References ==
<references/>
== See also ==
Line 200:
[[Category:Object recognition and categorization]]
[[Category:Image search]]
|