Object categorization from image search: Difference between revisions

Browse history interactively

← Previous edit

Content deleted Content added

VisualWikitext

Revision as of 23:24, 13 May 2016 edit BG19bot (talk \| contribs) 1,005,055 edits m →Performance: Remove blank line(s) between list items per WP:LISTGAP to fix an accessibility issue for users of screen readers. Do WP:GENFIXES and cleanup if needed. Discuss this at... using AWB ← Previous edit		Latest revision as of 09:28, 20 August 2025 edit undo Bender the Bot (talk \| contribs) Bots 1,064,377 edits m →Model: HTTP to HTTPS for Brown University Tag: AWB
(19 intermediate revisions by 13 users not shown)
Line 1: {{update\|date=September 2019}} In [[computer vision]], ~~the problem of~~ '''object categorization from image search''' is the problem of training a [[Statistical classification\|classifier]] to recognize categories of objects, using only ~~the~~[[image search]], i.e., images retrieved automatically with an Internet [[search engine]]. Ideally, automatic image collection would allow classifiers to be trained with nothing but the category names as input. This problem is closely related to that of [[content-based image retrieval]] (CBIR), where the goal is to return better image search results rather than training a classifier for image recognition. Traditionally, classifiers are trained using sets of images that are labeled by hand. Collecting such a set of images is often a very time-consuming and laborious process. The use of Internet search engines to automate the process of acquiring large sets of labeled images has been described as a potential way of greatly facilitating computer vision research.<ref name = "fergus"> {{cite conference \| last = Fergus \| first = R. \|author2=Fei-Fei, L. \|author3=Perona, P. \|author4=Zisserman, A. \| title = Learning Object Categories from Google抯 Image Search \| ~~booktitle~~book-title = Proc. IEEE International Conference on Computer Vision \| url = http://vision.cs.princeton.edu/documents/FergusFei-FeiPeronaZisserman_ICCV05.pdf \| year = 2005}} Line 14 ⟶ 15: === Unrelated images === One problem with using Internet image search results as a training set for a classifier is the high percentage of unrelated images within the results. It has been estimated that, when a search engine such as Google images is queried with the name of an object category (such as ''airplane?''), up to 85% of the returned images are unrelated to the category.<ref name = "fergus"/> === Intra-class variability === Line 29 ⟶ 30: <math>\displaystyle P(w\|d) = \sum_{z=1}^Z P(w\|z)P(z\|d)</math> An important assumption made in this model is that <math>\displaystyle w</math> and <math>\displaystyle d</math> are conditionally independent given <math>\displaystyle z</math>. Given a topic, the probability of a certain word appearing as part of that topic is independent of the rest of the image.<ref name = "hofmann">{{cite conference \| first = Thomas \| last = Hofmann \| title = Probabilistic Latent Semantic Analysis \|book-title ~~booktitle~~ = Uncertainty in Artificial Intelligence \| year = 1999 \| url = ~~http~~https://www.cs.brown.edu/~th/papers/Hofmann-UAI99.pdf~~}}</ref>~~ \|url-status = dead \|archive-url = https://web.archive.org/web/20070710083034/http://www.cs.brown.edu/~th/papers/Hofmann-UAI99.pdf \|archive-date = 2007-07-10 }}</ref> Training this model involves finding <math>\displaystyle P(w\|z)</math> and <math>\displaystyle P(z\|d)</math> that maximizes the likelihood of the observed words in each document. To do this, the [[expectation maximization]] algorithm is used, with the following [[objective function]]: Line 63 ⟶ 68: ==== Selecting words ==== Words in an image were selected using 4 different feature detectors:<ref name = "fergus"/> * [[~~Kadir-Brady~~Kadir–Brady saliency detector]] * [[Corner detection\|Multi-scale Harris detector]] * [[Difference of Gaussians]] Line 83 ⟶ 88: \| first = Li-Jia \|author2=Wang, Gang \|author3=Fei-Fei, Li \| title = OPTIMOL: automatic Online Picture collection via Incremental MOdel Learning \| ~~booktitle~~book-title = Proc. IEEE Conference on Computer Vision and Pattern Recognition \| year = 2007 \| url = http://vision.cs.princeton.edu/documents/LiWangFei-Fei_CVPR2007.pdf}} Line 104 ⟶ 109: {{cite journal \| last = Teh \| first = Yw \|author2=Jordan, MI \|author3=Beal, MJ \|author4=Blei, David \| title = Hierarchical Dirichlet Processes \| journal = Journal of the American Statistical Association Line 113 ⟶ 118: \| issue = 476 \| page = 1566 \| citeseerx = 10.1.1.5.9094 \| s2cid = 7934949 }} }} </ref> Line 157 ⟶ 162: {{cite conference \| last = Fergus \| first = R. \|author2=Perona, P. \|author3=Zisserman, A. \| title = A visual category filter for Google images \| ~~booktitle~~book-title = Proc. 8th European Conf. on Computer Vision \| year = 2004 \| url = http://www.robots.ox.ac.uk/~fergus/papers/Fergus_ECCV4.pdf Line 169 ⟶ 174: \|author2=Forsyth, D. \| title = Animals on the web \| ~~booktitle~~book-title = Proc. Computer Vision and Pattern Recognition \| year = 2006 \| doi = 10.1109/CVPR.2006.57 ~~\| url = http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1640929~~ }}</ref> * Yanai and Barnard, 2006 <ref> Line 179 ⟶ 184: \|author2=Barnard, K. \| title = Probabilistic web image gathering \| ~~booktitle~~book-title = ACM SIGMM workshop on Multimedia information retrieval \| year = 2005 \| url = http://portal.acm.org/citation.cfm?id=1101838 Line 186 ⟶ 191: == References == <references/> ~~== External links ==~~ ~~{{Empty section\|date=July 2010}}~~ == See also == Line 198 ⟶ 200: [[Category:Object recognition and categorization]] [[Category:Image search]]