Content deleted Content added
No edit summary |
m clean up, replaced: | journal=Springer → | publisher=Springer |
||
Line 12:
The term "content-based image retrieval" seems to have originated in 1992 when it was used by Japanese [[Electrotechnical Laboratory]] engineer Toshikazu Kato to describe experiments into automatic retrieval of images from a database, based on the colors and shapes present.<ref name="Eakins"/><ref>{{cite journal |last1=Kato |first1=Toshikazu |title=Database architecture for content-based image retrieval |journal=Image Storage and Retrieval Systems |date=April 1992 |volume=1662 |pages=112–123 |doi=10.1117/12.58497 |bibcode=1992SPIE.1662..112K |publisher=International Society for Optics and Photonics|s2cid=14342247 }}</ref> Since then, the term has been used to describe the process of retrieving desired images from a large collection on the basis of syntactical image features. The techniques, tools, and algorithms that are used originate from fields such as statistics, pattern recognition, signal processing, and computer vision.<ref name="Survey" />
{{anchor|Content-based video browsing}}Content-based [[video browsing]] was introduced by Iranian engineer Farshid Arman, Taiwanese computer scientist Arding Hsu, and computer scientist Ming-Yee Chiu, while working at [[Siemens]], and it was presented at the [[Association for Computing Machinery|ACM International Conference]] in August 1993.<ref>{{cite journal |last1=Arman |first1=Farshid |last2=Hsu |first2=Arding |last3=Chiu |first3=Ming-Yee |title=Image Processing on Compressed Data for Large Video Databases |journal=Proceedings of the First ACM International Conference on Multimedia |date=August 1993 |pages=267–272 |doi=10.1145/166266.166297 |isbn=0897915968 |url=https://dl.acm.org/citation.cfm?id=166297 |publisher=[[Association for Computing Machinery]]|s2cid=10392157 }}</ref><ref name="Arman1994">{{cite journal |last1=Arman |first1=Farshid |last2=Depommier |first2=Remi |last3=Hsu |first3=Arding |last4=Chiu |first4=Ming-Yee |title=Content-based Browsing of Video Sequences |journal=Proceedings of the Second ACM International Conference on Multimedia |date=October 1994 |pages=97–103 |doi=10.1145/192593.192630 |citeseerx=10.1.1.476.7139 |isbn=0897916867 |url=https://dl.acm.org/citation.cfm?id=192630 |publisher=[[Association for Computing Machinery]]|s2cid=1360834 }}</ref> They described a [[shot detection]] algorithm for [[compressed video]] that was originally encoded with [[discrete cosine transform]] (DCT) [[video coding standards]] such as [[JPEG]], [[MPEG]] and [[H.26x]]. The basic idea was that, since the DCT coefficients are mathematically related to the spatial ___domain and represent the content of each frame, they can be used to detect the differences between video frames. In the algorithm, a subset of blocks in a frame and a subset of DCT coefficients for each block are used as [[motion vector]] representation for the frame. By operating on compressed DCT representations, the algorithm significantly reduces the computational requirements for decompression and enables effective video browsing.<ref>{{cite book |last1=Zhang |first1=HongJiang |chapter=Content-Based Video Browsing And Retrieval |editor-last1=Furht |editor-first1=Borko |title=Handbook of Internet and Multimedia Systems and Applications |date=1998 |publisher=[[CRC Press]] |isbn=9780849318580 |pages=[https://archive.org/details/handbookofintern0000unse_a3l0/page/83 83–108 (89)] |chapter-url=https://books.google.com/books?id=5zfC1wI0wzUC&pg=PA89 |url=https://archive.org/details/handbookofintern0000unse_a3l0/page/83 }}</ref> The algorithm represents separate shots of a video sequence by an r-frame, a thumbnail of the shot framed by a motion tracking region. A variation of this concept was later adopted for QBIC video content mosaics, where each r-frame is a salient still from the shot it represents.<ref>{{cite journal |last1=Steele |first1=Michael |last2=Hearst |first2=Marti A. |last3=Lawrence |first3=A. Rowe |s2cid=18212394 |title=The Video Workbench: a direct manipulation interface for digital media editing by amateur videographers |journal=[[Semantic Scholar]] |date=1998 |pages=
==={{Visible anchor|QBIC}} - Query By Image Content===
Line 106:
===Shape===
Shape does not refer to the shape of an image but to the shape of a particular region that is being sought out. Shapes will often be determined first applying [[Segmentation (image processing)|segmentation]] or [[edge detection]] to an image. Other methods use shape filters to identify given shapes of an image.<ref>{{cite book | last=Tushabe | first=F. |author2=M.H.F. Wilkinson | title=Content-based Image Retrieval Using Combined 2D Attribute Pattern Spectra
Some shape descriptors include:<ref name="Rui"/>
Line 114:
== Vulnerabilities, attacks and defenses ==
Like other tasks in [[computer vision]] such as recognition and detection, recent neural network based retrieval algorithms are susceptible to [[generative adversarial network
Conversely, the resistance to such attacks can be improved via adversarial defenses such as the Madry defense.<ref name="Madry Makelov Schmidt Tsipras 2017">{{cite arXiv | last1=Madry | first1=Aleksander | last2=Makelov | first2=Aleksandar | last3=Schmidt | first3=Ludwig | last4=Tsipras | first4=Dimitris | last5=Vladu | first5=Adrian | title=Towards Deep Learning Models Resistant to Adversarial Attacks | date=2017-06-19 | class=stat.ML | eprint=1706.06083v4 }}</ref>
Line 136:
* Retail catalogs
* Nudity-detection filters<ref>{{cite journal | last=Wang |first = James Ze |author2=Jia Li |author3=Gio Wiederhold |author4=Oscar Firschein|title=System for Screening Objectionable Images|journal=Computer Communications|year = 1998|volume=21|issue=15|pages=1355–1360|doi=10.1016/s0140-3664(98)00203-5|citeseerx = 10.1.1.78.7689 }}</ref>
* [[
* Textiles Industry<ref name="Bird">{{cite journal | last=Bird | first=C.L. | author2=P.J. Elliott, Griffiths | title=User interfaces for content-based image retrieval | year=1996}}</ref>
Line 215:
* "[https://web.archive.org/web/20141129085237/http://identify.plantnet-project.org/en/ Pl@ntNet: Interactive plant identification based on social image data]" (Joly, Alexis et al.)
* "[https://link.springer.com/book/10.1007%2F978-981-10-6759-4 Content based Image Retrieval]'' (Tyagi, V, 2017)
* ''[https://dx.doi.org/10.1145/2578726.2578741 Superimage: Packing Semantic-Relevant Images for Indexing and Retrieval]'' (Luo, Zhang, Huang, Gao, Tian, 2014)
* ''[https://dx.doi.org/10.1145/2461466.2461470 Indexing and searching 100M images with Map-Reduce]'' (Moise, Shestakov, Gudmundsson, and Amsaleg, 2013)
|