Revision as of 21:59, 28 July 2021 edit Citation bot (talk \| contribs) Bots 5,870,298 edits Add: eprint, class. Removed parameters. Some additions/deletions were parameter name changes. \| Use this bot. Report bugs. \| Suggested by Headbomb \| #UCB_toolbar ← Previous edit		Revision as of 00:03, 29 July 2021 edit undo Citation bot (talk \| contribs) Bots 5,870,298 edits Add: series. \| Use this bot. Report bugs. \| Suggested by Headbomb \| #UCB_toolbar Next edit →
Line 12: The term "content-based image retrieval" seems to have originated in 1992 when it was used by Japanese [[Electrotechnical Laboratory]] engineer Toshikazu Kato to describe experiments into automatic retrieval of images from a database, based on the colors and shapes present.<ref name="Eakins"/><ref>{{cite journal \|last1=Kato \|first1=Toshikazu \|title=Database architecture for content-based image retrieval \|journal=Image Storage and Retrieval Systems \|date=April 1992 \|volume=1662 \|pages=112–123 \|doi=10.1117/12.58497 \|bibcode=1992SPIE.1662..112K \|publisher=International Society for Optics and Photonics\|s2cid=14342247 }}</ref> Since then, the term has been used to describe the process of retrieving desired images from a large collection on the basis of syntactical image features. The techniques, tools, and algorithms that are used originate from fields such as statistics, pattern recognition, signal processing, and computer vision.<ref name="Survey" /> {{anchor\|Content-based video browsing}}Content-based [[video browsing]] was introduced by Iranian engineer Farshid Arman, Taiwanese computer scientist Arding Hsu, and computer scientist Ming-Yee Chiu, while working at [[Siemens]], and it was presented at the [[Association for Computing Machinery\|ACM International Conference]] in August 1993.<ref>{{cite journal \|last1=Arman \|first1=Farshid \|last2=Hsu \|first2=Arding \|last3=Chiu \|first3=Ming-Yee \|title=Image Processing on Compressed Data for Large Video Databases \|journal=Proceedings of the First ACM International Conference on Multimedia \|series=Multimedia '93 \|date=August 1993 \|pages=267–272 \|doi=10.1145/166266.166297 \|isbn=0897915968 \|url=https://dl.acm.org/citation.cfm?id=166297 \|publisher=[[Association for Computing Machinery]]\|s2cid=10392157 }}</ref><ref name="Arman1994">{{cite conference \|last1=Arman \|first1=Farshid \|last2=Depommier \|first2=Remi \|last3=Hsu \|first3=Arding \|last4=Chiu \|first4=Ming-Yee \|title=Content-based Browsing of Video Sequences \|book-title=Proceedings of the Second ACM International Conference on Multimedia \|date=October 1994 \|pages=97–103 \|doi=10.1145/192593.192630 \|citeseerx=10.1.1.476.7139 \|isbn=0897916867 \|url=https://dl.acm.org/citation.cfm?id=192630 \|publisher=[[Association for Computing Machinery]]\|s2cid=1360834 }}</ref> They described a [[shot detection]] algorithm for [[compressed video]] that was originally encoded with [[discrete cosine transform]] (DCT) [[video coding standards]] such as [[JPEG]], [[MPEG]] and [[H.26x]]. The basic idea was that, since the DCT coefficients are mathematically related to the spatial ___domain and represent the content of each frame, they can be used to detect the differences between video frames. In the algorithm, a subset of blocks in a frame and a subset of DCT coefficients for each block are used as [[motion vector]] representation for the frame. By operating on compressed DCT representations, the algorithm significantly reduces the computational requirements for decompression and enables effective video browsing.<ref>{{cite book \|last1=Zhang \|first1=HongJiang \|chapter=Content-Based Video Browsing And Retrieval \|editor-last1=Furht \|editor-first1=Borko \|title=Handbook of Internet and Multimedia Systems and Applications \|date=1998 \|publisher=[[CRC Press]] \|isbn=9780849318580 \|pages=[https://archive.org/details/handbookofintern0000unse_a3l0/page/83 83–108 (89)] \|chapter-url=https://books.google.com/books?id=5zfC1wI0wzUC&pg=PA89 \|url=https://archive.org/details/handbookofintern0000unse_a3l0/page/83 }}</ref> The algorithm represents separate shots of a video sequence by an r-frame, a thumbnail of the shot framed by a motion tracking region. A variation of this concept was later adopted for QBIC video content mosaics, where each r-frame is a salient still from the shot it represents.<ref>{{cite journal \|last1=Steele \|first1=Michael \|last2=Hearst \|first2=Marti A. \|last3=Lawrence \|first3=A. Rowe \|s2cid=18212394 \|title=The Video Workbench: a direct manipulation interface for digital media editing by amateur videographers \|journal=[[Semantic Scholar]] \|date=1998 \|pages=1–19 (14) }}</ref> ==={{Visible anchor\|QBIC}} - Query By Image Content===

Content-based image retrieval: Difference between revisions