Content deleted Content added
m Added new information about search engine indexing including sources |
rmv odd, poorly written addition to lead |
||
Line 3:
Popular search engines focus on the [[Full-text search|full-text]] indexing of online, [[Natural language processing|natural language]] documents.<ref>Clarke, C., Cormack, G.: Dynamic Inverted Indexes for a Distributed Full-Text Retrieval System. TechRep MT-95-01, University of Waterloo, February 1995.</ref> [[Media type]]s such as pictures, video,<ref>{{cite journal |last=Sikos |first=L. F. |date=August 2016 |title=RDF-powered semantic video annotation tools with concept mapping to Linked Data for next-generation video indexing |journal=Multimedia Tools and Applications |doi=10.1007/s11042-016-3705-7 |s2cid=254832794 |url=https://ap01.alma.exlibrisgroup.com/view/delivery/61USOUTHAUS_INST/12165436490001831 }}{{Dead link|date=August 2023 |bot=InternetArchiveBot |fix-attempted=yes }}</ref> audio,<ref>http://www.ee.columbia.edu/~dpwe/papers/Wang03-shazam.pdf {{Bare URL PDF|date=March 2022}}</ref> and graphics<ref>Charles E. Jacobs, Adam Finkelstein, David H. Salesin. [http://grail.cs.washington.edu/projects/query/mrquery.pdf Fast Multiresolution Image Querying]. Department of Computer Science and Engineering, University of Washington. 1995. Verified Dec 2006</ref> are also searchable.
[[Metasearch engine|Meta search engines]] reuse the indices of other services and do not store a local index whereas cache-based search engines permanently store the index along with the [[text corpus|corpus]]. Unlike full-text indices, partial-text services restrict the depth indexed to reduce index size. Larger services typically perform indexing at a predetermined time interval due to the required time and processing costs, while [[Intelligent agent|agent]]-based search engines index in [[Real time business intelligence|real time]].
==Indexing==
The purpose of storing an index is to optimize speed and performance in finding [[relevance (information retrieval)|relevant]] documents for a search query. Without an index, the search engine would [[Lexical analysis|scan]] every document in the [[Text corpus|corpus]], which would require considerable time and computing power.
===Index design factors===
|