Search engine indexing: Difference between revisions

Content deleted Content added
m Replaced 1 bare URLs by {{Cite web}}; Replaced "Archived copy" by actual titles
Mstary (talk | contribs)
No edit summary
Line 3:
'''Search engine indexing''' is the collecting, [[parsing]], and storing of data to facilitate fast and accurate [[information retrieval]]. Index design incorporates interdisciplinary concepts from [[linguistics]], [[cognitive psychology]], mathematics, [[informatics]], and [[computer science]]. An alternate name for the process, in the context of [[search engine]]s designed to find [[web page]]s on the Internet, is ''[[web indexing]]''.
 
Popular search engines focus on the [[Full-text search|full-text]] indexing of online, [[Natural language processing|natural language]] documents.<ref>Clarke, C., Cormack, G.: Dynamic Inverted Indexes for a Distributed Full-Text Retrieval System. TechRep MT-95-01, University of Waterloo, February 1995.</ref> [[Media type]]s such as pictures, video,<ref>{{cite journal web|lasturl=Sikos |first=Lhttps://superstarseo. F. |date=August 2016 com/how-search-engines-actually-find-and-index-video-content/|title=RDF-powered semanticHow videoSearch annotationEngines toolsActually withFind conceptand mapping to Linked Data for next-generation videoIndex indexingVideo Content|journal=Multimedia Tools and Applications |doiwebsite=10.1007/s11042-016-3705-7 |s2cid=254832794 |url=https://ap01.alma.exlibrisgroupsuperstarseo.com/view/delivery/61USOUTHAUS_INST/12165436490001831 }}{{Dead link|date=August 2023 |bot=InternetArchiveBot |fix-attempted=yes }}</ref> audio,<ref>{{Cite web| title=An Industrial-Strength Audio Search Algorithm | url=http://www.ee.columbia.edu/~dpwe/papers/Wang03-shazam.pdf | archive-url=https://web.archive.org/web/20060512074748/http://www.ee.columbia.edu:80/~dpwe/papers/Wang03-shazam.pdf | archive-date=2006-05-12}}</ref> and graphics<ref>Charles E. Jacobs, Adam Finkelstein, David H. Salesin. [http://grail.cs.washington.edu/projects/query/mrquery.pdf Fast Multiresolution Image Querying]. Department of Computer Science and Engineering, University of Washington. 1995. Verified Dec 2006</ref> are also searchable.
 
[[Metasearch engine|Meta search engines]] reuse the indices of other services and do not store a local index whereas cache-based search engines permanently store the index along with the [[text corpus|corpus]]. Unlike full-text indices, partial-text services restrict the depth indexed to reduce index size. Larger services typically perform indexing at a predetermined time interval due to the required time and processing costs, while [[Intelligent agent|agent]]-based search engines index in [[Real time business intelligence|real time]].