Search engine indexing: Difference between revisions

Content deleted Content added
Mstary (talk | contribs)
No edit summary
rmv spam
Line 3:
'''Search engine indexing''' is the collecting, [[parsing]], and storing of data to facilitate fast and accurate [[information retrieval]]. Index design incorporates interdisciplinary concepts from [[linguistics]], [[cognitive psychology]], mathematics, [[informatics]], and [[computer science]]. An alternate name for the process, in the context of [[search engine]]s designed to find [[web page]]s on the Internet, is ''[[web indexing]]''.
 
Popular search engines focus on the [[Full-text search|full-text]] indexing of online, [[Natural language processing|natural language]] documents.<ref>Clarke, C., Cormack, G.: Dynamic Inverted Indexes for a Distributed Full-Text Retrieval System. TechRep MT-95-01, University of Waterloo, February 1995.</ref> [[Media type]]s such as pictures, video,<ref>{{cite web|url=https://superstarseo.com/how-search-engines-actually-find-and-index-video-content/|title= How Search Engines Actually Find and Index Video Content| website= superstarseo.com }}</ref> audio,<ref>{{Cite web| title=An Industrial-Strength Audio Search Algorithm | url=http://www.ee.columbia.edu/~dpwe/papers/Wang03-shazam.pdf | archive-url=https://web.archive.org/web/20060512074748/http://www.ee.columbia.edu:80/~dpwe/papers/Wang03-shazam.pdf | archive-date=2006-05-12}}</ref> and graphics<ref>Charles E. Jacobs, Adam Finkelstein, David H. Salesin. [http://grail.cs.washington.edu/projects/query/mrquery.pdf Fast Multiresolution Image Querying]. Department of Computer Science and Engineering, University of Washington. 1995. Verified Dec 2006</ref> are also searchable.
 
[[Metasearch engine|Meta search engines]] reuse the indices of other services and do not store a local index whereas cache-based search engines permanently store the index along with the [[text corpus|corpus]]. Unlike full-text indices, partial-text services restrict the depth indexed to reduce index size. Larger services typically perform indexing at a predetermined time interval due to the required time and processing costs, while [[Intelligent agent|agent]]-based search engines index in [[Real time business intelligence|real time]].