Revision as of 23:07, 21 July 2024 edit Lynch44 (talk \| contribs) Extended confirmed users, Pending changes reviewers, Rollbackers, Temporary account IP viewers 16,446 edits Reverted 1 edit by 174.214.49.165 (talk): Way too long Tags: Twinkle Undo Mobile edit Mobile web edit ← Previous edit		Revision as of 02:52, 24 July 2024 edit undo Kvng (talk \| contribs) Extended confirmed users, New page reviewers 116,060 edits m WP:ADOPTYPO Next edit →
Line 61: \|} This index can only determine whether a word exists within a particular document, since it stores no information regarding the frequency and position of the word; it is therefore considered to be a [[~~boolean~~Boolean ~~datatype~~data type\|~~boolean~~Boolean]] index. Such an index determines which documents match a query but does not rank matched documents. In some designs the index includes additional information such as the frequency of each word in each document or the positions of a word in each document.<ref>Grossman, Frieder, Goharian. [http://www.cs.clemson.edu/~juan/CPSC862/Concept-50/IR-Basics-of-Inverted-Index.pdf IR Basics of Inverted Index]. 2002. Verified Aug 2011.</ref> Position information enables the search algorithm to identify word proximity to support searching for phrases; frequency can be used to help in ranking the relevance of documents to the query. Such topics are the central research focus of [[information retrieval]]. The inverted index is a [[sparse matrix]], since not all words are present in each document. To reduce [[Computer Storage\|computer storage]] memory requirements, it is stored differently from a two dimensional [[Array data structure\|array]]. The index is similar to the [[document-term matrix\|term document matrices]] employed by [[latent semantic analysis]]. The inverted index can be considered a form of a hash table. In some cases the index is a form of a [[binary tree]], which requires additional storage but may reduce the lookup time. In larger indices the architecture is typically a [[distributed hash table]].<ref>Tang, Hunqiang. [[Sandhya Dwarkadas\|Dwarkadas, Sandhya]]. "Hybrid Global Local Indexing for Efficient

Search engine indexing: Difference between revisions