Boolean model of information retrieval: Difference between revisions

Content deleted Content added
m Typo/general fixes, replaced: et. al. → et al.
Line 4:
==Definitions==
 
An ''index term'' is a word or expression'','' which may be [[stemming|stemmed]], describing or characterizing a document, such as a keyword given for a journal article. Let<math display="block">T = \{t_1, t_2,\ \ldots,\ t_m\}</math>be the set of all such index terms.
 
A ''document'' is any subset of <math>T</math>. Let<math display="block">D = \{D_1,\ \ldots\ ,D_n\}</math>be the set of all documents.
 
A ''query'' is a Boolean expression <math display="inline">Q</math> in normal form:<math display="block">Q = (W_1\ \or\ W_2\ \or\ \cdots) \and\ \cdots\ \and\ (W_i\ \or\ W_{i+1}\ \or\ \cdots)</math>where <math display="inline">W_i</math> is true for <math>D_j</math> when <math>t_i \in D_j</math>. (Equivalently, <math display="inline">Q</math> could be expressed in [[disjunctive normal form]].)
Line 91:
[https://people.eng.unimelb.edu.au/jzobel/fulltext/acmtods98.pdf "Inverted Files Versus Signature Files for Text Indexing"].
</ref><ref name="goodwin" >
Bob Goodwin; et. al.
[https://safari.ethz.ch/architecture/fall2018/lib/exe/fetch.php?media=p605-goodwin.pdf "BitFunnel: Revisiting Signatures for Search"].
2017.