Content deleted Content added
add reference to H H Williams paper |
No edit summary |
||
Line 17:
In 2004 Anna Lynn Patterson filed a number of patents on the subject of "Phrase based indexing and retrieval" and to which Google subsequently acquired the rights. A full discussion of the patents can be found here: [http://www.webmasterwoman.com/search-engines/phrase-based-indexing.html Webmaster Woman]. The patents themselves can be found online, for example: <ref>[http://appft1.uspto.gov/netacgi/nph-Parser?Sect1=PTO1&Sect2=HITOFF&d=PG01&p=1&u=%2Fnetahtml%2FPTO%2Fsrchnum.html&r=1&f=G&l=50&s1=%2220060031195%22.PGNR.&OS=DN/20060031195&RS=DN/20060031195] US Patent: 20060031195</ref>.
Statistical Compound Term Processing is more adaptive than the "phrase based indexing and retrieval" detailed by Anna Lynn Patterson in her patent applications. The "phrase based indexing" is targeted at searching the World Wide Web where an extensive statistical knowledge of common searches can be used to identify candidate phrases. Statistical Compound Term Processing is more suited to [[Enterprise Search]] applications where such [[a priori]] knowledge is not available.
Statistical Compound Term Processing is also more adaptive than the linguistic approach taken by the CLAMOUR project which considers the syntactic properties of the terms (part of speech, gender, number) and their combination. CLAMOUR is highly language dependent, whereas the statistical approach is language independent.
|