Compound-term processing: Difference between revisions

Content deleted Content added
update link to the Clamour project
Replace broken link to Webmaster Woman
Line 15:
Compound term processing is a new approach to an old problem: how can one improve the relevance of search results while maintaining ease of use? By forming compound terms and placing these terms in a search engine's index, searches can be performed with a higher degree of accuracy, as the ambiguity inherent in single words is no longer a problem. Using this technique, a search for ''survival rates following a triple heart bypass in elderly people'' will locate documents about this topic even if this precise phrase is not contained in any document. This can be performed by a [[concept search]], which itself uses compound term processing. This will extract the key concepts automatically (in this case "survival rates", "triple heart bypass" and "elderly people") and use these concepts to select the most relevant documents.
 
In 2004, Anna Lynn Patterson filed a number of patents on "phrase-based searching in an information retrieval system"<ref>[http://appft1.uspto.gov/netacgi/nph-Parser?Sect1=PTO1&Sect2=HITOFF&d=PG01&p=1&u=%2Fnetahtml%2FPTO%2Fsrchnum.html&r=1&f=G&l=50&s1=%2220060031195%22.PGNR.&OS=DN/20060031195&RS=DN/20060031195] US Patent: 20060031195</ref> to which Google subsequently acquired the rights. A full discussion of the patents can be found at <ref>[http://www.webmasterwomanseobythesea.com/search-engines2012/02/phrasegoogle-basedacquires-indexing.htmlcuil-patent-applications/] WebmasterGoogle Woman]{{deadAcquires link|date=FebruaryCuil 2015}}Patent Applications</ref>.
 
Statistical compound term processing is a method more adaptive than the process described by Patterson in her patent applications. Her process is targeted at searching the World Wide Web where an extensive statistical knowledge of common searches can be used to identify candidate phrases. Statistical compound term processing is more suited to [[enterprise search]] applications where such [[A priori and a posteriori|a priori]] knowledge is not available.