Content deleted Content added
→How to use the unlabeled query logs to help with query classification?: === Using unlabeled query logs to help with query classification === |
Eurohunter (talk | contribs) →top: -capitals |
||
(4 intermediate revisions by 2 users not shown) | |||
Line 1:
{{Cleanup|date=March 2011}}
A '''
== Difficulties ==
Line 33 ⟶ 8:
=== Derive an appropriate feature representation for Web queries ===
Many queries are short, and query terms are often noisy.{{Clarify|reason=what
Query-enrichment based methods<ref>Shen et al. [http://www.sigkdd.org/sites/default/files/issues/7-2-2005-12/KDDCUP2005Report_Shen.pdf "Q2C@UST: Our Winning Solution to Query Classification"]. ''ACM SIGKDD Exploration, December 2005, Volume 7, Issue 2''.</ref><ref>Shen et al. [http://portal.acm.org/ft_gateway.cfm?id=1165776 "Query Enrichment for Web-query Classification"]. ''ACM TOIS, Vol. 24, No. 3, July 2006''.</ref> start by enriching user queries to a collection of text documents through [[search engines]]. Thus, each query is represented by a pseudo-document which consists of the snippets of top ranked result pages retrieved by search engine. Subsequently, the text documents are classified into the target categories using synonym based classifier or statistical classifiers, such as [[Naive Bayes]] (NB) and [[Support Vector Machines]] (SVMs).
|