Revision as of 21:19, 7 October 2016 edit Shiv4nsh (talk \| contribs) 1 edit No edit summary ← Previous edit		Revision as of 14:59, 22 December 2016 edit undo MrOllie (talk \| contribs) Extended confirmed users, Pending changes reviewers, Rollbackers 255,676 edits WP:NOT Next edit →
Line 21: ==Clustering in search engines== A [[web search engine]] often returns thousands of pages in response to a broad query, making it difficult for users to browse or to identify relevant information. Clustering methods can be used to automatically group the retrieved documents into a list of meaningful categories, as is achieved by e.g. open source software such as [[Carrot2]]. ~~Examples:~~ * Clustering divides the results of a search for "cell" into groups like "biology," "battery," and "prison." * [http://FirstGov.gov FirstGov.gov], the official Web portal for the U.S. government, uses document clustering to automatically organize its search results into categories. For example, if a user submits “immigration”, next to their list of results they will see categories for “Immigration Reform”, “Citizenship and Immigration Services”, “Employment”, “Department of Homeland Security”, and more. * The Noggle search and clustering engine has clustered over 2000 TED Talks into automatically generated clusters. E.g. what had all TED talks from 2006-2016 in common about "happiness"? The results are available for further review.<ref>{{cite news\|last1=von Thienen\|first1=Lars\|title=What would a robot see in TED talks?\|url=https://www.noggle.online/knowledge-base/robot-see-ted-talks/\|work=noggle.online\|agency=TED.com}}</ref> ==Procedures==

Document clustering: Difference between revisions