Revision as of 22:05, 7 July 2010 view source Ohnoitsjamie (talk \| contribs) Edit filter managers, Autopatrolled, Administrators 268,802 edits m Reverted edits by 72.21.131.205 (talk) to last version by Bonadea ← Previous edit		Revision as of 09:58, 8 July 2010 view source 59.162.126.133 (talk) →Methods Tag: references removed Next edit →
Line 47: {{Main\|search engine optimization methods}} === Getting indexed === The leading search engines, such as Google, Bing and Yahoo!, use [[Web crawler\|crawlers]] to find pages for their algorithmic search results. Pages that are linked from other search engine indexed pages do not need to be submitted because they are found automatically. Some search engines, notably Yahoo!, operate a paid submission service that guarantee crawling for either a set fee or [[Pay per click\|cost per click]].<ref>{{cite web\|url=http://searchenginewatch.com/showPage.html?page=2167871\|title= Submitting To Search Crawlers: Google, Yahoo, Ask & Microsoft's Live Search \|date= 2007-03-12\|accessdate=2007-05-15\|publisher=[[Search Engine Watch]]}}</ref> Such programs usually guarantee inclusion in the database, but do not guarantee specific ranking within the search results.{{Dead link\|date=April 2010}}<ref>{{Dead link\|date=April 2010}}{{cite web\|title=Search Submit\|url=http://searchmarketing.yahoo.com/srchsb/index.php\|publisher=searchmarketing.yahoo.com\|accessdate=2007-05-09}}</ref> Two major ~~directories, the Yahoo Directory and the [[Open Directory~~directh ~~Project~~engine]] ~~both~~crawlers ~~require~~may ~~manual~~look ~~submission~~at ~~and~~a ~~human~~number ~~editorial~~of ~~review.<ref>{{cite~~different ~~web\|url=http://searchenginewatch.com/showPage.html?page=2167881\|title=~~factors ~~Submitting~~when ~~To Directories: Yahoo & The Open Directory \|date= 2007-03-12\|accessdate=2007-05-15\|publisher=~~[[~~Search~~Web ~~Engine Watch~~crawler\|crawling]]~~}}</ref>~~ ~~Google~~a ~~offers~~site. ~~[[Google~~Not ~~Webmaster~~every ~~Tools]],~~page ~~for~~is ~~which~~indexed anby ~~XML~~the ~~[[Sitemap]]~~search ~~feed~~engines. ~~can~~Distance beof ~~created~~pages ~~and~~from ~~submitted~~the ~~for~~root ~~free~~directory toof ~~ensure~~a ~~that~~site ~~all~~may ~~pages~~also ~~are~~be ~~found,~~a ~~especially~~factor ~~pages~~in ~~that~~whether ~~aren't~~or ~~discoverable~~not bypages ~~automatically~~get ~~following links~~crawled.<ref name="cho">{{cite web\|url=http://~~www~~dbpubs.~~google~~stanford.~~com~~edu:8090/~~support~~pub/~~webmasters/bin/answer.py?answer~~1998-51\|title=~~40318&topic~~Efficient crawling through URL ordering\|author=~~8514~~Cho, J., Garcia-Molina, H.\|~~title~~year=~~What~~1998\|publisher=Proceedings isof athe ~~Sitemap~~seventh ~~file~~conference ~~and~~on ~~why~~World ~~should~~Wide IWeb, ~~have~~Brisbane, ~~one?\|publisher=google.com~~Australia\|accessdate=2007-0305-1909}}</ref> [[Web search engine\|Search engine]] crawlers may look at a number of different factors when [[Web crawler\|crawling]] a site. Not every page is indexed by the search engines. Distance of pages from the root directory of a site may also be a factor in whether or not pages get crawled.<ref name="cho">{{cite web\|url=http://dbpubs.stanford.edu:8090/pub/1998-51\|title=Efficient crawling through URL ordering\|author=Cho, J., Garcia-Molina, H.\|year=1998\|publisher=Proceedings of the seventh conference on World Wide Web, Brisbane, Australia\|accessdate=2007-05-09}}</ref> === Preventing crawling === {{Main\|Robots Exclusion Standard}} To avoid undesirable content in the search indexes, webmasters can instruct spiders not to crawl certain files or ~~directories through the standard~~directorid [[robots.txt]] file in the root directory of the ___domain. Additionally, a page can be explicitly excluded from a search engine's database by using a [[meta tag]] specific to robots. When a search engine visits a site, the robots.txt located in the [[root directory]] is the first file crawled. The robots.txt file is then parsed, and will instruct the robot as to which pages are not to be crawled. As a search engine crawler may keep a cached copy of this file, it may on occasion crawl pages a webmaster does not wish crawled. Pages typically prevented from being crawled include login specific pages such as shopping carts and user-specific content such as search results from internal searches. In March 2007, Google warned webmasters that they should prevent indexing of internal search results because those pages are considered search spam.<ref>{{cite web\|url=http://searchengineland.com/070508-165231.php\|title=Newspapers Amok! New York Times Spamming Google? LA Times Hijacking Cars.com?\|publisher=[[Search Engine Land]]\|date=May 8, 2007\|accessdate=2007-05-09}}</ref> === Increasing prominence ===

Search engine optimization: Difference between revisions