Revision as of 12:06, 16 December 2022 edit Materialscientist (talk \| contribs) Edit filter managers, Autopatrolled, Checkusers, Rollbackers, Administrators 2,037,767 edits m Reverted edits by 123.201.65.187 (talk) to last version by Curb Safe Charmer Tag: Rollback ← Previous edit		Revision as of 10:02, 29 January 2023 edit undo 虹易 (talk \| contribs) Extended confirmed users 543 edits - Petal (trivial market share, WP:SOAP by LTA sockpuppets) Next edit →
Line 4: {{Original research\|date=March 2021}} }} '''Search engine scraping''' is the process of harvesting [[URL]]s, descriptions, or other information from [[search engine]]s such as [[Google Search\|Google]], [[Microsoft Bing\|Bing]], [[Yahoo! Search\|Yahoo]], ~~[[Petal Search\|Petal]]~~ or [[Sogou]]. This is a specific form of [[screen scraping]] or [[web scraping]] dedicated to search engines only. Most commonly larger [[search engine optimization]] (SEO) providers depend on regularly scraping keywords from search engines, especially Google~~, [[Petal Search\|Petal]]~~, [[Sogou]] to monitor the competitive position of their customers' websites for relevant keywords or their [[search engine indexing\|indexing]] status. Search engines like Google have implemented various forms of human detection to block any sort of automated access to their service,<ref>{{Cite web\|url=https://support.google.com/webmasters/answer/66357?hl=en\|title=Automated queries – Search Console Help\|website=support.google.com\|language=en\|accessdate=2017-04-02}}</ref> in the intent of driving the users of scrapers towards buying their official [[API]]s instead. The process of entering a website and extracting data in an automated fashion is also often called "[[Web crawler\|crawling]]". Search engine’s like Google, Bing, Yahoo~~, [[Petal Search\|Petal]]~~ or [[Sogou]] get almost all their data from automated crawling bots. == Difficulties == Line 34: All these forms of detection may also happen to a normal user, especially users sharing the same IP address or network class (IPV4 ranges as well as IPv6 ranges). == Methods of scraping Google, Bing, Yahoo~~, [[Petal Search\|Petal]]~~ or [[Sogou]] == To scrape a search engine successfully, the two major factors are time and amount.

Search engine scraping: Difference between revisions