Revision as of 10:03, 29 January 2023 edit 虹易 (talk \| contribs) Extended confirmed users 543 edits No edit summary ← Previous edit		Revision as of 02:01, 31 January 2023 edit undo Citation bot (talk \| contribs) Bots 5,871,101 edits Alter: title, template type. Add: magazine, date. Removed parameters. Some additions/deletions were parameter name changes. \| Use this bot. Report bugs. \| Suggested by Abductive \| #UCB_webform 1750/3850 Next edit →
Line 65: == Legal == When scraping websites and services the legal part is often a big concern for companies, for web scraping it greatly depends on the country a scraping user/company is from as well as which data or website is being scraped. With many different court rulings all over the world.<ref>{{cite web\|url=http://blog.icreon.us/advise/web-scraping-legality\|title=Is Web Scraping Legal? \|publisher=Icreon (blog)}}</ref><ref>{{cite web\|url=https://arstechnica.com/tech-policy/2014/04/appeals-court-reverses-hackertroll-weev-conviction-and-sentence/\|title=Appeals court reverses hacker/troll "weev" conviction and sentence [Updated]\|website=arstechnica.com\|date=11 April 2014 }}</ref><ref>{{cite web\|url=https://www.techdirt.com/articles/20090605/2228205147.shtml\|title=Can Scraping Non-Infringing Content Become Copyright Infringement... Because Of How Scrapers Work?\|website=www.techdirt.com\|date=10 June 2009 }}</ref> However, when it comes to scraping search engines the situation is different, search engines usually do not list intellectual property as they just repeat or summarize information they scraped from other websites. The largest public known incident of a search engine being scraped happened in 2011 when Microsoft was caught scraping unknown keywords from Google for their own, rather new Bing service,<ref>{{cite ~~web~~magazine\|url=https://www.wired.com/2011/02/bing-copies-google/\|title=Google Catches Bing Copying; Microsoft Says ~~‘So~~'So What?’'\|first=Ryan\|last=Singel\|~~work~~magazine=Wired}}</ref> but even this incident did not result in a court case. One possible reason might be that search engines like Google, [[Sogou]] are getting almost all their data by scraping millions of public reachable websites, also without reading and accepting those terms.

Search engine scraping: Difference between revisions