Blog scraping: Difference between revisions

Content deleted Content added
JPM007 (talk | contribs)
Dangers: blog scrapers and blog scraper added
Alphachimp (talk | contribs)
m Limited spellcheck + unicode + minor fixes using READ ME using AWB
Line 1:
[[Category:Category needed]]{{Cleanup-date|June 2006}}
{{cleanup}}
'''Blog scraping''', is the process where automated software scans hundreds of thousands of blogs per day, searching for and copying content. The process is sometimes referenced by the name given the software or individuals responsible for the action, “''blog scrapers''.”
 
"''Scraping''" essentially stands for copying, or in the case of copyrighted material, stealing content off a [[http://en.wikipedia.org/wiki/Blog blog]] that is not owned by the individual initiating the scraping process. The scraped content is often used on [[http://en.wikipedia.org/wiki/Splog Spam blogs or splogs]].
 
 
== Dangers ==
Line 15 ⟶ 14:
Why the more 'advanced' Blog scrapers do this is simple. By copying only the content that is relevant to their splog topic, they can increases the keyword relevancy of their site(s). Secondly, by not scraping the entire post, they eliminate any outbound links which would reduce their search engine ranking.
 
Additionally, scraped content can appear on literally any type of splog or [[http://en.wikipedia.org/wiki/RSS_%28file_format%29 RSS]] fed spam site. That means an unsuspecting individual could find their creative or even copyrighted material showing up on a site promoting pornography or other type of content that would be offensive to the original author or his/her audience. This can be damaging to the original author's reputation.
 
== Defense ==