Content deleted Content added
Icherishyou (talk | contribs) m word choice |
No edit summary |
||
Line 21:
|author=Ron Lieber |date=May 7, 2016}}</ref>
Screen scraping is normally associated with the programmatic collection of visual data from a source, instead of parsing data as in
As a concrete example of a classic screen scraper, consider a hypothetical legacy system dating from the 1960s—the dawn of computerized [[data processing]]. Computer to [[user interface]]s from that era were often simply text-based [[dumb terminal]]s which were not much more than virtual [[teleprinter]]s (such systems are still in use {{As of|2007|alt=today}}, for various reasons). The desire to interface such a system to more modern systems is common. A [[Robustness (computer science)|robust]] solution will often require things no longer available, such as [[source code]], system [[documentation]], [[Application programming interface|API]]s, or [[programmers]] with experience in a 50-year-old computer system. In such cases, the only feasible solution may be to write a screen scraper that "pretends" to be a user at a terminal. The screen scraper might connect to the legacy system via [[Telnet]], [[emulator|emulate]] the keystrokes needed to navigate the old user interface, process the resulting display output, extract the desired data, and pass it on to the modern system. A sophisticated and resilient implementation of this kind, built on a platform providing the governance and control required by a major enterprise—e.g. change control, security, user management, data protection, operational audit, load balancing, and queue management, etc.—could be said to be an example of [[robotic process automation]] software, called RPA or RPAAI for self-guided RPA 2.0 based on [[artificial intelligence]].
Line 38:
===Web scraping===
{{main|Web scraping}}
[[Web page]]s are built using text-based mark-up languages ([[HTML]] and [[XHTML]]), and frequently contain a wealth of useful data in text form. However, most web pages are designed for human [[End-user (computer science)|end-users]] and not for ease of automated use. Because of this, tool kits that scrape web content were created. A [[Web scraping|web scraper]] is an [[API]] or tool to extract data from a web site. Companies like [[Amazon AWS]] and [[Google]] provide '''web scraping''' tools, services, and public data available free of cost to end-users.
Newer forms of web scraping involve listening to data feeds from web servers. For example, [[JSON]] is commonly used as a transport storage mechanism between the client and the webserver.
|