Data-intensive computing: Difference between revisions

Content deleted Content added
Hadoop: added link to main article
HPCC: added link to main article; and improved concision, readability, clarity
Line 54:
 
===HPCC===
{{Main|HPCC}}
[[LexisNexis|LexisNexis Risk SolutionsHPCC]], independently(High-Performance Computing Cluster) was developed and implemented aby solution[[LexisNexis|LexisNexis forRisk data-intensive computing called the [[HPCCSolutions]] (High-Performance Computing Cluster). The development of this computing platform began in 1999 and applications were in production by late 2000. The [[LexisNexis]] approach also utilizes commodity clusters of hardware running the [[Linux]] operating system. Custom system software and middleware components were developed and layered on the base Linux operating system to provide the execution environment and distributed filesystem support required for data-intensive computing. LexisNexis also implemented a new high-level language for data-intensive computing called ECL.
 
The [[ECL, data-centric programming language for Big Data|ECL programming language]] is the primary distinguishing factor between HPCC and other data-intensive computing solutions. It is a high-level, declarative, data-centric, [[Implicit parallelism|implicitly parallel]] language that allows the programmer to define what the data processing result should be and the dataflows and transformations that are necessary to achieve the result. The ECL language includes extensive capabilities for data definition, filtering, data management, and data transformation, and provides an extensive set of built-in functions to operate on records in datasets which can include user-defined transformation functions. [[ECL, data-centric programming language for Big Data|ECL]] programs are compiled into optimized [[C++]] source code, which is subsequently compiled into executable code and distributed to the nodes of a processing cluster.