Content deleted Content added
m "ad-hoc" -> "ad hoc" |
m Missed one. |
||
Line 59:
The [[ECL, data-centric programming language for Big Data|ECL programming language]] is the primary distinguishing factor between HPCC and other data-intensive computing solutions. It is a high-level, declarative, data-centric, [[Implicit parallelism|implicitly parallel]] language that allows the programmer to define what the data processing result should be and the dataflows and transformations that are necessary to achieve the result. The ECL language includes extensive capabilities for data definition, filtering, data management, and data transformation, and provides an extensive set of built-in functions to operate on records in datasets which can include user-defined transformation functions. [[ECL, data-centric programming language for Big Data|ECL]] programs are compiled into optimized [[C++]] source code, which is subsequently compiled into executable code and distributed to the nodes of a processing cluster.
To address both batch and online aspects data-intensive computing applications, [[HPCC]] includes two distinct cluster environments, each of which can be optimized independently for its parallel data processing purpose. The Thor platform is a cluster whose purpose is to be a data refinery for processing of massive volumes of raw data for applications such as data cleansing and hygiene, [[Extract, transform, load|ETL]] (extract, transform load), record linking and entity resolution, large-scale ad
== See also ==
|