ECL (data-centric programming language): Difference between revisions

Content deleted Content added
m fix randomcaps, avoid redirects on "Big Data" (is "big data")
Rescuing 1 sources and tagging 0 as dead.) #IABot (v2.0.9.5
 
(31 intermediate revisions by 27 users not shown)
Line 1:
{{Short description|None}}
{{Infobox programming language
| name = ECL
| developer = [[HPCC|HPCC Systems®]], LexisNexis Risk Solutions
| logo =
 
| paradigm = [[declarative programming|declarative]], [[structured]], [[Data-centric programming language|data-centric]]
| logo =
| typing = [[type system#Static typing|static]], [[type system#Strong and weak typing|strong]], [[type system#Safely and unsafely typed systems|safe]]
| paradigm = [[declarative]], [[structured]], [[Data-centric programming language|data-centric]]
| major implementations = [[Windows Cluster]], [[GNU]]/[[Linux kernel|Linux]] [[Cluster]]
| typing = [[type system#Static typing|static]], [[type system#Strong and weak typing|strong]], [[type system#Safely and unsafely typed systems|safe]]
| year = 2000
| major implementations = [[Windows Cluster]], [[Linux Cluster]]
| designer =
| year = 2000
| designer =
| latest release version =
| latest release date =
| influenced_by = [[Prolog]], [[Pascal (programming language)|Pascal]], [[SQL]], [[Snobol4]], [[C++]], [[Clarion (programming language)|Clarion]]
| influenced = [[big data]]
| operating_system = [[Linux]]
| license =
| website = http://hpccsystems.com/
}}
 
'''ECL''' (Enterprise Control Language) is a declarative, data -centric programming language designed in 2000 to allow a team of programmers to process [[big data]] across a high performance computing cluster without the programmer being involved in many of the lower level, imperative decisions.<ref>[http://www.lexisnexis.com/risk/about/guides/program-guide.html A Guide to ECL], [[Lexis-Nexis]].]</ref><ref>"Evaluating use of data flow systems for large graph analysis," by A. Yoo, and I. Kaplan. Proceedings of the 2nd Workshop on Many-Task Computing on Grids and Supercomputers, MTAGS, 2009</ref>
 
== History ==
ECL was initially designed and developed in 2000 by David Bayliss as an in-house productivity tool within [[Lexis-Nexis|Seisint Inc]] and was considered to be a ‘secret weapon’ that allowed Seisint to gain market share in its data business. Equifax had an SQL-based process for predicting who would go bankrupt in the next 30 days, but it took 26 days to run the data. The first ECL implementation solved the same problem in 6 minutes. The technology was cited as a driving force behind the acquisition of Seisint by [[LexisNexis]] and then again as a major source of synergies when LexisNexis acquired ChoicePoint Inc.<ref>[{{Cite web |url=http://www.reed-elsevier.com/mediacentre/pressreleases/2004/Pages/AcquisitionofSeisint.aspx |title=Acquisition of Seisint] |access-date=2011-03-24 |archive-url=https://web.archive.org/web/20110621001838/http://www.reed-elsevier.com/mediacentre/pressreleases/2004/Pages/AcquisitionofSeisint.aspx |archive-date=2011-06-21 |url-status=dead }}</ref>
 
== Language Constructsconstructs ==
 
ECL, at least in its purest form, is a declarative, data -centric language. Programs, in the strictest sense, do not exist. Rather an ECL application will specify a number of core datasets (or data values) and then the operations which are to be performed on those values.
== Language Constructs ==
ECL, at least in its purest form, is a declarative, data centric language. Programs, in the strictest sense, do not exist. Rather an ECL application will specify a number of core datasets (or data values) and then the operations which are to be performed on those values.
 
=== Hello world ===
ECL is to have succinct solutions to problems and sensible defaults. The ‘Hello"Hello World’World" program is characteristically short:
'Hello World'
‘Hello World’.
Perhaps a more flavorful example would take a list of strings, sort them into order, and then return that as a result instead.
 
<syntaxhighlight lang="ecl">
<PRE>
// First declare a dataset with one column containing a list of strings
// Datasets can also be binary, csvCSV, xmlXML or externally defined structures
 
D := DATASET([{'ECL'},{'Declarative'},{'Data'},{'Centric'},{'Programming'},{'Language'}],{STRING Value;});
SD := SORT(D,Value);
output(SD)
</syntaxhighlight>
</PRE>
 
The statements containing a <code>:=</code> are defined in ECL as attribute definitions. They do not denote an action; rather a definition of a term. Thus, logically, an ECL program can be read: “bottom"bottom to top”top"
 
OUTPUT(SD)
 
What is an SD?
<syntaxhighlight lang="ecl">
 
SD := SORT(D,Value);
</syntaxhighlight>
 
SD is a D that has been sorted by ‘Value’
 
What is a D?
<syntaxhighlight lang="ecl">
 
D := DATASET([{'ECL'},{'Declarative'},{'Data'},{'Centric'},{'Programming'},{'Language'}],{STRING Value;});
</syntaxhighlight>
 
D is a dataset with one column labeled ‘Value’ and containing the following list of data.
 
=== ECL Primitivesprimitives ===
ECL primitives that act upon datasets include: SORT, ROLLUP, DEDUP, ITERATE, PROJECT, JOIN, NORMALIZE, DENORMALIZE, PARSE, CHOSEN, ENTH, TOPN, DISTRIBUTE
 
=== ECL Encapsulationencapsulation ===
Whilst ECL is terse and LexisNexis claims that 1 line of ECL is roughly equivalent to 120 lines of C++, it still has significant support for large scale programming including data encapsulation and code re-use. The constructs available include: MODULE, FUNCTION, FUNCTIONMACRO, INTERFACE, MACRO, EXPORT, SHARED
 
=== Support for Parallelism in ECL ===
Line 67 ⟶ 66:
 
=== Comparison to Map-Reduce ===
The Hadoop Map-Reduce paradigm actually consists of three phases which correlate to ECL primitives as follows.
{| class="wikitable sortable" style="font-size: 90%; text-align: center; width: auto;"
|-
! Hadoop Name/Term
Line 74 ⟶ 73:
! Comments
|-
!| MAPing within the MAPper
!| PROJECT/TRANSFORM
!| Takes a record and covertsconverts to a different format; in the [[Hadoop]] case the conversion is into a key-value pair
|-
!| SHUFFLE (Phase 1)
!| DISTRIBUTE(,HASH(KeyValue))
!| The records from the mapper are distributed dependentdepending upon the KEY value
|-
!| SHUFFLE (Phase 2)
!| SORT(,LOCAL)
!| The records arriving at a particular reducer are sorted into KEY order
|-
!| REDUCE
!| ROLLUP(,Key,LOCAL)
!| The records for a particular KEY value are now combined together
|}
 
Line 96 ⟶ 95:
 
== External links ==
* [http://rosettacode.org/wiki/ECL Rosetta Code ECL category]
* [http://www.nytimes.com/2008/02/21/technology/21iht-reed.4.10279549.html Reed Elsevier to acquire ChoicePoint for $3.6 billion]
* [https://hpccsystems.com/training/documentation/ecl-language-reference/html ECL Language Reference] {{Webarchive|url=https://web.archive.org/web/20210116135748/https://hpccsystems.com/training/documentation/ecl-language-reference/html |date=2021-01-16 }}
* [http://www.bloomberg.com/apps/news?pid=newsarchive&sid=aBuqYZDOSPL4&refer=uk Reed Elsevier's LexisNexis Buys Seisint for $775 Mln]
* [httphttps://www.reutersnytimes.com/finance2008/stocks02/keyDevelopments?symbol=ENL&pn=15 21/technology/21iht-reed.4.10279549.html Reed Elsevier to acquire ChoicePoint for $3.6 billion]
* [httphttps://www.bloomberg.com/apps/news?pid=newsarchive&sid=aBuqYZDOSPL4&refer=uk Reed Elsevier's LexisNexis Buys Seisint for $775 Mln]
* [https://archive.today/20130201091208/http://www.reuters.com/finance/stocks/keyDevelopments?symbol=ENL&pn=15 Reed Elsevier]
 
[[Category:Declarative programming languages]]
[[Category:Data-centric programming languages]]
[[Category:Big Datadata]]
[[Category:Statically typed programming languages]]