Revision as of 11:35, 9 September 2013 edit Wbm1058 (talk \| contribs) Autopatrolled, Administrators 278,505 edits removed Category:Big Data; added Category:Big data using HotCat ← Previous edit		Revision as of 11:55, 10 September 2013 edit undo Frap (talk \| contribs) Extended confirmed users, File movers, Pending changes reviewers, Rollbackers 35,586 edits No edit summary Next edit →
Line 2: \| name = ECL \| developer = [[HPCC\|HPCC Systems]], LexisNexis Risk Solutions \| logo = \| paradigm = [[declarative programming\|declarative]], [[structured]], [[Data-centric programming language\|data-centric]] Line 12 ⟶ 11: \| latest release date = \| influenced_by = [[Prolog]], [[Pascal (programming language)\|Pascal]], [[SQL]], [[Snobol4]], [[C++]], [[Clarion (programming language)\|Clarion]] \| influenced = ~~[[big data]]~~ \| operating_system = [[Linux]] \| license = \| website = http://hpccsystems.com/ }} Line 27 ⟶ 26: === Hello world === ECL is to have succinct solutions to problems and sensible defaults. The ~~‘Hello~~"Hello ~~World’~~World" program is characteristically short: "Hello World". ~~‘Hello World’.~~ Perhaps a more flavorful example would take a list of strings, sort them into order, and then return that as a result instead. <~~PRE~~pre> // First declare a dataset with one column containing a list of strings // Datasets can also be binary, ~~csv~~CSV, ~~xml~~XML or externally defined structures D := DATASET([{'ECL'},{'Declarative'},{'Data'},{'Centric'},{'Programming'},{'Language'}],{STRING Value;}); SD := SORT(D,Value); output(SD) </~~PRE~~pre> The statements containing a <code>:=</code> are defined in ECL as attribute definitions. They do not denote an action; rather a definition of a term. Thus, logically, an ECL program can be read: ~~“bottom~~"bottom to ~~top”~~top" OUTPUT(SD) Line 56 ⟶ 55: D is a dataset with one column labeled ‘Value’ and containing the following list of data. === ECL ~~Primitives~~primitives === ECL primitives that act upon datasets include: SORT, ROLLUP, DEDUP, ITERATE, PROJECT, JOIN, NORMALIZE, DENORMALIZE, PARSE, CHOSEN, ENTH, TOPN, DISTRIBUTE === ECL ~~Encapsulation~~encapsulation === Whilst ECL is terse and LexisNexis claims that 1 line of ECL is roughly equivalent to 120 lines of C++ it still has significant support for large scale programming including data encapsulation and code re-use. The constructs available include: MODULE, FUNCTION, INTERFACE, MACRO, EXPORT, SHARED Line 67 ⟶ 66: === Comparison to Map-Reduce === The Hadoop Map-Reduce paradigm actually consists of three phases which correlate to ECL primitives as follows. {\| class="wikitable ~~sortable" style="font-size: 90%; text-align: center; width: auto;~~" \|- ! Hadoop Name/Term Line 73 ⟶ 72: ! Comments \|- !\| MAPing within the MAPper !\| PROJECT/TRANSFORM !\| Takes a record and converts to a different format; in the [[Hadoop]] case the conversion is into a key-value pair \|- !\| SHUFFLE (Phase 1) !\| DISTRIBUTE(,HASH(KeyValue)) !\| The records from the mapper are distributed dependent upon the KEY value \|- !\| SHUFFLE (Phase 2) !\| SORT(,LOCAL) !\| The records arriving at a particular reducer are sorted into KEY order \|- !\| REDUCE !\| ROLLUP(,Key,LOCAL) !\| The records for a particular KEY value are now combined together \|}

ECL (data-centric programming language): Difference between revisions