Revision as of 09:47, 7 March 2011 edit Jratke (talk \| contribs) 22 edits Update ecl for big data definition correctly		Revision as of 09:48, 7 March 2011 edit undo Jratke (talk \| contribs) 22 edits No edit summary Next edit →
Line 20: }} '''ECL''' is a declarative, data centric programming language designed in 2000 to allow a team of programmers to process Big Data across a high performance computing cluster without the programmer being involved in many of the lower level, imperative decisions.<ref>[http://www.lexisnexis.com/risk/about/guides/program-guide.html A Guide to ECL, [[Lexis-Nexis]].]</ref> ~~allow a team of programmers to process Big Data across a high performance computing~~ ~~cluster without the programmer being involved in many of the lower level, imperative~~ ~~decisions.<ref>[http://www.lexisnexis.com/risk/about/guides/program-guide.html A~~ ~~Guide to ECL, [[Lexis-Nexis]].]</ref>~~ == History == ECL was initially designed and developed in 2000 as an in-house productivity tool within Seisint Inc and was considered to be ‘secret weapon’ that allowed [[Seisint]] to gain market share in its data business. The technology was cited as a driving force behind the acquisition of Seisint by [[LexisNexis]] and then again as a major source of synergies when LexisNexis acquired ChoicePoint Inc. ~~within Seisint Inc and was considered to be ‘secret weapon’ that allowed [[Seisint]]~~ ~~to gain market share in its data business. The technology was cited as a driving~~ ~~force behind the acquisition of Seisint by [[LexisNexis]] and then again as a major~~ ~~source of synergies when LexisNexis acquired ChoicePoint Inc.~~ == Implementations == LexisNexis, and its partners, were chosen for a [[DARPA]] project to create a prototype of a new kind of data-centric supercomputer. DARPA is the research and development office for the [[U.S. Department of Defense (DoD)]]. DARPA’s mission is to maintain technological superiority of the U.S. military and prevent technological surprise from harming our [[national security]].<ref>[http://www.darpa.mil/about.html [[DARPA]].]</ref> ~~LexisNexis, and its partners, were chosen for a [[DARPA]] project to create a~~ DARPA has created a Ubiquitous High Performance Computing (UHPC) program to provide the revolutionary technology needed to meet the steadily increasing demands of DoD applications – from embedded to command center and expandable to high performance computing systems.<ref>[http://www.er.doe.gov/ascr/Research/CS/UHPC%20DARPA-SN-09-46_RFI.pdfl [[DARPA High Performance Computing]].]</ref> ~~prototype of a new kind of data-centric supercomputer. DARPA is the research and~~ Earlier, [[Sandia National Labs]] and LexisNexis were chosen by DARPA as one of four teams to design a new kind of data-centric supercomputer prototype.<ref>[https://share.sandia.gov/news/resources/news_releases/supercomputer-prototype/ttp://insidehpc.com/2010/08/19/uhpc-the-sandia-team/ Sandia team and [[High Performance Computing]].]</ref> ~~development office for the [[U.S. Department of Defense (DoD)]]. DARPA’s mission is~~ ~~to maintain technological superiority of the U.S. military and prevent technological~~ ~~surprise from harming our [[national security]].<ref>~~ ~~[http://www.darpa.mil/about.html [[DARPA]].]</ref>~~ ~~DARPA has created a Ubiquitous High Performance Computing (UHPC) program to provide~~ ~~the revolutionary technology needed to meet the steadily increasing demands of DoD~~ ~~applications – from embedded to command center and expandable to high performance~~ ~~computing systems.<ref>[http://www.er.doe.gov/ascr/Research/CS/UHPC%20DARPA-SN-09-~~ ~~46_RFI.pdfl [[DARPA High Performance Computing]].]</ref>~~ ~~Earlier, [[Sandia National Labs]] and LexisNexis were chosen by DARPA as one of four~~ ~~teams to design a new kind of data-centric supercomputer prototype.<ref>~~ ~~[https://share.sandia.gov/news/resources/news_releases/supercomputer-~~ ~~prototype/ttp://insidehpc.com/2010/08/19/uhpc-the-sandia-team/ Sandia team and~~ ~~[[High Performance Computing]].]</ref>~~ == Language Constructs == ECL, at least in its purest form, is a declarative, data centric language. Programs, in the strictest sense, do not exist. Rather an ECL application will specify a number of core datasets (or data values) and then the operations which are to be performed on those values. ~~in the strictest sense, do not exist. Rather an ECL application will specify a~~ ~~number of core datasets (or data values) and then the operations which are to be~~ ~~performed on those values.~~ === Hello world === ECL is to have succinct solutions to problems and sensible defaults. The ‘Hello World’ program is characteristically short: ~~World’ program is characteristically short:~~ ‘Hello World’ Perhaps a more flavorful example would take a list of strings, sort them into order, and then return that as a result instead. ~~and then return that as a result instead.~~ // First declare a dataset with one column containing a list of strings // Datasets can also be binary, csv, xml or externally defined structures D := DATASET([{'ECL'},{'Declarative'},{'Data'},{'Centric'},{'Programming'},{'Language'}],{STRING Value;}); ~~{'Language'}],{STRING Value;});~~ SD := SORT(D,Value); output(SD) The statements containing a := are defined in ECL as attribute definitions. They do not denote an action; rather a definition of a term. Thus, logically, an ECL program can be read: “bottom to top” ~~not denote an action; rather a definition of a term. Thus, logically, an ECL program~~ ~~can be read: “bottom to top”~~ OUTPUT(SD) Line 115 ⟶ 57: What is a D? D := DATASET([{'ECL'},{'Declarative'},{'Data'},{'Centric'},{'Programming'},{'Language'}],{STRING Value;}); D is a dataset with one column labeled ‘Value’ and containing the following list of data.▼ ~~{'Language'}],{STRING Value;});~~ ▲D is a dataset with one column labeled ‘Value’ and containing the following list of ~~data.~~ === ECL Primitives === ECL primitives that act upon datasets include: SORT, ROLLUP, DEDUP, ITERATE, PROJECT, JOIN, NORMALIZE, DENORMALIZE, PARSE, CHOOSEN, ENTH, TOPN, DISTRIBUTE ~~PROJECT, JOIN, NORMALIZE, DENORMALIZE, PARSE, CHOOSEN, ENTH, TOPN, DISTRIBUTE~~ === ECL Encapsulation === Whilst ECL is terse and LexisNexis claims that 1 line of ECL is roughly equivalent to 120 lines of C++ it still has significant support for large scale programming including data encapsulation and code re-use. The constructs available include: MODULE, FUNCTION, INTERFACE, MACRO, EXPORT, SHARED ~~to 120 lines of C++ it still has significant support for large scale programming~~ ~~including data encapsulation and code re-use. The constructs available include:~~ ~~MODULE, FUNCTION, INTERFACE, MACRO, EXPORT, SHARED~~ === Support for Parallelism in ECL === In the HPCC implementation, by default, most ECL constructs will execute in parallel across the hardware being used. Many of the primitives also have a LOCAL option to specify that the operation is to occur locally on each node. ~~across the hardware being used. Many of the primitives also have a LOCAL option to~~ ~~specify that the operation is to occur locally on each node.~~ === Comparison to Map-Reduce === The Hadoop Map-Reduce paradigm actually consists of three phases which correlate to ECL primitives as follows: ~~ECL primitives as follows:~~ {{clear}} {\| class="wikitable sortable" style="font-size: smaller; text-align: center; width: auto;" ~~auto;"~~ \|- ! Hadoop Name/Term Line 160 ⟶ 82: ! MAPing within the MAPper ! PROJECT/TRANSFORM ! Takes a record and coverts to a different format; in the [[Hadoop]] case the conversion is into a key-value pair ~~conversion is into a key-value pair~~ \|- ! SHUFFLE (Phase 1) Line 178 ⟶ 98: == References == <!--- See [[Wikipedia:Footnotes]] on how to create references using <ref></ref> tags which will then appear here automatically --> ~~which will then appear here automatically -->~~ {{Reflist}} == External links == * [http://www.nytimes.com/2008/02/21/technology/21iht-reed.4.10279549.html Reed Elsevier to acquire ChoicePoint for $3.6 billion] ~~Elsevier to acquire ChoicePoint for $3.6 billion]~~ * [http://www.bloomberg.com/apps/news?pid=newsarchive&sid=aBuqYZDOSPL4&refer=uk Reed Elsevier's LexisNexis Buys Seisint for $775 Mln] * [http://www.reuters.com/finance/stocks/keyDevelopments?symbol=ENL&pn=15 Reed Elsevier] ~~Elsevier]~~

ECL (data-centric programming language): Difference between revisions