Parallelization contract: Difference between revisions

Content deleted Content added
Schubi87 (talk | contribs)
No edit summary
Schubi87 (talk | contribs)
No edit summary
Line 9:
 
The PACT programming model is based on the concept of Parallelization Contracts (PACTs). Similar to MapReduce, arbitrary user code is handed and executed by PACTs. However, PACT generalizes a couple of MapReduce's concepts:
-* Second-order Functions: PACT provides more second-order functions. Currently, five SOF called Input Contracts are supported. This set might be extended in the future.
-* Program structure: PACT allows the composition of arbitrary acyclic data flow graphs. In contract, MapReduce programs have a static structure (Map -> Reduce).
-* Data Model: PACT's data model are records of arbitrary many fields of arbitrary types. MapReduce's KeyValue-Pairs can be considered as records with two fields.
 
In the following, the concept of Parallelization Contracts is discussed and how they are composed to PACT programs.
Line 18:
 
Parallelization Contracts (PACTs) are data processing operators in a data flow. Therefore, a PACT has one or more data inputs and one or more outputs. A PACT consists of two components:
* Input Contract
* User function
* User code annotations
 
The figure below shows how those components work together. Input Contracts split the input data into independently processable subset. The user code is called for each of these independent subsets. All calls can be executed in parallel, because the subsets are independent.