Cuneiform (programming language): Difference between revisions

Content deleted Content added
m top: clean up, replaced: Nature biotechnology → Nature Biotechnology using AWB
No edit summary
Line 221:
==Parallel execution==
 
Cuneiform is a purely functional language, i.e., it does not support mutable references. In the consequence, it can use subterm-independence to divide a program into parallelizable partitions. The Cuneiform scheduler distributes these partitions to worker nodes. In addition, Cuneiform uses a Call-by-Name evaluation strategy to compute values only if they contribute to the computation result. Finally, foreign function applications are memoized to avoid speed up queries that contain previously derived results.
The task applications in a Cuneiform script form a data dependency graph.
This dependency graph constrains the order in which tasks can be evaluated.
Apart from data dependencies tasks can be evaluated in any order, assuming tasks are always [[Side effect (computer science)|side effect]]-free and deterministic.
 
For example, the following Cuneiform program allows the applications of <code>f</code> and <code>g</code> to run in parallel while <code>h</code> is dependent and can be started only when both <code>f</code> and <code>g</code> are finished.
;[[Map (higher-order function)|Map]]: Applies a task to each element in a list. Each task applications can run in parallel.
;[[Cartesian product]]: Takes the Cartesian product of several lists and applies a task to each combination. Each task application can run in parallel.
;[[Dot product]]: Given a pair of lists of equal sizes, each element of the first list is combined with its corresponding element in the second list. A task is applied to each combination. Each task application can run in parallel.
;[[Fold (higher-order function)|Aggregate]]: Applies a task to the list as a whole without decomposing it. Since the task is applied only once for the whole list, this skeleton leaves the parallelism potential unchanged.
;[[Conditional (computer programming)|Conditional]]: Evaluates a program branch, depending on a condition computed at runtime. This skeleton leaves the parallelism potential unchanged.
 
<pre>
By partitioning input data and using parallelizable skeletons to process partitions the interpreter can exploit data parallelism even if the integrated tools are single-threaded. Workflows can be executed also in distributed compute environments.
let output-of-f : File = f();
let output-of-g : File = g();
 
h( f = output-of-f, g = output-of-g );
</pre>
 
The following Cuneiform program creates three parallel applications of the function <code>f</code> by mapping <code>f</code> over a three-element list:
 
<pre>
let xs : [File] =
['a.txt', 'b.txt', 'c.txt' : File];
 
for x <- xs do
f( x = x )
: File
end;
</pre>
 
Similarly, the applications of <code>f</code> and <code>g</code> are independent in the construction of the record <code>r</code> and can, thus, be run in parallel:
 
<pre>
let r : <a : File, b : File> =
<a = f(), b = g()>;
</pre>
==Examples==