Revision as of 09:16, 15 September 2011 edit 131.254.101.85 (talk) No edit summary ← Previous edit		Revision as of 13:25, 26 January 2012 edit undo 116.74.120.229 (talk) →Common Characteristics Next edit →
Line 27: There are several important common characteristics of data-intensive computing systems that distinguish them from other forms of computing: (1) The principle of ~~collocation~~collection of the data and programs or algorithms is used to perform the computation. To achieve high performance in data-intensive computing, it is important to minimize the movement of data.<ref>[http://queue.acm.org/detail.cfm?id=1394131 Distributed Computing Economics] by J. Gray, "Distributed Computing Economics," ACM Queue, Vol. 6, No. 3, 2008, pp. 63-68.</ref> This characteristic allows processing algorithms to execute on the nodes where the data resides reducing system overhead and increasing performance.<ref>[http://www.pnl.gov/science/images/highlights/computing/dic_special.pdfData-Intensive Computing in the 21st Century], by I. Gorton, P. Greenfield, A. Szalay, and R. Williams, IEEE Computer, Vol. 41, No. 4, 2008, pp. 30-32.</ref> Newer technologies such as [[InfiniBand]] allow data to be stored in a separate repository and provide performance comparable to collocated data. (2) The programming model utilized. Data-intensive computing systems utilize a machine-independent approach in which applications are expressed in terms of high-level operations on data, and the runtime system transparently controls the scheduling, execution, load balancing, communications, and movement of programs and data across the distributed computing cluster.<ref>[http://www.cs.cmu.edu/~bryant/presentations/DISC-concept.ppt Data Intensive Scalable Computing] by R.E. Bryant. "Data Intensive Scalable Computing," 2008</ref> The programming abstraction and language tools allow the processing to be expressed in terms of data flows and transformations incorporating new dataflow [[programming languages]] and shared libraries of common data manipulation algorithms such as sorting.

Data-intensive computing: Difference between revisions