Content deleted Content added
No edit summary |
No edit summary |
||
Line 24:
* The pbdR built on pbdMPI uses [[SPMD|SPMD parallelism]] where every processors are considered as workers and own parts of data. The [[SPMD|SPMD parallelism]] introduced in mid 1980 is particularly efficient in homogeneous computing environments for large data, for example, performing [[singular value decomposition]] on a large matrix, or performing [[Mixture model|clustering analysis]] on high-dimensional large data. On the other hand, there is no restriction to use [[Master/slave (technology)|manager/workers parallelism]] in [[SPMD|SPMD parallelism]] environment.
* The Rmpi<ref name=rmpi/> uses [[Master/slave (technology)|manager/workers parallelism]] where one main processor (manager) servers as the control of all other processors (workers). The [[Master/slave (technology)|manager/workers parallelism]] introduced around early 2000 is particularly efficient for large tasks in small [[Computer cluster|clusters]], for example, [[Bootstrapping (statistics)|bootstrap method]] and [[Monte Carlo method|Monte Carlo simulation]] in applied statistics since [[Independent and identically distributed random variables|i.i.d.]] assumption is commonly used in most [[Statistics|statistical analysis]]. In particular, task pull parallelism has better performance for Rmpi in heterogeneous computing environments.
The idea of [[SPMD|SPMD parallelism]] is to let every processor do the same work, but on different parts of a large data set. For example, a modern [[Graphics processing unit|GPU]] is a large collection of slower co-processors
* does '''not''' like Rmpi, snow, snowfall, do-like, '''nor''' parallel packages in R,
* does '''not''' focus on interactive computing '''nor''' master/workers,
|