Content deleted Content added
No edit summary |
No edit summary |
||
Line 14:
| website = [http://www.r-pbd.org r-pbd.org]
}}
'''Programming with Big Data in R''' (pbdR)<ref>{{cite web|author=Ostrouchov, G., Chen, W.-C., Schmidt, D., Patel, P.|title=Programming with Big Data in R|year=2012|url=http://r-pbd.org/}}</ref><ref>{{cite web|title=XSEDE|url=https://portal.xsede.org/knowledge-base/-/kb/document/bcrw}}</ref> is a
* The pbdR built on [http://cran.r-project.org/package=pbdMPI pbdMPI] uses [[SPMD|SPMD Parallelism]] where every processors are considered as workers and own parts of data. This parallelism is particularly for large data, for example, performing [[Singular value decomposition|singular value decomposition]] on a large matrix, or performing [[Mixture model|clustering analysis]] on high-dimensional large data. On the other hand, there is no restriction to use [[Master/slave (technology)|Manager/Workers Parallelism]] in [[SPMD|SPMD Parallelism]] environment.
* The [http://cran.r-project.org/package=Rmpi Rmpi]<ref name=rmpi/> uses [[Master/slave (technology)|Manager/Workers Parallelism]] where one main processor (manager) servers as the control of all other processors (workers). This parallelism is particularly efficient for large tasks in small [[Computer cluster|clusters]], for example, [[Bootstrapping (statistics)|bootstrap method]] and [[Monte Carlo method|Monte Carlo simulation]] in applied statistics since [[Independent and identically distributed random variables|i.i.d.]] assumption is commonly used in most [[Statistics|statistical analysis]].
Line 133 ⟶ 132:
* [http://cran.r-project.org/web/views/HighPerformanceComputing.html High-Performance and Parallel Computing with R].<ref>{{cite web|title=High-Performance and Parallel Computing with R|author=Dirk Eddelbuettel|url=http://cran.r-project.org/web/views/HighPerformanceComputing.html}}</ref>
* [http://userpages.umbc.edu/~gobbert/papers/pbdRtara2013.pdf UMBC HPCF Technique Report by Raim, A.M. (2013)].<ref>{{cite journal|author=Raim, A.M.|year=2013|title= Introduction to distributed computing with pbdR at the UMBC High Performance Computing Facility|journal= Technical Report HPCF-2013-2|url=http://userpages.umbc.edu/~gobbert/papers/pbdRtara2013.pdf}}</ref>
* [http://www.r-bloggers.com/r-at-12000-cores/ R at 12,000 Cores].<ref>{{cite news|title=R at 12,000 Cores|url=http://www.r-bloggers.com/r-at-12000-cores/}}</ref> This article was read 22,584 times in 2012 since it posted on October 16, 2012 and ranked number 3 according to [http://www.r-bloggers.com/100-most-read-r-posts-for-2012-stats-from-r-bloggers-big-data-visualization-data-manipulation-and-other-languages/|Top 100 R posts of 2012]<ref>{{cite
== External links ==
|