Content deleted Content added
m Link to model-based clustering article updated to link to main article |
|||
(55 intermediate revisions by 34 users not shown) | |||
Line 2:
{{notability|date=June 2013}}
{{COI|date=June 2013}}
}}
{{Infobox programming language
| name =
| logo =
| paradigm = [[SPMD]] and [[MPMD]]
|
| designer = Wei-Chen Chen, George Ostrouchov, Pragneshkumar Patel, and Drew Schmidt
| developer = pbdR Core Team
| latest_test_version = Through [[GitHub]] at [
| typing = [[dynamic typing|Dynamic]]
| influenced_by = [[R (programming language)|R]], [[C (programming language)|C]], [[
| operating_system = [[Cross-platform]]
| license = [[General Public License]] and [[Mozilla Public License]]
| website =
}}
'''Programming with Big Data in R''' (pbdR)<ref>{{cite web|author=Ostrouchov, G., Chen, W.-C., Schmidt, D., Patel, P.|title=Programming with Big Data in R|year=2012|url=http://r-pbd.org}}</ref> is a series of [[R (programming language)|R]] packages and an environment for [[statistical computing]] with [[
Two main implementations in [[R (programming language)|R]] using [[Message Passing Interface|MPI]] are Rmpi<ref name=rmpi>{{cite journal|author=Yu, H.|title=Rmpi: Parallel Statistical Computing in R|year=2002|url=
* The pbdR built on pbdMPI uses [[SPMD|SPMD parallelism]] where every
* The Rmpi<ref name=rmpi/> uses [[Master/slave (technology)|manager/workers parallelism]] where one main processor (manager)
The idea of [[SPMD|SPMD parallelism]] is to let every
== Package design ==
Line 33 ⟶ 29:
{| class="wikitable"
|-
! General !! I/O !! Computation !! Application !! Profiling !! Client/Server
|-
| pbdDEMO || pbdNCDF4 || pbdDMAT || pmclust || pbdPROF || pbdZMQ
|-
| pbdMPI || pbdADIOS || pbdBASE ||
|-
| ||
|-
| || || kazaam || || || pbdRPC
|}
[[File:Pbd overview.png|thumb|The images describes how various pbdr packages are correlated.]]
Among these packages, pbdMPI provides wrapper functions to [[Message Passing Interface|MPI]] library, and it also produces a [[Library (computing)|shared library]] and a configuration file for MPI environments. All other packages rely on this configuration for installation and library loading that
* pbdMPI --- an efficient interface to MPI either [[Open MPI|OpenMPI]] or [[MPICH2]] with a focus on Single Program/Multiple Data ([[SPMD]]) parallel programming style
* pbdSLAP --- bundles scalable dense linear algebra libraries in double precision for R, based on [[ScaLAPACK]] version 2.0.2 which includes several scalable linear algebra packages (namely [[BLACS]], [[PBLAS]], and [[ScaLAPACK]]).
* pbdNCDF4 ---
* pbdBASE --- low-level [[ScaLAPACK]] codes and wrappers
* pbdDMAT --- distributed matrix classes and computational methods, with a focus on linear algebra and statistics
* pbdDEMO --- set of package demonstrations and examples, and this unifying vignette
*
*
* pbdZMQ --- interface to [[ZeroMQ|ØMQ]]
* remoter --- R client with remote R servers
* pbdCS --- pbdR client with remote pbdR servers
* pbdRPC --- remote procedure call
* kazaam --- very tall and skinny distributed matrices
* pbdML --- machine learning toolbox
== Examples ==
=== Example 1 ===
Hello World! Save the following code in a file called
<
### Initial MPI
library(pbdMPI, quiet = TRUE)
Line 68 ⟶ 72:
### Finish
finalize()
</syntaxhighlight>
and use the command
<
mpiexec -np 2 Rscript demo.r
</syntaxhighlight>
to execute the code where [[R (programming language)|Rscript]]
=== Example 2 ===
The following example modified from pbdMPI illustrates the basic [[programming language syntax|syntax of the language]] of pbdR.
Since pbdR is designed in [[SPMD]], all the R scripts are stored in files and executed from the command line via mpiexec, mpirun, etc. Save the following code in a file called
<
### Initial MPI
library(pbdMPI, quiet = TRUE)
Line 97 ⟶ 101:
### Finish
finalize()
</syntaxhighlight>
and use the command
<
mpiexec -np 4 Rscript demo.r
</syntaxhighlight>
to execute the code where [[R (programming language)|Rscript]] is one of command line executable program.
=== Example 3 ===
The following example modified from pbdDEMO illustrates the basic ddmatrix computation of pbdR which performs [[singular value decomposition]] on a given matrix.
Save the following code in a file called
<
# Initialize process grid
library(pbdDMAT, quiet=T)
Line 126 ⟶ 130:
# Finish
finalize()
</syntaxhighlight>
and use the command
<
mpiexec -np 2 Rscript demo.r
</syntaxhighlight>
to execute the code where [[R (programming language)|Rscript]] is one of command line executable program.
== Further reading ==
* {{cite
* {{cite
* {{cite
* {{cite web|title=High-Performance and Parallel Computing with R|author=Dirk Eddelbuettel|date=13 November 2022 |url=
* {{cite news|title=R at 12,000 Cores|url=http://www.r-bloggers.com/r-at-12000-cores/}}<br />This article was read 22,584 times in 2012 since it posted on October 16, 2012, and ranked number 3<ref>{{cite news|url=http://www.r-bloggers.com/100-most-read-r-posts-for-2012-stats-from-r-bloggers-big-data-visualization-data-manipulation-and-other-languages/|title=100 most read R posts in 2012 (stats from R-bloggers) – big data, visualization, data manipulation, and other languages}}</ref>
* {{cite web|url=http://rwiki.sciviews.org/doku.php?id=developers:projects:gsoc2013:mpiprofiler|archive-url=https://archive.today/20130629095333/http://rwiki.sciviews.org/doku.php?id=developers:projects:gsoc2013:mpiprofiler|url-status=dead|archive-date=2013-06-29|title=Profiling Tools for Parallel Computing with R|author=Google Summer of Code - R 2013}}
* {{cite web|url=http://rpubs.com/wush978/pbdMPI-linux-pilot|title=在雲端運算環境使用R和MPI|author=Wush Wu (2014)}}
* {{cite web|url=
== References ==
{{Reflist|30em}}
== External links ==
* {{Official website|www.r-pbd.org}}
{{DEFAULTSORT:PbdR}}
[[Category:Parallel computing]]▼
[[Category:Cross-platform free software]]
[[Category:
[[Category:Data-centric programming languages]]
[[Category:Free statistical software]]
[[Category:Numerical analysis software for Linux]]
[[Category:Numerical analysis software for
[[Category:Numerical analysis software for Windows]]
[[Category:
|