Content deleted Content added
mNo edit summary |
Citation bot (talk | contribs) Alter: title, pages. Add: website. Formatted dashes. | Use this bot. Report bugs. | Suggested by BrownHairedGirl | #UCB_webform 2157/3722 |
||
Line 111:
; Netlib CBLAS: Reference [[C (programming language)|C]] interface to the BLAS. It is also possible (and popular) to call the Fortran BLAS from C.<ref>{{Cite web|url=http://www.netlib.org/blas|title=BLAS (Basic Linear Algebra Subprograms)|website=www.netlib.org|access-date=2017-07-07}}</ref>
; [[OpenBLAS]]: Optimized BLAS based on GotoBLAS, supporting [[x86]], [[x86-64]], [[MIPS architecture|MIPS]] and [[ARM architecture family|ARM]] processors.<ref>{{Cite web|url=http://www.openblas.net/|title=OpenBLAS : An optimized BLAS library|website=www.openblas.net|access-date=2017-07-07}}</ref>
; PDLIB/SX: [[NEC Corporation|NEC]]'s Public Domain Mathematical Library for the NEC [[NEC SX architecture|SX-4]] system.<ref name=":0">{{cite web |url=http://www.nec.co.jp/hpc/mediator/sxm_e/software/61.html |title=
; rocBLAS: Implementation that runs on [[AMD]] GPUs via [[ROCm]].<ref>{{Cite web|url=https://rocmdocs.amd.com/en/latest/ROCm_Tools/rocblas.html|title=rocBLAS|website=rocmdocs.amd.com|access-date=2021-05-21}}</ref>
; SCSL
: [[Silicon Graphics|SGI]]'s Scientific Computing Software Library contains BLAS and LAPACK implementations for SGI's [[Irix]] workstations.<ref>{{cite web |url=http://www.sgi.com/products/software/scsl.html |title=
; Sun Performance Library: Optimized BLAS and LAPACK for [[SPARC]], [[Intel Core|Core]] and [[AMD64]] architectures under Solaris 8, 9, and 10 as well as Linux.<ref>{{Cite web|url=http://www.oracle.com/technetwork/server-storage/solarisstudio/overview/index.html|title=Oracle Developer Studio|website=www.oracle.com|access-date=2017-07-07}}</ref>
; uBLAS: A generic [[C++]] template class library providing BLAS functionality. Part of the [[Boost library]]. It provides bindings to many hardware-accelerated libraries in a unifying notation. Moreover, uBLAS focuses on correctness of the algorithms using advanced C++ features.<ref>{{Cite web|url=http://www.boost.org/doc/libs/1_60_0/libs/numeric/ublas/doc/index.html|title=Boost Basic Linear Algebra - 1.60.0|website=www.boost.org|access-date=2017-07-07}}</ref>
Line 121:
; Armadillo: [[Armadillo (C++ library)|Armadillo]] is a C++ linear algebra library aiming towards a good balance between speed and ease of use. It employs template classes, and has optional links to BLAS/ATLAS and LAPACK. It is sponsored by [[NICTA]] (in Australia) and is licensed under a free license.<ref>{{Cite web|url=http://arma.sourceforge.net/|title=Armadillo: C++ linear algebra library|website=arma.sourceforge.net|access-date=2017-07-07}}</ref>
; [[LAPACK]]: LAPACK is a higher level Linear Algebra library built upon BLAS. Like BLAS, a reference implementation exists, but many alternatives like libFlame and MKL exist.
; Mir: An [[LLVM]]-accelerated generic numerical library for science and machine learning written in [[D (programming language)|D]]. It provides generic linear algebra subprograms (GLAS). It can be built on a CBLAS implementation.<ref>{{Cite web|url=https://github.com/libmir|title= Dlang Numerical and System Libraries|website= [[GitHub]]}}</ref>
==Similar libraries (not compatible with BLAS)==
Line 134:
==Batched BLAS==
The traditional BLAS functions have been also ported to architectures that support large amounts of parallelism such as [[GPUs]]. Here, the traditional BLAS functions provide typically good performance for large matrices. However, when computing e.g., matrix-matrix-products of many small matrices by using the GEMM routine, those architectures show significant performance losses. To address this issue, in 2017 a batched version of the BLAS function has been specified.<ref name="dongarra17">{{cite journal |last1=Dongarra |first1=Jack |last2=Hammarling |first2=Sven |last3=Higham |first3=Nicholas J. |last4=Relton |first4=Samuel D. |last5=Valero-Lara |first5=Pedro |last6=Zounon |first6=Mawussi |title=The Design and Performance of Batched BLAS on Modern High-Performance Computing Systems |journal=Procedia Computer Science |volume=108 |pages=
Taking the GEMM routine from above as an example, the batched version performs the following computation simultaneously for many matrices:
|