Revision as of 23:39, 5 August 2022 edit Comp.arch (talk \| contribs) Extended confirmed users 41,498 edits mNo edit summary Tag: 2017 wikitext editor ← Previous edit		Revision as of 02:03, 24 August 2022 edit undo Citation bot (talk \| contribs) Bots 5,868,586 edits Alter: title, pages. Add: website. Formatted dashes. \| Use this bot. Report bugs. \| Suggested by BrownHairedGirl \| #UCB_webform 2157/3722 Next edit →
Line 111: ; Netlib CBLAS: Reference [[C (programming language)\|C]] interface to the BLAS. It is also possible (and popular) to call the Fortran BLAS from C.<ref>{{Cite web\|url=http://www.netlib.org/blas\|title=BLAS (Basic Linear Algebra Subprograms)\|website=www.netlib.org\|access-date=2017-07-07}}</ref> ; [[OpenBLAS]]: Optimized BLAS based on GotoBLAS, supporting [[x86]], [[x86-64]], [[MIPS architecture\|MIPS]] and [[ARM architecture family\|ARM]] processors.<ref>{{Cite web\|url=http://www.openblas.net/\|title=OpenBLAS : An optimized BLAS library\|website=www.openblas.net\|access-date=2017-07-07}}</ref> ; PDLIB/SX: [[NEC Corporation\|NEC]]'s Public Domain Mathematical Library for the NEC [[NEC SX architecture\|SX-4]] system.<ref name=":0">{{cite web \|url=http://www.nec.co.jp/hpc/mediator/sxm_e/software/61.html \|title=~~Archived~~PDLIB/SX: ~~copy~~Business Solution \| NEC \|access-date=2007-05-20 \|url-status=dead \|archive-url=https://web.archive.org/web/20070222154031/http://www.nec.co.jp/hpc/mediator/sxm_e/software/61.html \|archive-date=2007-02-22 }}</ref> ; rocBLAS: Implementation that runs on [[AMD]] GPUs via [[ROCm]].<ref>{{Cite web\|url=https://rocmdocs.amd.com/en/latest/ROCm_Tools/rocblas.html\|title=rocBLAS\|website=rocmdocs.amd.com\|access-date=2021-05-21}}</ref> ; SCSL : [[Silicon Graphics\|SGI]]'s Scientific Computing Software Library contains BLAS and LAPACK implementations for SGI's [[Irix]] workstations.<ref>{{cite web \|url=http://www.sgi.com/products/software/scsl.html \|title=~~Archived~~SGI ~~copy~~- SCSL Scientific Library: Home Page \|access-date=2007-05-20 \|url-status=dead \|archive-url=https://web.archive.org/web/20070513173030/http://www.sgi.com/products/software/scsl.html \|archive-date=2007-05-13 }}</ref> ; Sun Performance Library: Optimized BLAS and LAPACK for [[SPARC]], [[Intel Core\|Core]] and [[AMD64]] architectures under Solaris 8, 9, and 10 as well as Linux.<ref>{{Cite web\|url=http://www.oracle.com/technetwork/server-storage/solarisstudio/overview/index.html\|title=Oracle Developer Studio\|website=www.oracle.com\|access-date=2017-07-07}}</ref> ; uBLAS: A generic [[C++]] template class library providing BLAS functionality. Part of the [[Boost library]]. It provides bindings to many hardware-accelerated libraries in a unifying notation. Moreover, uBLAS focuses on correctness of the algorithms using advanced C++ features.<ref>{{Cite web\|url=http://www.boost.org/doc/libs/1_60_0/libs/numeric/ublas/doc/index.html\|title=Boost Basic Linear Algebra - 1.60.0\|website=www.boost.org\|access-date=2017-07-07}}</ref> Line 121: ; Armadillo: [[Armadillo (C++ library)\|Armadillo]] is a C++ linear algebra library aiming towards a good balance between speed and ease of use. It employs template classes, and has optional links to BLAS/ATLAS and LAPACK. It is sponsored by [[NICTA]] (in Australia) and is licensed under a free license.<ref>{{Cite web\|url=http://arma.sourceforge.net/\|title=Armadillo: C++ linear algebra library\|website=arma.sourceforge.net\|access-date=2017-07-07}}</ref> ; [[LAPACK]]: LAPACK is a higher level Linear Algebra library built upon BLAS. Like BLAS, a reference implementation exists, but many alternatives like libFlame and MKL exist. ; Mir: An [[LLVM]]-accelerated generic numerical library for science and machine learning written in [[D (programming language)\|D]]. It provides generic linear algebra subprograms (GLAS). It can be built on a CBLAS implementation.<ref>{{Cite web\|url=https://github.com/libmir\|title= Dlang Numerical and System Libraries\|website= [[GitHub]]}}</ref> ==Similar libraries (not compatible with BLAS)== Line 134: ==Batched BLAS== The traditional BLAS functions have been also ported to architectures that support large amounts of parallelism such as [[GPUs]]. Here, the traditional BLAS functions provide typically good performance for large matrices. However, when computing e.g., matrix-matrix-products of many small matrices by using the GEMM routine, those architectures show significant performance losses. To address this issue, in 2017 a batched version of the BLAS function has been specified.<ref name="dongarra17">{{cite journal \|last1=Dongarra \|first1=Jack \|last2=Hammarling \|first2=Sven \|last3=Higham \|first3=Nicholas J. \|last4=Relton \|first4=Samuel D. \|last5=Valero-Lara \|first5=Pedro \|last6=Zounon \|first6=Mawussi \|title=The Design and Performance of Batched BLAS on Modern High-Performance Computing Systems \|journal=Procedia Computer Science \|volume=108 \|pages=~~495-504~~495–504 \|date=2017 \|doi=10.1016/j.procs.2017.05.138}}</ref> Taking the GEMM routine from above as an example, the batched version performs the following computation simultaneously for many matrices:

Basic Linear Algebra Subprograms: Difference between revisions