Basic Linear Algebra Subprograms: Difference between revisions

Content deleted Content added
plural
OAbot (talk | contribs)
m Open access bot: doi added to citation with #oabot.
Line 134:
 
==Batched BLAS==
The traditional BLAS functions have been also ported to architectures that support large amounts of parallelism such as [[GPUs]]. Here, the traditional BLAS functions provide typically good performance for large matrices. However, when computing e.g., matrix-matrix-products of many small matrices by using the GEMM routine, those architectures show significant performance losses. To address this issue, in 2017 a batched version of the BLAS function has been specified.<ref name="dongarra17">{{cite journal |last1=Dongarra |first1=Jack |last2=Hammarling |first2=Sven |last3=Higham |first3=Nicholas J. |last4=Relton |first4=Samuel D. |last5=Valero-Lara |first5=Pedro |last6=Zounon |first6=Mawussi |title=The Design and Performance of Batched BLAS on Modern High-Performance Computing Systems |journal=Procedia Computer Science |volume=108 |pages=495-504 |date=2017 |doi=10.1016/j.procs.2017.05.138|doi-access=free }}</ref>
 
Taking the GEMM routine from above as an example, the batched version performs the following computation simultaneously for many matrices:
Line 159:
* {{Citation |author=BLAST Forum |title=Basic Linear Algebra Subprograms Technical (BLAST) Forum Standard |date=21 August 2001 |publisher=University of Tennessee |___location=Knoxville, TN }}
* {{Citation |last1=Dodson |first1= D. S. |last2=Grimes |first2=R. G. |title=Remark on algorithm 539: Basic Linear Algebra Subprograms for Fortran usage |journal=ACM Trans. Math. Softw. |volume=8 |issue= 4 |pages=403&ndash;404 |year=1982 |doi= 10.1145/356012.356020|s2cid= 43081631 }}
* {{Citation |last=Dodson |first=D. S. |title=Corrigendum: Remark on "Algorithm 539: Basic Linear Algebra Subroutines for FORTRAN usage" |journal=ACM Trans. Math. Softw. |volume=9 |page=140 |year=1983 |doi= 10.1145/356022.356032|s2cid=22163977 |doi-access=free }}
* J. J. Dongarra, J. Du Croz, S. Hammarling, and R. J. Hanson, Algorithm 656: An extended set of FORTRAN Basic Linear Algebra Subprograms, ACM Trans. Math. Softw., 14 (1988), pp.&nbsp;18&ndash;32.
* J. J. Dongarra, J. Du Croz, I. S. Duff, and S. Hammarling, A set of Level 3 Basic Linear Algebra Subprograms, ACM Trans. Math. Softw., 16 (1990), pp.&nbsp;1&ndash;17.