Content deleted Content added
Revert changes by 64.121.97.220 on 17:45, 23 December 2022, which degraded the explanation of the basic nature of BLAS. I also improved the (original) explanation a little. |
Citation bot (talk | contribs) Added bibcode. | Use this bot. Report bugs. | Suggested by Dominic3203 | Category:Numerical software | #UCB_Category 63/119 |
||
Line 144:
The index <math>k</math> in square brackets indicates that the operation is performed for all matrices <math>k</math> in a stack. Often, this operation is implemented for a strided batched memory layout where all matrices follow concatenated in the arrays <math>A</math>, <math>B</math> and <math>C</math>.
Batched BLAS functions can be a versatile tool and allow e.g. a fast implementation of [[exponential integrators]] and [[Magnus integrators]] that handle long integration periods with many time steps.<ref name="herb21">{{cite journal |last1=Herb |first1=Konstantin |last2=Welter |first2=Pol |title=Parallel time integration using Batched BLAS (Basic Linear Algebra Subprograms) routines |journal=Computer Physics Communications |volume=270 |pages=108181 |date=2022 |doi=10.1016/j.cpc.2021.108181 |arxiv=2108.07126|bibcode=2022CoPhC.27008181H |s2cid=237091802 }}</ref> Here, the [[matrix exponentiation]], the computationally expensive part of the integration, can be implemented in parallel for all time-steps by using Batched BLAS functions.
==See also==
|