Revision as of 00:52, 12 December 2015 edit Sing0512 (talk \| contribs) 143 edits m →OpenAcc Tag: Visual edit ← Previous edit		Revision as of 01:01, 12 December 2015 edit undo Sing0512 (talk \| contribs) 143 edits m →m × m Matrix Multiplication Tag: Visual edit Next edit →
Line 70: \end{pmatrix},\quad C_{ij}=\sum_{k=1}^m A_{ik}B_{kj}</math> To compute each element in {{math\|'''C'''}} takes {{math\|''m''}} multiplications and {{math\|(''m'' – ''1'')}} additions. Therefore, with a CPU implementation, the time complexity to achieve this computation is ''Θ(n''<sup href="Category:GPGPU">''3''</sup>'')'' in the following C example''.'' However, we have known that elements in {{math\|'''C'''}} are independent toof each other. Hence, the computation can be fully parallelized by SIMD processors, such as GPGPU devices. With a GPGPU implementation, the time complexity significantly reduces to ''Θ(n)'' by unrolling the for-loop ~~showing~~as shown in the following OpenCL example''.''<syntaxhighlight lang="c" line="1"> // MxM matrix multiplication in C void matrixMul( Line 115: === Multidimensional Convolution (M-D Convolution) === Convolution is a frequently used operation in DSP. To compute the 2-D convolution of two ''m'' × ''m'' signals, it requires {{math\|''m''<sup>''2''</sup>}} multiplications and {{math\|''m'' × (''m'' – ''1'')}} additions for an output element. That is, the overall time complexity is ''Θ(n''<sup href="Category:GPGPU">''4''</sup>'')'' for the entire output signal''.'' As the following OpenCL example shows, with GPGPU acceleration, the total computation time effectively decreases to ''Θ(n''<sup href="Category:GPGPU">''2''</sup>'')'' since all output elements are data independent. 2-D convolution equation: <math>y(n_1, n_2)=x(n_1,n_2)h(n_1,n_2)=\sum_{k_1=0}^{m-1}\sum_{k_2=0}^{m-1}x(k_1, k_2)h(n_1-k_1, n_2-k_2)</math><syntaxhighlight lang="c++" line="1"> Line 144 ⟶ 146: Note that, although the example demonstrated above is a 2-D convolution, a similar approach can be adopted for a higher dimension system. Overall, for a s-D convolution, a GPGPU implementation has time complexity ''Θ(n''<sup href="Category:GPGPU">''s''</sup>'')'', whereas a CPU implementation has time complexity ''Θ(n''<sup href="Category:GPGPU">''2s''</sup>'')''. M-D convolution equation: <math>y(n_1,n_2,...,n_s)=x(n_1,n_2,...,n_s)h(n_1,n_2,...,n_s)=\sum_{k_1=0}^{m_1-1}\sum_{k_2=0}^{m_2-1}...\sum_{k_s=0}^{m_s-1}x(k_1, k_2,...,k_s)h(n_1-k_1,n_2-k_2,...,n_s-k_s)</math> === Multidimensional Discrete Time Fourier Transform (M-D DTFT) === In addition to convolution, the [[Fourier transform\|discrete -time Fourier transform (DTFT)]] is another technique which is often used in system analysis. <math>X(\Omega_1,\Omega_2,...,\Omega_s)=\sum_{n_1=0}^{m_1-1}\sum_{n_2=0}^{m_2-1}...\sum_{n_s=0}^{m_s-1}x(n_1, n_2,...,n_s)e^{-j(\Omega_1n_1+\Omega_1n_1+...+\Omega_sn_s)}</math> Practically, to implement an M-D DTFT~~, if assuming the system is separable~~, we can perform M times 1-D DFTF and matrix transpose with respect to each dimension. With a 1-D ~~FFT~~DTFT operation, GPGPU can conceptually reduce the complexity from ''Θ(n''<sup href="Category:GPGPU">''2''</sup>'')'' to Θ(n'')'' as illustrated by the following example of OpenCL implementation''.'' That is, ~~to perform~~an M-D DTFT, the complexity of GPGPU can be ~~achieved~~computed byon a GPU with a complexity of ''Θ(n''<sup href="Category:GPGPU">''2''</sup>'').'' While some GPGPUs' are also equipped with hardware FFT ~~accelerator~~accelerators internally, this implementation might be also optimized by invoking the FFT APIs or libraries provided by GPU ~~manufactures~~manufacture.<ref>{{cite web\|title = OpenCL™ Optimization Case Study Fast Fourier Transform – Part II – AMD\|url = http://developer.amd.com/resources/documentation-articles/articles-whitepapers/opencl-optimization-case-study-fast-fourier-transform-part-ii/\|website = AMD\|accessdate = 2015-11-05\|language = en-US}}</ref><syntaxhighlight lang="c++" line="1"> // DTFT in OpenCL __kernel void convolution(

Multidimensional DSP with GPU acceleration: Difference between revisions