Content deleted Content added
m →OpenAcc |
|||
Line 70:
\end{pmatrix},\quad C_{ij}=\sum_{k=1}^m A_{ik}B_{kj}</math>
To compute each element in {{math|'''C'''}} takes {{math|''m''}} multiplications and {{math|(''m'' – ''1'')}} additions. Therefore, with a CPU implementation, the time complexity to achieve this computation is ''Θ(n''<sup href="Category:GPGPU">''3''</sup>'')'' in the following C example''.'' However, we have known that elements in {{math|'''C'''}} are independent
// MxM matrix multiplication in C
void matrixMul(
Line 115:
=== Multidimensional Convolution (M-D Convolution) ===
Convolution is a frequently used operation in DSP. To compute the 2-D convolution of two ''m'' × ''m'' signals, it requires {{math|''m''<sup>''2''</sup>}} multiplications and {{math|''m'' × (''m'' – ''1'')}} additions for an output element. That is, the overall time complexity is ''Θ(n''<sup href="Category:GPGPU">''4''</sup>'')'' for the entire output signal''.'' As the following OpenCL example shows, with GPGPU acceleration, the total computation time effectively decreases to ''Θ(n''<sup href="Category:GPGPU">''2''</sup>'')'' since all output elements are data independent.
2-D convolution equation:
<math>y(n_1, n_2)=x(n_1,n_2)**h(n_1,n_2)=\sum_{k_1=0}^{m-1}\sum_{k_2=0}^{m-1}x(k_1, k_2)h(n_1-k_1, n_2-k_2)</math><syntaxhighlight lang="c++" line="1">
Line 144 ⟶ 146:
Note that, although the example demonstrated above is a 2-D convolution, a similar approach can be adopted for a higher dimension system. Overall, for a s-D convolution, a GPGPU implementation has time complexity ''Θ(n''<sup href="Category:GPGPU">''s''</sup>'')'', whereas a CPU implementation has time complexity ''Θ(n''<sup href="Category:GPGPU">''2s''</sup>'')''.
M-D convolution equation:
<math>y(n_1,n_2,...,n_s)=x(n_1,n_2,...,n_s)**h(n_1,n_2,...,n_s)=\sum_{k_1=0}^{m_1-1}\sum_{k_2=0}^{m_2-1}...\sum_{k_s=0}^{m_s-1}x(k_1, k_2,...,k_s)h(n_1-k_1,n_2-k_2,...,n_s-k_s)</math>
=== Multidimensional Discrete Time Fourier Transform (M-D DTFT) ===
In addition to convolution, the [[Fourier transform|discrete
<math>X(\Omega_1,\Omega_2,...,\Omega_s)=\sum_{n_1=0}^{m_1-1}\sum_{n_2=0}^{m_2-1}...\sum_{n_s=0}^{m_s-1}x(n_1, n_2,...,n_s)e^{-j(\Omega_1n_1+\Omega_1n_1+...+\Omega_sn_s)}</math>
Practically, to implement an M-D DTFT
// DTFT in OpenCL
__kernel void convolution(
|