Content deleted Content added
Shaddowffax (talk | contribs) |
|||
(48 intermediate revisions by 36 users not shown) | |||
Line 1:
{{Short description|Mathematical operation in signal processing}}
In signal processing, '''multidimensional discrete convolution''' refers to the mathematical operation between two functions ''f'' and ''g'' on an ''n''-dimensional lattice that produces a third function, also of ''n''-dimensions. Multidimensional discrete convolution is the discrete analog of the [[convolution#Domain of definition|multidimensional convolution]] of functions on [[Euclidean space]].
==Definition==
===Problem
Similar to the one-dimensional case, an asterisk is used to represent the convolution operation. The number of dimensions in the given operation is reflected in the number of asterisks. For example, an ''M''-dimensional convolution would be written with ''M'' asterisks. The following represents a ''M''-dimensional convolution of discrete signals:
Line 15 ⟶ 16:
The resulting output region of support of a discrete multidimensional convolution will be determined based on the size and regions of support of the two input signals.[[File:Picture1_wiki.png|thumb|475px|Visualization of Convolution between two Simple Two-Dimensional Signals|none]]
Listed are several properties of the two-dimensional convolution operator.
'''''Commutative Property:'''''
Line 27 ⟶ 28:
'''''Distributive Property:'''''
<math>x**(h+g) = (x**h)
These properties are seen in use in the figure below.
<math>w = x ** h</math>
Line 44 ⟶ 45:
<math>h_{eq} = h**g</math>
[[File:Cascaded.png|none|thumb|272x272px|Both figures represent cascaded systems.
A similar analysis can be done on a set of parallel systems illustrated below.
In this case, it is clear that:
Line 64 ⟶ 65:
The equivalent impulse responses in both cascaded systems and parallel systems can be generalized to systems with <math>N</math>-number of filters.<ref name=":4" />
===Motivation
Convolution in one dimension was a powerful discovery that allowed the input and output of a linear shift-invariant (LSI) system (see [[LTI system theory]])
For example, consider an image that is sent over some wireless network subject to
[[File:Screen Shot 2015-11-11 at 11.18.23 PM.png|none|thumb|311x311px|Impulse Responses of Typical Multidimensional Low Pass Filters]]
In addition to filtering out spectral content, the multidimensional convolution can implement [[edge detection]] and smoothing. This once again is wholly dependent on the values of the impulse response that is used to convolve with the input image.
[[File:Screen Shot 2015-11-11 at 11.21.00 PM.png|none|thumb|Typical Impulse Responses for Edge Detection]]
[[File:
In addition to image processing, multidimensional convolution can be implemented to enable a variety of other applications. Since filters are widespread in digital communication systems, any system that must transmit multidimensional data is assisted by filtering techniques It is used in real-time video processing, neural network analysis, digital geophysical data analysis, and much more.<ref>{{Cite web|title = Digital Geophysical Analysis Redesign|url = http://www-rohan.sdsu.edu/~jiracek/DAGSAW/4.1.html|website = www-rohan.sdsu.edu|
One typical distortion that occurs during image and video capture or transmission applications is blur that is caused by a low-pass filtering process. The introduced blur can be modeled using Gaussian low-pass filtering.
[[File:Wiki image blur example.png|500px|thumb|Original image (left) and blurred image (right) performed using Gaussian convolution|none]]
==Row-
===Separable
A signal is said to be '''separable''' if it can be written as the product of multiple one-dimensional signals.<ref name=":4">{{citation |last1 = Dudgeon|first1 = Dan|last2 = Mersereau|first2 = Russell|title = Multidimensional Digital Signal Processing|publisher = Prentice-Hall|year = 1983|pages = 21–22}}</ref> Mathematically, this is expressed as the following:
Line 90 ⟶ 93:
<math>x(n_1,n_2,...,n_M) = x(n_1)x(n_2)...x(n_M)</math>
Some readily recognizable separable signals include the unit step function, and the
<math>u(n_1,n_2,...,n_M)=u(n_1)u(n_2)...u(n_M)</math> (unit step function)
<math>\delta(n_1,n_2,...,n_M)=\delta(n_1)\delta(n_2)...\delta(n_M)</math> (
Convolution is a linear operation. It then follows that the multidimensional convolution of separable signals can be expressed as the product of many one-dimensional convolutions. For example, consider the case where ''x'' and ''h'' are both separable functions.
Line 102 ⟶ 105:
By applying the properties of separability, this can then be rewritten as the following:
<math>x(n_1,n_2)**h(n_1,n_2)=\bigg(\sum_{k_1=-\infty}^{\infty} h(k_1)x(n_1-k_1)\bigg)\bigg(\sum_{k_2=-\infty}^{\infty}h(
It is readily seen then that this reduces to the product of one-dimensional convolutions:
Line 110 ⟶ 113:
This conclusion can then be extended to the convolution of two separable ''M''-dimensional signals as follows:
<math>x(n_1,n_2,...,n_M)* \overset{M}{\cdots} *h(n_1,n_2,...,n_M)=\bigg[x(n_1)*h(
So, when the two signals are separable, the multidimensional convolution can be computed by computing <math>n_M</math> one-dimensional convolutions.
===Row-
The row-column method can be applied when one of the signals in the convolution is separable. The method exploits the properties of separability in order to achieve a method of calculating the convolution of two multidimensional signals that is more computationally efficient than direct computation of each sample (given that one of the signals are separable).<ref>{{cite
<math>
<math>y(n_1,n_2)=\sum_{k_1=-\infty}^{\infty} \sum_{k_2=-\infty}^{\infty} h(k_1,k_2)x(n_1-k_1,n_2-k_2)</math>▼
\begin{align}
▲
\end{align}
</math>
The value of <math>\sum_{k_2=-\infty}^{\infty} h_2(k_2)x(n_1-k_1,n_2-k_2)</math> can now be re-used when evaluating other <math>y</math> values with a shared value of <math>n_2</math>:
<math>
\begin{align}
y(n_1+\delta,n_2)&=\sum_{k_1=-\infty}^{\infty}h_1(k_1)\Bigg[ \sum_{k_2=-\infty}^{\infty} h_2(k_2)x(n_1-[k_1-\delta],n_2-k_2)\Bigg]\\
&=\sum_{k_1=-\infty}^{\infty}h_1(k_1+\delta)\Bigg[ \sum_{k_2=-\infty}^{\infty} h_2(k_2)x(n_1-k_1,n_2-k_2)\Bigg]
\end{align}
</math>
Thus, the resulting convolution can be effectively calculated by first performing the convolution operation on all of the rows of <math>x(n_1,n_2)</math>, and then on all of its columns. This approach can be further optimized by taking into account how memory is accessed within a computer processor.
A processor will load in the signal data needed for the given operation. For modern processors, data will be loaded from memory into the processors cache, which has faster access times than memory. The cache itself is partitioned into lines. When a cache line is loaded from memory, multiple data operands are loaded at once. Consider the optimized case where a row of signal data can fit entirely within the processor's cache. This particular processor would be able to access the data row-wise efficiently, but not column-wise since different data operands in the same column would lie on different cache lines.<ref>{{cite web|title=Introduction to Caches|url=http://www.cs.umd.edu/class/sum2003/cmsc311/Notes/Memory/introCache.html|website=Computer Science University of Maryland|
# Separate the separable two-dimensional signal <math>h(n_1,n_2)</math> into two one-dimensional signals <math>h_1(n_1)</math> and <math>h_2(n_2)</math>
# Perform row-wise convolution on the horizontal components of the signal <math>x(n_1,n_2)</math> using <math>h_1(n_1)</math> to obtain <math>g(n_1,n_2)</math>
Line 131 ⟶ 148:
# Perform row-wise convolution on the transposed vertical components of <math>g(n_1,n_2)</math> to get the desired output <math>y(n_1,n_2)</math>
===Computational
Examine the case where an image of size <math>X\times Y</math> is being passed through a separable filter of size <math>J\times K</math>. The image itself is not separable. If the result is calculated using the direct convolution approach without exploiting the separability of the filter, this will require approximately <math>XYJK</math> multiplications and additions. If
[[File:Picture2 wiki.png|thumb|400px|Number of computations passing a ''10 x 10'' Image through a filter of size ''J x K'' where ''J = K'' varies in size from ''1'' to ''10''|none]]
==Circular
The premise behind the circular convolution approach on multidimensional signals is to develop a relation between the [[Convolution theorem]] and the [[Discrete Fourier transform]] (DFT) that can be used to calculate the convolution between two finite-extent, discrete-valued signals.<ref>{{citation | last1=Dudgeon | first1=Dan | last2=Mersereau | first2=Russell | title=Multidimensional Digital Signal Processing | publisher=Prentice-Hall | year=1983 | page=70}}</ref>
===Convolution
For one-dimensional signals, the [[Convolution theorem|Convolution Theorem]] states that the [[Fourier transform]] of the convolution between two signals is equal to the product of the Fourier Transforms of those two signals. Thus, convolution in the time ___domain is equal to multiplication in the frequency ___domain. Mathematically, this principle is expressed via the following:<math display="block">y(n)=h(n)*x(n)\longleftrightarrow Y(\omega)=H(\omega)X(\omega)</math>This principle is directly extendable to dealing with signals of multiple dimensions.<math display="block">y(n_1,n_2,...,n_M)=h(n_1,n_2,...,n_M)*\overset{M}{\cdots}*x(n_1,n_2,...,n_M) \longleftrightarrow Y(\omega_1,\omega_2,...,\omega_M)=H(\omega_1,\omega_2,...,\omega_M)X(\omega_1,\omega_2,...,\omega_M)</math> This property is readily extended to the usage with the [[Discrete Fourier transform]] (DFT) as follows (note that linear convolution is replaced with circular convolution where <math>\otimes</math> is used to denote the circular convolution operation of size <math>N</math>):
<math>y(n)=h(n)\otimes x(n)\longleftrightarrow Y(k)=H(k)X(k)</math>
When dealing with signals of multiple dimensions:<math display="block">y(n_1,n_2,...,n_M)=h(n_1,n_2,...,n_M)\otimes\overset{M}{\cdots}\otimes x(n_1,n_2,...,n_M) \longleftrightarrow Y(k_1,k_2,...,k_M)=H(k_1,k_2,...,k_M)X(k_1,k_2,...,k_M)</math>The circular convolutions here will be of size <math>N_1, N_2,...,N_M</math>.
===Circular
The motivation behind using the circular convolution approach is that it is based on the DFT. The premise behind circular convolution is to take the DFTs of the input signals, multiply them together, and then take the inverse DFT. Care must be taken such that a large enough DFT is used such that aliasing does not occur. The DFT is numerically computable when dealing with signals of finite-extent. One advantage this approach has is that since it requires taking the DFT and inverse DFT, it is possible to utilize efficient algorithms such as the [[Fast Fourier transform]] (FFT). Circular convolution can also be computed in the time/spatial ___domain and not only in the frequency ___domain.
Line 153 ⟶ 170:
[[File:Wiki circular conv.png|thumb|600px|Block diagram of circular convolution with 2 ''M''-dimensional signals|none]]
===Choosing DFT size to avoid
Consider the following case where two finite-extent signals ''x'' and ''h'' are taken. For both signals, there is a corresponding DFT as follows:
Line 171 ⟶ 188:
The result will be that <math>y_{circular}(n_1,n_2)</math> will be a spatially aliased version of the linear convolution result <math>y_{linear}(n_1,n_2)</math>. This can be expressed as the following:
<math>y_{circular}(n_1,n_2)=\sum_{r_1} \sum_{r_2} y_{linear}(n_1-r_1N_1, n_2-r_2N_2){\mathrm{\,\,\,for\,\,\,}}(n_1,n_2) \in R_{N_1N_2}</math>
Then, in order to avoid aliasing between the spatially aliased replicas, <math>N_1</math> and <math>N_2</math> must be chosen to satisfy the following conditions:
Line 181 ⟶ 198:
If these conditions are satisfied, then the results of the circular convolution will equal that of the linear convolution (taking the main period of the circular convolution as the region of support). That is:
<math>y_{circular}(n_1,n_2)=y_{linear}(n_1,n_2)</math> for <math>(n_1,n_2) \in R_{N_1N_2}</math>
===Summary of
The Convolution theorem and circular convolution can thus be used in the following manner to achieve a result that is equal to performing the linear convolution:<ref>{{citation | last1=Dudgeon | first1=Dan | last2=Mersereau | first2=Russell | title=Multidimensional Digital Signal Processing | publisher=Prentice-Hall | year=1983 | page=72}}</ref>
# Choose <math>N_1</math> and <math>N_2</math> to satisfy <math>N_1 \geq P_1+Q_1-1</math> and <math>N_2 \geq P_2+Q_2-1</math>
# Zero pad the signals <math>h(n_1,n_2)</math> and <math>x(n_1,n_2)</math> such that they are both <math>N_1\times N_2</math> in size
# Compute the DFTs of both <math>h(n_1,n_2)</math> and <math>x(n_1,n_2)</math>
#
# The result of
==Overlap and
Another method to perform multidimensional convolution is the '''overlap and add''' approach.
Consider a two-dimensional convolution using a direct computation:
Line 199 ⟶ 216:
<math>y(n_1, n_2) = \sum_{k_1=-\infty}^{\infty} \sum_{k_2=-\infty}^{\infty} x(n_1 - k_1, n_2 - k_2)h(k_1, k_2)</math>
Assuming that the output signal <math>y(n_1, n_2)</math> has N nonzero coefficients, and the impulse response has M nonzero samples, this direct computation would need MN multiplies and MN - 1 adds in order to compute.
===Decomposition into
Instead of performing convolution on the blocks of information in their entirety, the information can be broken up into smaller blocks of dimensions <math>L_1</math>x<math>L_2
</math> resulting in smaller FFTs, less computational complexity, and less storage needed.
<math>x(n_1, n_2) = \sum_{i=1}^{P_1} \sum_{j=1}^{P_2}x_{ij}(n_1, n_2)</math>
where <math>x(n_1, n_2)</math> represents the <math>N_1</math>x<math>N_2
</math> input signal, which is a summation of <math>P_1P_2</math> block segments, with <math>P_1 = N_1/L_1
</math> and <math>P_2 = N_2/L_2
</math>.
Line 222 ⟶ 239:
This convolution adds more complexity than doing a direct convolution; however, since it is integrated with an FFT fast convolution, overlap-add performs faster and is a more memory-efficient method, making it practical for large sets of multidimensional data.
===Breakdown of
Let <math>h(n_1, n_2)</math> be of size <math>M_1 \times M_2</math>:
# Break input <math>x(n_1, n_2)</math> into non-overlapping blocks of dimensions <math>L_1 \times L_2</math>.
Line 232 ⟶ 249:
## Multiply to get <math>Y_{ij}(k_1, k_2) = X_{ij}(k_1, k_2)H(k_1,k_2)</math>.
## Take inverse discrete Fourier transform of <math>Y_{ij}(k_1, k_2)</math> to get <math>y_{ij}(n_1, n_2)</math>.
# Find <math>y(n_1, n_2)</math> by overlap and adding the last <math>(M_1 - 1)</math><math>\times</math> <math>(M_2 - 1)</math> samples of <math>y_{ij}(n_1, n_2)</math> with the first <math>(M_1 - 1)</math> <math>\times</math><math>(M_2 - 1)</math> samples of <math>y_{i+1,j+1}(n_1, n_2)</math> to get the result.<ref name=":3">{{Cite web|url = http://www.comm.utoronto.ca/~dkundur/course_info/real-time-DSP/notes/8_Kundur_Overlap_Save_Add.pdf|title = Overlap-Save and Overlap-Add|access-date
===Pictorial
In order to visualize the overlap-add method more clearly, the following illustrations examine the method graphically. Assume that the input <math>x(n_1, n_2)</math> has a square region support of length N in both vertical and horizontal directions as shown in the figure below.
In the figure below, the first graph on the left represents the convolution corresponding to the component of the input <math>x_{0,0}</math> with the corresponding impulse response <math>h(n_1,n_2)</math>. To the right of that, the input <math>x_{1,0}</math> is then convolved with the impulse response <math>h(n_1,n_2)</math>.
[[File:Conjoined blocks.jpeg|thumb|387x387px|Individual Component Convolution with Impulse Response|none]][[File:Combined convo.png|left|thumb|255x255px|Convolution of each Component with the Overlap Portions Highlighted]]The same process is done for the other two inputs respectively, and they are accumulated together in order to form the convolution.
Assume that the filter impulse response <math>h(n_1,n_2)</math> has a region of support of <math>(N/8)</math> in both dimensions.
</math> and <math>n_2
</math> directions, which leads to overlap (highlighted in blue) since the length of each individual convolution is equivalent to:
Line 247 ⟶ 264:
<math>(N/2)</math> <math>+</math><math>(N/8)</math> <math>-</math><math>1</math> = <math>(5/8)N-1</math>
in both directions.
==Overlap and
The overlap and save method, just like the overlap and add method, is also used to reduce the computational complexity associated with discrete-time convolutions.
===Comparison to
The overlap and save method is very similar to the overlap and add methods with a few notable exceptions. The overlap-add method involves a linear convolution of discrete-time signals, whereas the overlap-save method involves the principle of circular convolution.
In one dimension, the performance and storage metric differences between the two methods is minimal.
===Breakdown of
Let <math>h(n_1, n_2)</math> be of size <math>M_1 \times M_2 </math>:
# Insert <math>(M_1 - 1)</math> columns and <math>(M_2 - 1)</math> rows of zeroes at the beginning of the input signal <math>x(n_1,n_2)</math> in both dimensions.
# Split the corresponding signal into overlapping segments of dimensions (<math>L_1 + M_1 - 1</math>)<math>\times</math>(<math>L_2 + M_2 - 1</math>) in which each two-dimensional block will overlap by <math>(M_1 - 1)</math> <math>\times</math> <math>(M_2 - 1)</math>.
# Zero pad <math>h(n_1, n_2)</math> such that it has dimensions (<math>L_1 + M_1 - 1</math>)<math>\times</math>(<math>L_2 + M_2 - 1</math>).
Line 267 ⟶ 284:
## Multiply to get <math>Y_{ij}(k_1, k_2) = X_{ij}(k_1, k_2)H(k_1,k_2)</math>.
## Take inverse discrete Fourier transform of <math>Y_{ij}(k_1, k_2)</math> to get <math>y_{ij}(n_1, n_2)</math>.
## Get rid
# Find <math>y(n_1, n_2)</math> by attaching the last <math>(L_1\times L_2)</math> samples for each output block <math>y_{ij}(n_1, n_2)</math>.<ref name=":3" />
==The
Similar to row-column decomposition, the helix transform computes the multidimensional convolution by incorporating one-dimensional convolutional properties and operators.
===Multidimensional
To understand the helix transform, it is useful to first understand how a multidimensional convolution can be broken down into a one-dimensional convolution.
<math>Z(i,j) = \sum_{m=0}^{M-1}\sum_{n=0}^{N-1}X(m,n)Y(i-m, j-n)</math>
Line 290 ⟶ 307:
\end{bmatrix}</math>
where each of the input matrices are now of dimensions <math>(M+K-1)</math><math>\times</math><math>(N+L-1)</math>.
<math>l_{X''} =</math> <math>(M+K-1)</math><math>\times</math><math>(N-1)</math> + <math>M</math>
Line 300 ⟶ 317:
<math>l_{Z''} =</math> <math>l_{Y''} +</math><math>l_{X''}</math> <math>= (M+K-1)</math><math>\times</math><math>(N+L-1)</math>
===Filtering on a
When working on a two-dimensional Cartesian mesh, a Fourier transform along either axes will result in the two-dimensional plane becoming a cylinder as the end of each column or row attaches to its respective top forming a cylinder.
[[File:Cartesian combined.jpeg|none|thumb|480x480px|Transformation from a 2D Cartesian Filtering Plane to a Helix Filter.]]If this helical structure is then sliced and unwound into a one-dimensional strip, the same filter coefficients on the 2-d Cartesian plane will match up with the same input data, resulting in an equivalent filtering scheme.
[[File:1d strip.png|none|thumb|189x189px|One-Dimensional Filtering Strip after being Unwound.]]
Assuming that some-low pass two-dimensional filter was used, such as:
Line 311 ⟶ 328:
{| class="wikitable"
|0
| -1
|0
|-
Line 325 ⟶ 342:
Then, once the two-dimensional space was converted into a helix, the one-dimensional filter would look as follows:
<math> h(n) = -1, 0, ... , 0, -1, 4, -1, 0, ..., 0, -1, 0, ...</math>
Notice in the one-dimensional filter that there are no leading zeroes
===Applications===
Helix transformations to implement recursive filters via convolution are used in various areas of signal processing.
==Gaussian
One application of multidimensional convolution that is used within signal and image processing is Gaussian convolution. This refers to convolving an input signal with the Gaussian distribution function.
[[File:Wiki gauss.png|thumb|300px|2D Gaussian Visualization where <math>\mu_1=\mu_2=0</math> and <math>\sigma_1=\sigma_2=1</math>]]
The Gaussian distribution sampled at discrete values in one dimension is given by the following (assuming <math>\mu=0</math>):<math display="block">G(n)=\frac{1}{\sqrt{2\pi\sigma^2}}e^{-\frac{n^2}{2\sigma^2}}</math>This is readily extended to a signal of ''M'' dimensions (assuming <math>\sigma</math> stays constant for all dimensions and <math>\mu_1=\mu_2=...=\mu_M=0</math>):<math display="block">G(n_1,n_2,...,n_M)=\frac{1}{(2\pi
<math>y(n)=x(n)*G(n)</math>
Line 343 ⟶ 360:
<math>y(n_1,n_2,...,n_M)=x(n_1,n_2,...,n_M)*...*G(n_1,n_2,...,n_M)</math>
===Approximation by FIR
Gaussian convolution can be effectively approximated via implementation of a [[Finite impulse response]] (FIR) filter.
<math>H(z_1,z_2)=\frac{1}{s(r_1,r_2)} \sum_{n_1=-r_1}^{r_1}\sum_{n_2=-r_2}^{r_2}G(n_1,n_2){z_1}^{-n_1}{z_2}^{-n_2}</math>
Line 355 ⟶ 372:
Choosing lower values for <math>r_1</math> and <math>r_2</math> will result in performing less computations, but will yield a less accurate approximation while choosing higher values will yield a more accurate approximation, but will require a greater number of computations.
===Approximation by
Another method for approximating Gaussian convolution is via recursive passes through a box filter. For approximating one-dimensional convolution, this filter is defined as the following:<ref name=":0" />
Line 361 ⟶ 378:
<math>H(z)=\frac{1}{2r+1} \frac{z^r-z^{-r-1}}{1-z^-1}</math>
Typically, recursive passes 3, 4, or 5 times are performed in order to obtain an accurate approximation.<ref name=":0" /> A suggested method for computing ''r'' is then given as the following:<ref>{{Cite journal
<math>\sigma^2=\frac{1}{12}K((2r+1)^2-1)</math> where ''K'' is the number of recursive passes through the filter.
Line 376 ⟶ 393:
===Applications===
Gaussian convolutions are used extensively in signal and image processing. For example, image-blurring can be accomplished with Gaussian convolution where the <math>\sigma</math> parameter will control the strength of the blurring. Higher values would thus correspond to a more blurry end result.<ref>{{Cite web|title = Gaussian Blur - Image processing for scientists and engineers, Part 4|url = http://patrick-fuller.com/gaussian-blur-image-processing-for-scientists-and-engineers-part-4/|website = patrick-fuller.com|
==See also==
* [[Convolution]]
* [[
* [[Signal processing]]
|