Multidimensional discrete convolution: Difference between revisions

Content deleted Content added
Monkbot (talk | contribs)
m Task 18 (cosmetic): eval 20 templates: del empty params (20×); hyphenate params (8×);
Line 68:
Convolution in one dimension was a powerful discovery that allowed the input and output of a linear shift-invariant (LSI) system (see [[LTI system theory]]) to be easily compared so long as the impulse response of the filter system was known. This notion carries over to multidimensional convolution as well, as simply knowing the impulse response of a multidimensional filter too allows for a direct comparison to be made between the input and output of a system. This is profound since several of the signals that are transferred in the digital world today are of multiple dimensions including images and videos. Similar to the one-dimensional convolution, the multidimensional convolution allows the computation of the output of an LSI system for a given input signal.
 
For example, consider an image that is sent over some wireless network subject to electro-optical noise. Possible noise sources include errors in channel transmission, the analog to digital converter, and the image sensor. Usually noise caused by the channel or sensor creates spatially-independent, high-frequency signal components that translates to arbitrary light and dark spots on the actual image. In order to rid the image data of the high-frequency spectral content, it can be multiplied by the frequency response of a low-pass filter, which based on the convolution theorem, is equivalent to convolving the signal in the time/spatial ___domain by the impulse response of the low-pass filter. Several impulse responses that do so are shown below.<ref>{{Cite web|title = MARBLE: Interactive Vision|url = http://homepages.inf.ed.ac.uk/rbf/CVonline/LOCAL_COPIES/MARBLE/|website = homepages.inf.ed.ac.uk|accessdateaccess-date = 2015-11-12}}</ref>
 
[[File:Screen Shot 2015-11-11 at 11.18.23 PM.png|none|thumb|311x311px|Impulse Responses of Typical Multidimensional Low Pass Filters]]
Line 78:
[[File:edge_detection2.png|500px|thumb|Original image (left) and image after passing through edge-detecting filter (right)|none]]
 
In addition to image processing, multidimensional convolution can be implemented to enable a variety of other applications. Since filters are widespread in digital communication systems, any system that must transmit multidimensional data is assisted by filtering techniques It is used in real-time video processing, neural network analysis, digital geophysical data analysis, and much more.<ref>{{Cite web|title = Digital Geophysical Analysis Redesign|url = http://www-rohan.sdsu.edu/~jiracek/DAGSAW/4.1.html|website = www-rohan.sdsu.edu|accessdateaccess-date = 2015-11-12}}</ref>
 
One typical distortion that occurs during image and video capture or transmission applications is blur that is caused by a low-pass filtering process. The introduced blur can be modeled using Gaussian low-pass filtering.
Line 141:
Thus, the resulting convolution can be effectively calculated by first performing the convolution operation on all of the rows of <math>x(n_1,n_2)</math>, and then on all of its columns. This approach can be further optimized by taking into account how memory is accessed within a computer processor.
 
A processor will load in the signal data needed for the given operation. For modern processors, data will be loaded from memory into the processors cache, which has faster access times than memory. The cache itself is partitioned into lines. When a cache line is loaded from memory, multiple data operands are loaded at once. Consider the optimized case where a row of signal data can fit entirely within the processor's cache. This particular processor would be able to access the data row-wise efficiently, but not column-wise since different data operands in the same column would lie on different cache lines.<ref>{{cite web|title=Introduction to Caches|url=http://www.cs.umd.edu/class/sum2003/cmsc311/Notes/Memory/introCache.html|website=Computer Science University of Maryland|accessdateaccess-date=10 November 2015}}</ref> In order to take advantage of the way in which memory is accessed, it is more efficient to transpose the data set and then axis it row-wise rather than attempt to access it column-wise. The algorithm then becomes:
# Separate the separable two-dimensional signal <math>h(n_1,n_2)</math> into two one-dimensional signals <math>h_1(n_1)</math> and <math>h_2(n_2)</math>
# Perform row-wise convolution on the horizontal components of the signal <math>x(n_1,n_2)</math> using <math>h_1(n_1)</math> to obtain <math>g(n_1,n_2)</math>
Line 148:
 
===Computational Speedup from Row-Column Decomposition===
Examine the case where an image of size <math>X\times Y</math> is being passed through a separable filter of size <math>J\times K</math>. The image itself is not separable. If the result is calculated using the direct convolution approach without exploiting the separability of the filter, this will require approximately <math>XYJK</math> multiplications and additions. If the separability of the filter is taken into account, the filtering can be performed in two steps. The first step will have <math>XYJ</math> multiplications and additions and the second step will have <math>XYK</math>, resulting in a total of <math>XYJ+XYK</math> or <math>XY(J+K)</math> multiplications and additions.<ref>{{cite web|last1=Eddins|first1=Steve|title=Separable Convolution|url=http://blogs.mathworks.com/steve/2006/10/04/separable-convolution/|website=Mathwords|accessdateaccess-date=10 November 2015}}</ref> A comparison of the computational complexity between direct and separable convolution is given in the following image:
 
[[File:Picture2 wiki.png|thumb|400px|Number of computations passing a ''10 x 10'' Image through a filter of size ''J x K'' where ''J = K'' varies in size from ''1'' to ''10''|none]]
Line 209:
==Overlap and Add==
 
Another method to perform multidimensional convolution is the '''overlap and add''' approach. This method helps reduce the computational complexity often associated with multidimensional convolutions due to the vast amounts of data inherent in modern-day digital systems.<ref>{{cite book|last1 = Fernandez|first1 = Joseph|last2 = Kumar|first2 = Vijaya|title = Multidimensional Overlap-Add and Overlap-Save for Correlation and Convolution|journal = |pages = 509–513|doi = 10.1109/ICIP.2013.6738105|year = 2013|isbn = 978-1-4799-2341-0}}</ref> For sake of brevity, the two-dimensional case is used as an example, but the same concepts can be extended to multiple dimensions.
 
Consider a two-dimensional convolution using a direct computation:
Line 215:
<math>y(n_1, n_2) = \sum_{k_1=-\infty}^{\infty} \sum_{k_2=-\infty}^{\infty} x(n_1 - k_1, n_2 - k_2)h(k_1, k_2)</math>
 
Assuming that the output signal <math>y(n_1, n_2)</math> has N nonzero coefficients, and the impulse response has M nonzero samples, this direct computation would need MN multiplies and MN - 1 adds in order to compute. Using an FFT instead, the frequency response of the filter and the Fourier transform of the input would have to be stored in memory.<ref>{{Cite web|url = http://www.eeng.dcu.ie/~ee502/EE502s4.pdf|title = 2D Signal Processing|access-date = |accessdate = November 11, 2015|website = EE502: Digital Signal Processing|publisher = Dublin City University|last = |first = |page = 24}}</ref> Massive amounts of computations and excessive use of memory storage space pose a problematic issue as more dimensions are added. This is where the overlap and add convolution method comes in.
 
===Decomposition into Smaller Convolution Blocks===
Line 248:
## Multiply to get <math>Y_{ij}(k_1, k_2) = X_{ij}(k_1, k_2)H(k_1,k_2)</math>.
## Take inverse discrete Fourier transform of <math>Y_{ij}(k_1, k_2)</math> to get <math>y_{ij}(n_1, n_2)</math>.
# Find <math>y(n_1, n_2)</math> by overlap and adding the last <math>(M_1 - 1)</math><math>\times</math> <math>(M_2 - 1)</math> samples of <math>y_{ij}(n_1, n_2)</math> with the first <math>(M_1 - 1)</math> <math>\times</math><math>(M_2 - 1)</math> samples of <math>y_{i+1,j+1}(n_1, n_2)</math> to get the result.<ref name=":3">{{Cite web|url = http://www.comm.utoronto.ca/~dkundur/course_info/real-time-DSP/notes/8_Kundur_Overlap_Save_Add.pdf|title = Overlap-Save and Overlap-Add|access-date = |accessdate = November 12, 2015|website = |publisher = University of Toronto|last = Kundur|first = Deepa}}</ref>
 
===Pictorial Method of Operation===
Line 263:
<math>(N/2)</math> <math>+</math><math>(N/8)</math> <math>-</math><math>1</math> = <math>(5/8)N-1</math>
 
in both directions. The lighter blue portion correlates to the overlap between two adjacent convolutions, whereas the darker blue portion correlates to overlap between all four convolutions. All of these overlap portions are added together in addition to the convolutions in order to form the combined convolution <math>y(n_1,n_2)</math>.<ref>{{Cite web|url = http://www.eeng.dcu.ie/~ee502/EE502s4.pdf|title = 2D Signal Processing|access-date = |accessdate = November 11, 2015|website = EE502: Digital Signal Processing|publisher = Dublin City University|last = |first = |page = 26}}</ref>
 
==Overlap and Save==
Line 271:
The overlap and save method is very similar to the overlap and add methods with a few notable exceptions. The overlap-add method involves a linear convolution of discrete-time signals, whereas the overlap-save method involves the principle of circular convolution. In addition, the overlap and save method only uses a one-time zero padding of the impulse response, while the overlap-add method involves a zero-padding for every convolution on each input component. Instead of using zero padding to prevent time-___domain aliasing like its overlap-add counterpart, overlap-save simply discards all points of aliasing, and saves the previous data in one block to be copied into the convolution for the next block.
 
In one dimension, the performance and storage metric differences between the two methods is minimal. However, in the multidimensional convolution case, the overlap-save method is preferred over the overlap-add method in terms of speed and storage abilities.<ref>{{Cite journal|url = |title = High-Speed Multidimensional Convolution|last = Kim|first = Chang|date = May 1980|journal = IEEE Transactions on Pattern Analysis and Machine Intelligence|doi = 10.1109/tpami.1980.4767017 |last2 = Strintzis|first2 = Michael|volume=PAMI-2|issue = 3|pages=269–273}}</ref> Just as in the overlap and add case, the procedure invokes the two-dimensional case but can easily be extended to all multidimensional procedures.
 
===Breakdown of Procedure===
Line 316:
<math>l_{Z''} =</math> <math>l_{Y''} +</math><math>l_{X''}</math> <math>= (M+K-1)</math><math>\times</math><math>(N+L-1)</math>
 
This vector length is equivalent to the dimensions of the original matrix output <math>Z</math>, making converting back to a matrix a direct transformation. Thus, the vector, <math>Z''</math>, is converted back to matrix form, which produces the output of the two-dimensional discrete convolution.<ref name=":1">{{Cite journal|url = https://www.researchgate.net/publication/274360447|title = Multidimensional convolution via a 1D convolution algorithm|last = Naghizadeh|first = Mostafa|date = November 2009|journal = The Leading Edge|doi = |pmid = |access-date = |last2 = Sacchi|first2 = Mauricio}}</ref>
 
===Filtering on a Helix===
Line 343:
<math> h(n) = -1, 0, ... , 0, -1, 4, -1, 0, ..., 0, -1, 0, ...</math>
 
Notice in the one-dimensional filter that there are no leading zeroes as illustrated in the one-dimensional filtering strip after being unwound. The entire one-dimensional strip could have been convolved with; however, it is less computationally expensive to simply ignore the leading zeroes. In addition, none of these backside zero values will need to be stored in memory, preserving precious memory resources.<ref name=":2">{{Cite journal|url = |title = Multidimensional recursive filters via a helix|last = Claerbout|first = Jon|date = September 1998|journal = Geophysics|volume = 63|issue = 5|doi = 10.1190/1.1444449|pmid = |access-date = |page = 9|citeseerx = 10.1.1.76.1193|bibcode = 1998Geop...63.1532C}}</ref>
 
===Applications===
Helix transformations to implement recursive filters via convolution are used in various areas of signal processing. Although frequency ___domain Fourier analysis is effective when systems are stationary, with constant coefficients and periodically-sampled data, it becomes more difficult in unstable systems. The helix transform enables three-dimensional post-stack migration processes that can process data for three-dimensional variations in velocity.<ref name=":2" /> In addition, it can be applied to assist with the problem of implicit three-dimensional wavefield extrapolation.<ref>{{Cite journal|title = Exploring three-dimensional implicit wavefield extrapolation with the helix transform|last = Fomel|first = Sergey|date = 1997|journal = SEP Report|doi = |pmid = |access-date = |last2 = Claerbout|first2 = Jon|pages = 43–60|url=https://pdfs.semanticscholar.org/5993/a76fdb37d3b1a3dfba795d6ed596c608cec9.pdf#page=179}}</ref> Other applications include helpful algorithms in seismic data regularization, prediction error filters, and noise attenuation in geophysical digital systems.<ref name=":1" />
 
==Gaussian Convolution==
Line 377:
<math>H(z)=\frac{1}{2r+1} \frac{z^r-z^{-r-1}}{1-z^-1}</math>
 
Typically, recursive passes 3, 4, or 5 times are performed in order to obtain an accurate approximation.<ref name=":0" /> A suggested method for computing ''r'' is then given as the following:<ref>{{Cite journal|title = Efficient synthesis of Gaussian filters by cascaded uniform filters|last = Wells|first = W.M.|date = 1986|journal = IEEE Transactions on Pattern Analysis and Machine Intelligence|doi = 10.1109/TPAMI.1986.4767776|pmid =|volume=PAMI-8|issue = 2|pages=234–239}}</ref>
 
<math>\sigma^2=\frac{1}{12}K((2r+1)^2-1)</math> where ''K'' is the number of recursive passes through the filter.
Line 392:
 
===Applications===
Gaussian convolutions are used extensively in signal and image processing. For example, image-blurring can be accomplished with Gaussian convolution where the <math>\sigma</math> parameter will control the strength of the blurring. Higher values would thus correspond to a more blurry end result.<ref>{{Cite web|title = Gaussian Blur - Image processing for scientists and engineers, Part 4|url = http://patrick-fuller.com/gaussian-blur-image-processing-for-scientists-and-engineers-part-4/|website = patrick-fuller.com|accessdateaccess-date = 2015-11-12}}</ref> It is also commonly used in [[Computer vision]] applications such as [[Scale-invariant feature transform]] (SIFT) feature detection.<ref>{{cite journal|last1=Lowe|first1=D.G.|title=Object recognition from local scale-invariant features|journal=Proceedings of the International Conference on Computer Vision|date=1999|volume=2|pages=1150–1157|url=http://www.cs.ubc.ca/~lowe/papers/iccv99.pdf}}</ref>
 
==See also==