Nyquist–Shannon sampling theorem: Difference between revisions

Content deleted Content added
Nonuniform sampling: eliminate unnecessary abbreviation; paragraph break; {{cite thesis}}; don't make implicit reference to the content of ref tags
Tags: Mobile edit Mobile web edit Advanced mobile edit
fix typo
 
(34 intermediate revisions by 20 users not shown)
Line 1:
{{confuseddistinguish|Shannon–Hartley theorem}}
 
The '''Nyquist–Shannon sampling theorem''' is an essential principle for [[digital signal processing]] linking the [[frequency range]] of a signal and the [[sample rate]] required to avoid a type of [[distortion]] called [[aliasing]]. The theorem states that the sample rate must be at least twice the [[Bandwidth (signal processing)|bandwidth]] of the signal to avoid aliasing distortion. In practice, it is used to select [[band-limiting]] filters to keep aliasing distortion below an acceptable amount when an analog signal is sampled or when sample rates are changed within a digital signal processing function.
 
[[File:Bandlimited.svg|thumb|250px|Example of magnitude of the Fourier transform of a bandlimited function]]
Line 9:
Strictly speaking, the theorem only applies to a class of [[mathematical function]]s having a [[continuous Fourier transform|Fourier transform]] that is zero outside of a finite region of frequencies. Intuitively we expect that when one reduces a continuous function to a discrete sequence and [[interpolates]] back to a continuous function, the fidelity of the result depends on the density (or [[Sampling (signal processing)|sample rate]]) of the original samples. The sampling theorem introduces the concept of a sample rate that is sufficient for perfect fidelity for the class of functions that are [[bandlimiting|band-limited]] to a given bandwidth, such that no actual information is lost in the sampling process. It expresses the sufficient sample rate in terms of the bandwidth for the class of functions. The theorem also leads to a formula for perfectly reconstructing the original continuous-time function from the samples.
 
Perfect reconstruction may still be possible when the sample-rate criterion is not satisfied, provided other constraints on the signal are known (see {{sectionlinksection link||Sampling of non-baseband signals}} below and [[compressed sensing]]). In some cases (when the sample-rate criterion is not satisfied), utilizing additional constraints allows for approximate reconstructions. The fidelity of these reconstructions can be verified and quantified utilizing [[Bochner's theorem]].<ref>{{cite arXiv |last1=Nemirovsky|first1=Jonathan|last2=Shimron|first2=Efrat|title=Utilizing Bochners Theorem for Constrained Evaluation of Missing Fourier Data|eprint=1506.03300 |class=physics.med-ph |date=2015 }}</ref>
 
The name ''Nyquist–Shannon sampling theorem'' honours [[Harry Nyquist]] and [[Claude Shannon]], but the theorem was also previously discovered by [[E. T. Whittaker]] (published in 1915), and Shannon cited Whittaker's paper in his work. The theorem is thus also known by the names ''Whittaker–Shannon sampling theorem'', ''Whittaker–Shannon'', and ''Whittaker–Nyquist–Shannon'', and may also be referred to as the ''cardinal theorem of interpolation''.
Line 21:
A sufficient sample-rate is therefore anything larger than <math>2B</math> samples per second. Equivalently, for a given sample rate <math>f_s</math>, perfect reconstruction is guaranteed possible for a bandlimit <math>B < f_s/2</math>.
 
When the bandlimit is too high (or there is no bandlimit), the reconstruction exhibits imperfections known as [[aliasing]]. Modern statements of the theorem are sometimes careful to explicitly state that <math>x(t)</math> must contain no [[Sine wave|sinusoidal]] component at exactly frequency <math>B,</math> or that <math>B</math> must be strictly less than ½one half the sample rate. The threshold <math>2B</math> is called the [[Nyquist rate]] and is an attribute of the continuous-time input <math>x(t)</math> to be sampled. The sample rate must exceed the Nyquist rate for the samples to suffice to represent <math>x(t).</math>&nbsp; The threshold <math>f_s/2</math> is called the [[Nyquist frequency]] and is an attribute of the [[Analog-to-digital converter|sampling equipment]]. All meaningful frequency components of the properly sampled <math>x(t)</math> exist below the Nyquist frequency. The condition described by these inequalities is called the ''Nyquist criterion'', or sometimes the ''Raabe condition''. The theorem is also applicable to functions of other domains, such as space, in the case of a digitized image. The only change, in the case of other domains, is the units of measure attributed to <math>t,</math> <math>f_s,</math> and <math>B.</math>
 
[[File:Sinc function (normalized).svg|thumb|right|250px|The normalized [[sinc function]]: {{nowrap|sin(π{{var|x}}) / (π{{var|x}})}} ... showing the central peak at {{nowrap|1={{var|x}} = 0}}, and zero-crossings at the other integer values of {{var|x}}.]]
 
The symbol <math>T \triangleq 1/f_s</math> is customarily used to represent the interval between adjacent samples and is called the ''sample period'' or ''sampling interval''. The samples of function <math>x(t)</math> are commonly denoted by <math>x[n] \triangleq x(nT)</math> (alternatively <math>x_n</math> in older signal processing literature), for all integer values of <math>n.</math>&nbsp; Another convenient definition is <math>x[n] \triangleq T\cdot x(nT),</math> which preserves the energy of the signal as <math>T</math> varies.<ref>
{{cite book |last1=Ahmed |first1=N. |last2url=Rao |first2=Khttps://books.Rgoogle.com/books?id=F-nvCAAAQBAJ |title=Orthogonal Transforms for Digital Signal Processing |publisherlast2=Springer-VerlagRao |editionfirst2=1K.R. |date=July 10, 1975 |publisher=Springer-Verlag |isbn=9783540065562 |edition=1 |___location=Berlin Heidelberg New York |language=English |url=https://books.google.com/books?id=F-nvCAAAQBAJ |doi=10.1007/978-3-642-45450-9}}</ref> |isbn=9783540065562(alternatively <math>x_n</math> in older signal processing literature), for all integer values of <math>n.</math> The multiplier <math>T</math> is a result of the transition from continuous time to discrete time (see [[Discrete-time Fourier transform#Relation to Fourier Transform]]), and it is needed to preserve the energy of the signal as <math>T</math> varies.
}}</ref>
 
A mathematically ideal way to interpolate the sequence involves the use of [[sinc function]]s. Each sample in the sequence is replaced by a sinc function, centered on the time axis at the original ___location of the sample <math>nT,</math> with the amplitude of the sinc function scaled to the sample value, <math>x[n](nT).</math> Subsequently, the sinc functions are summed into a continuous function. A mathematically equivalent method uses the [[Dirac comb#Sampling and aliasing|Dirac comb]] and proceeds by [[Convolution|convolving]] one sinc function with a series of [[Dirac delta]] pulses, weighted by the sample values. Neither method is numerically practical. Instead, some type of approximation of the sinc functions, finite in length, is used. The imperfections attributable to the approximation are known as ''interpolation error''.
 
Practical [[digital-to-analog converter]]s produce neither scaled and delayed [[sinc function]]s, nor ideal [[Dirac pulse]]s. Instead they produce a [[Step function|piecewise-constant]] sequence]] of scaled and delayed [[rectangular function|rectangular pulses]] (the [[zero-order hold]]), usually followed by a [[lowpass filter]] (called an "anti-imaging filter") to remove spurious high-frequency replicas (images) of the original baseband signal.
 
==Aliasing==
Line 39 ⟶ 38:
When <math>x(t)</math> is a function with a [[Fourier transform]] <math>X(f)</math>''':'''
 
:<math display="block">X(f)\ \triangleq\ \int_{-\infty}^{\infty} x(t) \ e^{- i 2 \pi f t} \ {\rm d}t,</math>
 
the [[Poisson summation formula]] indicates thatThen the samples, <math>x(nT)[n]</math>, of <math>x(t)</math> are sufficient to create a [[periodic summation]] of <math>X(f).</math>. The(see result[[Discrete-time isFourier transform#Relation to Fourier Transform]])''':'''
 
{{Equation box 1|title=
<math display="block">X_s(f)\ \triangleq \sum_{k=-\infty}^{\infty} X\left(f - k f_s\right) = \sum_{n=-\infty}^{\infty} T\cdot x(nT)\ e^{-i 2\pi n T f}</math> ({{EquationRef|Eq.1}})
|indent=: |cellpadding= 0 |border= 0 |background colour=white
|equation = {{NumBlk||
<math display="block">X_sX_{1/T}(f)\ \triangleq \sum_{k=-\infty}^{\infty} X\left(f - k f_s/T\right) = \sum_{n=-\infty}^{\infty} T\cdot x(nT)[n]\ e^{-i 2\pi f n T f},</math> ({{EquationRef|Eq.1}})&nbsp; &nbsp;
|{{EquationRef|Eq.1}}}}
}}
 
[[File:AliasedSpectrum.png|thumb|upright=1.8|right|<math>X(f)</math> (top blue) and <math>X_A(f)</math> (bottom blue) are continuous Fourier transforms of two {{em|different}} functions, <math>x(t)</math> and <math>x_A(t)</math> (not shown). When the functions are sampled at rate <math>f_s</math>, the images (green) are added to the original transforms (blue) when one examines the discrete-time Fourier transforms (DTFT) of the sequences. In this hypothetical example, the DTFTs are identical, which means {{em|the sampled sequences are identical}}, even though the original continuous pre-sampled functions are not. If these were audio signals, <math>x(t)</math> and <math>x_A(t)</math> might not sound the same. But their samples (taken at rate <math>f_s</math>) are identical and would lead to identical reproduced sounds; thus <math>x_A(t)</math> is an alias of <math>x(t)</math> at this sample rate.]]
 
which is a periodic function and its equivalent representation as a [[Fourier series]], whose coefficients are <math>T\cdot x(nT)[n]</math>. This function is also known as the [[discrete-time Fourier transform]] (DTFT) of the sample sequence.
 
As depicted, copies of <math>X(f)</math> are shifted by multiples of the sampling rate <math>f_s = 1/T</math> and combined by addition. For a band-limited function <math>(X(f) = 0, \text{ for all } |f| \ge B)</math> and sufficiently large <math>f_s,</math> it is possible for the copies to remain distinct from each other. But if the Nyquist criterion is not satisfied, adjacent copies overlap, and it is not possible in general to discern an unambiguous <math>X(f).</math> Any frequency component above <math>f_s/2</math> is indistinguishable from a lower-frequency component, called an ''alias'', associated with one of the copies. In such cases, the customary interpolation techniques produce the alias, rather than the original component. When the sample-rate is pre-determined by other considerations (such as an industry standard), <math>x(t)</math> is usually filtered to reduce its high frequencies to acceptable levels before it is sampled. The type of filter required is a [[lowpass filter]], and in this application it is called an [[anti-aliasing filter]].
 
[[File:ReconstructFilter.pngsvg|thumb|right|upright=1.8|Spectrum, <math>X_s(f)</math>, of a properly sampled bandlimited signal (blue) and the adjacent DTFT images (green) that do not overlap. A ''brick-wall'' low-pass filter, <math>H(f)</math>, removes the images, leaves the original spectrum, <math>X(f)</math>, and recovers the original signal from its samples.]]
[[File:Nyquist sampling.gif|upright=1.8|thumb|right|The figure on the left shows a function (in gray/black) being sampled and reconstructed (in gold) at steadily increasing sample-densities, while the figure on the right shows the frequency spectrum of the gray/black function, which does not change. The highest frequency in the spectrum is half the width of the entire spectrum. The width of the steadily-increasing pink shading is equal to the sample-rate. When it encompasses the entire frequency spectrum it is twice as large as the highest frequency, and that is when the reconstructed waveform matches the sampled one.]]
 
Line 57 ⟶ 61:
When there is no overlap of the copies (also known as "images") of <math>X(f)</math>, the <math>k=0</math> term of {{EquationNote|Eq.1}} can be recovered by the product:
 
<math display="block">X(f) = H(f) \cdot X_sX_{1/T}(f),</math>
 
where:
Line 65 ⟶ 69:
The sampling theorem is proved since <math>X(f)</math> uniquely determines <math>x(t)</math>.
 
All that remains is to derive the formula for reconstruction. <math>H(f)</math> need not be precisely defined in the region <math>[B,\ f_s-B]</math> because <math>X_sX_{1/T}(f)</math> is zero in that region. However, the worst case is when <math>B=f_s/2,</math> the Nyquist frequency. A function that is sufficient for that and all less severe cases is''':'''
 
<math display="block">H(f) = \mathrm{rect} \left(\frac{f}{f_s} \right) = \begin{cases}1 & |f| < \frac{f_s}{2} \\ 0 & |f| > \frac{f_s}{2}, \end{cases}</math>
 
where <math>\mathrm{rect}</math> is the [[rectangular function]]. Therefore:{{efn-ua|group=bottom|The sinc function follows from rows 202 and 102 of the [[Table of Fourier transforms|transform tables]]}}
 
:<math>X(f) &= \mathrm{rect} \left(\frac{f}{f_s} \right) \cdot X_sX_{1/T}(f) \\</math>
<math display="block">\begin{align}
:::<math> = \mathrm{rect}(Tf)\cdot \sum_{n=-\infty}^{\infty} T\cdot x(nT)\ e^{-i 2\pi n T f}</math>&nbsp; &nbsp; &nbsp; (from &nbsp;{{EquationNote|Eq.1}}, above).
X(f) &= \mathrm{rect} \left(\frac{f}{f_s} \right) \cdot X_s(f) \\
&:::<math> = \mathrm{rect}(Tf)\cdot \sum_{n=-\infty}^{\infty} x(nT)\cdot \underbrace{T\cdot x\mathrm{rect} (nTTf) \cdot e^{-i 2\pi n T f} &\text}_{ (from Eq.1, above)} \\
&= \sum_{n=-\infty}^{\infty} x(nT)\cdot \underbrace{T\cdot \mathrm{rect} (Tf) \cdot e^{-i 2\pi n T f}}_{
\mathcal{F}\left \{
\mathrm{sinc} \left( \frac{t - nT}{T} \right)
\right \}}.</math>&nbsp; &nbsp; &nbsp;{{efn-ua|group=bottom|The sinc function follows from rows 202 and 102 of the [[Table of Fourier transforms|transform tables]]}}
\right \}}.
\end{align}</math>
 
The inverse transform of both sides produces the [[Whittaker–Shannon interpolation formula]]:
 
:<math display="block">x(t) = \sum_{n=-\infty}^{\infty} x(nT)\cdot \mathrm{sinc} \left( \frac{t - nT}{T}\right),</math>
 
which shows how the samples, <math>x(nT)</math>, can be combined to reconstruct <math>x(t)</math>.
Line 92 ⟶ 94:
Poisson shows that the Fourier series in {{EquationNote|Eq.1}} produces the periodic summation of <math>X(f)</math>, regardless of <math>f_s</math> and <math>B</math>. Shannon, however, only derives the series coefficients for the case <math>f_s=2B</math>. Virtually quoting Shannon's original paper:
 
:Let <math>X(\omega)</math> be the spectrum of <math>x(t).</math>.&nbsp; Then
{{quote|
Let <math>X(\omega)</math> be the spectrum of <math>x(t)</math>. Then
 
::<math display="block">x(t) = {1 \over 2\pi} \int_{-\infty}^{\infty} X(\omega) e^{i\omega t}\;{\rm d}\omega = {1 \over 2\pi} \int_{-2\pi B}^{2\pi B} X(\omega) e^{i\omega t}\;{\rm d}\omega,</math>
 
:because <math>X(\omega)</math> is assumed to be zero outside the band <math>\left|\tfrac{\omega}{2\pi}\right| < B.</math>.&nbsp; If we let <math>t = \tfrac{n}{2B},</math>, where <math>n</math> is any positive or negative integer, we obtain:
 
{{Equation box 1|title=
{{anchor|math_Eq.2}}<math display="block">x \left(\tfrac{n}{2B} \right) = {1 \over 2\pi} \int_{-2\pi B}^{2\pi B} X(\omega) e^{i\omega {n \over {2B}}}\;{\rm d}\omega. \text{(Eq.2)}</math>
|indent=: |cellpadding= 0 |border= 0 |background colour=white
|equation = {{NumBlk|:|
{{anchor|math_Eq.2}}<math display="block">x \left(\tfrac{n}{2B} \right) = {1 \over 2\pi} \int_{-2\pi B}^{2\pi B} X(\omega) e^{i\omega {n \over {2B}}}\;{\rm d}\omega. \text{(Eq.2)}</math> &nbsp; &nbsp;
|{{EquationRef|Eq.2}}}}
}}
 
:On the left are values of <math>x(t)</math> at the sampling points. The integral on the right will be recognized as essentially{{
 
efn|group=proof|Multiplying both sides of {{EquationNote|Eq.2}} by <math>T = 1/2B</math> produces, on the left, the scaled sample values <math>(T\cdot x(nT))</math> in Poisson's formula ({{EquationNote|Eq.1}}), and, on the right, the actual formula for Fourier expansion coefficients.
 
}} the ''<math>n''^{th}</math> coefficient in a Fourier-series expansion of the function <math>X(\omega),</math>, taking the interval <math>-B</math> to <math>B</math> as a fundamental period. This means that the values of the samples <math>x(n/2B)</math> determine the Fourier coefficients in the series expansion of <math>X(\omega).</math>.&nbsp; Thus they determine <math>X(\omega),</math>, since <math>X(\omega)</math> is zero for frequencies greater than <math>B,</math>, and for lower frequencies <math>X(\omega)</math> is determined if its Fourier coefficients are determined. But <math>X(\omega)</math> determines the original function <math>x(t)</math> completely, since a function is determined if its spectrum is known. Therefore the original samples determine the function <math>x(t)</math> completely.
}}
 
Shannon's proof of the theorem is complete at that point, but he goes on to discuss reconstruction via [[sinc function]]s, what we now call the [[Whittaker–Shannon interpolation formula]] as discussed above. He does not derive or prove the properties of the sinc function, as the Fourier pair relationship between the [[rectangular function|rect]] (the rectangular function) and sinc functions was well known by that time.<ref>{{cite book |last1=Campbell |first1=George |last2=Foster |first2=Ronald |title=Fourier Integrals for Practical Applications |date=1942 |publisher=Bell Telephone System Laboratories |___location=New York}}</ref>
 
{{quoteblockquote|
Let <math>x_n</math> be the ''<math>n''^{th}</math> sample. Then the function <math>x(t)</math> is represented by:
 
:<math display="block">x(t) = \sum_{n=-\infty}^{\infty}x_n{\sin(\pi(2Bt-n)) \over \pi(2Bt-n)}.</math>
}}
 
Line 119 ⟶ 124:
 
===Notes===
<!---Bug report: The group=proof tag attracts the intended footnote, but it also attracts one of the {{efn|group=bottom footnotes. The work-around is to use {efn}} for one type and {{efn-ua}} for the other type.-->
{{notelist|group=proof}}
 
Line 138 ⟶ 143:
 
The sampling theorem also applies to post-processing digital images, such as to up or down sampling. Effects of aliasing, blurring, and sharpening may be adjusted with digital filtering implemented in software, which necessarily follows the theoretical principles.
 
[[File:CriticalFrequencyAliasing.svg|thumb|right|A family of sinusoids at the critical frequency, all having the same sample sequences of alternating +1 and –1. That is, they all are aliases of each other, even though their frequency is not above half the sample rate.]]
 
==Critical frequency==
To illustrate the necessity of <math>f_s>2B,</math>, consider the family of sinusoids generated by different values of <math>\theta</math> in this formula:
 
:<math display="block">x(t) = \frac{\cos(2 \pi B t + \theta )}{\cos(\theta )}\ = \ \cos(2 \pi B t) - \sin(2 \pi B t)\tan(\theta ), \quad -\pi/2 < \theta < \pi/2.</math>
 
With <math>f_s=2B</math> or equivalently <math>T=1/2B,</math>, the samples are given by:
[[File:CriticalFrequencyAliasing.svg|thumb|right|A family of sinusoids at the critical frequency, all having the same sample sequences of alternating +1 and –1. That is, they all are aliases of each other, even though their frequency is not above half the sample rate.]]
 
With <math>f_s=2B</math> or equivalently <math>T=1/2B</math>, the samples are given by:
 
:<math display="block">x(nT) = \cos(\pi n) - \underbrace{\sin(\pi n)}_{0}\tan(\theta ) = (-1)^n</math>
 
{{em|regardless of the value of <math>\theta.</math>}}. That sort of ambiguity is the reason for the ''strict'' inequality of the sampling theorem's condition.
 
==Sampling of non-baseband signals==
As discussed by Shannon:<ref name="Shannon49"/>
 
{{quoteblockquote|A similar result is true if the band does not start at zero frequency but at some higher value, and can be proved by a linear translation (corresponding physically to [[single-sideband modulation]]) of the zero-frequency case. In this case the elementary pulse is obtained from <math>\sin(x)/x</math> by single-side-band modulation.}}
 
That is, a sufficient no-loss condition for sampling [[signal (information theory)|signal]]s that do not have [[baseband]] components exists that involves the ''width'' of the non-zero frequency interval as opposed to its highest frequency component. See ''[[Sampling (signal processing)|sampling]]'' for more details and examples.
 
For example, in order to sample [[FM broadcasting|FM radio]] signals in the frequency range of 100–102&nbsp;[[megahertz|MHz]], it is not necessary to sample at 204&nbsp;MHz (twice the upper frequency), but rather it is sufficient to sample at 4&nbsp;MHz (twice the width of the frequency interval). (Reconstruction is not usually the goal with sampled IF or RF signals. Rather, the sample sequence can be treated as ordinary samples of the signal frequency-shifted to near baseband, and digital demodulation can proceed on that basis.)
 
AUsing the bandpass condition, iswhere that <math>X(f) = 0</math>, for all nonnegative <math>|f|</math> outside the open band of frequencies:
:<math display="block"> \left(\frac{N}2 f_\mathrm{s}, \frac{N+1}2 f_\mathrm{s}\right), </math>
for some nonnegative integer <math>N</math> and some sampling frequency <math>f_\mathrm{s}</math>, it is possible to find an interpolation that reproduces the signal. Note that there may be several combinations of <math>N</math> and <math>f_\mathrm{s}</math> that work, including the normal baseband condition as the case <math>N=0.</math> The corresponding interpolation functionfilter to be convolved with the sample is the impulse response of an ideal "brick-wall" [[bandpass filter]] (as opposed to the ideal [[brick-wall filter|brick-wall]] [[lowpass filter]] used above) with cutoffs at the upper and lower edges of the specified band, which is the difference between a pair of lowpass impulse responses:
for some nonnegative integer <math>N</math>. This formulation includes the normal baseband condition as the case <math>N=0</math>.
 
The corresponding interpolation function is the impulse response of an ideal brick-wall [[bandpass filter]] (as opposed to the ideal [[brick-wall filter|brick-wall]] [[lowpass filter]] used above) with cutoffs at the upper and lower edges of the specified band, which is the difference between a pair of lowpass impulse responses:
 
<math display="block">(N+1)\,\operatorname{sinc} \left(\frac{(N+1)t}T\right) - N\,\operatorname{sinc}\left( \frac{Nt}T \right).</math>
 
This function is 1 at <math>t=0</math> and zero at any other multiple of <math>T</math> (as well as at other times if <math>N>0</math>).
 
Other generalizations, for example to signals occupying multiple non-contiguous bands, are possible as well. Even the most generalized form of the sampling theorem does not have a provably true converse. That is, one cannot conclude that information is necessarily lost just because the conditions of the sampling theorem are not satisfied; from an engineering perspective, however, it is generally safe to assume that if the sampling theorem is not satisfied then information will most likely be lost.
Line 176 ⟶ 181:
The general theory for non-baseband and nonuniform samples was developed in 1967 by [[Henry Landau]].<ref>{{cite journal |first=H. J. |last=Landau |title=Necessary density conditions for sampling and interpolation of certain entire functions |journal=Acta Mathematica |volume=117 |issue=1 |pages=37–52 |year=1967 |doi=10.1007/BF02395039 |doi-access=free }}</ref> He proved that the average sampling rate (uniform or otherwise) must be twice the ''occupied'' bandwidth of the signal, assuming it is ''a priori'' known what portion of the spectrum was occupied.
 
In the late 1990s, this work was partially extended to cover signals whosefor which the amount of occupied bandwidth wasis known, but the actual occupied portion of the spectrum wasis unknown.<ref>
For example, {{cite thesis |first=P. |last=Feng |title=Universal minimum-rate sampling and spectrum-blind reconstruction for multiband signals |degree=Ph.D. |institution=University of Illinois at Urbana-Champaign |year=1997 }}</ref> In the 2000s, a complete theory was developed
(see the section [[Nyquist–Shannon sampling theorem#Sampling below the Nyquist rate under additional restrictions|Sampling below the Nyquist rate under additional restrictions]] below) using [[compressed sensing]]. In particular, the theory, using signal processing language, is described in a 2009 paper by Mishali and Eldar.<ref>{{cite journal | citeseerx = 10.1.1.154.4255 | title = Blind Multiband Signal Reconstruction: Compressed Sensing for Analog Signals | first1 = Moshe | last1 = Mishali | first2 = Yonina C. | last2 = Eldar | journal = IEEE Trans. Signal Process. |date=March 2009 | volume = 57 | issue = 3 | pages = 993–1009 | doi = 10.1109/TSP.2009.2012791 | bibcode = 2009ITSP...57..993M | s2cid = 2529543 }}</ref> They show, among other things, that if the frequency locations are unknown, then it is necessary to sample at least at twice the Nyquist criteria; in other words, you must pay at least a factor of 2 for not knowing the ___location of the [[spectrum]]. Note that minimum sampling requirements do not necessarily guarantee [[Numerical stability|stability]].
 
==Sampling below the Nyquist rate under additional restrictions==
Line 184 ⟶ 189:
The Nyquist–Shannon sampling theorem provides a [[necessary and sufficient condition|sufficient condition]] for the sampling and reconstruction of a band-limited signal. When reconstruction is done via the [[Whittaker–Shannon interpolation formula]], the Nyquist criterion is also a necessary condition to avoid aliasing, in the sense that if samples are taken at a slower rate than twice the band limit, then there are some signals that will not be correctly reconstructed. However, if further restrictions are imposed on the signal, then the Nyquist criterion may no longer be a [[necessary and sufficient condition|necessary condition]].
 
A non-trivial example of exploiting extra assumptions about the signal is given by the recent field of [[compressed sensing]], which allows for full reconstruction with a sub-Nyquist sampling rate. Specifically, this applies to signals that are sparse (or compressible) in some ___domain. As an example, compressed sensing deals with signals that may have a low over-alloverall bandwidth (say, the ''effective'' bandwidth ''<math>EB''</math>), but the frequency locations are unknown, rather than all together in a single band, so that the [[Nyquist–Shannon sampling theorem#Sampling of non-baseband signals|passband technique]] does not apply. In other words, the frequency spectrum is sparse. Traditionally, the necessary sampling rate is thus 2''B''<math>2B.</math> Using compressed sensing techniques, the signal could be perfectly reconstructed if it is sampled at a rate slightly lower than 2''EB''<math>2EB.</math> With this approach, reconstruction is no longer given by a formula, but instead by the solution to a [[Linear programming|linear optimization program]].
 
Another example where sub-Nyquist sampling is optimal arises under the additional constraint that the samples are quantized in an optimal manner, as in a combined system of sampling and optimal [[lossy compression]].<ref>{{cite journal|last1=Kipnis|first1=Alon|last2=Goldsmith|first2=Andrea J.|last3=Eldar|first3=Yonina C.|last4=Weissman|first4=Tsachy|title=Distortion rate function of sub-Nyquist sampled Gaussian sources|journal=IEEE Transactions on Information Theory|date=January 2016|volume=62|pages=401–429|doi=10.1109/tit.2015.2485271|arxiv=1405.5329|s2cid=47085927 }}</ref> This setting is relevant in cases where the joint effect of sampling and [[Quantization (signal processing)|quantization]] is to be considered, and can provide a lower bound for the minimal reconstruction error that can be attained in sampling and quantizing a [[random signal]]. For stationary Gaussian random signals, this lower bound is usually attained at a sub-Nyquist sampling rate, indicating that sub-Nyquist sampling is optimal for this signal model under optimal [[Quantization (signal processing)|quantization]].<ref>{{cite journal |last1=Kipnis |first1=Alon |last2=Eldar |first2=Yonina |last3=Goldsmith |first3=Andrea |title=Analog-to-Digital Compression: A New Paradigm for Converting Signals to Bits |journal=IEEE Signal Processing Magazine |date=26 April 2018 |volume=35 |issue=3 |pages=16–39 |doi=10.1109/MSP.2017.2774249 |arxiv=1801.06718 |bibcode=2018ISPM...35c..16K |s2cid=13693437 }}</ref>
 
==Historical background==
The sampling theorem was implied by the work of [[Harry Nyquist]] in 1928,<ref>{{cite journal | authorlast=Nyquist, |first=Harry | author-link =Harry Nyquist | title =Certain topics in telegraph transmission theory | journal =Trans.Transactions of the AIEE | volume =47 | issue =2 | pages =617–644 | date =April 1928 | doi=10.1109/t-aiee.1928.5055024| bibcode =1928TAIEE..47..617N }} [https://web.archive.org/web/20130926031230/http://www.ieee.org/publications_standards/publications/proceedings/nyquist.pdf Reprint as classic paper] in: ''Proc.Proceedings of the IEEE'', Vol. 90, No. 2, FebFebruary 2002] {{webarchive|url=https://web.archive.org/web/20130926031230/http://www.ieee.org/publications_standards/publications/proceedings/nyquist.pdf |date=2013-09-26 }}</ref> in which he showed that up to 2''B''<math>2B</math> independent pulse samples could be sent through a system of bandwidth ''<math>B''</math>; but he did not explicitly consider the problem of sampling and reconstruction of continuous signals. About the same time, [[Karl Küpfmüller]] showed a similar result<ref>{{cite journal |first=Karl |last=Küpfmüller |title=Über die Dynamik der selbsttätigen Verstärkungsregler |journal=Elektrische Nachrichtentechnik |volume=5 |issue=11 |pages=459–467 |year=1928 |language=de}} [http://ict.open.ac.uk/classics/2.pdf (English translation 2005)] {{Webarchive|url=https://web.archive.org/web/20190521021624/http://ict.open.ac.uk/classics/2.pdf |date=2019-05-21 }}.</ref> and discussed the sinc-function impulse response of a band-limiting filter, via its integral, the step-response [[sine integral]]; this bandlimiting and reconstruction filter that is so central to the sampling theorem is sometimes referred to as a ''Küpfmüller filter'' (but seldom so in English).
 
The sampling theorem, essentially a [[duality (mathematics)|dual]] of Nyquist's result, was proved by [[Claude E. Shannon]].<ref name="Shannon49"/> The mathematician [[E.Edmund T.Taylor Whittaker]] published similar results in 1915,<ref>{{cite journal |authorlast=Whittaker, |first=E. T. |author-link=E. T. Whittaker |title=On the Functions Which are Represented by the Expansions of the Interpolation Theory |journal=Proc.Proceedings R.of Soc.the Royal Society of Edinburgh |volume=35 |pages=181–194 |date=1915 |doi=10.1017/s0370164600017806|url=https://zenodo.org/record/1428702 }} ({{lang|de|"Theorie der Kardinalfunktionen"}}).</ref> J.as M.did his son [[John Macnaghten Whittaker]] in 1935,<ref>{{cite book | authorlast=Whittaker, |first=J. M. | author-link =J. M. Whittaker | title =Interpolatory Function Theory | url=https://archive.org/details/in.ernet.dli.2015.219870 | publisher =Cambridge Univ.University Press | date =1935 | ___location =Cambridge, England}}.</ref> and [[Dennis Gabor|Gabor]] in 1946 ("Theory of communication").
 
In 1948 and 1949, Claude E. Shannon published the two revolutionary articles in which he founded the [[information theory]].<ref>{{cite journal |ref=refShannon48 |authorlast=Shannon, |first=Claude E. |author-link=Claude Shannon |title=A Mathematical Theory of Communication |journal=Bell System Technical Journal |volume=27 |issue=3 |pages=379–423 |date=July 1948 |doi=10.1002/j.1538-7305.1948.tb01338.x|hdl=11858/00-001M-0000-002C-4317-B |hdl-access=free }}.</ref><ref>{{cite journal |ref=refShannon48oct |authorlast=Shannon, |first=Claude E. |author-link=Claude Shannon |title=A Mathematical Theory of Communication |journal=Bell System Technical Journal |volume=27 |issue=4 |pages=623–666 |date=October 1948 |doi=10.1002/j.1538-7305.1948.tb00917.x|hdl=11858/00-001M-0000-002C-4314-2 |hdl-access=free }}</ref><ref name="Shannon49"/> In Shannon's "[[#refShannon48oct|ShannonA 1948Mathematical Theory of Communication]]", the sampling theorem is formulated as “Theorem"Theorem 13”13": Let ''<math>f''(''t'')</math> contain no frequencies over W. Then
 
:<math display="block">f(t) = \sum_{n=-\infty}^\infty X_n \frac{\sin \pi(2Wt - n)}{\pi(2Wt - n)},</math> where <math>X_n = f\left(\frac n {2W} \right)</math>.
It was not until these articles were published that the theorem known as “Shannon’s sampling theorem” became common property among communication engineers, although Shannon himself writes that this is a fact which is common knowledge in the communication art.{{efn-ua|group=bottom|[[#refShannon49|Shannon 1949]], p. 448.}} A few lines further on, however, he adds: "but in spite of its evident importance, [it] seems not to have appeared explicitly in the literature of communication theory".
where <math>X_n = f\left(\frac n {2W} \right).</math>
 
It was not until these articles were published that the theorem known as “Shannon’s"Shannon's sampling theorem”theorem" became common property among communication engineers, although Shannon himself writes that this is a fact which is common knowledge in the communication art.{{efn-ua|group=bottom|[[#refShannon49|Shannon 1949]], p. 448.}} A few lines further on, however, he adds: "but in spite of its evident importance, [it] seems not to have appeared explicitly in the literature of [[communication theory]]". Despite his sampling theorem being published at the end of the 1940s, Shannon had derived his sampling theorem as early as 1940.<ref>{{Cite conference |last1=Stanković |first1=Raromir S. |last2=Astola |first2=Jaakko T. |last3=Karpovsky |first3=Mark G. |date=September 2006 |title=Some Historic Remarks On Sampling Theorem |url=https://sites.bu.edu/mark/files/2018/02/196.pdf |conference=Proceedings of the 2006 International TICSP Workshop on Spectral Methods and Multirate Signal Processing}}</ref>
 
===Other discoverers===
Others who have independently discovered or played roles in the development of the sampling theorem have been discussed in several historical articles, for example, by Jerri<ref>{{cite journal | last=Jerri | first=Abdul | author-link=Abdul Jerri | title=The Shannon Sampling Theorem—Its Various Extensions and Applications: A Tutorial Review | journal=Proceedings of the IEEE | volume=65 | issue=11 | pages=1565–1596 | date=November 1977 | doi=10.1109/proc.1977.10771 | bibcode=1977IEEEP..65.1565J | s2cid=37036141 }} See also {{cite journal | last=Jerri | first=Abdul | title=Correction to "'The Shannon sampling theorem—Its various extensions and applications: A tutorial review"' | journal=Proceedings of the IEEE | volume=67 | issue=4 | page=695 | date=April 1979 | doi=10.1109/proc.1979.11307 }}</ref> and by Lüke.<ref>{{cite journal | last=Lüke | first=Hans Dieter | title =The Origins of the Sampling Theorem | journal =IEEE Communications Magazine | pages =106–108 | date =April 1999 | issue=4 | doi =10.1109/35.755459 | volume=37| url=http://www.hit.bme.hu/people/papay/edu/Conv/pdf/origins.pdf | citeseerx=10.1.1.163.2887 }}</ref> For example, Lüke points out that H.[[Herbert Raabe]], an assistant to Küpfmüller, proved the theorem in his 1939 Ph.D. dissertation; the term ''Raabe condition'' came to be associated with the criterion for unambiguous representation (sampling rate greater than twice the bandwidth). Meijering<ref name="EM">{{cite journal | last =Meijering | first =Erik | title =A Chronology of Interpolation From Ancient Astronomy to Modern Signal and Image Processing | journal =Proc.Proceedings of the IEEE | volume =90 | issue =3 | pages =319–342 | date =March 2002 | doi =10.1109/5.993400 | url =http://bigwww.epfl.ch/publications/meijering0201.pdf }}</ref> mentions several other discoverers and names in a paragraph and pair of footnotes:
<blockquote>
As pointed out by Higgins [135], the sampling theorem should really be considered in two parts, as done above: the first stating the fact that a bandlimited function is completely determined by its samples, the second describing how to reconstruct the function using its samples. Both parts of the sampling theorem were given in a somewhat different form by J. M. Whittaker [350, 351, 353] and before him also by Ogura [241, 242]. They were probably not aware of the fact that the first part of the theorem had been stated as early as 1897 by Borel [25].<sup>27</sup> As we have seen, Borel also used around that time what became known as the cardinal series. However, he appears not to have made the link [135]. In later years it became known that the sampling theorem had been presented before Shannon to the Russian communication community by [[Vladimir Kotelnikov|Kotel'nikov]] [173]. In more implicit, verbal form, it had also been described in the German literature by Raabe [257]. Several authors [33, 205] have mentioned that Someya [296] introduced the theorem in the Japanese literature parallel to Shannon. In the English literature, Weston [347] introduced it independently of Shannon around the same time.<sup>28</sup>
 
<{{blockquote>|
<sup>27</sup> Several authors, following Black [16], have claimed that this first part of the sampling theorem was stated even earlier by Cauchy, in a paper [41] published in 1841. However, the paper of Cauchy does not contain such a statement, as has been pointed out by Higgins [135].
As pointed out by Higgins [135], the sampling theorem should really be considered in two parts, as done above: the first stating the fact that a bandlimited function is completely determined by its samples, the second describing how to reconstruct the function using its samples. Both parts of the sampling theorem were given in a somewhat different form by [[J. M. Whittaker [350, 351, 353]] and before him also by Ogura [241, 242]. They were probably not aware of the fact that the first part of the theorem had been stated as early as 1897 by Borel [25[Émile Borel|Borel]].{{refn|group= Meijering|Several authors, following Black, have claimed that this first part of the sampling theorem was stated even earlier by Cauchy, in a paper published in 1841. However, the paper of Cauchy does not contain such a statement, as has been pointed out by Higgins.<sup>27</sup>}} As we have seen, Borel also used around that time what became known as the cardinal series. However, he appears not to have made the link [135]. In later years it became known that the sampling theorem had been presented before Shannon to the Russian communication community by [[Vladimir Kotelnikov|Kotel'nikov]] [173]. In more implicit, verbal form, it had also been described in the German literature by [[Herbert Raabe [257|Raabe]]. Several authors [33, 205] have mentioned that Someya [296] introduced the theorem in the Japanese literature parallel to Shannon. In the English literature, Weston [347] introduced it independently of Shannon around the same time.<sup>28</sup>{{refn|group= Meijering|As a consequence of the discovery of the several independent introductions of the sampling theorem, people started to refer to the theorem by including the names of the aforementioned authors, resulting in such catchphrases as "the Whittaker–Kotel'nikov–Shannon (WKS) sampling theorem" or even "the Whittaker–Kotel'nikov–Raabe–Shannon–Someya sampling theorem". To avoid confusion, perhaps the best thing to do is to refer to it as the sampling theorem, "rather than trying to find a title that does justice to all claimants".}}
 
{{reflist|group= Meijering}}|Eric Meijering, "A Chronology of Interpolation From Ancient Astronomy to Modern Signal and Image Processing" (citations omitted)
<sup>28</sup> As a consequence of the discovery of the several independent introductions of the sampling theorem, people started to refer to the theorem by including the names of the aforementioned authors, resulting in such catchphrases as “the Whittaker–Kotel’nikov–Shannon (WKS) sampling theorem" [155] or even "the Whittaker–Kotel'nikov–Raabe–Shannon–Someya sampling theorem" [33]. To avoid confusion, perhaps the best thing to do is to refer to it as the sampling theorem, "rather than trying to find a title that does justice to all claimants" [136].
}}
</blockquote>
 
In Russian literature it is known as the Kotelnikov's theorem, named after [[Vladimir Kotelnikov]], who discovered it in 1933.<ref>Kotelnikov VA, ''On the transmission capacity of "ether" and wire in electrocommunications'', [http://ict.open.ac.uk/classics/1.pdf (English translation, PDF)] {{Webarchive|url=https://web.archive.org/web/20210301042517/http://ict.open.ac.uk/classics/1.pdf |date=2021-03-01 }}, Izd. Red. Upr. Svyazzi RKKA (1933), Reprint in ''[http://www.ieeta.pt/~pjf/MSTMA/ Modern Sampling Theory: Mathematics and Applications]'', Editors: J. J. Benedetto und PJSG Ferreira, Birkhauser (Boston) 2000, {{ISBN|0-8176-4023-1}}.</ref>
 
===Why Nyquist?===
Exactly how, when, or why [[Harry Nyquist]] had his name attached to the sampling theorem remains obscure. The term ''Nyquist Sampling Theorem'' (capitalized thus) appeared as early as 1959 in a book from his former employer, [[Bell Labs]],<ref>{{cite book | title = Transmission Systems for Communications | author = Members of the Technical Staff of Bell Telephone Lababoratories | year = 1959 | publisher = AT&T | pagespage = 26–426-4 (Vol.|volume=2)}}</ref> and appeared again in 1963,<ref>{{cite book | title = Theory of Linear Physical Systems | publisher = Wiley | year = 1963 | url = https://books.google.com/books?id=jtI-AAAAIAAJ |first=Ernst Adolph |last=Guillemin| isbn = 9780471330707 }}</ref> and not capitalized in 1965.<ref>{{cite book |first1=Richard A. |last1=Roberts |first2=Ben F. |last2=Barton |title=Theory of Signal Detectability: Composite Deferred Decision Theory |year=1965 }}</ref> It had been called the ''Shannon Sampling Theorem'' as early as 1954,<ref>{{cite journal |first=Truman S. |last=Gray |title=Applied Electronics: A First Course in Electronics, Electron Tubes, and Associated Circuits |journal=Physics Today |year=1954 |volume=7 |issue=11 |page=17 |doi=10.1063/1.3061438 |bibcode=1954PhT.....7k..17G |hdl=2027/mdp.39015002049487 |hdl-access=free }}</ref> but also just ''the sampling theorem'' by several other books in the early 1950s.
 
In 1958, [[R. B. Blackman|Blackman]] and [[J. W. Tukey|Tukey]] cited Nyquist's 1928 article as a reference for ''the sampling theorem of information theory'',<ref>{{cite journal
| last1 = Blackman | first1 = R. B. | author1-link = R. B. Blackman
| last2 = Tukey | first2 = J. W. | author2-link = J. W. Tukey
| doi = 10.1002/j.1538-7305.1958.tb03874.x
| journal = [[The Bell System Technical Journal]]
| mr = 102897
| pages = 185–282
| title = The measurement of power spectra from the point of view of communications engineering. I
| volume = 37
| year = 1958}} See glossary, pp. 269–279. Cardinal theorem is on p. 270 and sampling theorem is on p. 277.</ref> even though that article does not treat sampling and reconstruction of continuous signals as others did. Their glossary of terms includes these entries:
 
</{{blockquote>|
In 1958, Blackman and Tukey cited Nyquist's 1928 article as a reference for ''the sampling theorem of information theory'',<ref>{{cite book |first1=R. B. |last1=Blackman |first2=J. W. |last2=Tukey |title=The Measurement of Power Spectra : From the Point of View of Communications Engineering |___location=New York |publisher=Dover |year=1958 |url=http://alcatel-lucent.com/bstj/vol37-1958/articles/bstj37-1-185.pdf |archive-url=https://ghostarchive.org/archive/20221009/http://alcatel-lucent.com/bstj/vol37-1958/articles/bstj37-1-185.pdf |archive-date=2022-10-09 |url-status=live }}{{Dead link|date=April 2020 |bot=InternetArchiveBot |fix-attempted=yes }}</ref> even though that article does not treat sampling and reconstruction of continuous signals as others did. Their glossary of terms includes these entries:
{{glossary}}
; Sampling theorem (of information theory): Nyquist's result that equi-spaced data, with two or more points per cycle of highest frequency, allows reconstruction of band-limited functions. (See ''Cardinal theorem''.)
{{term|Sampling theorem (of information theory)}}
; Cardinal theorem (of interpolation theory): A precise statement of the conditions under which values given at a doubly infinite set of equally spaced points can be interpolated to yield a continuous band-limited function with the aid of the function <math display="block">\frac{\sin (x - x_i)}{x - x_i}.</math>
; Sampling theorem (of information theory): {{defn|Nyquist's result that equi-spaced data, with two or more points per cycle of highest frequency, allows reconstruction of band-limited functions. (See ''Cardinal theorem''.)}}
{{term|Cardinal theorem (of interpolation theory)}}
; Cardinal theorem (of interpolation theory): {{defn|A precise statement of the conditions under which values given at a doubly infinite set of equally spaced points can be interpolated to yield a continuous band-limited function with the aid of the function <math display="block">\frac{\sin (x - x_i)}{x - x_i}.</math>}}
{{glossary end}}}}
 
Exactly what "Nyquist's result" they are referring to remains mysterious.
 
When Shannon stated and proved the sampling theorem in his 1949 article, according to Meijering,<ref name="EM" /> "he referred to the critical sampling interval <math>T = \frac 1 {2W}</math> as the ''Nyquist interval'' corresponding to the band ''<math>W'',</math> in recognition of Nyquist’sNyquist's discovery of the fundamental importance of this interval in connection with telegraphy". This explains Nyquist's name on the critical interval, but not on the theorem.
 
Similarly, Nyquist's name was attached to ''[[Nyquist rate]]'' in 1953 by [[Harold Stephen Black|Harold S. Black]]:
 
{{blockquote|"If the essential frequency range is limited to ''<math>B''</math> cycles per second, 2''B''<math>2B</math> was given by Nyquist as the maximum number of code elements per second that could be unambiguously resolved, assuming the peak interference is less than half a quantum step. This rate is generally referred to as '''signaling at the Nyquist rate''' and <math>\frac 1 {2B}</math> has been termed a ''Nyquist interval''."|Harold Black, ''Modulation Theory''<ref>{{cite book |first=Harold S. |last=Black |title=Modulation Theory |year=1953 }}</ref> (bold added for emphasis; italics as in the original)}}
 
According to the ''[[OEDOxford English Dictionary]]'', this may be the origin of the term ''Nyquist rate''. In Black's usage, it is not a sampling rate, but a signaling rate.
 
== See also ==
Line 244 ⟶ 268:
{{refbegin}}
*{{cite journal |first=J.R. |last=Higgins |title=Five short stories about the cardinal series |journal=Bulletin of the AMS |volume=12 |issue=1 |pages=45–89 |date=1985 |doi=10.1090/S0273-0979-1985-15293-0 |url=https://www.ams.org/bull/1985-12-01/S0273-0979-1985-15293-0/ |doi-access=free }}
*{{cite journal |author1-link=Karl Küpfmüller |first=Karl |last=Küpfmüller |title=Utjämningsförlopp inom Telegraf- och Telefontekniken |trans-title=Transients in telegraph and telephone engineering |journal=[[Teknisk Tidskrift]] |issue=9 |pages=153–160 |date=1931 |doi= |url=httphttps://runeberg.org/tektid/1931e/0157.html] }} and (10): pp.&nbsp;[httphttps://runeberg.org/tektid/1931e/0182.html 178–182]
*{{cite book |first=R.J. |last=Marks, II |title=Introduction to Shannon Sampling and Interpolation Theory |series=Springer Texts in Electrical Engineering |publisher=Springer |date=1991 |url=http://marksmannet.com/RobertMarks/REPRINTS/1999_IntroductionToShannonSamplingAndInterpolationTheory.pdf |archive-url=https://web.archive.org/web/20110714042657/http://marksmannet.com/RobertMarks/REPRINTS/1999_IntroductionToShannonSamplingAndInterpolationTheory.pdf |archive-date=2011-07-14 |url-status=live |doi=10.1007/978-1-4613-9708-3 |isbn=978-1-4613-9708-3 }}
*{{cite book |editor-first=R.J. |editor-last=Marks, II |title=Advanced Topics in Shannon Sampling and Interpolation Theory |series=Springer Texts in Electrical Engineering |publisher=Springer |date=1993 |url=http://marksmannet.com/RobertMarks/REPRINTS/1993_AdvancedTopicsOnShannon.pdf |archive-url=https://web.archive.org/web/20111006022802/http://marksmannet.com/RobertMarks/REPRINTS/1993_AdvancedTopicsOnShannon.pdf |archive-date=2011-10-06 |url-status=live |isbn=978-1-4613-9757-1 |doi=10.1007/978-1-4613-9757-1 }}
Line 277 ⟶ 301:
[[Category:Claude Shannon]]
[[Category:Telecommunication theory]]
[[Category:Data compression]]