Talk:Nyquist–Shannon sampling theorem

	Electronics portal This article is part of WikiProject Electronics, an attempt to provide a standard approach to writing articles about electronics on Wikipedia. If you would like to participate, you can choose to edit the article attached to this page, or visit the project page, where you can join the project and see a list of open tasks. Leave messages at the project talk pageElectronicsWikipedia:WikiProject ElectronicsTemplate:WikiProject Electronicselectronic
???	This article has not yet received a rating on Wikipedia's content assessment scale.
???	This article has not yet received a rating on the project's importance scale.

Disagreeing with Kupfmuller filters

Latest comment: 19 years ago2 comments2 people in discussion

The following passage rings suspicious, it is obviously impossible to reconstruct arbitrary signals once they are sampled. If you want to keep this paragraph in, explain here why, or I will delete it.

The theory of reconstruction of the analog signal from samples requires a so-called Kupfmüller filter. Such a filter is a theoretical filter with certain properties, which can only be approximated in reality. The filter is supposed to be fed directly with the sample impulses and generates the original analog signal.

Loisel 07:09, 16 Sep 2004 (UTC)

I think there's some form of filtering or interpolation that can be used to *perfectly* reproduce a bandlimited sampled signal, instead of the usual approximate filtering of all harmonics. Related to the sinc function? I'm not sure, but assumed this is what they were talking about. - Omegatron 19:03, Sep 16, 2004 (UTC)

That would be the Nyquist-Shannon interpolation formula, Omegatron - Omegatron

The theory of discrete Fourier transforms tells you that if U is the discrete Fourier transform of u, then the discrete inverse Fourier transform of U is u again, without any error whatsoever. If you have a bandlimited signal f, sample it to u, take its DFT U, then take an IDFT of U, you'll recover u perfectly. If u has sufficiently many samples for the bandwidth of f, then f is given by the unique interpolating trigonometric polynomial for u. The coefficients of the interpolating polynomial are given by U. As you can see, there's no filter. Applying a sinc filter of the correct frequency to a bandlimited signal won't change it at all, since it's already bandlimited. So this Kupfmuler stuff is nonsense. Loisel 07:05, 20 Sep 2004 (UTC)

Oh I remember now. To convert the sampled signal back to analog you put it through a zero-order hold or the like, which creates the original signal in the baseband with harmonics every multiple of the sampling frequency, and you filter the harmonics. The sinc "filter" i was thinking of is actually what you convolute the signal by. sinc convolution in time ___domain = rectangular ideal filter in frequency ___domain. right? and no real filter is ideal, so technically you are only attenuating the higher frequencies, not destroying them completely. So I thought maybe this kupmuller stuff had to do with the sinc convolution, but nope. That's just an ideal filter. - Omegatron 13:29, Sep 20, 2004 (UTC)

Actually, since I will forget about this, I am deleting now. Restore it if you can justify it.

Loisel 07:11, 16 Sep 2004 (UTC)

Apparently people missed the word "theoretical". I actually tells much about the education of people who claim this is nonsens but scrap the bottom of the barrel to fabulate about Nyquist and Shannon. If it isn't in their beginner's textbooks it is not supposed to exist?

Karl Küpfmüller (*1897, +1977), professor in Gdansk (http://www.pg.gda.pl/~mjasina/pehgo2000/hist2en.html) and in Darmstadt, contributed basic research to the telecommunication system theory. One of his contributions is the Kupfmüller Filter. It is a theoretical construct, with an ideal behavior in the frequency ___domain. It shows exactly the properties one needs to convert an ideal(!) (did you get this: IDEAL, we are talking system theory here) sampling sequence - where each sampling is an infinit short impuls, and a sampled signal is a sequence of such impulses. And the amplitude of each such impuls resembles the amplitude of the original, sampled system at the moment it was sampled.

To reconstruct the original signal from such a series of impulses one needs a (mathematical, ideal) device "undoing the sampling". That device is called a Kupfmüller filter. It is a lowpass filter with an impulse response h(t) = A0 * omegac /pi * sin(omegac (t - t0)) / (omegac (t - t0)); for t = -infinit ... infinit. Since the impuls response starts at -infinit it is obvious that the filter is non-causal. It does not exist. It is an idealisation. It is system theory. You get it? System theory, like the sampling theorem, also system theory. However, processing (calculating) a sequence of sampling impulses, which have been sampled adhearing to the sampling theorem with a kupfmüller filter (you treat the sequence of samples as some signal), results in the original signal. You need this theoretical filter in the system theory, because if you take the sequence of samples as some signal, and each sample is infinit short (an impuls), one can easily see that it contains infinitly high frequencies due to the impulses. Frequencies, which were never in the original signal.

The fact that one can in theory, under ideal conditions reconstruct the original signal without loss of information is the whole key of the sampling theorem! If you stay within its limits (which one can't), if you manage to do an ideal sampling (infinit short sampling impulses, with infinit precision), and if you apply a Kupfmüller filter to the sequence of samples, you get exactly the original signal out. Do the math! That's why the theorem was such a great breakthrough. It provides the theoretical foundation for all digital signal processing. It establishes that there is a 1:1 relation between an analog and a digital signal. In practice it has shown that even without having the ideal sampling, without the ideal Kupfmüller filter, without infinit precision, etc. it works good enough to do serious things. This is what some people call the digital revolution. And what Loisel-Loser calls nonsens.

If you care, do the math. But I doubt that you care. I am to tired to change the article. The nonsens fanboys will delete it anyhow. I have just one question for you: Why the fuck do you write about sampling if you don't get the system theory behind it? 89.52.130.249 23:53, 14 March 2006 (UTC)Reply

Please identify Yourself if making such accusations. At least use a wikipedia login. Next provide a reference for Your claim. Explain in which way the filter You want to introduce differs from the cardinal series introduced by E.T. Whittaker (prof. in Birnmingham) in 1915. The series itself dates back to E. Borel in 1898. I don't think that You want to claim that a one-year old has precedence on that result. This series is treated in this article as well as in Nyquist-Shannon interpolation formula.--LutzL 09:39, 16 March 2006 (UTC)Reply

The following disputed psychobabble also deleted:

In practice, the sampling data is usually not available any more. It has been quantized ("squeezed" or broken into discrete values) and digitized (converted into symbols, such as numbers). Digital to analog converter circuits are used to decode the digital information into analog, which is then filtered. These circuits are usually based on op-amps. Each quantized value is associated with a certain voltage level, and the op-amp circuits generate that voltage level when seeing the particular digital input signal.

Loisel 07:14, 16 Sep 2004 (UTC)

Psychobabble? I think you are using the wrong word. Anyway, what's wrong with a description of digital sampling? - Omegatron 19:03, Sep 16, 2004 (UTC)

All right, technobabble then. The text above is handwavy, fails to explain the terminology (like "quantized") and, since it's trying to explain why Kupfmuller filters don't have the "sampling data", irrelevant (by virtue of Kupfmuller filters being nonsense.)

If a signal is quantized to floating points, many purposes the quantization does not matter and the IDFT gives u from U to extremely high precision. If a signal is quantized for the purpose of compression, u can obviously not be recovered from V (my notation for quantized U) but this has nothing to do with the sinc filter or anything. If v is a quantized u, then the DFT V of v will yield v again to high precision using the IDFT. Digital to analog converters do not affect the quantization, and are no more affected by it than the software IDFT. Injecting op-amps adds nothing to the discussion. If you want to have a reference to DACs, just put "see also Digital to analog converters." One possible statement would be, "Measurement devices like Charge-coupled devices, antennas, geiger counters, etc... and output devices such as Cathode ray tubes, loudspeakers, Light-emitting diodes, the brakes of an Electronic Stability Program, etc... will add some error to the input or output. Such error can make high frequency information irrecoverable even when we have very many samples. This effect can sometimes (but not always) be regarded as the Nyquist theorem applied to the Fourier transform of the signal." Such a comment would have to be made in a non-technical context or include the caveats that "high frequency" may actually not be correct. In some instances, the low frequency data is the noise, and the high frequency is good. In other cases, the noise does not correspond to a neat range of requencies. See aliasing for details.

If you want to talk about hardware, make sure that what you're saying is true and has its place in an article with "theorem" in the title. Loisel 07:05, 20 Sep 2004 (UTC)

'just put "see also Digital to analog converters."'

Good enough for me. - Omegatron 13:29, Sep 20, 2004 (UTC)

Disagreeing with the theorem.

I just read this and the aliasing article, and I must say that I disagree with what's written in there.

Perhaps in engineering this is gospel and sacred, but many of the assertions and hypotheses are unstated, unmotivated and even perhaps incorrect. Why would signals with high frequencies be wrong? Why should we choose, of all the signals that alias to the same sampled signal, the one that has zeroes for the high frequencies? There are reasons for that, but they vary from case to case, and stating that using zero for the high frequency as a canon isn't scientific.

If anyone feels strongly about these articles, continue the discussion in here. If nobody pipes up, I'll change the articles around significantly. Loisel 04:41 Jan 24, 2003 (UTC)

Dou you mean we can generalize to: If S(f) = 0 outside a given interval of width 2W, then s(x) can be recovered. ? Here S(f) is defined for positive and negative f, so if the interval is 3W to 5W, S should also be zero for -5W to -3W. We can add this, especially if you can describe an application for this it is interesting. - Patrick 10:38 Jan 24, 2003 (UTC)

Not quite. Here is the short version. A signal s(x) is just another word for a function. There are many functions. If you sample functions, there are a lot fewer sampled functions. The map you use to go from a signal s(x) to a sampled version (s_k) will necessarely collapse several different signals into the same sampled version by the pigeonhole principle, this is called aliasing. Two signals s(x) and t(x) which collapse to the same sampled signals are called aliases of one another. In general it is preferable that the map L:s(x)->(s_k) produce sampled versions which correspond as closely to the original s(x) according to some metric, or perhaps according to the human brain. Experimentation has suggested that in many cases, the signal that most resembles the sampled version is the one which has the least high frequency components. However, this is simply a heuristic, and in some cases, it is patently false. For instance, if one is working on the real line (not with periodic function) with signals that are compactly supported, it is completely absurd to hope that the spectrum such a signal is also compactly supported. Let us assume that L is linear. Note that the space of sampled signals is of dimension n. Then we would like to choose, for each sampled signal (s_k), a pre-image L^{-1)(s_k)=s(x) which we believe gave rise to (s_k). With the assumption above in italics, then we can choose a linear space of dimension n as the pre-image L^{-1}(S)of the space of sampled signals S={(s_k)}. This is in fact the Nyquist-Shannon sampling theorem (and we could write it explicitly sort of the way it's written, although I suspect there's an off-by-one bug in the current statement.) With this presumption, the Nyquist-Shannon interpolation formula follows. In view of this, I'm not sure how to split the discussion between aliasing and this theorem, which is why my original comment was about both articles. Loisel 19:18 Jan 24, 2003 (UTC)

Also, I should add that I don't intend to massacre the article, just to present the information so that the Fourier approach isn't presented as some sort of God-given truth, as well as offer a linear algebra explanation of the theorem. Loisel 05:51 Jan 25, 2003 (UTC)

Maybe, in the interest of conciseness, this article can stay as is, and the article under aliasing can fill in the blanks I discussed above. Loisel 08:34 Jan 27, 2003 (UTC)

---

What does "the sampling frequency must be original perfectly from the sampled version." in the page meen - this change was made almost two months ago, and I have no idea what it means - can we revert to what the older edits said in this place: "the sampling frequency must be greater than twice the highest frequency of the input signal in order to be able to reconstruct the original perfectly from the sampled version."

Probably a typo where a line got deleted. --Damian Yerrick

I suggest that we leave the psychoacoustic stuff out of this particular article and limit the discussion strictly to the objective, mathematical properties of sampling that Nyquist originally studied. He was actually studying the reverse problem to the one usually mentioned in connection with his theorem, which was how fast one could signal in a bandlimited channel like a telegraph cable without intersymbol interference. His result was equivalent to its modern application in digital sampling. To that end, I would merge the discussion of oversampling into the main body and restate the theorem as it is now stated in the oversampled section, i.e., the sampling rate must be at least twice as high as the bandwidth of the signal being sampled to avoid any loss of information. --Phil Karn, KA9Q

---

I liked the page very much but strongly suggest to remove the last paragraph (which I did) since the claim that the sampling problem is analogous to polynomial interpolation simply is wrong and misleading. -- Olivier Cappé, 11:23 Mar 8, 2004 (CET)

Also is known

It also is known as Whittaker-Nyquist-Kotelnikov-Shannon sampling theorem (example in Russia is's known as Nyquist-Kotelnikov theorem).

Nyquist - 1928
Kotelnikov - 1933
Whittaker - 1935
Gabor - 1946
Shannon - 1948

Links:

--.:Ajvol:. 09:50, 24 Oct 2004 (UTC)

What is "W"?

Latest comment: 19 years ago2 comments2 people in discussion

The article states that

" F[s(x)] = S(f) = 0 for |f| ≥ W"

What is meant with the W? I think this should be explained in the article!

Thanks

W means bandwidth. So for example, an audio signal would have a one-sided bandwidth of 20 kHz; 20,000 cycles per second. If you took the frequency transform of an audio signal, it would be zero for everything higher frequency than 20 kHz (and for everything less than -20 kHz). Then, according to this section, you would need to measure that signal with a minimum time of 1/(2W) between each measurement to not lose any information. in this case 1/(2W) = 1/(2*20,000) = 1/40,000 = .000025 seconds. usually this is specified by just saying you have to measure it at least 40,000 times per second, or twice the bandwidth.

Is this clear enough? I will try to fit it into the article. - Omegatron 16:37, Nov 15, 2004 (UTC)

Hi Omegatron, thanks for your quick reply. It helped a lot! There only is one thing I haven't understood clearly: let's imagine a soundcard that samples audio input at 44 KHz (you see, the same with your explanation). What the soundcard does, is it takes a record of the voltage every .000025 seconds.

If you drew all those recorded values on a time-voltage diagram (time being the variable), you'd get a diagram full of dots. To reconstruct the original signal, you could draw lines between each dot and its next neighbor.

(*) That would be quite a close approximation, but you couldn't find out what happened between those .000025-second-snapshots. Maybe there was a high voltage burst (being a Dirac distribution for example) somewhere between the intervall 1.000025 s and 1.000030 s.

Maybe you understand the problem I see - but maybe I just made a mistake at some point in my thought.

Thank you for your help, --Abdull 18:57, 15 Nov 2004 (UTC)

Very correct. However, the dots with spaces between them actually DO have all the information in the original signal. As long as they are at most .000025 seconds apart. That's the whole essence of the theorem. You just have to get it back out in the correct way. I can't think of a simple explanation, but I will try. If you were to connect them as you described, with a straight line between (~~i think this is called a first-order hold~~), you would not reproduce the original signal. What you actually do in a real system is even cruder! You just make a horizontal line out from each dot like stairsteps (this is a zero-order hold~~, i think~~). The thing is, you are creating extra frequencies above 22 kHz when you do that, perfect multiples of the original frequencies. If you then filtered out those higher frequencies, you would smooth out the horizontal lines into exactly the original signal. As long as the original signal doesn't go above 22 kHz, you can reproduce it exactly with sampling at 44 kHz. It's hard to explain. (When you've figured it out, help me make the article easy for beginners to understand.) Here's some diagrams http://cnx.rice.edu/content/m10402/latest/ - Omegatron 19:23, Nov 15, 2004 (UTC)

In reply to (*): if the signal has no high frequencies (that is, if F, the Fourier transform of f is zero for any frequency w greater than W) then it is not possible to have a large spike between the samples. This is related to but not the same as the Nyquist sampling theorem. One manifestation of this truth is the Nyquist-Shannon interpolation formula.

If you don't have a very good understanding of the article as it is, please don't try to "make the article easy for beginners to understand." We'll just end up with something less good than what there is now.

Loisel 17:08, 23 Nov 2004 (UTC)

I meant "help us make the article easier to understand, by pointing out what needs to be added." It should be accessible to beginners while still being accurate. - Omegatron 20:33, Nov 23, 2004 (UTC)

I agree -- this is one of those things that I learned so long ago that I can't remember what it was like before I understood it. So, although it now seems obvious to me, it is difficult for me to explain to someone else.

Nevertheless, I'm going to try again, hoping something in what I say will inspire you to create a really good explanation. You, dear reader, whoever may be reading this.

OK, say that somewhere between the interval 1.000025 s and 1.000030 s, there's this huge spike that goes way up then comes back down again.

Every properly-designed sound card has an analog anti-aliasing filter to limit the bandwidth. If a signal is already band-limited, it goes right through that filter unchanged.

However, this spike is heavily "distorted" by that filter. If it's "small", it may be entirely clipped out by the filter. If it's "large", it will be rounded off and smeared over a much larger time period -- such a long time period that it effects at least one sample, probably more.

One could paraphrase the Nyquist-Shannon sampling theorem like this:

If the signal is so smooth and rounded off that it goes right through the antialiasing filter unchanged (or it's the output from such a filter), then nothing important happens between samples -- the signal simply rises or falls in a very smooth way between samples.

The mathematical details specify *how* smooth it has to be.

I like the pigeonhole principle explanation that Loisel gives, and hope it makes its way into the main article.

I hate to confuse people by bringing in further complexities, but I want to point out that this is not the only situation where a few samples can give us all the information there is to know about an infinite number of points.

For example:

if you know some function is a straight line, you only need to know 2 points on it to know everything there is to know about that function, which contains an infinite number of points.
If you know some shape is a perfect square, you only need to know a couple of points on each side of the square to know everything there is to know about the perimeter of the square, which contains an infinite number of points. (You don't necessarily need to be given the exact corners -- you can figure them out from other points).
if you know that the signal is some kind of perfectly periodic triangle wave or sawtooth wave, and you know roughly what its repetition rate is, then it only takes around 6 (? or so) samples (a couple of samples on the falling line, a couple of samples on the next rising line, and a couple of samples on the next falling line) to measure everything there is to know about the the sawtooth wave -- the slope of the rising segments, the slope of the falling segments, the maximum value, the minimum value, the repetition frequency, and the "phase". (You don't necessarily need to be given the exact corners -- you can figure them out from the other points).

--DavidCary 22:58, 10 November 2005 (UTC)Reply

I like that line. Here's my version: If the signal has already been smoothed enough by the anti-aliasing filter, then nothing important happens between samples — the signal simply rises or falls in a smooth, predictable way between samples. The mathematical details specify *how* smooth it has to be for this to happen.

I also like the geometric analogy.

If you know that a function is a straight line, you only need to know two points and you can predict every other point along that line.
If you know that a shape is a perfect circle, you only need to know three points to fully define it.
If you know that a function is a sine wave, you only need to know three(?) points to fully define it.
Similarly, it can be shown that if you know the signal is made out of sine waves that change no faster than f_s/2, then you only need to know points every T seconds.

Something like that. — Omegatron 01:23, 11 November 2005 (UTC)Reply

Sampling theory for beginners

Latest comment: 19 years ago4 comments2 people in discussion

As a "beginner", I think the article is fairly good and accessible. However, I am left with a couple of questions that I think should be clarified in some way or another.

Suppose the signal consists of a simple sine wave, e.g. sin(2*pi*f*t) with f = 5Hz. Suppose I sample that signal at a frequency f = 11Hz. When plotted on graph paper, the sampled signal appears to be the product of two sine functions: one with the original frequency, and one with a lower frequency at 0.5Hz. The article states that the original signal is fully recoverable from the collection of sampling points. How is that possible? Is it fair to say that if we want a fairly accurate description of a signal, just by linear or quadratic interpolation between sampling points, then we need to sample at, say, 8 or 10 times the highest frequency? --Ahnielsen 09:55, 11 Mar 2005 (UTC)

Hi, that would be possible if there were a Fourier series for the Dirac comb. Since that's an impossibility, You cannot use the sampling theorem on sine or cosine functions. What You can do is to approximate those infinite waves by sinc(2*pi*0.000Hz*t)*sin(2*pi*5Hz*t), which decays at infinity. For perfect reconstruction, You will have to sum from minus to plus infinity, or at least over a very large intervall around the actual argument. In general, band limited functions are defined as

f(x)=\int _{-W/2}^{W/2}g(s)e^{isx}ds

, where g is square-integrable on the interval [-W/2,W/2]. So the above corresponds to two small boxes, each of area 1/2, around +-5Hz.

If You sample by averaging and reconstruct by step functions or linear interpolation, those oversampling factors of 8,10 or even 20 are appropriate (to get error levels below 1%). Because only local operations are involved, You can as well apply this procedure to sine functions.--LutzL 12:13, 24 May 2005 (UTC)Reply

You can't use the sampling theorem on sinusoids? This is the first time I've heard that... - Omegatron 13:48, May 24, 2005 (UTC)

I've never heard it before, either. --DavidCary 22:58, 10 November 2005 (UTC)Reply

Yes, that fact is right. It rises scepticism in everyone I tell this, but the minimal condition for the cardinal series to converge is that the sum

\sum _{k\neq o}|f(kT)/k|

converges, the more common, but weaker condition is that

\sum _{k\in \mathbb {Z} }|f(kT)|^{2}<\infty

, which both are not satisfied by any sinoid. See papers and books by Butzer, Higgins and Benedetto. A weaker argument is already present in the article: if one samples a sinoid with frequency F at the Nyquist-frequency 2F, then result is an oscillating series with amplitude depending on the phase. For any sinoid with frequency equal to a fraction of the Nyquist frequency, the sampled series will be periodic so the reconstruction formula will diverge since it contains the harmonic series. The proof which Shannon gave — and which was quite common before people thought that a Dirac comb were a simple thing — involves the identity of a function on an interval with its Fourier-series in the L²-sense. A Dirac-Delta-distribution is not in L².--LutzL 15:00, 24 May 2005 (UTC)Reply

I don't know L²-type math. Can you give a more laymen's explanation? You're saying that if I have a sinusoid at f and sample at 3f (>2f to fit the theorem but an exact multiple of f), there will be more than one possible waveform that fits those sample points using the Nyquist-Shannon interpolation formula? - Omegatron 15:36, May 24, 2005 (UTC)

In most cases, piecewise continuous is a good proxy for square integrable. At least the fourier-transform has to be a function, not a distribution. And no, there won't be any cardinal series, linked here under Shannon-Nyquist interpolation formula that fits those samples since they don't satisfy the convergence criterion mentioned above. So there would be divergence everywhere exept at the sampling points. Any partial sum of the cardinal series will of course exist, and divergence is very slow with the harmonic series, so numerical experiments will not at first glance show this behavior.--LutzL 16:10, 24 May 2005 (UTC)Reply

Sorry, I don't understand. Don't know enough about distributions. Maybe someone else does. - Omegatron 17:43, May 24, 2005 (UTC)

Heh, well, if you ask me you're needing a formal proof to show that it, indeed, will work out correctly. The proof could use some plots to make it visually more understandable.

The frequency content of your signal can only have a sum of frequencies, but not products of frequencies. So even if it looks like there's a 0.5 Hz sine modulated on a 5 Hz wave, it's just your mind reading into it. :)

Plot a 5 Hz wave through those points and you'll see that it will match up. Keep in mind that you're nearing the nyquist frequency of 5.5 Hz. If you sample at 10 Hz then if you sample at the peaks of a sin wave then you'll get alternating spikes. So as you approach the nyquist frequency, you're samples will look more and more like the alternating spikes plot.

Re your interpolation question. The ideal interpolator is the sinc function because the freq ___domain equivalent is a rectangular function (a perfect low-pass filter). So anything less perfect of a low-pass filter will require extra room between your max frequency and nyquist frequency. You really would not want to sample @ 11 Hz if you have a real signal of 5 Hz because you'll be hard pressed to find a low pass filter that goes from 1 to 0 between 5 Hz and 5.5 Hz.

Perhaps when you understand this, you can change to article to answer the questions/confusions you have now...or post 'em and someone can work them in. Cburnett 10:22, 11 Mar 2005 (UTC)

Many thanks for your response, Cburnett. It was not quite what I was looking for, though. Many people have participated in the discussion on the Nyquist-Shannon Sampling Theorem. Most of these people are, I suspect, from an eletrical engineering background. I am a civil engineer, and I am not concerned with electronic signals. I am concerned with the dynamic behaviour of structures. By far the most popular method of structural analysis is finite element analysis. In this method, a solid structure is described by a (high) number of elements that are connected at nodes. The response of the structure is entirely determined by the displacement of the nodes. The displacement at any other point of the structure is usually given by linear or quadratic interpolation between nodes (this is inherent in the mathematical formulation of the elements). When a structure is subjected to dynamic excitation, stress waves propagate through the structure at finite speeds. Stress is accompanied by deformation. As a rule of thumb there should be at least 8 or 10 elements per wavelenght in order to capture the dynamic response of the structure. I was wondering, given the Nyquist-Shannon Sampling Theorem, why 3 or even 2.1 elements is not enough. But I think I am answering my own question. This is because the finite element method uses something as crude as linear interpolation to recover, as it were, the "true signal" (i.e. the deformed shape) from sampling points (nodes). Any opinion on this matter would be appreciated. --Ahnielsen 13:17, 15 Mar 2005 (UTC)

Yes, it definitely depends on the interpolation method used.

Actually, I think you can use linear interpolation, as long as you filter out the harmonics afterward. For electrical engineering you use a "stairstep" zero order hold. It creates harmonics which you filter to get the original signal. [1] ~~The~~ first order hold (these don't have articles???) ~~is linear interpolation, and has a different harmonic structure, but you can treat it the same.~~ [2]

Oops. Nope, that's wrong. The first order hold is not linear interpolation, as you can see in the image above. One of my professors told me that, once, though. Hmmph. Maybe it's somehow equivalent?

So I am not sure whether you can use linear interpolation and get the same results. I've heard of people using quadratic interpolation for quick processing of audio algorithms, but I believe they had a clause saying you should really oversample 4x or so if you want it to be accurate. - Omegatron 14:44, Mar 15, 2005 (UTC)

Here's some more info: http://cnx.rice.edu/content/m10402/latest/ http://cnx.rice.edu/content/m10788/latest/ - Omegatron 15:05, Mar 15, 2005 (UTC)

I have zero-order hold and first-order hold on my user page to be created along with Pass-band, stop-band, transition band. Surprised myself WP doesn't have them. :/ Cburnett 17:10, 15 Mar 2005 (UTC)

Some do. *creates redirects* :-) Always check under similar names... Should we just group all the holds into nth-order hold or something? - Omegatron 17:44, Mar 15, 2005 (UTC)

Ahnielsen, it's always helpful to tell what you're really asking about to get a better response. :)

The nth order holds correspond to fitting n degree polynomial to the last n values (assuming a causal signal which is usually the case). I've not done the math but I suspect an infinite order hold would be the sinc function (perfect interpolator). Since you are using a low order hold/interpolator you'll need to have a higher sampling rate. Really, what you are doing is oversampling to account for your imperfect interpolator.

In the end, the point of the sampling theorem is to contain all the energy of the signal in the frequencies under the nyquist frequency. Oversampling gives you more samples to manipulate and less error (think law of large numbers). Cburnett 17:04, 15 Mar 2005 (UTC)

The infinite-order hold is a Gaussian function according to http://cnx.rice.edu/content/m10788/latest/ Check under the "Related info" box for more on that site. - Omegatron 17:44, Mar 15, 2005 (UTC)

The http://cnx.rice.edu/content/m10788/latest/ page doesn't make sense to me. I also expected a sinc() function, not a Gaussian function.

There seem to be 2 different kinds of "first-order hold". Both agree on drawing straight lines (perhaps with some non-zero slope) between sampling instants.

At least one Wikipedian, and also the author of [3] , place that straight line between sampling instants 3 and 4 is on a straight line extrapolating a straight line from sample 2 and 3 (because it's "no fair" using point 4 -- that would be non-causal). That line intersects sample 3, but is in general nowhere near point 4. I presume these people prefer "causal" filters involving the n preceding points. (Perhaps this is the "predictive first order hold"?)

Other wikipedians, and also the author of [4] and [5] , place that straight line between sampling instants 3 and 4 so it intersects both sample 3 and sample 4, using linear interpolation. I presume these people prefer "symmetric" filters involving roughly n/2 preceding and n/2 following points.

I wonder which type of "first order hold" that Lozano used?

Disagreement with "Important Note"

Latest comment: 19 years ago6 comments2 people in discussion

I have two complaints regarding the paragraph labelled "important note" at the articles beginning. First, while I generally agree with what is said, I think the paragraph should mention that while it is a necessary condition for sampling to be twice the bandwidth (as opposed to maximum frequency for the case of bandpass signals), it is not a sufficient condition. For example, a real signal with Fourier support over [1,10] Hz requires a 10 Hz sampling frequency (not 9 Hz!!) to prevent aliasing. As it is, the paragraph is somewhat misleading on this fact.

My second complaint is on the existence of this paragraph, even in a clarified form. Correct me if I'm wrong, but I don't believe basic Nyquist/WKS sampling ever considers such bandpass signals. It is certainly a straightforward extension from the basic theory, but I do not believe this generalization was covered in any of these 4 original papers. Since this paragraph actually deals with an extended topic which is related but not part of the article topic I am going to remove the paragraph. I suggest a new article on generalized sampling should be created and linked to. If someone does find it necessary to replace the paragraph, please at least rewrite it so it is not misleading. 8/18/2005

Hi, in some sense Your first example is wrong. That is, if You consider complex signals which can have negative frequencies. Because then the 9Hz sampling frequency is sufficient. But Your concern is right if You consider real signals with a positive frequency support of [1,10] Hz, as this is acompagnied with a negative part of the frequency support of [-10,-1] Hz. In this case, the smallest period of a nonintersecting periodization of the support is 20 Hz, and this corresponds to the lowest sufficient sampling frequency.

As to the original paper, I've put in a link to a reprint some time ago into the german version, link section. So You can find, right below the fomulation of Theorem 1 on page 2 the statement "This is a fact which is common knowledge in the communication art", and in the second column "A similar result is true if the band W does not start at zero frequency but a higher value..." and mentions the possibilty of modulation. Of course, bandwith for "sampling=taking values" should be, as mentioned above, the smallest period for which a periodization of the frequency support is disjoint (disregarding boundaries). Because by then the Fourier transform of the signal coincides on its support with its periodization. Thus one can apply the identity of a sufficiently regular, periodic function to its Fourier series, which is at the heart of the sampling theorem, as You also will find in Shannons paper.--LutzL 10:55, 19 August 2005 (UTC)Reply

Are you sure you got those right? If a signal is real, the negative and positive frequencies will be identical, so it would seem you only have to sample one side. If a signal is complex, you'd need to sample both positive and negative frequencies, since they can be different. - Omegatron 12:49, August 19, 2005 (UTC)

But You don't sample the frequencies, You sample the signal. You can have complex signals with frequency support only in [1,10] Hz. If You sample it with 9 Hz, You get 18 real numbers (symbols) per second. If You take the real part from such a signal, it has symmetric support, but You have to sample with 20 Hz, which coincides with 20 numbers/symbols per second. The negative frequencies of a real signal have the complex conjugate amplitude of the positive ones. Are You sure You understand what sampling does?--LutzL 14:18, 19 August 2005 (UTC)Reply

Apparently not.

If You sample it with 9 Hz, You get 18 real numbers (symbols) per second.
You must mean a single complex number per sample? The article assumes a real-valued function. I'm certainly only used to real signals.
Complex signals with independent spectra from [-10,-1] and [1,10] only need to be sampled at 9 Hz, but a real signal with a mirrored spectrum has to be sampled at 20? I don't get that. — Omegatron 01:59, 11 November 2005 (UTC)Reply

Please read again my first example. It said "complex function" and "frequency support in [1,10]Hz". That means zero on [-10,-1]Hz. Then 9Hz sampling frequency is theoretically correct. And yes, the samples will be complex numbers, that is 9 complex numbers=18 real numbers per second (QAM has a related sample counting system). In reality signals are real, so one cannot have a spectrum restricted to [1,10]Hz. If there is a nonzero part of the spectrum in [1,10]Hz, there has to be a mirrored and complex conjugate part in [-10,-1]Hz. Thus they are not independend. Following the note in the undersampling section, there is no chance to choose a sampling frequency below 2*10Hz=20Hz (contrary to other situations, fct. with spectrum in [-10,-9]Hz and [9,10]Hz] can be sampled with 2*1Hz). The same holds for complex signals with independend spectra in [-10,-1] and [1,10]. They also have to be sampled at 20Hz, only that one gets 20 complex numbers=40 real numbers per second. This all under the assumption that by sampling one means taking values of the signal. With frequency multiplex methods it is possible to sample real functions with spectrum restricted to [-10,-1]Hz and [1,10]Hz with one polysample per second, the polysample consisting of 18 real numbers. Those techniques compute scalar products of the signal with model functions, technically realized by oversampling (A/D) followed by convolution="digital" filtering followed by downsampling.--LutzL 09:42, 11 November 2005 (UTC)Reply

It said "complex function" and "frequency support in [1,10]Hz". That means zero on [-10,-1]Hz.
Oh. Oops.
with spectrum in [-10,-9]Hz and [9,10]Hz] can be sampled with 2*1Hz
That doesn't make sense. If [-10,-1] and [1,10] needs 20 Hz, then so would [-10,-9] and [9,10]. What's the difference between these examples?
If you know that the signal is real, then you know that the spectrum is mirrored and complex conjugate, so it seems that the extra samples are redundant. The real signal from [-10,-1] and [1,10] has the same number of unique frequency components as the complex signal from [1,10]. Given only the positive frequency components, you can calculate all the negative ones. Maybe this isn't sampling theory, though.
Speaking of which, which of these situations are covered in the original sampling theory? — Omegatron 15:21, 11 November 2005 (UTC)Reply

For the difference in the examples: See the undersampling paragraph. In the first, N=0 and only the Nyquist frequency and anything above is a sampling rate. For [9,10]Hz, N=9 and the lowest possible sampling frequency is 1Hz. This is covered in the "proof" section, although this is a real mess. And as I tried to clarify earlier, You don't have the spectrum, there is only a signal that can be sampled by measuring it. Only after that one can compute an approximate spectrum via FFT. But we are concerned with the sampling phase. To throw away the negative part means to perform a Hilbert transform, which is not very well behaved numerically. And You need the already sampled function.

"Original sampling theory": I don't know what You understand by this term, but Shannon in the 1949 paper knew perfectly well that a general signal subspace, one that can serve as the model of a communication channel, needs only to be a function subspace that has an orthonormal basis generated by a finite number of functions per unit intervall, that is a basis of the type $g_{k}(t-nT)$ , T the time unit, k=1,...,M the different types of basis functions (QAM has M=2, DVB-T has M=4096) and n varying over all integers, orthonormal meaning

\langle g_{k}(t-nT),\,g_{l}(t-mT)\rangle =\delta _{k,l}\delta _{m,n}

, (using twice the Kronecker symbol).

Look it up, he explains the geometry of signal transmission directly after the citation, the link for the citation is a reprint of his 1949 paper. In the most simple cases, as in the basis band case, the only occuring basis function, here sinc with T=1 for its normalized version, is not only orthonormal to its nT-shifts, but also interpolating, so that the scalar products for a function in the signal subspace are actually values of this function. In other cases, sampling means orthogonal projection onto the signal space by means of computing the scalar products. However, in engineering books this is represented as analog filtering followed by "point-sampling". If You are interested in this topic, You should also read the paper of M. Unser (the guy that runs www.wavelet.org): Sampling:50 years after Shannon.--LutzL 16:03, 11 November 2005 (UTC)Reply

Shannons original theorem

Latest comment: 20 years ago1 comment1 person in discussion

Cited from: Claude Elwood Shannon: Communication in the Presence of Noise 1949, probably circulated since 1941

Theorem 1: If a function f(t) contains no frequencies higher than W cps, it is completely determined by giving its ordinates at a series of points spaced 1/2W seconds apart.

This is a fact which is common knowledge in the communication art. The intuitive justification is that, if f(t) contains no frequencies higher than W, it cannot change to a substantially new value in a time less than one-half cycle of the highest frequency, that is, 1/2W. A mathematical proof showing that this is not only approximately, but exactly, true can be given as follows.

Let F(ω) be the spectrum of f(t). Then

{\begin{matrix}f(t)&=&{\frac {1}{2\pi }}\int _{-\infty }^{+\infty }F(\omega )e^{i\omega t}\,d\omega &\qquad &(2)\\&=&{\frac {1}{2\pi }}\int _{-2\pi W}^{+2\pi W}F(\omega )e^{i\omega t}\,d\omega &\qquad &(3)\end{matrix}}

since F(ω) is assumed zero outside the band W. If we let

t={\frac {n}{2W}}\qquad (4)

where n is any positive or negative integer, we obtain

f\left({\frac {n}{2W}}\right)={\frac {1}{2\pi }}\int _{-\infty }^{+\infty }F(\omega )e^{i\omega {\frac {n}{2W}}}\,d\omega .\qquad (5)

On the left are the values of f(t) at the sampling points. The integral on the right will be recognized as essentially the nth coefficient in a Fourier-series expansion of the function F(ω), taking the interval -W to +W as a fundamental period. This means that the values of the samples f(n/2W) determine the Fourier coefficients in the series expansion of F(ω). Thus they determine F(ω), since F(ω) is zero for frequencies greater than W, and for lower frequencies F(ω) is determined if its Fourier coefficients are determined. But F(ω) determines the original function f(t) completely, since a function is determined if its spectrum is known. Therefore the original samples determine the function f(t) completely. There is one and only one function whose spectrum is limited to a band W, and which passes through given values at sampling points separated 1/2W seconds apart. The function can be simply reconstructed from the samples by using a pulse of the type

{\frac {\sin 2\pi Wt}{2\pi Wt}}.\qquad (6)

This function is unity at t=0 and zero at t=n/2W, i.e., at all other sample points. Furthermore, its spectrum is constant in the band W and zero outside. At each sample point a pulse of this type is placed whose amplitude is adjusted to equal that of the sample. The sum of these pulses is the required function, since it satisfies the conditions on the spectrum and passes through the sampled values.

Mathematically, this process can be described as follows. Let x_n be the nth sample. Then the function f(t) is represented by

f(t)=\sum _{n=-\infty }^{\infty }x_{n}{\frac {\sin \pi (2Wt-n)}{\pi (2Wt-n)}}.\qquad (7)

A similar result is true if the band W does not start at zero frequency but at some higher value, and can be proved by a linear translation (corresponding physically to single-sideband modulation) of the zero-frequency case. In this case the elementary pulse is obtained from sinx/x by single-side-band modulation.

Remarks: "Band (of frequency) W" means support of the Fourier-transform F in [-2πW,2πW]. "cps" is "cycles per second".

Compare this to a paper with the same formula, but 30 years earlier (cited by Shannon): Edmund Taylor Whittaker: On the functions which are represented by the expansion of interpolation theory (1915)

Probably this could go into the article if it, despite being rather long, could count as scientific citation.--LutzL 08:59, 25 August 2005 (UTC)--minor corrections--LutzL 09:22, 26 September 2005 (UTC)Reply

Frequency 'Support'

Latest comment: 19 years ago14 comments4 people in discussion

What the he** is a frequency support when its at home? Never heard of this one. can someone explain??--Light current 03:06, 2 October 2005 (UTC)Reply

Don't look too hard... Support (mathematics). Cburnett 18:04, 17 October 2005 (UTC)Reply

In fact, it is a little more interesting. A frequency support [A,B] means, that the fourier transform of the signal has its support (mathematics) in the intervall

[2\pi A,2\pi B]

. If an engineer speaks about it, she most likely is speaking about a real valued function with a fourier-transform that is zero outside the union of the intervalls

[-2\pi B,-2\pi A]\cup [2\pi A,2\pi B]

. Now try to find out what "bandwidth" means in each of these cases.--LutzL 13:40, 18 October 2005 (UTC)Reply

Yes, it is 2*pi*B, fourier-transform is wrt. angular frequency.

Depends which transform you're using. f implies hertz; ω implies radians per second. This article uses f where you say it means radians per second?

Can we stick to only one of f or ω? Also can we stick to one of B or W? They are both used for the same thing here, right? I'm a little confused now. — Omegatron 20:05, 19 October 2005 (UTC)Reply

Hi, I did believe that the consens on Wikipedia was to take the angular frequency fourier transform. The majority of the articles here and textbooks uses it. I too would prefer a transform wrt. normal frequency in Hz, because it is somewhat more "natural" and needs no constant factors. But by then all articles using the fourier transform, or at least all articles related to signal processing should be consistent in this.--LutzL 08:56, 20 October 2005 (UTC)Reply

Well, we're allowed to use j instead of i for the imaginary number in electrical articles, so I don't know why we wouldn't be allowed to use f instead of ω in predominantly signal processing articles. I like consistency, too. But using the equations most commonly used by people who actually use them is good, too. As long as everything is tied together and explained I don't see why that would be a problem. Where was this consensus reached? — Omegatron 14:33, 20 October 2005 (UTC)Reply

Well, by changing i to j, only the symbol has changed. By changing ω to f, not only the symbol, but the underlying function changes. This is a difference. By the way, from a mathematical point of view, both f and ω are simply symbols for variables, without further qualification. Sometimes f is even a function. I found this Fortran-like allocation of names always a little confusing. Also, one quickly runs out of "virgin" symbols.--LutzL 15:52, 20 October 2005 (UTC)Reply

So we're using Ω now? I've never seen that used for frequency before except by mistake. — Omegatron 15:15, 24 October 2005 (UTC)Reply

Hi, please change it back to anything that seems appropriate. I only changed it because some IP had it changed to H_s, which makes no sense to me. Ω is used in many recent papers in harmonic analysis dealing with sampling and interpolation formulas and generalisations of the Poisson summation formula. The space of bandlimited functions is then called a Paley-Wiener space

PW_{\Omega }

. One could suspect that Shannon used the capital letter W only because it was at the time complicated to realize greek letters with a typewriter and the small letters look identical. However, I never saw a manuskript of this time.--LutzL 20:38, 27 October 2005 (UTC)Reply

The above quoted passage by Shannon says "frequencies higher than W cps", though. cps is the same as Hz. — Omegatron 05:53, 28 October 2005 (UTC)Reply

Yes, and then he goes on to use the Fourier-integral wrt. the angular frequency, adjusting the support of the Fourier-transform to

[-2\pi W,2\pi W]

. But You are right in some way, since the index in PW is usually taken as angular frequency, so sampling frequency 1 corresponds in those papers to

PW_{\pi }

. Have I already said that for me it makes no difference, as long as transforms and sybols are used consistently? --LutzL 12:57, 28 October 2005 (UTC)Reply

Yes, but it makes a difference to our readers. — Omegatron 13:57, 28 October 2005 (UTC)Reply

Hi, I think I found a simple way for that. I also think

[f_{L},f_{H}]

is easier to understand than

[\mathrm {A} ,\Omega ]

, which is an old joke (or pun?) dating back to biblical times meaning begin and end (of the greek alphabet).--LutzL 09:45, 11 November 2005 (UTC)Reply

Definitely agree about

[f_{L},f_{H}]

, though remember that f implies cyclic frequency to most people; not angular. — Omegatron 14:37, 11 November 2005 (UTC)Reply

Notation

Latest comment: 19 years ago2 comments2 people in discussion

What's with the := notation? I've only seen that in computer algebra systems. — Omegatron 19:33, 8 December 2005 (UTC)Reply

Hi, that is also frequently used in math texts, meaning "left side is defined as right side". Obviously, in math texts one should avoid constructions as "n:=n+1" (of course, except in code examples), as math texts are considered as static.--LutzL 07:53, 9 December 2005 (UTC)Reply

Edits by User metacomet

Latest comment: 19 years ago3 comments2 people in discussion

This article is listed as a mathematical theorem. So I hope one agrees that it should be mathematically correct and using consistent mathematical notations. To the points metacomet is unwilling to accept:

The easiest to understand should be the variants of the Fourier transform. The articel this links to defines it using the usual mathematical convention as ${\hat {s}}(w)={\mathcal {F}}(s)(w):={\frac {1}{\sqrt {2\pi }}}\int _{\mathbb {R} }e^{iwt}s(t)\,dt$ . So a "frequency component" of 1 Hz would show up at $w=2\pi /sec$ . So there is a need to explain that the Fourier-transform as used in signal analysis is $S(f)={\mathcal {F_{N}}}(s)(f):=\int _{\mathbb {R} }e^{i(2\pi f)t}s(t)\,dt$ .

The more controversial point is function notation. In mathematics a function is introduced as " $f:D\to V$ with f(x)=...". f(x) denotes the value at the point x, nothing else. Sometimes in school mathematics or engineering the function notation is abused. To write f(x) for clarity is perhaps tolerable, but to write $/mathcalF\{s(t)\}$ translates into "the Fourier transform of the constant s(t)", which does not exist (as function). The Fourier transform of the function s at frequency f has the notation used above.

--LutzL 18:47, 21 December 2005 (UTC)Reply

By the way, using the notation ":=" is completely non-standard. The only place I have ever seen ":=" is as the assignment operator in the Pascal programming language. In that context, the notation ":=" means "is set equal to". You seem to be using ":=" to mean "is defined as". In my experience, it is acceptable to use the plain old "=" sign, or if you really want to be careful, then to use the "equivalent to" symbol, as in

A\equiv B

.— Preceding unsigned comment added by metacomet (talk • contribs)

Do your think it's possible that the reason the preceding comment was left unsigned is maybe just maybe that I forgot to sign it? So maybe instead of trying to make me look like a fool, you could just once cut me a break.... -- Metacomet 18:54, 21 December 2005 (UTC)Reply

This is why I really think a translation page is necessary. The equiv sign is, for mathematically oriented people, only used in connection with remainder classes or to signigy that a function is identically constant to a given value.--LutzL 18:46, 21 December 2005 (UTC)Reply

Response from Metacomet

Latest comment: 19 years ago11 comments3 people in discussion

This article is about signal processing and telecommunications theory, not mathematics. Furthermore, the disagreement that we are having is about notation, not about meaning. The notation that you are using is extremely confusing and difficult to understand. I am not interested in mathematical orthodoxy, I am interested in improving this article so that it is accessible by people other than just mathematicians. As I am sure you are aware, the purpose of Wikipedia is to provide information for a general audience – including people other than advanced theoretical mathematicians.

I don't know where you have gotton the strange notion that s(t) is a constant, but just plain s is a function. That makes no sense whatsoever. When I see just plain s, it indicates to me that you are talking about a complex number s which is the argument of a complex valued function. On the other hand, when I see s(t), it is absolutely clear to me, with no ambiguity whatsoever, that we are talking about a function s with an argument t. And in fact, by a very common convention, the argument t, especially in the context of this particular article, is very often meant to represent a real-valued, continuous ___domain that engineers and scientists like to call time.

So please don't lecture me about what is correct notation and what is incorrect notation. There is no such thing as correct or incorrect notation. Notation is to a large degree a subjective matter of taste and an objective matter of effectiveness. The test is whether a given notation is easy for the reader to understand clearly and unambiguously, not whether it meets the author's view of mathematical orthodoxy.

One more thing: throughout this entire article, the frequency ___domain is discussed using the symbol f in units of hertz. Nowhere does the article refer to angular frequency ω in radians per second. Yet all of a sudden, you want to introduce the Fourier transform in terms of angular frequency ω instead of frequency f because you think that angular frequency is more correct than plain old frequency. I have news for you: go out in the real world and talk to people who design real signal processing systems. They deal in the world of hertz, not radians per second. Oh, and they have no trouble converting from one to the other when they need to. It isn't all that difficult to multiply or divide by 2π. On the other hand, why not simply use the frequency form of the FT instead of the angular frequency form? It is just as real, and just as valid. And in many ways, it is far easier and more intuitive than the other.

-- Metacomet 16:31, 21 December 2005 (UTC)Reply

Agree that the article needs more from a signal processing perspective. Mathematical rigor can be a good thing, but the article should be accessible to the people who actually use the theorem, and should use their conventions as well, at least at the beginning. The more concise definitions can be later in the article, perhaps.

The article is still confusing and cluttered.

Wikipedia:Make technical articles accessible applies to mathematics, as well. — Omegatron 16:57, 21 December 2005 (UTC)Reply

Well, I hope You could just cool down a bit. Although sampling is something that is more frequently done in signal processing then in mathematics, the sampling theorem is a purely mathematical theorem telling something about some very ideal situation involving sharply bandlimited functions that are nowhere to be found in practice. That's why it has a mathematical category in the bottom. In practical sampling You can forget about the factor-2-rule of this theorem, since it is a theoretical, ideal limit for situations where You can wait for an indefinite amount of time and perform an indefinite number of operations. I even very much doubt that factors <3 are meaningful if near perfect reconstruction is demanded.

For the Fourier transform: please check where the link leads to and how the transform is defined there. Your assumptions and foreknowledge don't matter, since this is intended to be also read by people who don't have them. If You want consistency then make an article ((Fourier transforms in signal processing)) where this other definition with its slightly different properties is to be found and link to this one.

I always thought that even in FORTRAN they marked predefined meanings for variable names beginning with certain letters as old style. So bad news for You: in any scientific or technical article the symbol s can stand for anything, an integer, a real or complex number, a matrix or a sequence or even a function, just anything. That's why well written articles explain at least informally what every symbol stands for. But Your attitude explains why so few students in computer science, even with signal processing as specialty, don't even properly understand the Fourier transform or even polynomials and convolution. They just aren't able to read common math textbooks. So I expect that "

{\mathcal {F}}:L^{2}(\mathbb {R} ,{\mathbb {C} })\to L^{2}(\mathbb {R} ,{\mathbb {C} })

is a unitary linear function" is an incomprehensible statement for You, even though it is one of the fundamental properties that make the Fourier transorm useful in signal processing.

--LutzL 17:28, 21 December 2005 (UTC)Reply

You are unable argue your case on the merits, so you revert to attacking me personally. Instead of presenting a solid case in favor of you ideas, or even showing the flaws in my arguments, you resort to demonstrating your superior intellect by putting down my intellect. That is not a very effective approach to convincing anyone that your ideas have any merit whatsoever. -- Metacomet 17:34, 21 December 2005 (UTC)Reply

Please skim through the discussions above this one, I don't want to repeat myself on the same page. I told You in the post directly above this section that notation as s(t) is misleading, espacially if there is no explanation what s and t are, and that there are two different (by constants) versions of the Fourier transform in common use, so one has to state clearly which one one uses. I would agree that such distinctions are confusing in the introduction, but then one should not use the FT there at all and restrict the text to some diffuse but perhaps "intuitive" phrases like "frequency component". It was no insult to You but a general observation that very few people in signal processing know what a L^p-function space is. However, without this theory one can't understand in which sense the FT is invertible, which is a very important property in signal analysis. Please read everything above, omegatron explicitely stated that he doesn't know about those spaces.--LutzL 17:52, 21 December 2005 (UTC)Reply

First, when I have some free time, I will read through all of the discussion above, as you suggested.

Second, I am well aware that there is more than one form of the Fourier transform. In fact, there are at least four different forms in common usage. I personally think that there is way too much time and energy expended arguing endlessly and mindlessly about which form is better, and which form we should use. Frankly, who cares? Everyone makes such a big f---ing deal about what really just amounts to simple scaling factors. Is it really all that difficult to undersand the difference between frequency in hertz and angular frequency in radians per second (and sampling frequency in radians per sample for that matter)? I have a confession to make: I use all three of these conventions all the time, and I readily switch back and forth amongst them without getting confused (most of the time). Wow. Again, who cares?

Third, I do know what L^p function space is, although I only just recently learned about it in a formal setting, and I must admit that I do not understand it at any great level of depth, nor do I really see why it is all that important or relevant. But that is beside the point. I would guess that most electrical engineers and signal processing practitioners do not have a deep background in function spaces. So if you start throwing around a lot of obscure notation without clearly defining what you mean (in English, not in math jargon), then you are absolutely going to confuse people. Worse yet, you will actually lose them right from the start, because they will immediately stop reading the article.

Fourth, I agree with Omegatron that this article is confusing and cluttered. I would like to make it less confusing and less cluttered. If you want to help achieve this objective, I would welcome the help. But please, let's not bury the general reader with a bunch of esoteric notation and confusing jargon in the very first three paragraphs!

-- Metacomet 18:08, 21 December 2005 (UTC)Reply

I nowhere claimed that the "mathematical" version of the FT should be used. Quite to the contrary, even before You began editing the article, I've put in the conversion to the "real frequency" transform in the first occurence of the FT, again please read above for earlier discussions on this topic. But, I repeat myself, if You put direcly below the link to the "mathematical" FT without any further notice a formula using the "real frequency" FT, I call that confusing. Not that the state as it is now is better readable. As I said, leave out the FT off the introduction and use such equally obscure but better to imagine phrases as "frequency component".

If You want to note functions by s(t), fine, make a note at the top telling "In this article, mathematical notation as common in communication technology is used except in places marked "mathematical formulation"" and refer to a translation page, so that also a mathematician can understand what people in signal analysis are talking about. There it should also explained how You want to refer to a function value, say at 10. Is it s(10), which is confusing, since the letter t is missing from the function symbol, or is it s(t=10) or ...?--LutzL 18:34, 21 December 2005 (UTC)Reply

This is a waste of time. I am done. Have a nice day. -- Metacomet 18:42, 21 December 2005 (UTC)Reply

It seems to me that you (LutzL) do not really understand the purpose of Wikipedia. But I am really tired of arguing about it with you and several others. I have had enough. If you want to make Wikipedia into a Mathematics textbook, then go for it. I am not going to stop you. I will find a better way to use my time and energy. See 'ya. -- Metacomet 19:07, 21 December 2005 (UTC)Reply

Sorry, I had the same impression about You. You seem to regard anything published here as Your private notepad, so that notations and necessary conversions don't need to be explained. If I may tell Ya, there are people in the outside world that don't have an education in what signal processing people believe to be mathematics (As there are people that don't know math, but they wouldn't be much interested in the sampling thm). So they have perhaps problems understanding ordinary mathematics notations, and then they are confronted with a seemingly different kind of math. Goodbye, farewell and a happy new year.--LutzL 19:21, 21 December 2005 (UTC)Reply

Think about it this way. There are a few different types of people who will be reading this article:

Laymen who don't know anything about anything remotely related to sampling
Laymen who know a little about signal processing, but not much; probably audio-related
Engineers
Mathematicians

LutzL, what percentage of each do you think exist in our readership? — Omegatron 20:17, 21 December 2005 (UTC)Reply

The sampling theorem for beginners

Latest comment: 19 years ago3 comments2 people in discussion

I have added a new section which contains almost no math but which hopefully describes the core of the sampling theorem in a sufficiently simple manner to be useful to someone not familiar with the concept. My suggestion is that this, or something like it, can be put somewhere in the beginning and that its content can be developed in more mathematical details later on in the article --KYN 01:20, 2 January 2006 (UTC)Reply

Thank you for writing this section. I think you did an excellent job in explaining the concepts at an introductory level without oversimplifying things. It is also well written and presents the information in a logical manner. I hope you don't mind the edits that I made, but different people can bring additional ideas to the party, which often improves things. Again, well done. -- Metacomet 21:54, 2 January 2006 (UTC)Reply

BTW, I agree with your assertion that a similar section at or near the beginning of many math and technical WP articles would make a major improvement in terms of accessbility and clarity. -- Metacomet 21:56, 2 January 2006 (UTC)Reply

Historical background

Latest comment: 19 years ago1 comment1 person in discussion

I moved this section to the end of the article since it didn't make sense to me to have it between two technical sections. --KYN 21:31, 2 January 2006 (UTC)Reply

Strict or non-strict inequality

Latest comment: 19 years ago8 comments3 people in discussion

There are now two versions of the condition for the Nyquist rate:

It should be strictly larger than the largest frequency component of the sigal (intro)

It should be larger or equal to the largest frequency component of the signal (section "Formal statement of the theorem")

My understanding of the theorem is that the second condition is not correct since it implies that the largest frequency component of the signal will in fact be aliased (as described in the "Critical frequency" section).

Is suggest that only the first version of the condition is used consistently in the article, and also that the "Critical frequency" section is removed since it only describes a special case of aliasing which is not really more interesting than any other. Also, "critical frequency" is here defined as synonomous with the Nyquist rate. Maybe move that to where the Nyquist rate is defined in the intro? --KYN 23:02, 2 January 2006 (UTC)Reply

I agree with you on the issue of strict inequality. In practice, it really doesn't matter that much, because you will always tend to oversample at least a little bit to provide some margin of safety. Nevertheless, the article should state the theorem as a strict inequality, which is theoretically correct and avoids ambiguity. -- Metacomet 23:16, 2 January 2006 (UTC)Reply

The section entitled "Critical frequency" is not all that significant to the article, but I am inclined not to remove it because I think it makes an interesting point, and helps justify why the sampling condition is a strict inequality. It may make sense to incorporate it within another, existing section. I am not sure it merits its own section. -- Metacomet 23:20, 2 January 2006 (UTC)Reply

Maybe in the "Alias" section as an example? --KYN 23:38, 2 January 2006 (UTC)Reply

I would eliminate the use of the term "critical frequency" and use the term "Nyquist rate", which is standard and far more common. -- Metacomet 23:23, 2 January 2006 (UTC)Reply

Maybe not eliminate, since it has worked its way into wikipedia, see critical frequency. I don't use it myself, but I guess some do. Also: "Nyquist frequency" is a relatively common synonym (according to Google even more common than Nyqust rate). Maybe all three can be presented in the intro but use "Nyquist rate" throughout the rest of the article? --KYN 23:38, 2 January 2006 (UTC)Reply

We need to be careful with regard to "Nyquist rate" and in particular "Nyquist frequency". In my experience, both terms are and should be used interchangeably. Some people, however, seem to define "Nyquist frequency" as one-half of the sampling rate whereas they define "Nyquist rate" as twice the maximium frequency component of the signal. Take a look at Nyquist rate and Nyquist frequency and you will see what I mean. Personally, I find this to be extremely confusing, and that is why I prefer to use only a single definiiton. Unfortunately, the genie is already out of the bottle, and as an encyclopedia, WP needs to report things as they are, not as they should be. So somehow we need to address these different meanings (which you have done already in the "for beginners" section) without confusing people. -- Metacomet 23:52, 2 January 2006 (UTC)Reply

In fact, a frequency component at a frequency f has a Fourier-transform that is different from zero in some neighborhood of f, so this component has a highest frequency that is strictly larger then f. To demand that the highest frequency $f_{H}$ does not exceed $f_{s}/2$ is the same as to demand that $f_{H}\leq f_{s}/2$ is the same as to demand that there are no frequency components above $f_{s}/2$ . The value at the critical frequncy is not important since first, the support of a function is always closed, and second the Fourier transform is a measurable function, that is a class of classical functions that differ only on a set of measure zero. That is, $X(f_{s}/2)=0$ is not a well-defined condition, but

\int _{|f|>f_{s}/2}|X(f)|^{2}\,df=0

is well-defined and characterizes bandlimited functions with highest frequency $f_{s}/2$ . For example, $sinc(f_{s}t)$ is bandlimited with highest frequency $f_{s}/2$ , its Fourier transform is a rectangular function, and its supprot is the closed intervall $[-f_{s}/2,f_{s}/2]$ .--LutzL 11:43, 10 February 2006 (UTC)Reply

Rearranging the order of the sections

Latest comment: 19 years ago6 comments2 people in discussion

I also think that the section entitled "Formal statement of theorem" should come much earlier in the article, perhaps following the new section called "The sampling theorem for beginners" and before the section on "Aliasing." Any thoughts? -- Metacomet 23:29, 2 January 2006 (UTC)Reply

You know, the more I think about it, maybe some of the material in the new "for beginners" section should actually move up to the introductory section of the article (above the table of contents). Then we could combine some of the material from the current introduction, which is a bit math-heavy, with the "Formal statement" section. As currently organized, I am afraid the article might scare away non-technical or non-mathematical people before they even get to the new "for beginners" section. We could eliminate this risk by rearranging the presentation a bit. Any thoughts? -- Metacomet 23:35, 2 January 2006 (UTC)Reply

Before starting to make too much work, I would like to have some clarifications about the relation between the Nyquist–Shannon sampling theorem and the Nyquist-Shannon interpolation formula. To me they appear to describe more or less the same thing (even more so after the new section). I guess that from a historical point of view they have developed along different paths by different people or been applied to different problems, but I don't really see the point in having two articles on the same subject. Or is it the case, that we should interpret the sampling theorem as only talking about the sampling frequency and not about how the sampling is being made or about how the signal is being reconstructed? It is possible to realize that the signal should be reconstructed by means of "sinc-interpolation" but at the same time fail to realize that this is only possible if the sampling frequency is larger than twice the bandwidth of the signal? To me, the description of the sampling process and the reconstruction process together with the relation between the sampling frequency and the signal's bandwidht form one single concept which I like to think about as the "sampling theorem". It states conditions on all of these three issues in order to accomplish perfect reconstruction. On the other hand, this is only my own mental picture of what is going on. --KYN 00:24, 3 January 2006 (UTC)Reply

I think your mental model is perfectly fine, but I don't agree with your conclusion that they should all be lumped together into a single article. One of the great things about a web-based encyclopedia is the ability to create hyper-links between related articles. I do see the need for separate articles on each of the three topics, with appropriate links between them. Of course it is always possible to merge related articles together, or to break up sub-sections of one article into several smaller articles. The issue is to find the right balance between the length of the article versus keeping logically-related concepts together. Often, that means creating a single high-level overview article, in this case maybe Sampling, and then several related and more detailed sub-articles, such as Nyquist-Shannon sampling theorem and Nyquist-Shannon interpolation formula and Bandlimited, etc. -- Metacomet 01:16, 3 January 2006 (UTC)Reply

BTW, if you are sure that you want to merge the articles, you should post the appropriate tag at the top of each article and allow people time to comment. There are three relevant templates: Merge, Mergeto, and Mergefrom. -- Metacomet 01:20, 3 January 2006 (UTC)Reply

In principle I agree, but still, before even thinking about merging or not, I would like to see a statement about what is the topic of this article. It could either be

"How do I sample a signal and then reconstruct it again" (maybe "Sampling and Reconstruction" is better) which presents the basic operations and, at sutiable places, introduces concepts such as bandlimitation, sampling frequency, etc. In this case "the sampling theorem" is interpreted in a rather general sense.

"The Nyquist-Shannon sampling theorem" which, I guess, could focus only on the statement about the sampling frequency being larger than twice that of the signals's bandwidth. In this case "the sampling theorem" is given a more narrow interpretation.

One way to proceed is to start a new article, e.g. "Sampling and Reconstruction" which could be based on the "Beginners..." secton and extended with some more formal statements. Then there are (already) separate articles on the different concepts which relate to the sampling and reconstriction processes which can go into much more technical/mathetmatical depth. However, these must then be

Consistent in notation

Consistent cross-linking between the basic article and the technical articles

As little duplication of information as possible, and instead use cross-links.

Is this a way forward? --KYN 09:01, 3 January 2006 (UTC)Reply

Consequences of the theorem

Latest comment: 19 years ago3 comments2 people in discussion

The topic of this section is the fact that a function (signal or not) cannot be truncated in both domains simultaneously. This is a very important fact, which sometimes have practical consequences, and it deserves to be mentioned in the wikipedia. However, I do not see how this result is a consequence of the sampling theorem. It is rather a general property of functions, even functions of arbitrary many variables. A proof of the statement could be based on the sampling theorem, but it must be possible to prove it in other ways as well. I suggest to move this section to the bandlimited article, since it relates more to that stuff than specifically to the sampling theorem. --KYN 10:01, 4 January 2006 (UTC)Reply

I have now moved the "Consequences of the theorem" section and incorporated it into the Bandlimited article. --KYN 22:20, 7 February 2006 (UTC)Reply

The following discussion is an archived debate of the proposal. Please do not modify it. Subsequent comments should be made in a new section on the talk page. No further edits should be made to this section.

The result of the debate was don't move. —Nightstallion (?) ^{Seen this already?} 18:50, 23 April 2006 (UTC)Reply

Requested move

Latest comment: 19 years ago11 comments5 people in discussion

Nyquist–Shannon sampling theorem → The sampling theorem … Rationale: Credit for the theorem appears to be controversial, and Wikipedia has a NPOV policy. Also for consistency; e.g. the Relativity article isn't named Einstein's Theory of relativity. --Bob K 11:42, 18 April 2006 (UTC) (copied from WP:RM#18_April_2006)Reply

Survey

Add *Support or *Oppose followed by an optional one-sentence explanation, then sign your opinion with ~~~~

Oppose. Shouldn't have "The" in any case. Need to distinguish Fourier sampling theorem,[6] perhaps others (or explain that this one is also known by that name). Use redirects to accommodate the various combinations of the four names associated with this theorem. Gene Nygaard 13:06, 18 April 2006 (UTC)Reply

Good point. Statistical sampling is another example. Wikipedia currently redirects sampling theory not to here, but to statistics. --Bob K 02:19, 23 April 2006 (UTC)Reply

Oppose per Gene. Also, my experience from music sampling is that it's known as "Nyquist sampling", which I've just created as a redirect. Regards, David Kernow 18:24, 18 April 2006 (UTC)Reply
On the fence, redirects cover a lot of the aliases and "Nyquist-Shannon sampling theorem" appears to be sufficiently disambiguous and most common. Cburnett 02:23, 19 April 2006 (UTC)Reply
Oppose. Agree with Gene; and add that I've actually used the theorem. — Arthur Rubin | (talk) 18:34, 22 April 2006 (UTC)Reply

Discussion

Add any additional comments

Using –s in article titles seems ill-advised to me. Anyone else agree? Regards, David Kernow 18:54, 18 April 2006 (UTC)Reply

I agree. It should be a hyphen. Gene Nygaard 15:19, 19 April 2006 (UTC)Reply
- It seems to be the convention in WikiProject Mathematics that theorems, conjectures, etc., named after more than one person are separated by –s rather than hyphens, to disdisguish them from being named after a single hyphenated person. I don't like it, either, but it's done in a number of places. — Arthur Rubin | (talk) 16:21, 23 April 2006 (UTC)Reply

The article mentions two other aliases. But actually there are many more than that, according to Google. That is what bothers me about the current name... its arbitrariness. And it appears to be one of the least popular forms. The one I proposed turns out to be the most popular form, but I agree that something more specific would be better. Digital sampling theory? Waveform sampling theory? Signal sampling theory? I realize it applies to other types of functions, but Function sampling theory would be too vague. --Bob K 02:19, 23 April 2006 (UTC)Reply

I'd say a solution is to ensure the article has a sufficiently accurate name, then have all less accurate, more everyday names redirect to it – or, if some of these less accurate names might refer to other sampling methods, have them redirect to a disambiguation page. Regards, David Kernow 11:45, 23 April 2006 (UTC)Reply

The above discussion is preserved as an archive of the debate. Please do not modify it. Subsequent comments should be made in a new section on this talk page. No further edits should be made to this section.

Aliasing in Undersampled Data

It would be useful to have a section explaining where aliased frequencies appear in undersampled data. This might be a curiosity for the main body, but in the undersampling section it would be useful to explain where the baseband signal goes within the undersampled dataset.

The general case would be that frequency F appears as an alias Fa = mod(abs(Fs - F), Fs).

To pick up with the FM radio band example, if you undersample at 43.2MHz (n=0) then a station at F=94.7MHz appears at 8.3MHz. The entire band covers 1.6 - 21.6MHz.

Mathematical basis of sampling and reconstruction theorem.

Latest comment: 19 years ago24 comments3 people in discussion

BobK, i dunno who did the previous "proof", so if it was you, i apologize in advance if this isn't flattering (i'm hoping it was someone else's contribution). but, that proof did not even come close to it. it made assertions without support and it really didn't connect its assertions to proof of the sampling and reconstruction theorem. in other words, some of its assertions were non sequitur. (an example that i can recall is that the DTFT mentioned has nothing to do with it. the DTFT or the Z transform comes after the issue of sampling is settled.) Rbj 03:22, 4 May 2006 (UTC)Reply

It is my attempt to simplify the 18-Jan version, shown below. Don't worry about flattery. It's more important to know what you really think. I fully expect to find disagreements and irreconcilable differences here. And I don't expect to get my way, because I have limited patience for argument. It's all just temporary anyway, because whatever we do will inevitably erode under the constant bombardment from other contributors. Anyhow, I won't attempt to defend the technical correctness of "proof 1". I think this is more correctly framed as a labelling issue, rather than an issue of inclusion or exclusion. Call it "Intuitive explanation of the sampling theorem and aliasing", if you like. I think there is a need for it, because in my experience it is the most common way the theorem is introduced to newbies. It is the verbal equivalent of the frequency-___domain pictorial explanation. --Bob K 12:42, 4 May 2006 (UTC)Reply

The introduction of the proof below [labelled 18-Jan version] is not entirely correct. It takes only some minor modifications in Shannons original proof, the one that only uses mesurable, square-integrable functions, to also include aliasing etc. pp. Probably I will never understand why communication engineers, computer scientists or, beware, systems engineers feel more comfortable with distributions then with "normal" functions. To be constructive: One could make use of the Poisson summation formula to unify all proofs. And someone should explain what the difference of a sine wave of some frequency and a frequency component at that frequency is. This is quite a big error in the article where it explains the Nyquist frequency. But well, I made those points above and nothing happened.--LutzL 13:45, 4 May 2006 (UTC)Reply

Just to make sure we are on the same page, that introduction has been gone since 18-Jan. My point was to show that although I created the version Rbj replaced on 30-Apr, it wasn't from scratch. I was just simplifying an existing article. The simplification is now copied below the 18-Jan version, for easier reference. --Bob K 22:50, 4 May 2006 (UTC)Reply

Yes, and what is written in the recent proof in great length is plain wrong, especially the claimed identity

T\sum _{n\in \mathbb {Z} }\delta _{t-nT}=\sum _{k\in \mathbb {Z} }e^{i(2\pi k/T)t}

. Those chain of symbols has no sense. That is, if one wants it as a mathematical proof. There is no Fourier-series for the Dirac comb. One can construct approximating sequences for this distribution and those have a fourier series and their coefficients converge to 1 and this construction is used to prove the convergence of the Fourier-series (without ever mentioning distributions). But the series as written in the article does not converge in any meaningful sense. Its use does not constitute a proof. This formula is just a rule of thumb for mathemematically challenged students or profs. One may perform calculations with this rule that have a correct result. But to get there one has to obey strict rules that amount to the different statements of the Poisson summation formula or the identity of the fourier development of a periodic function to that function. Also in the latter case there are different instances of this identity for different classes of functions and different types of convergence of the Fourier series. But if one has to include all that precautions in the proof one arrives at Shannons original proof, only in a highly bloated version. I suggest to put Shannons proof in the first place and to shorten those calculatations using the rule of thumb into something readable.--LutzL 12:04, 5 May 2006 (UTC)Reply

well, i'm not in agreement with you and am utterly unimpressed with any appeal to "the dirac delta function is not a function therefore you cannot treat it as one" sort of argument. this is a dispute of concept that many electrical engineering scholars (some that publish in IEEE as well as write textbooks) have with mathematicians. i've had courses in Real Analysis, too. i know that, from the POV of mathematicians, that when f(x) = g(x) almost everywhere (that is everywhere but a countable number of discrete values of x in the ___domain) that the integrals of f(x) and g(x) must be equal over the same region or set of x. this of course means that if one treats the Dirac delta function as a "function" and let f(x) = δ(x) and g(x) = 0, you have one integral that is equal to 1, yet the other integral is 0 and Lebesgue integration tells us that this cannot be so.

but it's what is done in countless textbooks on Linear System theory, all the time. and if you were to take a bunch of "mathemematically challenged students" and try to use a nuanced argument involving distributions to explain what a driving impulse is (and the impulse response) of a linear, time-invariant system, you will accomplish nothing, pedagogically. you may as well make them learn it by rote. in addition, such differentiation of the Dirac delta from the "nascent" delta functions is not useful in the context of physics/engineering/signal_processing, and is numerically unimportant. if you don't allow making a proof from the nascent definition because "the dirac delta function is not a function", then i'll pick a nascent impulse function that has an area of 1 and a width of 1 Planck time (which is a legitimate function) and there will be no measurable difference in any conceivable physical application, of which sampling a reasonably well-behaved function of time is one.

even if one were to stick to the distribution/functional definition, the Dirac delta is perfectly meaningful delayed by an arbitrary finite time as long as it eventually finds itself in an integral where the strict definition of the Dirac delta then has meaning. given that, it is most certainly meaningful and true that

T\sum _{k}\delta (t-kT)=\sum _{n}e^{i(2\pi n/T)t}\

. If there is such a thing as a Dirac comb, it is certainly the case that it is periodic, has a Fourier series, and the Fourier coefficients are easy to get (which is where the Dirac deltas get integrated). you can count me as one of your "mathematically challenged" (i've had more math courses, including Probablity, Complex Variables, Real Analysis as well as Metric Spaces and Functional Analysis, Approximation theory, etc. than the average electrical engineer), but your appeal to "the Dirac delta isn't even a function" argument gets nowhere, fast. BTW, feel free to de-bloat it. Rbj 05:47, 6 May 2006 (UTC)Reply

Another reason the proof is bloated is that it includes the derivation of sinc() as the inverse transform of rect(). And it's deja vu all over again in "Shannon's original proof". Why not reference the table of Fourier transforms? It would be good to have the derivation somewhere in Wikipedia, but it only needs to appear once, and this is not the logical place for it. --Bob K 01:48, 9 May 2006 (UTC)Reply

Getting back to Rbj's objection to use of the DTFT, I have to disagree. The issue of sampling is "settled" once you state that your model is:

x_{s}(t)=\delta _{T}(t)x(t)\

You can quibble with the model, but the proof only claims to apply to the model, not to the actual samples. It deals with the question of recovering

x(t)\,

from

x_{s}(t)\,

... or equivalently the question of recovering

{\mathcal {F}}\{x(t)\}\,

from

{\mathcal {F}}\{x_{s}(t)\}\,

. And

{\mathcal {F}}\{x_{s}(t)\}\,

is a DTFT. --Bob K 01:24, 5 May 2006 (UTC)Reply

i agree that the FT of

x_{s}(t)=\delta _{T}(t)x(t)\

is equivalent to and motivates the definition of the DTFT (with the proper normalization of ω, there is no

f_{s}\

or

T\

in the DTFT or Z transform). it's just that pedagogically (and that is my primary motivation) the DTFT is further on. the Sampling Theorem is the interface between the continuous-time ___domain and the discrete-time ___domain and it needs to be presented (and proven) without appealing to a future concepts, even the not-so-distant future concepts. This remains a concept purely in the continuous-time ___domain that, when conquered, allows one to go to the discrete-time ___domain with full confidence of how that person got there (with no ostensibly circular references). Rbj 05:47, 6 May 2006 (UTC)Reply

Your objection to the DTFT is understood. But it is superficial, because the DTFT is just a special case of the Fourier transform. It is a notational convenience. I can make the same argument with the continuous Fourier transform. --Bob K 01:48, 9 May 2006 (UTC)Reply

18-Jan version

A more modern approach to proving the theorem uses a distribution known as an infinite impulse train, or Dirac comb. Although this approach seems more complicated than Shannon's, it has a couple of important advantages: it provides additional insight into the not-exactly-bandlimited case and it explains the undersampling case.

Consider the signal $x(t)\,$ , which represents any continuous-time signal with fast decay at infinity.

Further, consider a second continuous-time signal, $q(t)\,$ , defined as a Dirac comb:

q(t)=\Delta _{T}(t)=\sum _{n=-\infty }^{\infty }\delta (t-nT)

with Fourier transform

Q(f)={\mathcal {F}}\{q(t)\}=\sum _{k=-\infty }^{\infty }\delta (f-kf_{s})

where

f_{\mathrm {s} }={\frac {1}{T}}

is the sampling rate (in hertz = cycles per second).

The result of their multiplication in the time ___domain is

v(t)=x(t)\cdot q(t)=x(t)\cdot \sum _{n=-\infty }^{\infty }\delta (t-nT)=\sum _{n=-\infty }^{\infty }x(nT)\,\delta (t-nT)

.

Since multiplication in the time ___domain corresponds to convolution in the frequency ___domain, we have

V(f)=X(f)*Q(f)=X(f)*\sum _{k=-\infty }^{\infty }\delta (f-kf_{s})=\sum _{k=-\infty }^{\infty }X(f-kf_{\mathrm {s} })

which follows from the shifting property of the Dirac delta under convolution.

In order to reconstruct the original signal $x(t)\,$ from its samples $v(t)\,$ , we need to find a function $h(t)\,$ that we can use as an interpolation kernel. The reconstructed signal will then take the form:

{\tilde {x}}(t)=v(t)*h(t)=\sum _{n=-\infty }^{\infty }x(nT)\,h(t-nT)

.

The goal is to find a minimum set of conditions on x and h that will result in perfect and complete reconstruction. In other words, we want to have

{\tilde {x}}(t)=x(t)

Taking the Fourier transform of both sides, and applying the multiplication/convolution property of the Fourier transform, we have

{\tilde {X}}(f)={\mathcal {F}}\{v(t)*h(t)\}=V(f)\cdot H(f)=\left[\sum _{k=-\infty }^{\infty }X(f-kf_{\mathrm {s} })\right]\cdot H(f)

Thus ${\tilde {X}}(f)\,$ is the product of the periodized Fourier transform X of x and a "windowing" function H. For the sampling theorem, one wants to obtain the identity of $X\,$ and ${\tilde {X}}\,$ . To simplify matters, one assumes that in the periodization sum at most one term is different from zero. That is, the replicated $X(f)\,$ shifted by integer multiples of $f_{\mathrm {s} }$ do not overlap.
This is true, among other possibilities,
- in the standard case where X is zero outside the interval $\left[-{\frac {1}{2}}f_{\mathrm {s} },{\frac {1}{2}}f_{\mathrm {s} }\right]$ or
- in the undersampling case where X is zero outside the union

\left[-{\frac {N+1}{2}}f_{\mathrm {s} },-{\frac {N}{2}}f_{\mathrm {s} }\right]\cup \left[{\frac {N}{2}}f_{\mathrm {s} },{\frac {N+1}{2}}f_{\mathrm {s} }\right]

for some

N\in \mathbb {N}

.

The standard case is contained in the undersampling case for N=0.

Since the periodization of X is now identical to X, the "windowing" factor H has to be chosen as $H(f)=T$ wherever $X(f)\neq 0$ .
- In the standard case, the cardinal sine function $h(t)=\operatorname {sinc} _{f}\left({\frac {t}{T}}\right)$ satisfies this condition. Since then ${\tilde {X}}(f)=X(f)$ , we get the Nyquist–Shannon interpolation formula for the reconstruction of x(t):

x(t)={\tilde {x}}(t)=\sum _{n=-\infty }^{\infty }x(nT)\operatorname {sinc} _{f}\left({\frac {t}{T}}-n\right)

.

In the undersampling case a similar formula holds with

h(t)=(N+1)\operatorname {sinc} _{f}\left({\frac {(N+1)t}{T}}\right)-N\operatorname {sinc} _{f}\left({\frac {Nt}{T}}\right)

.

If the spectrum X(f) of x(t) has its support inside some interval $[-f_{\mathrm {H} },f_{\mathrm {H} }]$ , and if $f_{\mathrm {s} }$ is not sufficiently large to satisfy $f_{\mathrm {s} }\geq 2f_{\mathrm {H} }$ , then the terms of the periodizing summation will overlap and aliasing will be introduced.

simplified version (from 18-Jan to 30-Apr)

The Fourier transform of the discrete-time sequence is called the discrete-time Fourier transform (DTFT), which is periodic, with period $f_{s}\,$ . As shown in the DTFT article, it can be constructed by placing copies (aka aliases) of $X(f)\,$ at intervals of $f_{s}\,$ and summing them all together:

X_{dtft}(f)={1 \over T}\sum _{k=-\infty }^{\infty }X(f-kf_{\mathrm {s} })

Clearly, if the original and the copies are bandlimited, and $f_{s}\,$ is sufficiently large to prevent overlap, the original signal can be recovered by simply filtering out the aliases. That is the essence of the sampling theorem. There are also situations where overlap is allowed to occur, and due to the particular shape of $X(f)\,$ the aliases coincide with null regions of $X(f)\,$ . Then neither the alias nor the original spectrum is irreparably affected. See Undersampling.

$X(f)\,$ is proportional to the k=0 term of $X_{dtft}(f)\,$ . So when $f_{s}>2f_{H}\,$ as previously discussed, we may conclude:

X(f)=X_{dtft}(f)\cdot T\cdot \operatorname {rect} \left({f \over f_{s}}\right)\quad =X_{dtft}(f)\cdot T\cdot \operatorname {rect} \left(Tf\right)\,

where

T\cdot \operatorname {rect} \left(Tf\right)\,

is a rectangle function, whose inverse transform is

\operatorname {sinc} \left(\pi t/T\right)\equiv {\frac {\sin(\pi t/T)}{\pi t/T}}\,

And that essentially proves the theorem, because $x(t)\,$ can be recovered from $X(f)\,$ . E.g., the inverse transform of the product is the convolution of the inverse transforms:

$x(t)\,$	$=\left(\sum _{n=-\infty }^{\infty }x(nT)\,\delta (t-nT)\right)*\operatorname {sinc} \left(\pi t/T\right)\,$
	$=\sum _{n=-\infty }^{\infty }x(nT)\cdot \operatorname {sinc} \left[\pi (t-nT)/T\right]$

which is known as the Nyquist–Shannon interpolation formula.

further simplified proof

Using the Fourier-transform in the normalization $X(f)={\mathcal {F}}(x)=\int _{\mathbb {R} }x(t)e^{i(2\pi f)t}\,dt$ , the Poisson summation formula (PSF) can be stated for any $T>0$ as

T\sum _{n\in \mathbb {Z} }x(nT)e^{-i(2\pi f)nT}=\sum _{k\in \mathbb {Z} }X(f-k/T)

.

Further, the pi-normalized cardinal sine function is the Fourier-transform of the box-function of height 1 and support [-1/2,1/2].

After multiplication of the PSF with $e^{i(2\pi f)t}$ and integration for f over the intervall $\left[-{\frac {1}{2T}},{\frac {1}{2T}}\right]$ one gets on the left side

T\sum _{n\in \mathbb {Z} }x(nT)\int _{-1/(2T)}^{-1/(2T)}e^{i(2\pi f)(t-nT)}\,df=\sum _{n\in \mathbb {Z} }x(nT)\operatorname {sinc} (t/T-n)

.

By partitioning the inverse Fourier-transform one obtains

x(t)=\int _{\mathbb {R} }X(f)e^{i(2\pi f)t}\,df=\sum _{k\in \mathbb {Z} }\int _{-1/(2T)}^{-1/(2T)}X(f-k/T)e^{i2\pi (f-k/T)t}\,df

.

Thus the right side can be transformed into

\sum _{k\in \mathbb {Z} }\int _{-1/(2T)}^{-1/(2T)}X(f-k/T)e^{i(2\pi f)t}\,df=x(t)+\sum _{k\in \mathbb {Z} ,\,k\neq 0}\int _{-1/(2T)}^{-1/(2T)}X(f-k/T)e^{i(2\pi f)t}\left(1-e^{-i(2\pi k/T)t}\right)\,df

.

If the Fourier-transform X(f) of x(t) is zero outside the intervall [-1/(2T),1/(2T)], the identity of x(t) to its cardinal interpolation series or Nyquist–Shannon interpolation formula follows. If the support of X(f) is not contained inside this intervall, then the application of the cardinal interpolation formula results in non-vanishing alias terms.

--LutzL 11:42, 5 May 2006 (UTC)Reply

Short proof as proposed by BobK

Using the Fourier-transform in the normalization $X(f)={\mathcal {F}}(x)=\int _{\mathbb {R} }x(t)e^{i(2\pi f)t}\,dt$ , the Poisson summation formula (PSF) can be stated for any bandlimeted function x(t) and any $T>0$ as

T\sum _{n\in \mathbb {Z} }x(nT)e^{-i(2\pi f)nT}=\sum _{k\in \mathbb {Z} }X(f-k/T)

.

The left hand side is a Fourier series, the right hand side is the periodisation of the Fourier-transform X(f) of x(t). Both sides are well-defined for bandlimited functions. Now suppose that the highest frequency $f_{H}$ of x(t) is $f_{N}={\frac {1}{2T}}$ or smaller, that is, the support of X(f) is bounded to the intervall $[-f_{N},\,f_{N}]$ . After multipliation of both sides with the rectangular function $\operatorname {rect} (2Tf)=\operatorname {rect} (f/f_{N})$ , one obtains

T\sum _{n\in \mathbb {Z} }x(nT)e^{-i(\pi f/f_{H})n}\cdot \operatorname {rect} (f/f_{N})={\begin{cases}{\frac {1}{2}}(X(-f_{N})+X(f_{N}))&{\mbox{for }}f=\pm f_{N},\\\\X(f)&{\mbox{for }}-f_{N}<f<f_{N},\\\\0=X(f)&{\mbox{for }}|f|>f_{N}.\end{cases}}

Therefore, the right hand side coinsides with X(f), except possibly at the boundary points of the intervall $[-f_{N},\,f_{N}]$ . For the integration of a function, values at a finite number of points (or at a set of measure zero) can be changed without influencing the integral. For the purpose of the inverse Fourier-transform this means that both sides represent X(f). Applying the inverse Fourier transform, and using the rules of the Fourier transform, one thus arrives at the Nyquist–Shannon interpolation formula.

--LutzL 10:26, 9 May 2006 (UTC)Reply

discussion

I had to revert the restriction of the generality by Rbj.

okay fine. i'll post it as an alternative

Please note that

f_{H}=f_{N}

is mathematically possible via this proof.

no. it's in error. your proof requires that there be no sinuosoids at Nyquist (when there is no problem with sinusoids at any other frequency below Nyquist). f_H is the highest possible frequency component in x(t). no sinusoidal component of x(t) may be as large or larger than 1/(2T) or aliasing potentially will occur.

And please tell me any textbook that makes this restriction

f_{H}<f_{N}

.

\Omega _{s}-\Omega _{N}>\Omega _{N}\

, or

\Omega _{s}>2\Omega _{N}\

,

then there is Shanmugam Digital and Analog Communication Systems (c) 1979; Carlson Communication Systems, 3^rd edition, (c) 1986 and earlier; Haykin, Communication Systems, (c) 1986. sorry about all of the old dates (i'm 5 decades old and one of those "mathematically challenged profs" or former prof), but since this is a mathematical truism, any more current text that claims that you can have any non-zero frequency components at Nyquist is simply wrong.

The argument against it in the article is somewhat beating a self-constructed strawman. The WKS-sampling theorem only ensures the perfect reconstruction of bandlimited

L^{2}(\mathbb {R} )

-functions, the function

\sin(2\pi f_{N}t)=\sin(\pi t/T)

does not belong to that space. See my comments high above on this talk page on this topic.--LutzL 06:59, 10 May 2006 (UTC)Reply

that's also in error. the sampling theorem as no problem with sinusoids (that result in dirac impulses in the frequency ___domain and certainly violate L²_R) as long as their frequency is strictly below Nyquist. you also have a problem simultaneously implying time-limitedness (which is often, but not always how people get finite integrals of the time-___domain signal) and frequency band-limitedness. what is necessary is frequency bandlimitedness (for sampling in the time domian). r b-j 18:16, 10 May 2006 (UTC)Reply

OK, I'll look the books up, if they are available to me. However, in my experience, the math in books written by engineers for engineers is often, well, questionable. There is a difference in explaining the formal handling of e.g. the Fourier transform and providing the fundamentals of Fourier analysis. The former, practically oriented part is often acceptable to well done, the latter part is missing or contains critical ommissions and factual errors. Needless to say, the WKS-sampling theorem belongs to the fundamentals part. The general sampling theory is a different story. But since it deals with approximations, never with perfect reconstructions, there is no critical frequency. I.e., applying a practical sampling method at the Nyquist frequency will always have grave errors.
You still have to point out where the argument I'm using goes wrong. This is math, not Dogma, so everything is provable. And no, it is not possible to sample sinusoids in the way this sampling theorem does. The WKS-sampling theorem only works for "finite energy" functions, that is measurable functions that are square integrable. At least the reconstruction part fails to work for tempered distributions like a sine function (which has only as a tempered distribution a Fourier transform). Thus we are back to the topic of Fourier series expansions for tempered distributions. But I don't want discuss distributions with You, since You have still problems dealing with measurable functions and Lp-spaces.
And no, I'm fully aware that one has to use the full real axis to reconstruct a sampled measurable function.
Please consult Higgins: "Five short stories on the sampling theorem" and Benedetto/Zimmermann:: "Sampling operators and the Poisson summation formula". The latter is available on the net, exact references for both are in Unser: "Sampling--50 years after shannon", also available on the net. Those I consider state-of-the-art treatments of the sampling theorem/theory.--LutzL 08:38, 11 May 2006 (UTC)Reply

And by the way, You too confuse a sine wave with a frequency component. A frequency component has a fourier transform with an intervall around the given frequency inside its support. So a frequency component at some frequency f has a highest frequency that is larger than f. To say that there are no frequency components above f is equivalent to saying that the highest occuring frequency is f. (In fact, that there is a modification of the Fourier-transform of the signal on a set of measure zero such that the support of the modified function is bounded by f. Modify X(f) to zero if You are more comfortable with it).--LutzL 11:43, 11 May 2006 (UTC)Reply

Well, I went to our library and had a look at Oppenheim/Schafer, the second german edition from 1995 and the Prentice Hall reprint of the first 1975 edition. Also at some books nearby. First, You are right that O/S is among the best of those books. Second, there is no proof of any kind, only a justification which omittes all mathematical subtleties. Third: Please check, the equality is nowhere discussed. It is only stated that $f_{H}<f_{N}$ is sufficient and that $f_{H}>f_{N}$ leads to aliasing, which is of course correct. In the text but not in the formulas it is further stated that one needs "at least" the Nyquist frequency, for me this is a hint that O/S believe that equality is also admissible. As to the other books: By now You know that I close a book if it starts about the Fourier series of a Dirac pulse (or worse: Dirac function). There was also at least one book that correctly stated and used that the Fourier transform of a Dirac comb (as a tempered distribution) is again a Dirac comb. But it failed to mention the Schwarz-space of the corresponding test functions. Infinite series and integrals were commuted without justification, from the equality of the Fourier coefficients of two functions it was concluded that the functions must be equal, without mentioning possible differences on a set of measure zero, and so on. You see, this is not mathematics, this is justification with hand waiving at best and voodoo in the general case. This is bad, from a certain angle. Since students tend to forget the cautious formulation in the text (if present at all) and only remember the bold and simple statements/formulas, we end up with neverending confusion.--LutzL 12:54, 11 May 2006 (UTC)Reply

LutzL, does your O&S have a chapter called (or translated) Sampling of Continuous-Time Signals (it's chapter 3 in my book) and a section named Frequency-Domain Representation of Sampling (section 3.2 in my book)? it's there.

as indicated before, i am well aware of mathematicians objections to the usage of the dirac impulse function as done by engineers and engineering texts. i am, for the most part, unimpressed with the objections. some of it is semantics. i don't care whether or not you allow me to call a dirac delta a "function" or not. i realize that it is not a regular, normal function, but if you do not allow us to define the dirac delta to be a function of virtually zero width and integral of 1, i say "what's the point?" we want to use these concepts and functions to get something done and we know that, in the limit, the width of the nascent dirac delta (the engineering kind of definition) never quite makes it all the way to zero. as i said before, a nascent delta function with width of 1 Planck time is a legitimate function for your purposes, and, for any physical or engineering purpose is indistinguishable in effect from the dirac delta.

if you consider that the Fourier transform of the Dirac comb is a legitimate expression (another Dirac comb), that is functionally equivalent to my "bloated proof" where i use the Fourier series equivalent of the dirac comb and use the shifting property of the F.T.

lastly, there is no need for the function being sampled to be L²(R). we have the Fourier transform of a few "functions" that aren't L²(R). it needs to be sufficiently bandlimited so that no frequency component (and i am using the terminology correctly) of one image occupies the same frequencies of any component of another image (or the baseband). if that happens, you do not know what the original frequency and phase was. even in the case of a frequency component right at Nyquist, there is aliasing:

x(t)={\frac {A}{\cos(\theta )}}\cos(2\pi {\frac {1}{2T}}t+\theta )

if you sample that at times that are multiples of T, you will get x(nT) = +A, -A, +A, -A, +A, -A, +A, -A, +A, -A no matter what θ is. but the amplitude is not A unless θ is a multiple of π. you cannot determine both the amplitude and phase of the original sinusoid. that information is lost forever and you cannot reconstruct x(t) from the samples x(nT). this is why, any decent textbook on the subject is careful to state that the sampling frequency must exceed twice the highest frequency. equality is not good enough. f_H is a property of the signal and 1/(2T) is a property of the sampler. they are not the same thing.

r b-j 17:01, 11 May 2006 (UTC)Reply

To reiterate things I already said: Of course You can sample a sinusoid. You just cant reconstruct it from the samples. But this is also not claimed by the sampling theorem. You can also sample a sinusoid at the double Nyquist-frequency. Those values are equally well open to interpretation and don't characterize the sinusoid uniquely. You can perhaps make a lucky guess if You know that it is a single sinusoid. But also in this case, the reconstruction formula diverges almost everywhere.
To make things worse, the (corrected) proof as You understand it, that is with equality of functions in the pointwise sense, only works for functions x(t) in $L^{2}(\mathbb {R} )$ that are bandlimited and thus continuous and have additionally a continuous and piecewise differentiable Fourier transform. For this it is necessary that $(1+|t|)|x(t)|$ is a bounded function, $(1+t^{2})(|x(t)|+|x''(t)|)$ bounded is largely sufficient. Of course no nonzero sinusoid satisfies those conditions, since periodic.
Next reiteration: That the Dirac pulse and the Dirac comb have a Fourier-transform as tempered distributions is a triviality resp. a reformulation of the PSF for tempered test functions. The use is a question of style and time, since You would have to define the Schwartz space and its topology and to show that those linear functions on the Schwartz space are bounded. It is entirely a different beast to say that the Dirac comb equals (pointwise I expect) a highly oszillating series. Let's have a look at the partial sums:

\sum _{k=-M}^{N}e^{i(2\pi f)k}={\frac {e^{i(2\pi f)(N+1)}-e^{-i(2\pi f)M}}{e^{i(2\pi f)}-1}}=e^{i(\pi f)(N-M)}\cdot {\frac {\sin(\pi (N+M+1)f)}{\sin(\pi f)}}

And now please tell me in which sense this converges as M and N go (independently) to infinity!

A hint: Don't use only the box pulse as approximation of the Dirac pulse. Especially if You want discuss Fourier series. At least, You should know the Gibbs oszillations at jumps. And the Fourier coefficients are not nice. There are two parametrized sequences of Fourier coefficients that have easily computable Fourier series and that are also very convincing approximations to the Dirac pulse. Look for "Poisson kernel" and "Fejer kernel" or read the best available net source for it: Carl Offner: A little harmonic analysis (Postscript).
But the point is: You first have to apply the approximation of the distribution to a test function. From this value, which depends on the approximation parameter, You then take the limit in the approximation parameter. This is by the topology of the Schwartz space.
Semantics: Any distribution is also a function. But the point space in question is the space of test functions. But that is not the point of confusion. The point of confusion is this: In engineering they exchange to carelessly limit processes. It is not always possible to exchange infinite sums with integrals or integrals with limits. Especially when dealing with distributions, because most of the exchange theorems for limit processes fail systematically for them.

--LutzL 07:41, 12 May 2006 (UTC)Reply

i am not going to argue this endlessly by use of proof by verbosity. i will say this, the sampling theorem does not require L²(R) unless you are also going to require that for the continuous Fourier Transform. but we engineers apply (in an indirect manner by indirect use of dirac delta functions) the F.T. to DC and sinusoids, none of which are L²(R). we recognize that

{\mathcal {F}}\{\delta (t-\tau )\}=e^{-i2\pi \tau f}

and by use of the duality property

{\mathcal {F}}\{e^{i2\pi f_{0}t}\}=\delta (f-f_{0})

similarly, for a real sinusoid

{\mathcal {F}}\{{\frac {A}{\cos(\theta )}}\cos(2\pi f_{0}t+\theta )\}={\frac {A}{2\cos(\theta )}}\left(e^{-i\theta }\delta (f+f_{0})+e^{+i\theta }\delta (f-f_{0})\right)

no matter how you (legitimately) describe the sampling theorem, that sinusoid can be completely reconstructed from the samples

{\frac {A}{\cos(\theta )}}\cos(2\pi f_{0}nT+\theta )\

if and only if |f₀| is strictly less than 1/(2T). if |f₀| equal to 1/(2T), it cannot be reconstructed from the samples (which is the point of contension) because the samples are indistiguishable from another set with a different θ. i think we both agree that if |f₀| is greater than 1/(2T), it cannot be reconstructed. but i have shown clearly a counter example to two claims of yours: 1. sinusoids can be sampled and reconstructed from the samples if their frequency is strictly less than Nyquist. 2. a sinusoid with frequency of exactly Nyquist cannot have its phase and amplitude both uniquely determined from the samples (that information is lost) and thus cannot be reconstructed from the samples.

just to be clear about strawmen, red herrings, or other distractions i am not conceding either point and i am not addressing any other point. i don't have the time. r b-j 15:47, 12 May 2006 (UTC)Reply

Hi, I'm very confident that every calculation You do for the practical part of Your job will be correct. Unconciously, You are doing the right thing, and You describe it very clearly. You use an approximation of the Dirac delta, and thus place Yourself inside

L^{2}(\mathbb {R} )

. There You can do anything you have to do, Fourier transforms, Fourier series, sampling formulas, etc. and after You got a result from this approximate computation, You take the limit in the approximation parameter. Since this amounts in most cases to just setting this parameter to zero, You even need not be aware of performing a limit computation.

But pity for Your students, at least for those that understand the theoretical part of the Fourier transform and its connection to distributions. Because to be correct in front of You, they will have to tell things that are wrong to them.

Also, You seem to have problems in evaluating an argument. Because You insist on things that are not disputed and fail to address things where You risk to admit being wrong. As for the Fourier series of the Dirac comb and the question of the kind of convergence of this oscillating series.

Please notice that a small rectangular spike around some frequency f₀ and with width e is in

L^{2}(\mathbb {R} )

, even if it is of Planck width. But its inverse Fourier transform

\sin(2\pi f_{0}t)\mathrm {sinc} (et)

(edit: in fact the inverse transform of

{\frac {1}{i2e}}(\mathrm {rect} ((f-f_{0})/e)-\mathrm {rect} (f+f_{0})/e)

) has a maximum frequency f₀ +e slightly greater than f₀. To sample it, You will need at least the double of this maximum frequency which is greater than 2f₀, which is what You insist in. For any

f_{s}>2f_{0}

and

0<e<f_{s}-2f_{0}

, the interpolation formula

\sin(2\pi f_{0}t)\mathrm {sinc} (et)=\ \sum _{k\in \mathbb {Z} }\sin \left(2\pi {\frac {f_{0}k}{f_{s}}}\right)\mathrm {sinc} \left({\frac {ek}{f_{s}}}\right)\mathrm {sinc} (f_{s}t-k)

holds. The convergence is very slow, so any finite part of that series will, for e small enough, be terribly wrong. But for e=0 it fails in the infinite series too. There is no problem in taking the values of the sine function, but no way to apply the reconstruction formula to them. (By the way, this is another topic where You didn't provide an answer. How are You going to reconstruct the sine function.)

One last remark to distributions. Since You are happy to call any function bandlimited that has a compactly supported Fourier transform in the sense of tempered distributions, You should also have no problem calling a polynomial a bandlimited function. Its Fourier transform is a linear combination of derivatives of the Dirac distribution, so its support is the point 0.

--LutzL 09:46, 16 May 2006 (UTC)--some further notational and error corrections as demanded below--LutzL 16:57, 30 May 2006 (UTC)Reply

what's with the capital "Y"? will you stop being silly?

did you mean to say

\sin(2\pi ft)\mathrm {sinc} (\pi et)\

and

\sin(2\pi ft)\mathrm {sinc} (\pi et)=\sum _{k=-\infty }^{+\infty }\sin(2\pi fkT)\mathrm {sinc} (\pi ekT)\mathrm {sinc} \left(\pi {\frac {t-kT}{T}}\right)

?

why not use

f_{0}\

instead of

f\

so to keep consistent with convention and so we might leave the latter for the fourier integral or argument of a F.T. function of f if needed?

also

f_{H}\

is a property of the signal being sampled, not the Nyquist frequency which is

f_{s}/2={\frac {1}{2T}}\

. we need to keep our notation correct and consistent with the topic of discussion.

if the rectangular spikes are centered at ±f, it's

\sin(2\pi ft)\mathrm {sinc} (\pi et)\

not

\sin(\pi ft)\mathrm {sinc} (\pi et)\

. this also means that the width of the spike is e, so i think you mean to say

f+e/2=1/(2T)\

. no? if you're trying to make a point by saying

f_{H}=f_{s}/2={\frac {1}{2T}}\

, that will work since there is no dirac delta precisely on frequency

f_{s}/2\

(or above) so it doesn't matter much. my point is that

\delta (f-f_{0})\

is fine for any

|f_{0}|<f_{s}/2\

(even though the inverse F.T. is not

L^{2}(\mathbb {R} )

) but is not okay for

|f_{0}|=f_{s}/2\

and the simple proof was made above (all i need is a counter example).

i know there are pathological functions that do not work with the sampling theorem. another is what would get reconstructed from this infinite sequence: ... -1, +1, -1, +1, -1, +1, -1, +1, -1, +1, +1, -1, +1, -1, +1, -1, +1, -1, +1 ... it doesn't converge either. the inverse F.T. of the doublet (or higher "derivatives" or the dirac delta) are among those pathological functions. it is still not the issue. i'm not the one clouding the subject. i'm also not the one appealing to pathological functions to try to make a point. i am saying that DC and sinusoids can be sampled and reconstructed in the sampling theorem (and we both know that they are not

L^{2}(\mathbb {R} )

) as long as their frequency is strictly less than Nyquist. and i shown why, if the frequency is exactly Nyquist, that there is information lost (aliased) that prevents reconstruction. again, those are the only two points i am making and i refuse to be distracted from it.

a long while ago, on the pertinent USENET newsgroup comp.dsp i had a drawn out debate [7] [8] [9] with a person named "Jay Rabkin" about the nature of the sampling and reconstruction and of the dirac impulse and he kept appealing to the "distribution" definition of the dirac delta as grounds to make his point which was hard to know what point he was trying to make other than that i could not sensibly claim that the dimension of quantity of the dirac delta function, such as it is, is the reciprocal dimension of the argument. i fear that this sorta appeal to esoterism to obscure the simpler reality what you're doing. you "pity [My] students" (i'm not teaching at the moment), for fear that some esoteric nature of the dirac distribution will mess them up with the sampling theorem. this is telling. you have absolutely no concept of pedagogy in an electrical engineering curriculum. we have trouble getting these kids to care about region of convergence of the Laplace transform or Z transform. they rarely will take a formal Real Analysis course in the math department (you know, the "given an epsilon, find a delta" so that some thing is proved) or will ever learn of Lebesque integration or the difference between countably infinite and uncountably infinite. you try to describe the Dirac delta in terms of "distribution" (instead of as a thin limit of the nascent delta functions with unit area) and their eyes will glaze over immediately and you will end up actually conveying no knowledge to them. this is why the engineering texts (in Linear System Theory) do it the way they do. even if it makes the mathematicians blanch.

so please make your point concisely and use the conventions for notation common to the subject (as in the article). again, i spent far more time on this silly argument than i had expected to. r b-j 05:21, 17 May 2006 (UTC)Reply

OK, my last comment should be better for your conventions now. So I have fortunately not to take pity on your actual students, but only past event pity on Jay Rabkin. The point both sides were missing is, if t has a dimension, this should be factored out in the argument of the "dirac function". More precisely, you should always define what you mean by a specific dirac process. Something like

\int _{\mathbb {R} }f(t)\delta (t/T-k)\,dt/T=f(kT)

would be appropriate. -- If You can't teach your students what a continuous function is (e.g., something for which a constant function is a good approximation) then they won't understand the concept of a nascent dirac pulse. And you even don't need it. Sampling gets You a sequence of pairs of time and value, reconstruction restores a function, optimization theory tells you how to best combine both. -- BTW., it was not Dirac that came up with formalized distributions but Laurent Schwartz. And he wrote extensive textbooks about it that are still readable today. Not that I have any hope that an old silly as you will read them. -- I don't expect you to do fundamental research or teach your students more than they need for their practical calculations. But i miss your professional interest into the fundamentals of the things you teach. The doubt on the common sense and fishy formulations in textbooks. A real analysis course in the students days of yours is not enough for that. -- PS: You still need to indicate how you wish to reconstruct a sum of sine functions from their samples, not approximatively but in a very theoretical sense with infinite precision.--LutzL 16:57, 30 May 2006 (UTC)Reply

i hadn't seen this addition when it was made (possible because i was out due to holiday and many other edits to this page where made since). there are still other format mistakes that i will try to fix.

Proof using PSF

Defining

X(f)\equiv {\mathcal {F}}\{x(t)\}=\int _{-\infty }^{+\infty }x(t)e^{-i2\pi ft}\,dt

.

The Poisson summation formula (PSF) says

\sum _{n=-\infty }^{+\infty }X(nf_{0})e^{i2\pi nf_{0}t}={\frac {1}{f_{0}}}\sum _{k=-\infty }^{+\infty }x\left(t-{\frac {k}{f_{0}}}\right)

and using the duality property of the Fourier transform, the PSF can be restated as

T\sum _{n=-\infty }^{+\infty }x(nT)e^{-i2\pi nTf}=\sum _{k=-\infty }^{+\infty }X\left(f-{\frac {k}{T}}\right)

.

Given the condition that the highest frequency of x(t) is $B<{\frac {1}{2T}}$ , that is, the support of X(f) is bounded to the interval $[-B,\,B]$ , and multiplying of both sides with the rectangular function $\operatorname {rect} (Tf)$ , one obtains

T\sum _{n=-\infty }^{+\infty }x(nT)e^{-i2\pi nTf}\cdot \operatorname {rect} (Tf)=\sum _{k=-\infty }^{+\infty }X\left(f-{\frac {k}{T}}\right)\cdot \operatorname {rect} (Tf)=X\left(f-{\frac {0}{T}}\right)

because only the term for k=0 survives the rectangular windowing operation. This means

T\sum _{n=-\infty }^{+\infty }x(nT)e^{-i2\pi nTf}\cdot \operatorname {rect} (Tf)=X(f)\quad \forall f\

and that, given the bandlimit and sampling condition above $B<{\frac {1}{2T}}$ , the samples of x(t), or x(nT) are sufficient to describe fully the spectrum of x(t) and knowing that the continuous Fourier transform is a fully invertable operator, the samples of x(t), or x(nT) are sufficient to fully describe x(t).

Reconstructing x(t) requires the inverse Fourier Transform:

x(t)={\mathcal {F}}^{-1}\{X(f)\}=\int _{-\infty }^{+\infty }X(f)e^{+i2\pi ft}\,df

.

x(t)={\mathcal {F}}^{-1}\left\{\sum _{n=-\infty }^{+\infty }x(nT)e^{-i2\pi nTf}\cdot T\operatorname {rect} (Tf)\right\}

=\sum _{n=-\infty }^{+\infty }x(nT)\cdot {\mathcal {F}}^{-1}\{T\operatorname {rect} (Tf)e^{-i2\pi nTf}\}

=\sum _{n=-\infty }^{+\infty }x(nT)\cdot \operatorname {sinc} \left(\pi {\frac {t-nT}{T}}\right)

.

This is the Nyquist–Shannon interpolation formula.

this is how to use the PSF to (quickly) prove the sampling and reconstruction theorem. but no matter how you do it, the sampling frequency must be strictly greater than twice the highest frequency of the baseband signal being sampled. that is fundamental. r b-j 18:43, 10 May 2006 (UTC)Reply

BobK, can you demonstrate how nearly any of the edits made in the past week or two...

Latest comment: 19 years ago8 comments3 people in discussion

...either reduce bloat, or make the mathematical justification of the sampling theorem more straight-forward? the article is getting bigger and less concise. it is getting less clear. these edits, overall, are not helping. r b-j 04:58, 26 May 2006 (UTC)Reply

An example of bloat is deriving the inverse transform of rect(), which can be looked up in a table. Each proof that I did is concise. It is not bloat when there are several interesting perspectives that arrive at the same place. Nobody is forced to read them all. And it avoids the hassle of trying to agree on just one of them. But if you would like me to pick just one, I can do that too. --Bob K 03:17, 30 May 2006 (UTC)Reply

Since the Dirac comb is not actually what sampling is (not even close), I think it is blatently not straightforward to use it as the definition of sampling. You can start with the concept of a periodic extension of

X(f)\,

without claiming anything about its relationship to sampling. The Dirac comb then emerges in a more mathematical (i.e. "straightforward"), way. --Bob K 17:30, 30 May 2006 (UTC)Reply

i am not sure exactly what to do about this "mathematical basis ..." section. do we include Shannon's original proof? why not Wittaker's? what was Nyquist's mathematical treatment? that's historical stuff. the proof by use of PSF is sufficient mathematically (and brings no objections from the mathematicians from our bonehead engineering usage of the dirac delta or dirac comb), but might seem a little indirect. i think there needs to be a reasonably rigorous (from an engineers POV) and straight forward proof that applies the concepts pretty much in the pedagogical chronological order:

1. we have a continuous-time signal.

2. we uniformly sample that signal. how do we represent that sampling process mathematically in a simple and effective manner? is there another means or a better means other than multiplying by a dirac comb?

Certainly. Shannon's proof for example. The actual samples are the finite-valued, discrete-time sequence,

x(nT)\,

, not the infinite-valued

x(nT)\cdot \delta (t-nT)\,

. --Bob K 14:17, 30 May 2006 (UTC)Reply

3. using whatever means of sampling, we need to show that this sampling process actually samples the continuous-time signal - that it actually discards the information in the c.t. signal between the sampling instances.

Done.

x(nT)\,

is just one [real or complex] number. Everything nearby is necessarily discarded. --Bob K 14:17, 30 May 2006 (UTC)Reply

4. then we need to show that, somehow from that sampled signal representation or just the discrete sequence of samples, that somehow the original c.t. function can be reconstructed from only those samples and under what condition(s) that reconstruction is guaranteed to be accurate. now, i think the only way to do that from the dirac sampled function, is with a frequency-___domain approach. that is to show that multiplying the original c.t. function by the dirac comb causes this repeating and overlapping of the spectrum and that, if there is no overlapping, that LPFing the result gets your original spectrum back.

The discrete-time sequence is not a Dirac comb. Unlike a comb, it is physically realizable. The comb is a questionable attempt to model the discrete-time sequence as a continuous-time function so that we can use the continuous-time transform, only because it is comfortable territory for most readers. As I showed,

x(nT)\cdot \delta (t-nT)\,

can also be viewed as a mathematical result, rather than a [flawed] premise. But Shannon showed that the

x(nT)\,

samples are also the coefficients of a Fourier series expansion that, under the stated conditions, fully represent

X(f)\,

and therefore fully represent

x(t)\,

. No delta functions are needed. --Bob K 14:17, 30 May 2006 (UTC)Reply

that's my take on this issue. r b-j 06:51, 30 May 2006 (UTC)Reply

Sounds good.

ad 1.) full qualification: continuous, finite-energy, continuous-time signal.

ad 2.) more emphasis on general shift invariance, not only in sampling but also in reconstructing. As for the sampling:

S_{T}:L^{2}(\mathbb {R} ,\mathbb {C} )\cap C(\mathbb {R} ,\mathbb {C} )\to \ell ^{2}(\mathbb {Z} ,\mathbb {C} )

defined by

S_{T}(x)=\{x(nT)\}

does the trick as well and requires no play with tempered distributions. On the other hand, no objection to using a Dirac comb, only the notation is a little misleading. (In mathematics, one usually writes δ_x for the Dirac distribution centered at x, δ_x(φ):=φ(x) for all test functions (continuous at least).

ad 3.) unclear. The sampling theorem is there to show that no information is lost between the samples. Or the remark is too simple for me to understand.

ad 4.) resp. ad PSF): While not reasonably clear from the PSF article here (I corrected this in deutsch), the PSF contains the statement about the Dirac comb to be its own Fourier-transform. But as a tempered distribution, the Dirac comb approach is only applicable to Schwartz-test functions. As long as You are comfortable with the situation where the spectrum has compact support and has bounded derivatives of any order, You can freely compute with this formalism. Spectrum repetition is ok for Schwartz test functions since all series converge in a locally uniform manner. Problems arise if the spectrum is less than a Schwartz test function.

ad 4.) resp. ad overlap): for a continuous spectrum function, the values at the highest frequency are zero, so spectrum repetition is pointwise exact inside the frequency band even for the critical frequency. But this is theoretical. Sticking to the best eng. books one can formulate that sampling at strictly higher than critical is ok, strictly lower than critical fails (visible overlap), equality is for the experts. But I hope that You can sample the sinc-function at sampling frequency 1.

ad Nyquist: perhaps one should include a word or two about the reconstruction of periodic, frequency bounded functions. This amounts to some kind of polynomial interpolation.

--LutzL 08:36, 30 May 2006 (UTC)Reply

Concise summary of sampling and the Fourier transform

Latest comment: 19 years ago52 comments4 people in discussion

Rbj, let's talk about the changes you made to this paragraph:

- That special case of the continuous-time Fourier transform is called discrete-time Fourier transform (DTFT), which is a periodic function. But it is also a continuous-frequency function, which means that a computer cannot evaluate it at every frequency (because it is a continuum).

Instead, you wrote:

- The continuous-time Fourier transform of $x_{s}(t)\,$ (or $X_{s}(f)\,$ ) is essentially the discrete-time Fourier transform (DTFT) of $x[n]\,$ , which is a periodic function. But, because $x[n]\,$ is infinitely long, it is also a continuous-frequency function, which means that a computer cannot evaluate it at every possible frequency (because it is a continuum).

The main problem is that the continuous-frequency nature of:

X_{s}(f)={\frac {1}{T}}\sum _{n=-\infty }^{\infty }X(f-nf_{s})\

has nothing to do with the duration of $x[n]\,$ . It only depends on the continuous-frequency nature of $X(f)\,$ , which depends on the periodicity (or lack thereof) of $x(t)\,$ .

perhaps it should be reworded to be if x[n] is completely general and infinitely long. that condition means that, in general, X_s(f) is contiuous. if x[n] is finite in length, then there are issues about how to define or extend x[n] outside of its original length. in and of itself, a finite and discrete set of frequency-___domain values (those returned by the DFT) suffice to completely describe x[n], but that essentially makes one of a couple of assumptions. one is that the DFT is sampling the DTFT, a continuous but repeating function, at equally spaced values and then the remainder of x[n] must be defined (usually zero). the other is that the DFT is only a mapping from one discrete vector to another discrete vector (with no mention of the DTFT or anything continuous). the latter invites debate about the periodic nature (or lack of, for those on the other side of the debate) inherent to the DFT but says nothing to the issue you bring here. the former says that, even if nearly all of the values of x[n] are zero, they are still defined for the countably infinite set.

I don't get the point of all this. My original statement references the DTFT article, and extracts a couple of concise, non-controversial points (periodicity and continuous-frequency). --Bob K 12:20, 20 June 2006 (UTC)Reply

it really is the case that because x[n] is infinitely long (in definition) and assumed general means that the DTFT is continuous in the frequency ___domain. even if x[n] was periodic, the DTFT would be composed of dirac impulses with zero in between, so it's still a continuous-frequency function. because x[n] is discrete is the reason that the DTFT is periodic with period 2π.

I understand all of that. Let's not get drawn into a quibble over how to describe a Dirac comb. No doubt we agree that for mathematical purposes it is a "function of a continuous variable". And for equivalence purposes, it is like a function of a discrete variable. I.e., it can be "fully-represented" by a discrete sequence (no loss of information) and vice-versa. --Bob K 12:20, 20 June 2006 (UTC)Reply

then does this mean that you agree with me that "... because

x[n]\,

is infinitely long, [

X_{s}(f)\,

] is also a continuous-frequency function..."?r b-j 15:57, 20 June 2006 (UTC)Reply

Let's make sure we don't have a semantics issue.

X_{s}(f)\,

is clearly a function of a continuous variable (by definition). So I assume you are referring to whether that function is a Dirac comb or a "normal" function. Also,

x(t)\,

and

x[n]\,

are "infinitely long" (counting zero-valued samples), by definition. So I assume you are implying that the non-zero values extend to infinity. Between you and me, I suggest we refer to that concept as non-windowed, for lack of a better term. Anyhow, my answer is the same as before. "

x[n]\,

is infinitely long" is neither a necessary nor sufficient condition to make

X_{s}(f)\,

"continuous-frequency". Rather it is a necessary (but not sufficient) condition to make

X_{s}(f)\,

a Dirac comb (the very antithesis of continuous frequency). --Bob K 17:23, 20 June 2006 (UTC)Reply

I would also add that the word essentially is unnecessary (at best), perhaps misleading (at worst). The DTFT is exactly what the continuous-time Fourier transform of a modulated Dirac comb reduces to mathematically. It is a special case of the more general transform.

the word "essentially' was put in because X_s(f) is dependent on T and there is no T (or x(nT)) in the DTFT, only x[n] which is what is left when any sense of physical time t is removed from the problem. the output of the DTFT looks like X(e^iω) and is closely related to X_s(f), but is not the same function. when a digital filter is executing its algorithm, there is no t or T, only x[n], x[n-1], etc. likewise, with the DTFT there is only x[n] in the "time" ___domain and only X(e^iω) in the frequency ___domain. r b-j 02:10, 20 June 2006 (UTC)Reply

But all we're really talking about is frequency normalization (by

f_{s}\,

) or not. Other than that superficial difference,

X_{s}(f)\,

, or

X_{s}({\omega  \over 2\pi })\,

if you wish, is a DTFT (by mathematical equivalence). And that point is clearly made in the DTFT article. Since this is intended as a "concise summary", my opinion is that it's potentially more confusing than helpful. --Bob K 12:20, 20 June 2006 (UTC)Reply

DTFT:

X(e^{i\omega })=\sum _{k=-\infty }^{\infty }x[k]\,e^{-i\omega k}\,

(same as X(z) in double-sided Z transform which is nice.)

FT of sampled function x_s(t) (using the conventional scaling that leaves out T):

X_{s}(f)=\sum _{k=-\infty }^{\infty }x(kT)\,e^{-i2\pi fkT}\

=\sum _{k=-\infty }^{\infty }x[k]\,e^{-i2\pi fkT}\

={\frac {1}{T}}\sum _{n=-\infty }^{\infty }X(f-nf_{s})\

(but not the same function, X(.) as above)

The relationship between the two is:

X_{s}(f)=X(e^{i2\pi fT})\,

essentially the same function, but not "exactly" the same function. r b-j 15:57, 20 June 2006 (UTC)Reply

Same function; i.e. plot them on the same f-axis. The notational difference is superficial and serves no purpose here [my opinion]. But now I know (for the first time) that we fundamentally agree. Yay! --Bob K 18:46, 20 June 2006 (UTC)Reply

actually, Bob, i don't think we agreed specifically about the correct word. you said "exactly ", which i said is not strictly true, and i said "essentially" which you said is "misleading". i continue to stand by my adverb as being more correct. r b-j 02:44, 22 June 2006 (UTC)Reply

By fundamentally agree, I mean:

X_{s}(f)=X(e^{i2\pi fT})\,

. The equals sign means exactly equal, so it's just two different notations for the same function of independent variable,

f\,

. I wasn't sure if you were getting that or not. Thus I was confused/misled by the word essentially. --Bob K 11:55, 22 June 2006 (UTC)Reply

You also replaced this paragraph:

- The discrete [frequency] Fourier transform (DFT) is a formula for computing regularly-spaced values (i.e. at discrete frequencies) of the DTFT function. Because of DTFT periodicity, a finite number of values is sufficient to fully characterize the DFT.

Instead, you wrote:

- The discrete [frequency] Fourier transform (DFT) is equivalent to evaluating (or sampling) the DTFT at regularly-spaced discrete frequencies. Because of DTFT periodicity, a finite number of values is sufficient to fully characterize the DFT. Because of this sampling in the frequency ___domain, this has the effect of periodically extending the data $x[n]\,$ in the time ___domain (in the same way that sampling in the time ___domain periodically extends the spectrum in the frequency ___domain).

Instead of that last sentence (which I would prefer to leave out), I would say that when $x(t)\,$ happens to be a periodic function, then $X(f)\,$ and $X_{s}(f)\,$ are effectively discrete-frequency functions. In that special case, it is possible for the DFT to fully represent the DTFT. And the inverse DFT reproduces the [periodic] samples of $x(t)\,$ . Conversely, when $x(t)\,$ is not periodic, e.g. $\mathrm {rect} (t)\,$ , the DTFT is a continuous-frequency function, and the DFT cannot fully represent it. The inverse DFT still produces a periodic sequence, but it is not the [aperiodic] x(nT) sequence. When a function of a continuous variable is approximated by a function of a discrete variable in one ___domain, the manifestation of the approximation error in the other ___domain is periodicity. --Bob K 12:10, 16 June 2006 (UTC)Reply

i'm still decoding what you're saying as an alternate, but i continue to stand by that sentence for both accuracy and conciseness. essentially, sampling in one ___domain always causes a periodic extension in the other ___domain. and, since the Fourier transform is invertible, the converse is true.

BTW, i am not sure whether or not i agree with Dicklyon's perspective on this, but i do think it should be discussed. probably this best belongs in DTFT or similar articles. the sampling theorem does not need the DTFT to support its thesis. it just so happens that there is a clean and linear relationship between the output of the DTFT, X(e^iω), and X_s(f). mentioning this as a note (with link to DTFT) makes sense, but i wouldn't build much about the sampling theorem on it.

actually, i think we should include the proof using the PSF (way above), to satisfy mathematicians who don't like our use of the dirac delta function in the mathematical treatement. we could include it as an alternative proof. r b-j 02:10, 20 June 2006 (UTC)Reply

Simple Solution

None of this rambling discussion of properties of various Fourier transforms has anything to do with a concise statement of the sampling theorem. So I took it out. Dicklyon 14:51, 16 June 2006 (UTC)Reply

one thing, Dick. it is customary to include a statement or rationale when deleting a large section of content (particularly content added recently by someone else). to delete a section of content added recently by someone else, along with other "innocuous" edits and then cite only the innocuous edits in the edit summary, might be considered to be a little shady. like either 1. you're so dismissive of the added section you think it deserves no mention to delete it or 2. you're trying to sneak it out and see if no one notices. maybe you just forgot to mention it (a legit oversight). r b-j 02:10, 20 June 2006 (UTC)Reply

You better remind me what section you're referring to. I'm not seeing it. Dicklyon 05:28, 20 June 2006 (UTC)Reply

diff: [10] r b-j 15:57, 20 June 2006 (UTC)Reply

OK, I confess. I don't recall if I deleted the "Note about scaling" section on purpose, or whether it might have been accidental.

???

\quad :-/\

In truth, I don't remember ever reading that section. I can't say that I particulary care for its cryptic contents, so its possible I just took it out to see if anyone cared about it. It seems to be quite irrelevant to the topic at hand. Dicklyon 20:01, 20 June 2006 (UTC)Reply

perhaps you can begin by asking yourself how you would design and construct an approximation to a brickwall filter (that title deserves an article) with cutoff f_s/2 and with passband gain of T= 1/f_s. T is not dimensionless.

or you can ask yourself how to model a practical conventional DAC, or why, in a simple DSP system, (you know those DSKs that TI and ADI sell with ADC and DAC built in) where the ADC and (conventional) DAC have identical V_ref (so their F.S. voltage is the same). program the DSP to simply pass the number from the ADC to the DAC ("talkthru.asm" is a common filename) and then use a scope and signal generator to measure the frequency response. besides some constant delay, if the DAC is conventional (not sigma-delta), you will notice a nearly 4 dB drop in gain as your frequency gets close to Nyquist.

this is treated in zero-order hold, but you will notice a difference in scaling factor for x_s(t) compared to this article (and 95% of the communications/DSP texts out there). why? that is what the cryptic contents of that Note on scaling are about. r b-j 20:50, 20 June 2006 (UTC)Reply

Yes, I can see what it is talking about. But crude reconstruction filters like zero-order hold have very little to do with the sampling theorem. This whole section is a narrow aside on approximate reconstruction techniques, I think. I don't see how it helps one understand the sampling theorem or its applications.

no one is building "crude reconstruction filters like zero-order hold". the behavior of the ZOH is what you get stuck with when you "reconstruct" with a conventional DAC. whether you like it or not or think it is crude or not, any practical reconstruction (and this article is about reconstruction but presents only a hypothetical and impractical method of reconstruction) will not be done with a string of dirac impulses. the discussion of the operation of the ZOH really belongs in an article about ZOH, but the article (as well as the FOH article) begins with a slightly modified premise regarding the sampled signal, x_s(t). where is this difference in premise to be bridged? r b-j 22:37, 20 June 2006 (UTC)Reply

Interesting viewpoint. I thought this article was about the sampling theorem, which totally requires sinc functions for reconstruction, rather than about approximate reconstruction methods, about which the sampling theorem has little to say. Dicklyon 23:01, 20 June 2006 (UTC)Reply

POLL: Does anyone here think the "Note about scaling" subsection is useful? Dicklyon 21:41, 20 June 2006 (UTC)Reply

at least he's asking questions first before shooting. r b-j 22:37, 20 June 2006 (UTC)Reply

No. Just define: $\Delta _{T}(t)\equiv T\cdot \sum _{n=-\infty }^{\infty }\delta (t-nT)\$ to begin with and be done with it. --Bob K 23:16, 20 June 2006 (UTC)Reply

if i were king of the world, that's what i would do. but the problem is that except for Pohlmann, Principles of Digital Audio (a book with some of my influence in it BTW), no textbook that i know of scales the sampling operator that way (that, along with the conventional DFT scaling, has always disappointed me). they always put the T factor either in the passband gain of the brickwall reconstruction filter or put it as a separate gain block. if you put it in Δ_T, then when you use that model to describe the DTFT (that needs work there, but i don't want to think about it now) or the Z transform, you'll have this extraneous T factor to deal with there and i don't think there should be two different definitions for the sampling operator. i really think we need to stick with the prevalent convention in the books, even if a better convention can be thunk of. r b-j 23:34, 20 June 2006 (UTC)Reply

The separate gain block is fine with me too. So I still vote "no" to the "Note about scaling" section. --Bob K 15:12, 21 June 2006 (UTC)Reply

please illustrate exactly how (with the "separate gain block") you will explain the zero-order hold effect of a conventional DAC without a resulting scaling error of of 1/T in the result. (the frequency response has to be dimensionless and the gain at DC is unity or 0 dB.) r b-j 15:37, 21 June 2006 (UTC)Reply

It's you who said that a separate gain block is a "prevalent convention". If "the books" aren't concerned with the scaling error you're worried about, why should I be? --Bob K 22:29, 21 June 2006 (UTC)Reply

no, Bob. i said two things. that the

x_{s}(t)=\Delta _{T}(t)x(t)=\sum _{n=-\infty }^{\infty }x(nT)\delta (t-nT)\

is the "prevalent convention" and that there are two ways (that i know of) of dealing with that, either "put the T factor in the passband gain of the brickwall reconstruction filter or put it as a separate gain block." of the two, i believe the former is much more common. but let's assume the latter for the purpose of imagining a way to do this:

i came up with a means of representing the sampling theorem that is comparible to EE texts (except they mostly convolve with the Dirac comb in the frequency ___domain to get the repeated images where i use a simpler and more familiar property of the cont. Fourier transform) and added a note about scaling to bridge between the placement of the T factor that is common in texts and represented in the main section to where it is placed in zero-order hold and first-order hold (and i noticed where you do it in DTFT, which, BTW has an inconsistent definition of Δ(t) to this article and Dirac comb). also, i didn't put in this Note on scaling until there was a conceivable need to because of the difference in scaling between the common convention and that in the ZOH and FOH articles.

now, i did that and it satisfies the common convention (at first) and it explains an alternate convention that really is necessary (until someone comes up with alternative that works just as well) to explain how to begin to approach this other practical reconstruction (than brickwall filtering a sequence of dirac impulses). that is the best way i can think of to do it. if you want to use the "separate gain block" but otherwise common convention, Bob, to explain why the conventional ideal DAC has a ZOH frequency and impulse response, the impetus is on you to do that. not me. r b-j 02:37, 22 June 2006 (UTC)Reply

Yes, if that was something I cared about, and if I felt that this article was the appropriate place for it, then I guess the impetus would be mine. But neither condition is true. Anyhow, since the separate gain block is not a prevalent convention, then I am back to my first stance: Just sample with

T\cdot \sum _{n=-\infty }^{\infty }\delta (t-nT)\

. If you don't want to call it

\Delta _{T}(t)\,

, then give it another name. --Bob K 05:34, 22 June 2006 (UTC)Reply

okay. someone can always revert it if they hate it. if you want to change it that way, it is fine with me, but i was conceding to convention. someone who knows me from comp.dsp is gonna think i did it because stood on a soapbox and advocating changing the lit. to that way. dunno if some people are gonna think that the common convention is the Wikipedia way. (i wouldn't change the definition of Δ_T(t) but just put a T beside it in "x_s(t) = ...") r b-j 06:26, 22 June 2006 (UTC)Reply

As I told you before, rbj, and there was no reply from you, if you want the continuous variable to carry a dimension you have also to care that the input of the "delta function" is dimensionless. That's because this "function" is purely theoretical and in its theory defined to be dimensionless on both sides, input and output. In the theoretical arguments of this article, it is assumend that all occuring variables are dimensionless, so there is no problem. If one is to apply the theory to practical data with dimensions, then those practical data have to be divided by a reference scale, thus removing the dimension. In most cases this reference scale is 1s or some other unit scale, so this removal of dimension does not show up in the decimals of the data, which seems to cause some of your confusion.--LutzL 07:03, 21 June 2006 (UTC)Reply

i am not confused at all. in engineering practice (as well as physics) we deal with quantities and functions that are dimensionful all of the time. certainly when an argument is applied to a mathematical elementary function like sin(), cos(), exp(), the argument must be dimensionless for the function to make sense. but not so for power functions (including roots) or for multiplication and division. when terms are added or substracted they must be of identical dimension.

using the "engineering" form of the dirac delta function (as a limit of "nascent" delta functions, what i'll call a dirac impulse), it is clear that the "height" or dependent variable (the quantity that comes out of the operator after receiving an argument) must be the reciprocal dimension than that of the argument. this is because the integral (or area) of the dirac impulse is the dimensionless unity and if the width is measured by some dimensionful quantity (say, seconds), the height must be measured in the reciprocal dimension.

a simple degenerate example: if we have a filter or LTI system in which the output signal is the same species of animal as the signal going in, the transfer function must be dimensionless. in that case:

H(s)=\int _{-\infty }^{+\infty }h(t)e^{-st}dt\

the dt is of dimension time, H(s) is dimensionless, e^st is dimensionless, then that means that h(t) must be of dimension 1/time. indeed for a simple RC low-pass filter (voltage in - voltage out):

H(s)={\frac {1}{1+RCs}}\

h(t)={\frac {1}{RC}}e^{-{\frac {t}{RC}}}u(t)

where u(t) is the (Heaviside) unit step function.

since RC is time, it is clear that the dimension of h(t) is 1/time.

now, regarding the dirac impulse, what if the filter is instead simply a pair of wires: v_out = v_in . now what is the impulse response? and then why should the dimension of the impulse response be any different?

LutzL, i've been over this so many times with so many different people. those USENET citings i listed above are but one (but one that was quite public and had a record kept by Deja News which was bought by Google). you can appeal to the distribution definition of the delta function all you want and it does not affect this argument at all, because i will just pick my favorite nascent definition and let a be whatever positive number i want, as small as i want, and in any physical system, the difference of effect from that finite width nascent and any other nascent delta of even smaller width, will be unmeasurable. if the delta function has argument of time, i'll pick a to be a Planck time and, at that point, the argument is over. it's not a true dirac delta "function", but it doesn't matter from the point-of-view a physical system. r b-j 15:24, 21 June 2006 (UTC)Reply

Hell and heaven, yes, you can do what you like, you can invent your own flavor of mathematics/calculus if you want. But don't claim that other people are wrong and should be corrected if they stick to the usual conventions. And please consider that, if I understand that policy correctly, that original homemade research is not wellcome on wikipedia. And you yourself admitted that your point of view is not shared even by the gross majority of the engineering literature. You can go on your own crusade on usenet, you have a sufficiently correct intuition with nascend deltas to be helpful sometimes, but wikipedia is not the place for it.--LutzL 17:12, 21 June 2006 (UTC)Reply

sorry to fluster you (i did recognize that the use of the dirac impulse common in electrical engineering signal-processing or communications texts make mathematicians blanch). but that usage is there in very respected and legitimate texts. this is a dispute of usage between two professional groups: mathematicians and electrical engineers. i fully disagree that Wikipedia is no place for the EE usage of δ(t) or that the EE community has no claim to how the Nyquist–Shannon sampling theorem should be represented. in fact, i think the EE community has the primary claim (but not the only claim) to how the sampling theorem should look. if you want to respond, please take it to the bottom of the talk page since the same dispute of δ(t) usage is also heating up there. r b-j 21:21, 21 June 2006 (UTC)Reply

Fourier transform exists for Dirac comb?

I can't think of any sense in which this statement can be considered true: "To use that analysis tool, a continuous-time function is contrived conceptually (not actually nor numerically) by using the samples to modulate the 'teeth' of a Dirac comb function, which does have a continuous-time Fourier transform." Can someone explain what is intended here? Even if the 'teeth' are replaced by finite pulses, the comb extending infinitely in time is not square integrable and does not have a Fourier transform.

It means whatever row 25 and Dirac_comb#Fourier_transform mean. --Bob K 22:29, 21 June 2006 (UTC)Reply

Or was the "which does have..." meant to refer to the modulated comb, under the assumption that the orignal signal was square integrable and had a Foureir transform, and the teeth were nascent delta functions? Too bad if so, since it means the analysis no longer applies to those functions, such as outputs of stationary random processes, that are NOT square integrable but DO have a well defined spectral density.

This section needs to be rewritten, or abandoned, I think. Dicklyon 17:36, 21 June 2006 (UTC)Reply

Dick or LutzL, does (non-zero) DC have a fourier transform? is it square-integrable? if you say it does not have a F.T., do the standard engineering communications texts agree? r b-j 17:52, 21 June 2006 (UTC)Reply

No, a constant function does not have a Fourier transform, unless you extend the concept to allow delta-function-type singularities. It's not an unreasonable thing to do, and Khinchin did it, if I recall correctly, but it's not really a Fourier transform. It does have a spectral density, however, which is the Fourier transform of its autocorrelation function (defined in terms of expected value of a product, which is finite everywhere). Most texts agree when they are at all rigorous, but some may not be. See Wiener-Khinchin theorem. Dicklyon 18:24, 21 June 2006 (UTC)Reply

There is this thing called tempered distributions, which is the dual space to the Schwartz space of test functions. As tempered distributions, polynomials and polynomially bounded, locally integrable functions have Fourier transforms. The Fourier transform of a polynomial is a sum of derivatives of the delta distribution at the point zero. Tempered distributions are a common theme of functional analysis. In this regard, electrical engineering is applied functional analysis.--LutzL 14:59, 22 June 2006 (UTC)Reply

well, not every EE curriculum oriented signal processing book can be Papoulis, "Signal Processing". if polling is your method of choice (in deciding what Wikipedia should do regarding this article), i wonder how many EE signal processing and communications texts will say somewhere in the book:

{\mathcal {F}}\left\{x(t)=c\right\}=X(f)=c\delta (f)\

{\mathcal {F}}\left\{s(t)=\sum _{n=-\infty }^{+\infty }\delta (t-nT)\right\}=S(f)={\frac {1}{T}}\sum _{k=-\infty }^{+\infty }\delta (f-k/T)\

or their equivalent with angular frequency. ??

are you saying that more of your above qualified books don't say that than those that do? how 'bout if we add "undergraduate level" to that list of qualifications?

Why is it that all these authors get away with saying it? r b-j 20:56, 21 June 2006 (UTC)Reply

OK, I think I now see the sense in which the Fourier transform of a Dirac comb can be said to exist: it leads to a treatment in terms of Dirac functions that "works OK", so in that sense it exists, and is often found in books, even in Fourier transform pair tables. I'd still feel a lot better to have an explanation, derivation, and proof that doesn't require that artifice. On other hand, what most do, like Shannon did, is to implicitly assume the signal to be sampled is square integrable, and therefore not address the applicability of the sampling theorem to stationary random processes. Can we fix both problems? Has someone done so rigorously? I'll be looking for it.

I just found where the key inconsistency in the wikipedia is: the Fourier transform article section [11] says "the unqualified term 'Fourier transform' refers to the continuous Fourier transform, representing any square-integrable function...", but the distribution (mathematics) article [12] says "all tempered distributions have a Fourier transform, but not all distributions have one." So is the Dirac comb a "tempered distribution"? I see it's a "Schwarz distribution", but that term is nowhere defined.

Dicklyon 04:03, 22 June 2006 (UTC)Reply

It's meant to be the same. Tempered distribution is the more common term. And yes, it is a tempered distribution as any text on harmonic or functional analysis will tell. Even using the Dirac comb, one can only prove the sampling theorem for square integrable functions. Of course there is the Nyquist sampling theorem for periodic functions, which is a special version of polynomial interpolation: a polynomial of degree N is determined by N+1 values at different points, a trigonometric polynomial with at most N voices is related to a polynomial of degree 2N, evaluated on the unit circle, and thus determined by 2N+1 samples at different phases of the period.--LutzL 14:59, 22 June 2006 (UTC)Reply

i think the proof based on Poisson summation formula (PSF) does not use dirac impulses at all. a couple are listed above. we can argue about whatever concise version of the proof is clearer. r b-j 05:02, 22 June 2006 (UTC)Reply

The PSF is a generalization of the Dirac comb. Or the Dirac comb with its Fourier transform, the Dirac comb, is an interpretation of the PSF in terms of tempered distributions. While the dirac comb is only applicable to Schwartz test functions, the PSF is applicable to fast falling (4. degree) twice differentiable functions. To prove the theorem, convolution of the Dirac comb with a fast falling locally integrable function is the shortest way to go using distributions. Just splitting up the inverse Fourier integral into a sum of integrals over periods is the fastest way to go without using distributions.--LutzL 14:59, 22 June 2006 (UTC)Reply

also, i think stationary random processes that are the sum of equally-spaced sinc functions (like normal reconstruction) where the coefficients of the sinc functions are random numbers coming out of some ideal random number generator, i think that stationary process can be sampled. i would think that some IEEE or Bell-system Journal would have someone describing this. i dunno. r b-j 05:12, 22 June 2006 (UTC)Reply

Yes, of course random processes can be sampled. Their Fourier transforms need not exist. And I think they can be unambiguously reconstructed, too, if they are bandlimited. I thought this was well known and proven, but I see that Shannon's original proof seems to only apply to square integrable functions. The only real issue is the convergence of the infinite sum of sincs, which as someone pointed out has the potential to diverge due to the 1/x nature; I think that for any frequency strictly less than half the sample rate, however, convergence can be shown. Dicklyon 05:42, 22 June 2006 (UTC)Reply

The convergence condition for sinc-series is given in the reconstruction formula article. An infinite series of random events violates with probability one this condition. Thus there is no function to speak of, even not in a generalized sense. One can't take samples from a non-functions. - From a non-function, one cannot determine the Fourier transform, there is no highest frequency to speak of.--LutzL 14:59, 22 June 2006 (UTC)Reply

i don't think any random process can be sampled and accurately reconstructed. About the 1/x like convergence of the sinc, we already know we have crossed a line with the strict mathematicians when we engineers treat the dirac delta precisely like a function that violates one of the properties of Lebesgue integration: If f, g are functions such that f = g almost everywhere, then f is integrable if and only if g is integrable and the integrals of f and g are the same. If we're past that, then i think we can use the invertibe or one-to-one mapping of the Fourier Transform to support the point (indirectly) of what the sum of sincs must convert to if it converts to anything. they criticized Fourier for lacking proof of convergence, too. even though the 1/xenvelope is insufficient to guarantee convergence (but it doesn't precude it, we know that ((-1)ⁿ)/n converges) of the reconstructed value in between samples, i am comfortable with the lack of direct proof. r b-j 06:06, 22 June 2006 (UTC)Reply

You are again building up strawmen and beating them to death. Of course, the Fourier transform is also one to one on the space of tempered distributions. And the periodization of a tempered distribution with compact support is again a tempered distribution. This allows to write things like

\sum _{n\in \mathbb {Z} }\delta '_{n}=i2\pi \sum _{k\in \mathbb {Z} }k{\hat {\delta }}_{k}

where

\delta _{n}[x]=x(n),\;\delta '_{n}[x]=x'(n)\;and\;{\hat {\delta }}_{k}[x]=X(k)

for any Schwartz test function x=x(t). For any partial sum of the right hand side one can give a regular distribution, that is a functional consisting of integration over a product with a normal function. In the limit, this does not hold. It would serve you well, rbj, if you would look up the details of this dispute over the convergence of the Fourier series. Carl Offners script "A little harmonic analysis" is a good starting point, if read back to front. To solve the problems of this dispute it needed several revolutions in mathematics from the mid-19th to the first half of the 20th century, the most severe revolution was Cantors set theory and the definition of a function as a relation resp. a subset of the set of pairs.--LutzL 14:59, 22 June 2006 (UTC)Reply

Consider the reading of a thermometer, measuring the temperature of the room you are sitting in. It is a bandlimited random process, which we could sample

10^{6}\,

times per second, if we wished. I realize, of course, that the samples must have finite precision and that we cannot reconstruct the future. But the past is not random. It is a specific instance of a process that we can only describe probabilistically, until it happens. That specific instance can of course be reconstructed, like any other B/L function. [OK. I'm in my flame-resistant suit... flame away!] --Bob K 12:53, 22 June 2006 (UTC)Reply

Yes, and you need a sampling modell and a sampling theory for this specific instance of sampling that is only marginally related to this, Shannons, sampling theorem. And I doubt that you can reconstruct the process to any precision, since that would require to be able to reconstruct every single collision of an air molecule with the thermometer. And since the collisions lead to jumps in the inner energy of the thermometer, this process is bandlimited only in a generalized sense, where the Fourier transform is bounded below 3 dB (of what?) outside the condidered frequency band.--LutzL 14:59, 22 June 2006 (UTC)Reply

I'm not talking about the instantaneous temperature at a molecular level. I am talking about the filtered temperature, time-averaged by the lowpass response of the measurement device. That is a random process too. And I already acknowledged that realistic sampling cannot represent it to infinite precision (quantization noise). But that has nothing to do with whether the underlying process is random or not. If that's your best shot, I probably won't be needing the flame-suit afterall. :-) --Bob K 15:24, 22 June 2006 (UTC)Reply

Tune ups

Latest comment: 19 years ago1 comment1 person in discussion

I hope my "tune ups" and such meet with approval. I found lots of things unclear or not quite right. The main part that now remains bugging me is the "Mathematical basis..." section that starts out claiming that "To prove this, a mathematical representation of the uniform sampled signal that effectively discards the information between samples must be constructed." Is this really so? Aren't there other ways to prove it? Can we at least agree that this is just an approach, not something that MUST be done? I can't begin to get my head around what is right and what is unclear until I clear this up.

I hope I'm not being too picky or rigorous; I think it's worth converging on something that's really right and consistent.

Dicklyon 05:37, 22 June 2006 (UTC)Reply

Frequencies versus sinusoidal components

Latest comment: 19 years ago2 comments2 people in discussion

Rbj, I used "frequencies" just like Shannon did (see box above). This means only and exactly what the math says (the Fourier transform being zero). To say "sinusoid components" muddies the waters, because a component is usually understood to be discrete in frequency; a pure tone component. Any square integrable signal, i.e. any signal that can actually be Fourier transformed with delta functions, has NO sinusoidal components. This is, if you look at the energy in any band, in the limit as the bandwidth goes to zero (toward sinusoidal), that energy is zero in the limit. So "no frequencies above..." is a much stronger condition than "no sinusoid components above...".

Fix it back? Dicklyon 05:50, 22 June 2006 (UTC)Reply

i did it to clear the waters, but do with it what you want. i like to think that my body contains matter that has the property of warmth rather than my body contains warmth but i might just be anal. r b-j 06:10, 22 June 2006 (UTC)Reply

Random processes

Latest comment: 19 years ago11 comments5 people in discussion

New section because the complexity of topics got out of hand above. Here's the issue: is there a sampling theorem for (samples functions of) random processes, which are not in L2?

LutzL says above: "From a non-function, one cannot determine the Fourier transform, there is no highest frequency to speak of." But a sample function of a random process is not a "non function", and the random process itself (if stationary in a wide sense) can have a spectral density. So what's the problem?

He also points out that the reconstruction formula does not converge; with probability 1, the conditions for convergernce are not met. Does that really mean it does not converge? Or just that we can't prove some technicality?

So two questions, really: Is there no mathematical sample theoreom provable for wide-sense stationary random processes? Is there any problem in an engineering sense with using the Shannon sampling theorem with such processes?

Dicklyon 15:28, 22 June 2006 (UTC)Reply

So, I looked at what was said about convergence at Nyquist–Shannon interpolation formula, thought about it, and wrote up what I believe to be true for random processes. Which is that it sounds like a non-problem and that convergence is practically guaranteed. Any comments? Can I copy some of that to here, or is someone more mathematical than I going to quibble with it? Dicklyon 00:35, 23 June 2006 (UTC)Reply

Who, us?... quibble? --Bob K 04:01, 23 June 2006 (UTC)Reply

I have a question. WSS processes are all fine and good for some things, but they have no beginning and no end. And isn't that what leads to their exclusion from L2? I.e., what about a windowed WSS process (which of course is no longer WSS)? In the real world it is always windowed processes that we are dealing with. I have done a lot of DFTs on them, and it seems to work just fine. So what is all the fuss about? --Bob K 04:01, 23 June 2006 (UTC)Reply

Yes, you've got it right. The fuss is just that some kinds of analytic statistical techniques work best with WSS processes, rather than with finite signals. So it's handy to have some applicable theorems, such as the Wiener–Khinchin theorem to help out.

I scanned some books, and found what I was looking for in Wozencraft and Jacobs, Principles of Communication Engineering, 1965, pp.598–603, after not finding it in Gallager nor Sage and Melsa. They prove the sampling theorem in an appendix, but in the main text explicity discuss extending it to random processes, so they can analyze random waveform sources. They conclude that the process doesn't even have to be stationary to be bandlimited and subject to the sampling theorem, but I'm not sure how that result can be formalized. However, they also admit that "The assumption that a process is ideally bandlimited is not completely realistic."

Their page 2 also has a nice historical summary of Nyquist's and Hartley's contributions to information theory. I think I'll say something more in this article about what Nyquist actually showed.

Dicklyon 04:55, 23 June 2006 (UTC)Reply

"They conclude that the process doesn't even have to be stationary to be bandlimited and subject to the sampling theorem..." i didn't think that stationary was necessary or really an issue (as long as the changing parameters or moments of the process didn't make the bandwidth of the random process equal or exceed Nyquist). i still think that given virtually any discrete random process, x[n], that is these numbers come out of some kind of random number generator, even the kind that samples diode noise with an ADC or a numerical pseudo-random number generator that

x(t)=\sum _{n=-\infty }^{+\infty }x[n]\mathrm {sinc} \left({\frac {t-nT}{T}}\right)\

is both necessary and sufficient for sampling at virtually f_s = 1/T (you might have to be a hair higher). if x[n] is stationary then x(t) is. now not all possible sequences of x[n] are legit (but i think they're also bloody unlikely). if such a sequence came out of an RNG:

x[n]={\begin{cases}(-1)^{n},&{\mbox{if }}n\geq 0\\-(-1)^{n},&{\mbox{if }}n<0\end{cases}}

it could not be reconstructed so i am not sure how one could speak of it being sampled. but it's also not bloody likely to come out of any RNG. r b-j 20:13, 23 June 2006 (UTC)Reply

As long as the probability distribution of the random generator has a positive deviation (2. moment), then the square sum of almost any infinite sequence of random samples will have infinity as square sum. The same goes for the weaker criterium. What is legal and possible is to exchange the sinc function for a compactly supported function. One can find such functions that are almost bandlimited, or bandlimited in the 3dB sense. The corresponding series will still not converge in L²(R), but it converges locally to a continuous function, since the infinite sum has at each point only a finite number of nonzero terms. One can apply Fourier analysis to windowed portions of this function, make a generalized approximative sampling theory about it,...--LutzL 08:51, 24 June 2006 (UTC)Reply

I disagree with what you seem to be implying from your opening sentence: "As long as the probability distribution of the random generator has a positive deviation (2. moment), then the square sum of almost any infinite sequence of random samples will have infinity as square sum." True, but that's not the issue. You need to look at those terms weighted by the sinc. If the infinite sum of the squares of the infinite weight sequence (the since) converge (which they do, since their magnitudes are bounded by the square of 1/n), then the infinite sum of the random variables will have a finite variance, and the variance of the terms truncated in the tail will converge to zero. Or so it seems to me. What's wrong with my reasoning? Dicklyon 14:51, 24 June 2006 (UTC)Reply

Yes, there will always be pathological sequences for which the reconstruction does not converge. That's why you need a characterization of the process, so you can conclude that those have probability 0.

The reason you need stationary, or so I thought, was so that the notion of bandlimited could be defined. What do you use as a spectrum characterization for a non-stationary process? While it may be true that any random sequence of samples can be converted back to a provably bandlimited signal, that doesn't help you make the theorem unless you can show some class of signals to be bandlimited to start with. Dicklyon 22:03, 23 June 2006 (UTC)Reply

x(t)=\sum _{n=-\infty }^{+\infty }x[n]\mathrm {sinc} \left({\frac {t-nT}{T}}\right)\

is not guaranteed to be bandlimited? r b-j 02:06, 24 June 2006 (UTC)Reply

yes, it is. But what class of original signals does that correspond to if you don't have stationarity to have a theorem about frequency spectrum content? 24.6.152.39 05:33, 24 June 2006 (UTC)Reply

Aliasing

Latest comment: 19 years ago2 comments2 people in discussion

at an earlier time, i remember bringing up the subject with BobK. i think we should just be clear that aliased frequency components (of frequency f) are those that could be mistaken for the legitimately sampled component at mf_s - f where m =floor(f/f_s +1/2) - 1/2. it is a distortion but a very specific kind where some frequencies masquerade for others. r b-j 20:13, 23 June 2006 (UTC)Reply

Well, I liked how they put it: "the resulting distortion is called aliasing". This distortion is very simple only in cases where the signal consists of sinusoid components. In general, a spectral inversion via aliasing is pretty messy, though conceptually simple in terms of what it does to frequency components as you point out. That's why moire patterns and jaggies in images can look so weird. Distortion is a good all purpose term for the difference between the ideal or original and what your system actually does. Dicklyon 22:07, 23 June 2006 (UTC)Reply

The Introduction section

Latest comment: 19 years ago5 comments3 people in discussion

Just made some modifications for the following reasons:

A band-limited signal can changes arbitrarily fast, just use a sufficiently high amplitude since the rate of change is the product of amplitude and frequency.
The Fourier transform used here is not unitary, and there is no requirement that this particular version of the FT is used to prove the point.
I have converted to the notation ${\mathcal {F}}\{x\}$ for the Fourier transform of the signal x, as this is the notation used in the Fourier transform article.

--KYN 19:22, 2 August 2006 (UTC)Reply

And after your changes got reverted for unknown reasons, I changed it differently. In particular, there's no need to introduce Fourier transform at all in stating or understanding the theorem or the concept of bandlimited. The existance of the FT is a big stumbling block that limits the ___domain of applicability of the theorem, so let's do without it. Dicklyon 02:04, 3 August 2006 (UTC)Reply

And then I changed the symbol for bandwidth to B, including in the image of the hypothetical spectrum. Dicklyon 02:24, 3 August 2006 (UTC)Reply

although nearly anything is possible, it is pedagogically ridiculous to try to teach anyone about the sampling theorem without the concept of frequency spectrum. both the straight-forward proof ("let's sample the SOB with a dirac comb and see what happens") and the more indirect proof of the Poisson summation formula require the use of the continuous Fourier transform (the latter doesn't use a dirac comb). there are different notations and even philophies between electrical engineers (and the scholars among them) and mathematicians. this very talk page shows evidence about that (regarding the nature of the dirac delta function. this article should not be relegated to the impenitrable language of the pure mathematicians because i can guarantee that 99% of those who will profit from it will only from the simpler and more straight-forward EE perspective of it. r b-j 16:30, 3 August 2006 (UTC)Reply

The concept of a frequency spectrum is a good one, but it is very easy to use it incorrectly while pretending to be rigorous. I'm no mathematician, however, so have no fear about me adding that sort of impenetrable language. I'm on your side in trying to avoid it. As to pedagogically ridiculous, however, I think that goes too far. It may be possible to explain the sampling theorem quite precisely, without proof, using the concept of frequency without the concept of spectrum. For example, Harold S. Black's discussing of sampling, including many correlary theorems, is very clear without mentioning the concept of a spectrum. But it's without proof. Dicklyon 17:59, 3 August 2006 (UTC)Reply

Mathematical basis

Latest comment: 19 years ago15 comments5 people in discussion

What's up with this long mathy section? Is it really required to go into all this gory math with Dirac distributions to understand or prove the sampling theorem? Can we do it more concisely? Has anyone actually read the whole thing and can verify that it's even right? Dicklyon 02:24, 3 August 2006 (UTC)Reply

for someone who says you don't need the FT, that's an odd statement. what do you want to use? Poisson summation formula? (that proof is on this page.) r b-j 05:00, 3 August 2006 (UTC)Reply

I'm not sure which statement you think is odd; I wrote nothing but questions. The FT is not needed to state or understand the sampling theorem. It may well be needed to prove it. Do you know for sure? Does the Poisson technique avoid FT or Dirac? Dicklyon 05:06, 3 August 2006 (UTC)Reply

there were (presumptive) statements imbedded in the questions that should be supported. as for your last questions in each of your two posts here, can you answer that yourself? (i should think you should be able to if you're editing this article.) you might also check out some of the back and forth between User:LutzL and myself about this issue. r b-j 05:15, 3 August 2006 (UTC)Reply

R, OK, but I'm still not sure what presumptive statements you find odd. My point is that the mathy section is extremely long-winded, and something that long and mathy is not going to help anyone understand the subject. I'm perfectly capable of verifying and understanding every line of it, but it seems to me that there's got to be a better, or at least more concise way to do this. I'll try to find one. Or two. Do we really need to use different math for the different cases of finite-energy signals versus stationary processes? By taking the FT out of the intro, I believe I used math and language that is simple and correct for both. Do you agree? Dicklyon 17:48, 3 August 2006 (UTC)Reply

Talk:Nyquist–Shannon_sampling_theorem#Proof_using_PSF above is an alternative proof using the PSF that is about as consise as it can get. but it is not as straight-forward to understand as the present "long winded" proof. and it uses the FT (but does not use the Dirac Comb). i personally do not see how it can be proved without the FT. some proof that is well accessable to undergrad EE students (and others with a similar level of mathematical sophistication) belongs in the article. User:BobK put in this "concise overview" which i am not as certain is useful, but i didn't want to fight that fight.

i will fight any fight that tries to remove all proofs of the the theorem that are rigorous enough to not have big holes in it. and i will fight any fight (with the pure mathematics crowd) to obfuscate this with objections to the use of the "nascent" dirac delta function that engineers and engineering texts commonly use (EE's treat the dirac less rigorously - like a "regular" function - whereas mathematically it is not strictly a function like that). that is the basis of my dispute with User:LutzL above (and you can see there were other objections from other EE-background editors). i am willing to let the pure math guys take over the content of the continuous Fourier transform and some others, but not this article. r b-j 18:15, 3 August 2006 (UTC)Reply

also, i do basically disagree with the removal of FT references. the most fundamental facts of the sampling theorem that should be put in the introduction is that: if the continuous-time signal is bandlimited (how do we define that without the use of FT or the concept of spectrum?) and the sampling rate exceeds twice the bandlimit, that there is sufficient information in those discrete samples to reconstruct the original. i believe that you are crapping up the article and making the concept less accessable by removing FT from it. r b-j 18:20, 3 August 2006 (UTC)Reply

R, you sound like you're trying to pick a fight. Nobody's trying to remove proof or banish FT. My point is in answer to your parenthetical question "(how do we define that without the use of FT or the concept of spectrum?)". Did you see how I wrote it? It looks just like an FT, but it doesn't require the FT to exist to be meaningful in the stated region. Is that too close to pure math? Dicklyon 18:27, 3 August 2006 (UTC)Reply

I re-did this section a bit. Just tightened up a few things, fixed some informalities, made it more clear what's true and what's required, I hope, and just a bit shorter. I still think it can be a lot more concise, especially the first part that gets about 10 equations in before showing a sequence of dirac impulses weighted by sample values. I may work on that later. Dicklyon 08:41, 4 August 2006 (UTC)Reply

Hi. Dicklyon, if you say that you understand every line of the mathematical basis article and find it correct then you don't understand enough basic calculus. See the discussions above on the Fourier series of a Dirac comb. The formula given is only correct in a very weak sense that would require a lengthy comment or at least a link to the Dirichlet kernel. As written it claims the identity of a tempered distribution to a mathematical impossibility. In the introduction, to avoid the Fourier transform, you give a Fourier integral as the more general transform. Unfortunately, the converse is true. There is a Fourier transform for functions, for which the Fourier integral does not exist. A commonly known example is the sinc function. And to end my comment a general remark: the sampling theorem is also a fundamental theorem in the mathematical discipline of harmonic analysis, it was known and heavily used there even before communication engineers discovered the importance of the concept of the bandwidth of a signal (around 1920).--LutzL 10:08, 4 August 2006 (UTC)Reply

I never said it was correct. I understand your point exactly, which is that the concept of delta functions does not integrate trivially with transforms and other mathametical operations; yet it is conventional (in engineering at least) to use them as if they do. I was just trying to make the writeup consistent and a little more concise within this framework, e.g. by adding the condition that x be square integrable so that at least its baseband tranform exists.

As to your remark about the history, yes, we've tried to capture that in the article already. Please add to it if we missed some critical points. Dicklyon 17:20, 4 August 2006 (UTC)Reply

Dicklyon at least has a point in that a formally correct proof appears rather lengthy. Is it really necessary to present a formally correct proof in the article? Can the theorem be stated without a formally correct proof? I believe that it is sufficient to let the article include the standard informal proof in the form of a graphical illustration of what happens in the signal and frequency domains when the signal is sampled and then reconstructed, with and without aliasing. Examples of such informal proofs can be found in most textbook on the subject. Apparently, the theorem can be proven in different ways, why not consider the option of writing separate articles for these proofs? --KYN 11:03, 4 August 2006 (UTC)Reply

Yes, given the turbulent history of this article. A lot of time has been wasted fighting over proofs, as if there is only one "right" way to do it. Now the article doesn't even have Shannon's original proof, which requires neither the controversial "distributions" nor Dirac delta functions. It just requires a simple Fourier series expansion. For proof lovers, there should be a [different] place for anyone's and everyone's favorites. --Bob K 02:44, 9 August 2006 (UTC)Reply

Bob, I think you should pull that from an old version and put it back in a section of its own on "Shannon's proof". But if I recall correctly, he did need to use a representation of the complex spectrum, which means his proof probably is not applicable to stationary processes. I still haven't found one that works for that. Dicklyon 02:54, 9 August 2006 (UTC)Reply

Don't misinterpret my point. I did not say it needs to be lengthy to be formally correct. I'm not sure that's true. I also don't think it needs to be so lengthy to be "engineering correct". I just haven't had time to make it much shorter. I've tried hard already to make the statement of the theorem concise and understandable, but with so many contributors adding what they think helps, it is likely to be an ongoing process. The statement I added about the integral being zero, as a way to define bandlimited without introducing two types of signals and transforms, is one such attempt, and I'm still waiting for a mathematician to tell me why it's really not quite correct; we'll see. The idea of a separate article for detailed proofs and discussions of the engineer/math perspectives is something I've considered. Anyone else like that? Dicklyon 17:20, 4 August 2006 (UTC)Reply

Introduction section again

Latest comment: 19 years ago10 comments2 people in discussion

The Introduction section still contains some statements which I get stuck on:

1. A signal that is bandlimited is constrained in terms of how rapidly it changes in time ... I take it that the author means that "rapidly" refers to frequency. However, I believe that most readers think of "rapidly" in terms of rate of change per time unit, that is, derivative, and this is not bounded for a bandlimited signal. Also, I don't see that the point of the sentence is lost by simply writing, shorter and less confusing:

A signal that is bandlimited is constrained in terms of much detail it can convey in between discrete instants of time.

It doesn't matter to me which way this informal statement is expressed. If you interpret "how rapidly it changes in time" relative to the sample values nearby, then reading it as a derivative limit is about right. I'm not sure the idea of "detail" is any more correct. Dicklyon 17:27, 4 August 2006 (UTC)Reply

OK, this may be a problem related to how I parse the entire sentence. I am guessing now that "and therefore how much detail it can convey" is a subordinate clause. If yes, it makes sense but I would prefer to explicitly mark it as such, with commas, to make my parser put thing together in the right order. If not, I am still lost. --KYN 19:08, 5 August 2006 (UTC)Reply

2. DickLyon has introduced the idea that the sampling theorem does not have to be restricted to only signals for which the Fourier transform is well-defined. I have no idea if this is correct or not, but the statement is confusing from the point of view that the theorem is based on bandlimited signals. Is it even possible to talk about a bandlimited signal which does not have Fourier transform? Even if it is, should such generalization be considered already in the introction?

--KYN 11:25, 4 August 2006 (UTC)Reply

In my experience, engineers in information and communication theory deal mostly with statistical signals; "stationary" signals have a well-defined concept of power spectral density and band limit, but do not have Fourier transforms. Yet the sampling theorem, including reconstruction formula, certainly does apply to them. I just want to make sure they don't get left out, but they need not be mentioned unless the way a particular passage is written would otherwise exclude them by requiring the existence of the Fourier transform. Dicklyon 17:27, 4 August 2006 (UTC)Reply

Can you provide an example of a stationary signal which is bandlimited but does not have a Fourier transform. --KYN 19:48, 4 August 2006 (UTC)Reply

Yes, since ALL stationary processes don't have Fourier transforms (unless you count the identically zero process), and SOME of these are bandlimited. A stationary process has a power that doesn't depend on when you observe it, so has infinite energy. A Fourier transform requires that the signal to be transformed have finite energy (can be extended with tricks to work for infinite-energy signals like dirac functions and combs, but that will not apply to typical random processes). A specific example: the first-order autoregressive process with autocorrelation function equal to exp(|tau|), which has unit power and a spectrum that you'd get by putting white noise through a simple first-order RC filter with R*C = tau. If you take a sample function from this process and try to do an FT on it, the expected variance of the transformed value will be infinite at every frequency. But that one's not bandlimited. To get one that's bandlimited, you need an autocorrelation function whose Fourier transform is zero above the frequency B. So take such a spectral density, say a rect function, inverse transform it to get the autocorrelation function, and find a filter such that that is then the convolution of the impulse response of the filter with itself. The easiest such filter is the sinc filter. I.e. put white noise through a sinc filter and you have a bandlimited stationary process with no Fourier transform. In practice, just as with finite-energy signals, getting a signal to be exactly bandlimited is a mathematical near impossibililty; this just means that the sampling theorem seldom holds exactly. Dicklyon 20:13, 4 August 2006 (UTC)Reply

I am still not certain that when you say "don't have Fourier transforms" you mean that the transforms do not exist as proper functions but can be formally defined in terms of distributions, or if you mean that they cannot be defined at all.

No, they don't exist at all, in the sense that they are unbounded at essentially all frequencies, not just a discrete set of dirac functions. Signals that are stationary have infinite energy at any frequency where they have any power. Dicklyon 20:27, 5 August 2006 (UTC)Reply

In either case, I don't believe that it is necessary, already in the introduction, to introduce the reader to some of the generalizations which are possible. It is sufficient to say that x is a signal (which we have to assume that the reader has some familiarity with) but not to specify that it can be complex-valued (which is correct, but why not vector valued also?) or that it can be a stochastic process instead of a just a realization of such a process. I would prefer to see a section later on which lists these and other generalizations which are possible for the theorem, once all the important concepts are in place. --KYN 19:08, 5 August 2006 (UTC)Reply

Yes, good point. We shouldn't need to mention complex- or vector-valued alteratives until we need to say real-valued, which should not be in the intro. They should be an aside later when the ___domain of the proof is specified. Dicklyon 20:27, 5 August 2006 (UTC)Reply

Some further points on the Introduction section:

3. I see that there are overlaps between sections "Introduction" and "The sampling process": statements about relations between sampling frequency and bandwidth occur in both as does the relation between sampling interval and sampling frequency. Also, "Introduction" makes use of the concept "alias-free sampling" without explaining what it means, while it is reasonably intuitively described in "The sampling process". Here is my point:

Can we delete the "Introduction" section entirely, possibly after moving a few non-redundant parts of it to "The sampling process" section? The latter section was written with the intent of being an introduction, saying "Hi and Welcome" to anyone who has never heard of the sampling theorem and wants to know what the fuss is about. I believe that it can serve as a good introduction, providing all the necessary pieces and statements with a minimum of math.

All the juicy math and other technicalities which appear to be so interesting to all you good people who are editing the article can then be developed at depth in the following sections.

I am not averse to combining those first two sections into one, but it needs to be done carefully so avoid making a long section with too much extraneous info. I think I agree that aliasing is an extraneous concept when first stating the sampling theorem precisely, since it's an effect that happens only in situations outside the conditions of the theorem; so leave it for later. Instead of alias-free sampling, talk about sampling with perfect reconstruction. Dicklyon 21:53, 5 August 2006 (UTC)Reply

Mathematical basis changes

Latest comment: 19 years ago9 comments2 people in discussion

As I mentioned above, I made some changes to tighten up the text of the mathematical basis section a bit; that is, to make it a bit more concise, a bit more clear, and a bit more correct. Please comment if you see any changes that make it less clear or less correct.

Rbj has reverted all the changes, saying they don't help, and I've re-reverted them back. On looking at the history, I see that it was Rbj around April 30 who made this section so long and verbose, so he has perhaps too much personal attachment to the details. Personally, I find the much shorter versions before April 30 to be more appropriate to the article, in terms of length and level of coverage. Perhaps we should go back to one of those versions and tune it up, instead of trying to work back down from the verbose version?

Dicklyon 18:17, 4 August 2006 (UTC)Reply

you've actually made it more verbose, except where you deleted real information that BobK put in, in lieu of that Note about scaling section you didn't like. we worked out a change that dealt with his concern and mine and you deleted that whole thing. it's like you're driving a huge dumptruck down the middle of the road completely unawares of other people are doing to make this article exactly what you think it should be without doing the politics necessary to get concensus. you need to read Wikipedia:Why stable versions. what you are doing is "edit creep" which moves this article to something approximating a good article to something unrecognizable. r b-j 04:57, 5 August 2006 (UTC)Reply

Would you be so kind as to point out the edits you are referring to? Dicklyon 05:01, 5 August 2006 (UTC)Reply

Anybody? Opinions have been solicited here, and nobody will engage, not even the guy who calls me a dumptruck. Dicklyon 05:28, 5 August 2006 (UTC)Reply

Well now Rbj has reported me for violating the three reversions rule, so maybe I'll have to let him drive it off in a new direction for a while. Still seeking opinions on whether the long mathy section can't be made more clear, correct, and even eventually concise. Dicklyon 05:57, 5 August 2006 (UTC)Reply

Since nobody's talking at this hour, let me review the compromise proposal I made a while back, splitting the difference with two of my four sequential edits. Here's the latest diff relative to where Rbj left the page: [13]. Note that the final paragraph has been deleted, because it had little or nothing to do with the "mathematical basis" section, and the other edits are all minor, clarifications, grammar, etc. For example, using /fs instead of *T in a subsection that had no locally apparent T. And I pointed out that it all works for complex-valued signals, too, since it's all written in complex math that applies equally there (if I'm not right on this, please say so). Anyone think this corresponds to driving a dumptruck over the work of others? Please comment. Dicklyon 06:23, 5 August 2006 (UTC)Reply

Seeing no objection, I will proceed to resurrect the next step of my changes. Please comment. There's one more after that. Dicklyon 18:17, 5 August 2006 (UTC)Reply

Now I've done the last of the four that Rbj didn't like. I'm still waiting for any reactions, one way or the other, to the content of these changes. This one is about conditions on the H(f) reconstruction filter, and that leads to the constraint that relates sample rate to bandwidth. The way it was, the notion of "overlap" was used, without an explanation of why it might matter; and it concluded or stated "the reconstruction filter H(f) must be:..." followed by a function that is not actually required. By stating the actual constraint on what H(f) needs to do, we get a looser constraint on the form of H(f), and a logical way to conclude what relationship must be satisfied about the sample rate and bandwidth, without appeal to the notion of "overlap". The next step should be to prune the long-winded start of this section, and maybe appeal to an external derivation of the sinc result, as was suggest by others above. Dicklyon 21:42, 5 August 2006 (UTC)Reply

OK, one more major shortening by not belaboring the derivation of the reconstruction part, and not repeating and summarizing a bunch of stuff that's well stated elsewhere in the article. This is change set 5. I took out the distinction between a sampling half and a reconstruction half, because I didn't think the theorem really had such two halves. I kept the parts, but characterized more explicitly what they are. Save 1 KB or so, according to the too-long-article complaint. Dicklyon 22:36, 5 August 2006 (UTC)Reply

Concise summary section

Latest comment: 19 years ago1 comment1 person in discussion

I finally figured out what the "concise summary" section was about. It wasn't clear that it was a summary of the preceding section on mathematical proof, since it didn't say so. Having worked on that section now, the relationship became clear, so I changed the title to refer to it (or close, anyway). I changed some wording too, as it erroneously said the comb could have an FT; that may have been my own mistake, but now it's fixed. Dicklyon 23:15, 5 August 2006 (UTC)Reply