Content deleted Content added
Artoria2e5 (talk | contribs) →The spectrogram as a time-frequency representation: ref improving mentions all of the above |
Artoria2e5 (talk | contribs) Milked most of the footnotes to their worth; now it's refimprove again. |
||
Line 1:
{{
[[Image:Reassigned spectrogral surface of bass pluck.png|thumb|400px|▼
Reassigned spectral surface for the onset of an acoustic bass tone having a sharp pluck and a fundamental frequency of approximately 73.4 Hz. Sharp spectral ridges representing the harmonics are evident, as is the abrupt onset of the tone. The spectrogram was computed using a 65.7 ms Kaiser window with a shaping parameter of 12.]]▼
The '''method of reassignment''' is a technique for sharpening a [[time-frequency representation]] (e.g. [[spectrogram]] or the [[short-time Fourier transform]]) by mapping the data to time-frequency coordinates that are nearer to the true [[Support (mathematics)|region of support]] of the analyzed signal. The method has been independently introduced by several parties under various names, including ''method of reassignment'', ''remapping'', ''time-frequency reassignment'', and ''modified moving-window method''.<ref name="hainsworth">{{Cite thesis |type=PhD |chapter=Chapter 3: Reassignment methods |title=Techniques for the Automated Analysis of Musical Audio |last=Hainsworth |first=Stephen |year=2003 |publisher=University of Cambridge |citeseerx=10.1.1.5.9579 }}</ref> The method of reassignment sharpens blurry time-frequency data by relocating the data according to local estimates of instantaneous frequency and group delay. This mapping to reassigned time-frequency coordinates is very precise for signals that are separable in time and frequency with respect to the analysis window.
== Introduction ==
▲[[Image:Reassigned spectrogral surface of bass pluck.png|thumb|400px|
▲Reassigned spectral surface for the onset of an acoustic bass tone having a sharp pluck and a fundamental frequency of approximately 73.4 Hz. Sharp spectral ridges representing the harmonics are evident, as is the abrupt onset of the tone. The spectrogram was computed using a 65.7 ms Kaiser window with a shaping parameter of 12.]]
Many signals of interest have a distribution of energy that varies in time and frequency. For example, any sound signal having a beginning or an end has an energy distribution that varies in time, and most sounds exhibit considerable variation in both time and frequency over their duration. Time-frequency representations are commonly used to analyze or characterize such signals. They map the one-dimensional time-___domain signal into a two-dimensional function of time and frequency. A time-frequency representation describes the variation of spectral energy distribution over time, much as a musical score describes the variation of musical pitch over time.
In audio signal analysis, the spectrogram is the most commonly used time-frequency representation, probably because it is well understood, and immune to so-called "cross-terms" that sometimes make other time-frequency representations difficult to interpret. But the windowing operation required in spectrogram computation introduces an unsavory tradeoff between time resolution and frequency resolution, so spectrograms provide a time-frequency representation that is blurred in time, in frequency, or in both dimensions. The method of time-frequency reassignment is a technique for refocussing time-frequency data in a blurred representation like the spectrogram by mapping the data to time-frequency coordinates that are nearer to the true region of support of the analyzed signal.<ref name="improving" />
== The spectrogram as a time-frequency representation ==
Line 36 ⟶ 35:
== The method of reassignment ==
Pioneering work on the method of reassignment was published by Kodera, Gendrin, and de Villedary under the name of ''Modified Moving Window Method''.<ref name=Kodera>{{cite journal |author1=K. Kodera |author2=R. Gendrin |author3=C. de Villedary |name-list-style=amp |date=Feb 1978 |title=Analysis of time-varying signals with small BT values |journal=IEEE Transactions on Acoustics, Speech, and Signal Processing |volume=26 |issue=1 |pages=64–76 |doi=10.1109/TASSP.1978.1163047 }}</ref> Their technique enhances the resolution in time and frequency of the classical Moving Window Method (equivalent to the spectrogram) by assigning to each data point a new time-frequency coordinate that better-reflects the distribution of energy in the analyzed signal.<ref name=Kodera/>{{rp|67}}
In the classical moving window method, a time-___domain signal, <math>x(t)</math> is decomposed into a set of coefficients, <math>\epsilon( t, \omega )</math>, based on a set of elementary signals, <math>h_{\omega}(t)</math>, defined<ref name=Kodera/>{{rp|73}}<!-- far from the same notation as Kodera p73, but the same thing. -->
:<math>h_{\omega}(t) = h(t) e^{j \omega t} </math>
Line 63 ⟶ 62:
\end{align}</math>
For signals having magnitude spectra, <math>M(t,\omega)</math>, whose time variation is slow relative to the phase variation, the maximum contribution to the reconstruction integral comes from the vicinity of the point <math>t,\omega</math> satisfying the phase stationarity condition<ref name=Kodera/>{{rp|74}}
:<math>\begin{align}
Line 70 ⟶ 69:
\end{align}</math>
or equivalently, around the point <math>\hat{t}, \hat{\omega}</math> defined by<ref name=Kodera/>{{rp|74}}
:<math>\begin{align}
Line 77 ⟶ 76:
\end{align}</math>
This phenomenon is known in such fields as optics as the [[stationary phase approximation|principle of stationary phase]], which states that for periodic or quasi-periodic signals, the variation of the Fourier phase spectrum not attributable to periodic oscillation is slow with respect to time in the vicinity of the frequency of oscillation, and in surrounding regions the variation is relatively rapid. Analogously, for impulsive signals, that are concentrated in time, the variation of the phase spectrum is slow with respect to frequency near the time of the impulse, and in surrounding regions the variation is relatively rapid.<ref name=Kodera/>{{rp|73}}
In reconstruction, positive and negative contributions to the synthesized waveform cancel, due to destructive interference, in frequency regions of rapid phase variation. Only regions of slow phase variation (stationary phase) will contribute significantly to the reconstruction, and the maximum contribution (center of gravity) occurs at the point where the phase is changing most slowly with respect to time and frequency.<ref name=Kodera/>{{rp|71}}
The time-frequency coordinates thus computed are equal to the local group delay, <math>\hat{t}_{g}(t,\omega),</math> and local instantaneous frequency, <math>\hat{\omega}_{i}(t,\omega),</math> and are computed from the phase of the short-time Fourier transform, which is normally ignored when constructing the spectrogram. These quantities are ''local'' in the sense that they represent a windowed and filtered signal that is localized in time and frequency, and are not global properties of the signal under analysis.<ref name=Kodera/>{{rp|70}}
The modified moving window method, or method of reassignment, changes (reassigns) the point of attribution of <math>\epsilon(t,\omega)</math> to this point of maximum contribution <math>\hat{t}(t,\omega), \hat{\omega}(t,\omega)</math>, rather than to the point <math>t,\omega</math> at which it is computed. This point is sometimes called the ''center of gravity'' of the distribution, by way of analogy to a mass distribution. This analogy is a useful reminder that the attribution of spectral energy to the center of gravity of its distribution only makes sense when there is energy to attribute, so the method of reassignment has no meaning at points where the spectrogram is zero-valued.<ref name="improving" />
== Efficient computation of reassigned times and frequencies ==
|