Content deleted Content added
add a figure and more discussion of efficiency |
mention rotating the h[n] buffer |
||
Line 1:
{{hatnote|This article uses common abstract notations, such as <math display="inline">y(t) = x(t) * h(t),</math> or <math display="inline">y(t) = \mathcal{H}\{x(t)\},</math> in which it is understood that the functions should be thought of in their totality, rather than at specific instants <math display="inline">t.</math> (see [[Convolution#Notation]])}}
'''Overlap–save''' is the traditional name for an efficient way to evaluate the [[Convolution#Discrete convolution|discrete convolution]] between a very long signal <math>x[n]</math> and a [[finite impulse response]] (FIR) filter <math>h[n]</math>''':'''▼
where h[m]=0 for m outside the region [1, ''M''].▼
▲In [[signal processing]], '''''Overlap–save''''' is the traditional name for an efficient way to evaluate the [[Convolution#Discrete convolution|discrete convolution]] between a very long signal <math>x[n]</math> and a [[finite impulse response]] (FIR) filter <math>h[n]</math>''':'''
[[Image:Overlap-save algorithm.svg|thumb|500px|Fig 1: A sequence of 4 plots depicts one cycle of the overlap–save convolution algorithm. The 1st plot is a long sequence of data to be processed with a lowpass FIR filter. The 2nd plot is one segment of the data to be processed in piecewise fashion. The 3rd plot is the filtered segment, with the usable portion colored red. The 4th plot shows the filtered segment appended to the output stream.{{efn-ua
|Rabiner and Gold, Fig 2.35, fourth trace
}} The FIR filter is a boxcar lowpass with M=16 samples, the length of the segments is L=100 samples and the overlap is 15 samples.]]
{{NumBlk|:|<math>
y[n] = x[n] * h[n]
\ \triangleq\ \sum_{m=-\infty}^{\infty} h[m] \cdot x[n - m]
= \sum_{m=1}^{M} h[m] \cdot x[n - m],
</math>|{{EquationRef|Eq.1}}}}
▲where {{nowrap|''h''[''m''] {{=}} 0}} for ''m'' outside the region {{nowrap|[1, ''M'']}}.
The concept is to compute short segments of ''y''[''n''] of an arbitrary length ''L'', and concatenate the segments together. Consider a segment that begins at ''n'' = ''kL'' + ''M'', for any integer ''k'', and define''':'''
:<math>x_k[n] \ \triangleq
\begin{cases}
x[n+kL], & 1 \le n \le L+M-1\\
0, & \textrm{otherwise}.
\end{cases}
</math>
:<math>y_k[n] \ \triangleq \ x_k[n]*h[n]
Then, for ''kL'' + ''M'' ≤ ''n'' ≤ ''kL'' + ''L'' + ''M'' − 1, and equivalently '''''M''''' ≤ ''n'' − ''kL'' ≤ '''''L'' + ''M'' − 1''', we can write''':'''
Line 23 ⟶ 29:
:<math>y[n] = \sum_{m=1}^{M} h[m] \cdot x_k[n-kL-m] \ \ \triangleq \ \ y_k[n-kL].</math>
With the substitution {{math|j ≜ n-kL}}, the task is reduced to computing {{math|y{{sub|k}}(j)}}, for '''''M''''' ≤ {{mvar|j}} ≤ '''''L'' +'' M'' − 1'''.
:<math>x_{k,N}[n] \ \triangleq \ \sum_{\ell=-\infty}^{\infty} x_k[n - \ell N],</math>
the convolutions <math>(x_{k,N})*h\,</math> and <math>x_k*h\,</math> are equivalent in the region ''M'' ≤ ''n'' ≤ ''L'' + ''M'' − 1.
{{NumBlk|:|<math>y_k[n]\ =\ \scriptstyle \text{
where''':'''
*DFT<sub>N</sub> and
*{{math|L}} is customarily chosen such that {{math|N {{=}} L+M-1}} is an integer power-of-2, and the transforms are implemented with the [[Fast Fourier transform|FFT]] algorithm, for efficiency.
*As shown in
|Not to be confused with the [[Overlap-add method]], which preserves separate leading and trailing edge-effects.
}} (and subsequently discarded). In other words, the first output value is a weighted average of the <u>last</u> M-1 samples of the input segment (and the first sample of the segment). The next M-2 outputs are weighted averages of both the beginning and the end of the segment. The M<sup>th</sup> output value is the first one that combines only samples from the beginning of the segment.
**The overlap region can also be shifted to the last M-1 outputs, which is a potential run-time convenience, because the IDFT can be computed in the <math>y[n]</math> buffer, instead of being computed and copied. Then the overlap can be overwritten by the next IDFT. The shift is accomplished by replacing <math>\scriptstyle \text{DFT}_N \displaystyle (h[n])</math> with <math>\scriptstyle \text{DFT}_N \displaystyle (h[n+M-1]) =\ \scriptstyle \text{DFT}_N \displaystyle (h[n+M-1-N]),</math> meaning that the N-length buffer is ''circularly-shifted'' (rotated) by M-1 samples.
==Pseudocode==
Line 59 ⟶ 66:
[[Image:FFT_size_vs_filter_length_for_Overlap-add_convolution.svg|thumb|400px|Fig 2: A graph of the values of N (an integer power of 2) that minimize the cost function <math>\tfrac{N\left(\log_2 N + 1\right)}{N - M + 1}</math>]]
When the DFT and
|1=Cooley–Tukey FFT algorithm for N=2<sup>k</sup> needs (N/2) log<sub>2</sub>(N) – see [[Fast Fourier transform#Definition and speed|FFT – Definition and speed]]
}} Each iteration produces {{nowrap|'''N-M+1'''}} output samples, so the number of complex multiplications per output sample is about''':'''
|