Bruun's FFT algorithm: Difference between revisions

Content deleted Content added
No edit summary
References: | Altered template type. Add: isbn, chapter, title. | Use this tool. Report bugs. | #UCB_Gadget
 
(6 intermediate revisions by 3 users not shown)
Line 22:
== Recursive factorizations and FFTs ==
 
In order to compute the DFT, we need to evaluate the remainder of <math>x(z)</math> modulo ''N'' degree-1 polynomials as described above. Evaluating these remainders one by one is equivalent to the evaluating the usual DFT formula directly, and requires O(''N''<sup>2</sup>) operations. However, one can ''combine'' these remainders recursively to reduce the cost, using the following trick: if we want to evaluate <math>x(z)</math> modulo two polynomials <math>U(z)</math> and <math>V(z)</math>, we can first take the remainder modulo their product <math>U(z)</math> <math>V(z)</math>, which reduces the [[Degree of a polynomial|degree]] of the polynomial <math>x(z)</math> and makes subsequent modulo operations less computationally expensive.
 
The product of all of the monomials <math>(z - \omega_N^k)</math> for ''k''=0..''N''-1 is simply <math>z^N-1</math> (whose roots are clearly the ''N'' roots of unity). One then wishes to find a recursive factorization of <math>z^N-1</math> into polynomials of few terms and smaller and smaller degree. To compute the DFT, one takes <math>x(z)</math> modulo each level of this factorization in turn, recursively, until one arrives at the monomials and the final result. If each level of the factorization splits every polynomial into an O(1) (constant-bounded) number of smaller polynomials, each with an O(1) number of nonzero coefficients, then the modulo operations for that level take O(''N'') time; since there will be a logarithmic number of levels, the overall complexity is O (''N'' log ''N'').
 
More explicitly, suppose for example that <math>z^N-1 = F_1(z) F_2(z) F_3(z)</math>, and that <math>F_k(z) = F_{k,1}(z) F_{k,2}(z)</math>, and so on. The corresponding FFT algorithm would consist of first computing ''x''<sub>''k''</sub>(''z'') = ''x''(''z'') mod
''F''<sub>''k''</sub>(''z''), then computing ''x''<sub>''k'',''j''</sub>(''z'') = ''x''<sub>''k''</sub>(''z'') mod
''F''<sub>''k'',''j''</sub>(''z''), and so on, recursively creating more and more remainder polynomials of smaller and smaller degree until one arrives at the final degree-0 results.
Line 34:
===Cooley–Tukey as polynomial factorization===
 
The standard decimation-in-frequency (DIF) radix-''r'' Cooley–Tukey algorithm corresponds closely to a recursive factorization. For example, radix-2 DIF Cooley–Tukey factors <math>z^N-1</math> into <math>F_1 = (z^{N/2}-1)</math> and <math>F_2 = (z^{N/2}+1)</math>. These modulo operations reduce the degree of <math>x(z)</math> by 2, which corresponds to dividing the problem size by 2. Instead of recursively factorizing <math>F_2</math> directly, though, Cooley–Tukey instead first computes ''x''<sub>2</sub>(''z'' ω<sub>''N''</sub>), shifting all the roots (by a ''twiddle factor'') so that it can apply the recursive factorization of <math>F_1</math> to both subproblems. That is, Cooley–Tukey ensures that all subproblems are also DFTs, whereas this is not generally true for an arbitrary recursive factorization (such as Bruun's, below).
 
== The Bruun factorization ==
Line 51:
\end{align}</math>
 
By the construction of the factorization of ''z''<sup>''2''<sup>''n''</sup></sup>-''1'', the polynomials ''p''<sub>''s'',''m''</sub>(''z'') each encode 2<sup>''n''-''s''</sup> values
<math display="block">X_k=p(e^{2\pi i\tfrac{k}{2^n}})</math>
of the Fourier transform, for ''m''=0, the covered indices are ''k''=''0'', 2<sup>''k''</sup>, 2∙2<sup>''s''</sup>, 3∙2<sup>''s''</sup>,..., (2<sup>''n''-''s''</sup>-1)∙2<sup>''s''</sup>, for ''m''>''0'' the covered indices are ''k''=''m'', 2<sup>''s''+1</sup>-''m'', 2<sup>''s''+1</sup>+''m'', 2∙2<sup>''s''+1</sup>-''m'', 2∙2<sup>''s''+1</sup>+''m'', ..., 2<sup>''n''</sup>-''m''.
 
During the transition to the next stage, the polynomial <math>p_{s,\ell}(z)</math> is reduced to the polynomials <math>p_{s+1,\ell}(z)</math> and <math>p_{s+1,2^s-\ell}(z)</math> via polynomial division. If one wants to keep the polynomials in increasing index order, this pattern requires an implementation with two arrays. An implementation in place produces a predictable, but highly unordered sequence of indices, for example for ''N''=16 the final order of the 8 linear remainders is (0, 4, 2, 6, 1, 7, 3, 5).
Line 61:
At each recursive stage, all of the polynomials of the common degree {{math|4''M''-1}} are reduced to two parts of half the degree {{math|2''M''-1}}. The divisor of this polynomial remainder computation is a quadratic polynomial ''z''<sup>''m''</sup>, so that all reductions can be reduced to polynomial divisions of cubic by quadratic polynomials. There are {{math|1=''N''/2 = 2<sup>''n''−1</sup>}} of these small divisions at each stage, leading to an {{math|''O''(''N'' log ''N'')}} algorithm for the FFT.
 
Moreover, since all of these polynomials have purely real coefficients (until the very last stage), they automatically exploit the special case where the inputs ''x''<sub>''n''</sub> are purely real to save roughly a factor of two in computation and storage. One can also take straightforward advantage of the case of real-symmetric data for computing the [[discrete cosine transform]] {{harv|Chen|Sorensen|1992}}.
 
=== Generalization to arbitrary radices ===
 
The Bruun factorization, and thus the Bruun FFT algorithm, was generalized to handle arbitrary ''even'' composite lengths, i.e. dividing the polynomial degree by an arbitrary ''radix'' (factor), as follows. First, we define a set of polynomials {{math|''φ''<sub>''N'',''α''</sub>(''z'')}} for positive integers {{mvar|N}} and for {{mvar|α}} in {{closed-open|0, 1}} by:
<math display="block">\phi_{N, \alpha}(z) = \begin{cases}
z^{2N} - 2 \cos (2 \pi \alpha) z^N + 1 & \text{if } 0 < \alpha < 1 \\ \\
z^{2N} - 1 & \text{if } \alpha = 0
\end{cases} </math>
 
Line 74:
 
<math display="block">\phi_{rM, \alpha}(z) = \begin{cases}
\prod_{\ell=0}^{r-1} \phi_{M,(\alpha+\ell)/r} & \text{if } 0 < \alpha \leq 0.5 \\ \\
\prod_{\ell=0}^{r-1} \phi_{M,(1-\alpha+\ell)/r} & \text{if } 0.5 < \alpha < 1 \\ \\
\prod_{\ell=0}^{r-1} \phi_{M,\ell/(2r)} & \text{if } \alpha = 0
\end{cases} </math>
 
==References==
{{refbegin}}
* {{cite journal | first = Georg | last = Bruun | title = ''z''-Transform DFT filters and FFTs | journal = [[IEEE]] Trans.Transactions on Acoustics, Speech, and Signal Processing (ASSP) | volume = 26 | issue = 1 | pages = 56-6356–63 | year = 1978 | doi = 10.1109/TASSP.1978.1163036 | url = https://backend.orbit.dtu.dk/ws/files/4658740/Bruun.pdf }}
* {{cite book | first = H. J. | last = Nussbaumer | title = Fast Fourier Transform and Convolution Algorithms | series = Springer Series in Information Sciences | publisher = Springer-Verlag | ___location = Berlin | year = 1990 | volume = 2 |doi=10.1007/978-3-642-81897-4| isbn = 978-3-540-11825-1 }}
* {{cite journal | first = Yuhang | last = Wu | title = New FFT structures based on the Bruun algorithm | journal = IEEE Trans.Transactions on Acoustics, Speech, and ASSPSignal Processing| volume = 38 | issue = 1 | pages = 188-191188–191 | year = 1990 | doi = 10.1109/29.45572 | url = https://backend.orbit.dtu.dk/ws/files/4601557/Wu.pdf }}
* {{cite journalbook | firstfirst1 = Jianping | last1 = Chen | first2 = Henrik | last2 = Sorensen | title = &#91;Proceedings&#93; ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing | chapter = An efficient FFT algorithm for real-symmetric data | journalvolume = Proc. ICASSP5 | volumepages = 517–20 | pagesyear = 17-201992 |doi=10.1109/ICASSP.1992.226669| yearisbn = 19920-7803-0532-9 }}
* {{cite journal | first = Rainer | last = Storn | title = Some results in fixed point error analysis of the Bruun-FTT {{sic}} algorithm | journal = IEEE Trans.Transactions on Signal Process. Processing| volume = 41 | issue = 7 | pages = 2371-23752371–2375 | year = 1993 | doi = 10.1109/78.224246 | bibcode = 1993ITSP...41.2371S }}
* {{cite journal | first = Hideo | last = Murakami | title = Real-valued decimation-in-time and decimation-in-frequency algorithms | journal = IEEE Trans.Transactions on Circuits Syst.and Systems II: Analog and Digital Sig. Proc.Signal Processing| volume = 41 | issue = 12 | pages = 808-816808–816 | year = 1994| doi = 10.1109/82.338622 }}
* {{cite journalbook | first = Hideo | last = Murakami | title = 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings | chapter = Real-valued fast discrete Fourier transform and cyclic convolution algorithms of highly composite even length | journalvolume = Proc. [[ICASSP]]3 | volumepages = 31311–1314 | pagesyear = 1311-13141996 |doi=10.1109/ICASSP.1996.543667| yearisbn = 19960-7803-3192-3 }}
* Shashank{{cite Mittal,book Md|doi=10.1007/978-3-540-73625-7_39 Zafar Ali Khan, M. B. Srinivas, "|chapter=A Comparative Study of Different FFT Architectures for Software Defined Radio", ''Lecture Notes in Computer Science'' '''4599''' (''|title=Embedded Computer Systems: Architectures, Modeling, and Simulation''), 375-384|series=Lecture Notes in Computer (Science |date=2007). |last1=Mittal Proc.|first1=Shashank 7th|last2=Khan Intl|first2=Md. Workshop,Zafar SAMOSAli 2007|last3=Srinivas (Samos,|first3=M. Greece,B. July|volume=4599 16–19,|pages=375–384 2007).|isbn=978-3-540-73622-6 }}
{{refend}}