Rader's FFT algorithm

Rader's FFT algorithm is a Fast Fourier Transform (FFT) algorithm that computes the discrete Fourier transform (DFT) of prime sizes by re-expressing the DFT as a cyclic convolution. (The other algorithm for FFTs of prime sizes, Bluestein's algorithm, also works by rewriting the DFT as a convolution.)

Recall that the DFT is defined by the formula

f_{j}={\frac {1}{n}}\sum _{k=0}^{n-1}x_{k}e^{-{\frac {2\pi i}{n}}jk}\qquad j=0,\dots ,n-1.

If n is a prime number, then the set of non-zero indices k = 1,...,n-1 forms a group under multiplication modulo n. One consequence of this is that there exists a generator of the group, an integer g such that k = g^q (mod n) for all non-zero k and for some q in 0,...,n-2. Similarly j = g^-p (mod n) for all non-zero j and for some p in 0,...,n-2, where the negative exponent denotes the multiplicative inverse of g^p modulo n. That means that we can rewrite the DFT using these new indices p and q as:

f_{0}={\frac {1}{n}}\sum _{k=0}^{n-1}x_{k},

f_{g^{-p}}={\frac {x_{0}}{n}}+{\frac {1}{n}}\sum _{q=0}^{n-2}x_{g^{q}}e^{-{\frac {2\pi i}{n}}g^{-(p-q)}}\qquad p=0,\dots ,n-2.

(Recall that x_k and f_j are implicitly periodic in n, and also that e^2πi=1. Thus, all indices and exponents are taken modulo n as required by the group arithmetic.)

The final summation, above, is precisely a cyclic convolution of the two sequences a_q and b_q of length n-1 (q = 0,...,n-2) defined by:

a_{q}=x_{g^{q}}

b_{q}=e^{-{\frac {2\pi i}{n}}g^{-q}}.

Since n-1 is composite, this convolution can be performed directly via the convolution theorem and more conventional FFT algorithms. However, this may not be efficient if n-1 itself has large prime factors, requiring recursive use of Rader's algorithm. Instead, one can compute a cyclic convolution exactly by zero-padding it into a linear convolution of at least twice the length, say to a power of two, which can then be evaluated in O(n log n) time without the recursive application of Rader's algorithm.

This algorithm, then, requires O(n) additions plus O(n log n) time for the convolution. In practice, the O(n) additions can often be performed in O(1) additions by absorbing the additions into the convolution: if the convolution is performed by a pair of FFTs, then the sum of x_k is given by the DC (0th) output of the FFT of a_q, and x₀ can be added to all the outputs by adding it to the DC term of the convolution prior to to the inverse FFT. Still, this algorithm requires intrinsically more operations than FFTs of nearby composite sizes, and typically takes 3-10 times as long in practice.

If Rader's algorithm is performed by using FFTs of size n-1 to compute the convolution, rather than by zero padding as mentioned above, the efficiency depends strongly upon n and the number of times that Rader's algorithm must be applied recursively. The worst case would be if n-1 were 2n₂ where n₂ is prime, with n₂-1 = 2n₃ where n₃ is prime, and so on. In such a case, the application of Rader's algorithm would actually require O(n²) time. Such n_j are called Sophie Germain primes, and the sequence of them is called a Cunningham chain. The length of Cunningham chains, however, grows more slowly than log₂(n), so Rader's algorithm applied in this way is not O(n²), but it is probably worse than O(n log n) for the worst cases. Fortunately, a guarantee of O(n log n) complexity can be achieved by using zero padding.

References:

C. M. Rader, "Discrete Fourier transforms when the number of data samples is prime," Proc. IEEE 56, 1107-1108 (1968).