FFTW FAQ - Section 4
Internals of FFTW

Q4.1. How does FFTW work?
Q4.2. Why is FFTW so fast?

Question 4.1. How does FFTW work?

The innovation (if it can be so called) in FFTW consists in having a variety of composable solvers, representing different FFT algorithms and implementation strategies, whose combination into a particular plan for a given size can be determined at runtime according to the characteristics of your machine/compiler. This peculiar software architecture allows FFTW to adapt itself to almost any machine.

For more details (albeit somewhat outdated), see the paper "FFTW: An Adaptive Software Architecture for the FFT", by M. Frigo and S. G. Johnson, Proc. ICASSP 3, 1381 (1998), also available at the FFTW web page.

Question 4.2. Why is FFTW so fast?

This is a complex question, and there is no simple answer. In fact, the authors do not fully know the answer, either. In addition to many small performance hacks throughout FFTW, there are three general reasons for FFTW's speed.

FFTW uses a variety of FFT algorithms and implementation styles that can be arbitrarily composed to adapt itself to a machine. See Q4.1 `How does FFTW work?'.
FFTW uses a code generator to produce highly-optimized routines for computing small transforms.
FFTW uses explicit divide-and-conquer to take advantage of the memory hierarchy.

For more details (albeit somewhat outdated), see the paper "FFTW: An Adaptive Software Architecture for the FFT", by M. Frigo and S. G. Johnson, Proc. ICASSP 3, 1381 (1998), available along with other references at the FFTW web page.

Next: Known bugs.
Back: Using FFTW.
Return to contents.

Matteo Frigo and Steven G. Johnson / fftw@fftw.org - 27 January 2017

FFTW FAQ - Section 4 Internals of FFTW

Question 4.1. How does FFTW work?

Question 4.2. Why is FFTW so fast?

FFTW FAQ - Section 4
Internals of FFTW