What type of FFT algorithm does Mlucas use?
Mlucas uses a custom FFT implementation written by me (EWM). I first started on this algorithmic journey in the late summer of 1996, and being a complete novice at transform-based arithmetic at the time, the first FFT routines I used were those from NR. Since then, the code has greatly evolved, and the FFT I currently use looks absolutely nothing like the original one, although it is doing basically the same thing (except for the non-power-of-2 vector length routines – NR has nothing along those lines.) In the past 2 years I have also augmented the original generic high-level C-code FFT implementation with inline assembly code to take advantage of the more-recent x86 processors` SSE2 vector processing capabilities. This more than doubles the program speed on the newer AMD64 and Intel Core2 CPUs. • 5) How does the Mlucas FFT compare to other high-performance FFT implementations, such as the FFTW package? I have not had time or desire to package the FFT core of Mlucas into a form suitabl