Really fast x86 floating point sin/cos

References : Posted by rerdavies[AT]msn[DOT]com
Linked file :
Notes :
Frightful code from the Intel Performance optimization front. Not for the squeamish.

The following code calculates sin and cos of a floating point value on x86 platforms to 20 bits precision with 2 multiplies and two adds. The basic principle is to use sin(x+y) and cos(x+y) identities to generate the result from lookup tables. Each lookup table takes care of 10 bits of precision in the input. The same principle can be used to generate sin/cos to full (! Really. Full!) 24-bit float precision using two 8-bit tables, and one 10 bit table (to provide guard bits), for a net speed gain of about 4x over fsin/fcos, and 8x if you want both sin and cos. Note that microsoft compilers have trouble keeping doubles aligned properly on the stack (they must be 8-byte aligned in order not to incur a massive alignment penalty). As a result, this class should NOT be allocated on the stack. Add it as a member variable to any class that uses it.

class CSomeClass {
CQuickTrig m_QuickTrig;

Code :
(see attached file)