Main Archive Specials IRC Wiki | FAQ Links Submit Forum
Search

Analysis

Effects

Filters

Other

Synthesis

All

Entries in this category

VST SDK GUI Switch without
(Allmost) Ready-to-use oscillators
1 pole LPF for smooth parameter changes
1-RC and C filter
16-Point Fast Integer Sinc Interpolator.
16-to-8-bit first-order dither
18dB/oct resonant 3 pole LPF with tanh() dist
1st and 2nd order pink noise filters
2 Wave shaping things
3 Band Equaliser
303 type filter with saturation
3rd order Spline interpollation
4th order Linkwitz-Riley filters
5-point spline interpollation
Alias-free waveform generation with analog filtering
Alien Wah
All-Pass Filters, a good explanation
Allocating aligned memory
AM Formantic Synthesis
Another 4-pole lowpass...
Another cheap sinusoidal LFO
another LFO class
Antialiased Lines
antialiased square generator
Arbitary shaped band-limited waveform generation (using oversampling and low-pass filtering)
Audiable alias free waveform gen using width sine
Automatic PDC system
Band Limited PWM Generator
Band Limited waveforms my way
Bandlimited sawtooth synthesis
Bandlimited waveform generation
Bandlimited waveform generation with hard sync
Bandlimited waveforms synopsis.
Bandlimited waveforms...
Base-2 exp
Bass Booster
Beat Detector Class
Biquad C code
Biquad, Butterworth, Chebyshev N-order, M-channel optimized filters
Bit quantization/reduction effect
Bit-Reversed Counting
Block/Loop Benchmarking
Branchless Clipping
Butterworth
Butterworth Optimized C++ Class
C# Oscilator class
C++ class implementation of RBJ Filters
C++ gaussian noise generation
C-Weighed Filter
Calculate notes (java)
Cascaded resonant lp/hp filter
Center separation in a stereo mixdown
Center separation in a stereo mixdown
Cheap pseudo-sinusoidal lfo
chebyshev waveshaper (using their recursive definition)
Class for waveguide/delay effects
Clipping without branching
Coefficients for Daubechies wavelets 1-38
Compressor
Constant-time exponent of 2 detector
Conversion and normalization of 16-bit sample to a floating point number
Conversions on a PowerPC
Cool Sounding Lowpass With Decibel Measured Resonance
Copy-protection schemes
Cubic interpollation
Cubic polynomial envelopes
Cure for malicious samples
DC filter
Decimator
Delay time calculation for reverberation
Delphi Class implementation of the RBJ filters
Denormal DOUBLE variables, macro
Denormal numbers
Denormal numbers, the meta-text
Denormalization preventer
Denormalization preventer
DFT
Digital RIAA equalization filter coefficients
DIRAC - Free C/C++ Library for Time and Pitch Manipulation of Audio Based on Time-Frequency Transforms
Direct form II
Direct Form II biquad
Direct pink noise synthesis with auto-correlated generator
Discrete Summation Formula (DSF)
Dither code
Dithering
Double to Int
Drift generator
DSF (super-set of BLIT)
dynamic convolution
Early echo's with image-mirror technique
Easy noise generation
ECE320 project: Reverberation w/ parameter control from PC
Envelope detector
Envelope Detector class (C++)
Envelope Follower
Envelope follower with different attack and release
Exponential curve for
Exponential parameter mapping
Fast & small sine generation tutorial
fast abs/neg/sign for 32bit floats
Fast binary log approximations
Fast cube root, square root, and reciprocal for x86/SSE CPUs.
Fast Downsampling With Antialiasing
fast exp() approximations
Fast exp2 approximation
Fast Exponential Envelope Generator
Fast Float Random Numbers
Fast in-place Walsh-Hadamard Transform
Fast LFO in Delphi...
Fast log2
fast power and root estimates for 32bit floats
Fast rounding functions in pascal
Fast sign for 32 bit floats
Fast SIN approximation for usage in e.g. additive synthesizers
Fast sine and cosine calculation
Fast sine wave calculation
Fast square wave generator
Fast Whitenoise Generator
FFT
FFT classes in C++ and Object Pascal
Float to int
Float to int (more intel asm)
Float to integer conversion
Float-to-int, coverting an array of floats
fold back distortion
Formant filter
Frequency response from biquad coefficients
frequency warped FIR lattice
Gaussian dithering
Gaussian random numbers
Gaussian White noise
Gaussian White Noise
Generator
Guitar feedback
Hermite Interpolator (x86 ASM)
Hermite interpollation
Hilbert Filter Coefficient Calculation
Hiqh quality /2 decimators
Inverted parabolic envelope
Java FFT
Karlsen
Karlsen Fast Ladder
Linear interpolation
Lo-Fi Crusher
Lock free fifo
Look ahead limiting
Lookahead Limiter
Lowpass filter for parameter edge filtering
LP and HP filter
LPC analysis (autocorrelation + Levinson-Durbin recursion)
LPF 24dB/Oct
Magnitude and phase plot of arbitrary IIR function, up to 5th order
Matlab Time Domain Impulse Response Inverter/Divider
MATLAB-Tools for SNDAN
matlab/octave code for minblep table generation
MDCT and IMDCT based on FFTW3
Measuring interpollation noise
MIDI note/frequency conversion
Millimeter to DB (faders...)
Moog Filter
Moog VCF
Moog VCF, variation 1
Moog VCF, variation 2
Most simple and smooth feedback delay
Most simple static delay
Motorola 56300 Disassembler
Noise Shaping Class
Nonblocking multiprocessor/multithread algorithms in C++
Notch filter
One pole filter, LP and HP
One pole LP and HP
One pole, one zero LP/HP
One zero, LP/HP
One-Liner IIR Filters (1st order)
Output Limiter using Envelope Follower in C++
PADsynth synthesys method
Parabolic shaper
Parallel combs delay calculation
Peak/Notch filter
Perfect LP4 filter
Phase equalization
Phase modulation Vs. Frequency modulation
Phase modulation Vs. Frequency modulation II
Phaser code
Piecewise quadratic approximate exponential function
Pink noise filter
please add it as a comment to the Denormalization preventer entry (no comments are allowed now) thanks
Plot Filter (Analyze filter characteristics)
Plotting R B-J Equalisers in Excel
Polynominal Waveshaper
Polyphase Filters
Polyphase Filters (Delphi)
Poor Man's FIWIZ
pow(x,4) approximation
Prewarping
PRNG for non-uniformly distributed values from trigonometric identity
Pseudo-Random generator
PulseQuad
Pulsewidth modulation
QFT and DQFT (double precision) classes
Quick & Dirty Sine
quick and dirty sine generator
randja compressor
rational tanh approximation
RBJ Audio-EQ-Cookbook
RBJ Audio-EQ-Cookbook
RBJ Wavetable 101
Reading the compressed WA! parts in gigasampler files
Real basic DSP with Matlab (+ GUI) ...
real value vs display value
Really fast x86 floating point sin/cos
Reasonably accurate/fastish tanh approximation
Remez Exchange Algorithm (Parks/McClellan)
Remez Remez (Parks/McClellan)
resampling
Resonant filter
Resonant IIR lowpass (12dB/oct)
Resonant low pass filter
Reverb Filter Generator
Reverberation Algorithms in Matlab
Reverberation techniques
Rossler and Lorenz Oscillators
Saturation
SawSin
Simple biquad filter from apple.com
Simple Compressor class (C++)
Simple peak follower
Simple Tilt equalizer
Simple Time Stretching-Granular Synthesizer
Sin(x) Aproximation (with SSE code)
Sin, Cos, Tan approximation
Sine calculation
Smooth random LFO Generator
smsPitchScale Source Code
Soft saturation
Spuc's open source filters
Square Waves
State variable
State Variable Filter (Chamberlin version)
State Variable Filter (Double Sampled, Stable)
Stereo Enhancer
Stereo Field Rotation Via Transformation Matrix
Stereo Width Control (Obtained Via Transfromation Matrix)
Stilson's Moog filter code
Time compression-expansion using standard phase vocoder
Time domain convolution with O(n^log2(3))
Time domain convolution with O(n^log2(3))
tone detection with Goertzel
Tone detection with Goertzel (x86 ASM)
Trammell Pink Noise (C++ class)
transistor differential amplifier simulation
Type : LPF 24dB/Oct
UniVox Univibe Emulator
Variable-hardness clipping function
Various Biquad filters
Vintage VU meters tutorial
Waveform generator using MinBLEPS
WaveShaper
Waveshaper
Waveshaper
Waveshaper (simple description)
Waveshaper :: Gloubi-boulga
Wavetable Synthesis
Weird synthesis
Windowed Sinc FIR Generator
Zoelzer biquad filters

VST SDK GUI Switch without

References : Posted by quintosardo[AT]yahoo[DOT]it

Notes :
In VST GUI an on-vaue is represented by 1.0 and off by 0.0.

Code :
Say you have two signals you want to switch between when the user changes a switch.
You could do:

if(fSwitch == 0.f) //fSwitch is either 0.0 or 1.0
    output = input1
else
    output = input2

However, you can avoid the branch by doing:

output = input1 * (1.f - fSwitch) + input2 * fSwitch

Which would be like a quick mix. You could make the change clickless by adding a simple one-pole filter:

smooth = filter(fSwitch)
output = input1 * (1.f - smooth) + input2 * smooth


4 comment(s) | add a comment | nofrills version


(Allmost) Ready-to-use oscillators

Type : waveform generation
References : Ross Bencina, Olli Niemitalo, ...

Notes :
Ross Bencina: original source code poster
Olli Niemitalo: UpdateWithCubicInterpolation


Code :
//this code is meant as an EXAMPLE

//uncomment if you need an FM oscillator
//define FM_OSCILLATOR

/*
members are:

float phase;
int TableSize;
float sampleRate;

float *table, dtable0, dtable1, dtable2, dtable3;

->these should be filled as folows... (remember to wrap around!!!)
table[i] = the wave-shape
dtable0[i] = table[i+1] - table[i];
dtable1[i] = (3.f*(table[i]-table[i+1])-table[i-1]+table[i+2])/2.f
dtable2[i] = 2.f*table[i+1]+table[i-1]-(5.f*table[i]+table[i+2])/2.f
dtable3[i] = (table[i+1]-table[i-1])/2.f
*/

float Oscillator::UpdateWithoutInterpolation(float frequency)
{
        int i = (int) phase;

        phase += (sampleRate/(float TableSize)/frequency;

        if(phase >= (float)TableSize)
                phase -= (float)TableSize;

#ifdef FM_OSCILLATOR
        if(phase < 0.f)
                phase += (float)TableSize;
#endif

        return table[i] ;
}

float Oscillator::UpdateWithLinearInterpolation(float frequency)
{
        int i = (int) phase;
        float alpha = phase - (float) i;

        phase += (sampleRate/(float)TableSize)/frequency;

        if(phase >= (float)TableSize)
                phase -= (float)TableSize;

#ifdef FM_OSCILLATOR
        if(phase < 0.f)
                phase += (float)TableSize;
#endif

        /*
        dtable0[i] = table[i+1] - table[i]; //remember to wrap around!!!
        */

        return table[i] + dtable0[i]*alpha;
}

float Oscillator::UpdateWithCubicInterpolation( float frequency )
{
        int i = (int) phase;
        float alpha = phase - (float) i;

        phase += (sampleRate/(float)TableSize)/frequency;

        if(phase >= (float)TableSize)
                phase -= (float)TableSize;

#ifdef FM_OSCILLATOR
        if(phase < 0.f)
                phase += (float)TableSize;
#endif

        /* //remember to wrap around!!!
        dtable1[i] = (3.f*(table[i]-table[i+1])-table[i-1]+table[i+2])/2.f
        dtable2[i] = 2.f*table[i+1]+table[i-1]-(5.f*table[i]+table[i+2])/2.f
        dtable3[i] = (table[i+1]-table[i-1])/2.f
        */

        return ((dtable1[i]*alpha + dtable2[i])*alpha + dtable3[i])*alpha+table[i];
}


2 comment(s) | add a comment | nofrills version


1 pole LPF for smooth parameter changes

Type : 1-pole LPF class
References : Posted by zioguido@gmail.com

Notes :
This is a very simple class that I'm using in my plugins for smoothing parameter changes that directly affect audio stream.
It's a 1-pole LPF, very easy on CPU.

Change the value of variable "a" (0~1) for slower or a faster response.

Of course you can also use it as a lowpass filter for audio signals.


Code :
class CParamSmooth
{
public:
    CParamSmooth() { a = 0.99f; b = 1.f - a; z = 0; };
    ~CParamSmooth();
    inline float Process(float in) { z = (in * b) + (z * a); return z; }
private:
    float a, b, z;
};


5 comment(s) | add a comment | nofrills version


1-RC and C filter

Type : Simple 2-pole LP
References : Posted by madbrain[AT]videotron[DOT]ca

Notes :
This filter is called 1-RC and C since it uses these two parameters. C and R correspond to raw cutoff and inverted resonance, and have a range from 0 to 1.


Code :
//Parameter calculation
//cutoff and resonance are from 0 to 127

  c = pow(0.5, (128-cutoff)   / 16.0);
  r = pow(0.5, (resonance+24) / 16.0);

//Loop:

  v0 =  (1-r*c)*v0  -  (c)*v1  + (c)*input;
  v1 =  (1-r*c)*v1  +  (c)*v0;

  output = v1;


3 comment(s) | add a comment | nofrills version


16-Point Fast Integer Sinc Interpolator.

References : Posted by mumart[at]gmail[dot]com

Notes :
This is designed for fast upsampling with good quality using only a 32-bit accumulator. Sound quality is very good. Conceptually it resamples the input signal 32768x and performs nearest-neighbour to get the requested sample rate. As a result downsampling will result in aliasing.

The provided Sinc table is Blackman-Harris windowed with a slight lowpass. The table entries are 16-bit and are 16x linear-oversampled. It should be pretty easy to figure out how to make your own table for it.

Code provided is in Java. Converting to C/MMX etc. should be pretty trivial.

Remember the interpolator requires a number of samples before and after the sample to be interpolated, so you can't resample the whole of a passed input buffer in one go.

Have fun,
Martin


Code :

public class SincResampler {
    private final int FP_SHIFT = 15;
    private final int FP_ONE = 1 << FP_SHIFT;
    private final int FP_MASK = FP_ONE - 1;


    private final int POINT_SHIFT = 4; // 16 points

    private final int OVER_SHIFT = 4; // 16x oversampling

    private final short[] table = {

         0, -7,  27, -71,  142, -227,  299,  32439,   299,  -227,  142,  -71,  27,  -7,  0,  0,

         0,  0,  -5,  36, -142,  450, -1439, 32224,  2302,  -974,  455, -190,  64, -15,  2,  0,

         0,  6, -33, 128, -391, 1042, -2894, 31584,  4540, -1765,  786, -318, 105, -25,  3,  0,

         0, 10, -55, 204, -597, 1533, -4056, 30535,  6977, -2573, 1121, -449, 148, -36,  5,  0,

        -1, 13, -71, 261, -757, 1916, -4922, 29105,  9568, -3366, 1448, -578, 191, -47,  7,  0,

        -1, 15, -81, 300, -870, 2185, -5498, 27328, 12263, -4109, 1749, -698, 232, -58,  9,  0,

        -1, 15, -86, 322, -936, 2343, -5800, 25249, 15006, -4765, 2011, -802, 269, -68, 10,  0,

        -1, 15, -87, 328, -957, 2394, -5849, 22920, 17738, -5298, 2215, -885, 299, -77, 12,  0,

         0, 14, -83, 319, -938, 2347, -5671, 20396, 20396, -5671, 2347, -938, 319, -83, 14,  0,

         0, 12, -77, 299, -885, 2215, -5298, 17738, 22920, -5849, 2394, -957, 328, -87, 15, -1,

         0, 10, -68, 269, -802, 2011, -4765, 15006, 25249, -5800, 2343, -936, 322, -86, 15, -1,

         0,  9, -58, 232, -698, 1749, -4109, 12263, 27328, -5498, 2185, -870, 300, -81, 15, -1,

         0,  7, -47, 191, -578, 1448, -3366,  9568, 29105, -4922, 1916, -757, 261, -71, 13, -1,

         0,  5, -36, 148, -449, 1121, -2573,  6977, 30535, -4056, 1533, -597, 204, -55, 10,  0,

         0,  3, -25, 105, -318,  786, -1765,  4540, 31584, -2894, 1042, -391, 128, -33,  6,  0,

         0,  2, -15,  64, -190,  455,  -974,  2302, 32224, -1439,  450, -142,  36,  -5,  0,  0,

         0,  0,  -7,  27,  -71,  142,  -227,   299, 32439,   299, -227,  142, -71,  27, -7,  0

    };



    /*

    private final int POINT_SHIFT = 1; // 2 points

    private final int OVER_SHIFT = 0; // 1x oversampling

    private final short[] table = {    

        32767,     0,

        0    , 32767

    };

    */



    private final int POINTS = 1 << POINT_SHIFT;

    private final int INTERP_SHIFT = FP_SHIFT - OVER_SHIFT;

    private final int INTERP_BITMASK = ( 1 << INTERP_SHIFT ) - 1;


    /*
        input - array of input samples
        inputPos - sample position ( must be at least POINTS/2 + 1, ie. 7 )
        inputFrac - fractional sample position ( 0 <= inputFrac < FP_ONE )
        step - number of input samples per output sample * FP_ONE
        lAmp - left output amplitude ( 1.0 = FP_ONE )
        lBuf - left output buffer
        rAmp - right output amplitude ( 1.0 = FP_ONE )
        rBuf - right output buffer    
        pos - offset into output buffers
        count - number of output samples to produce
    */

    public void resample( short[] input, int inputPos, int inputFrac, int step,

            int lAmp, int[] lBuf, int rAmp, int[] rBuf, int pos, int count ) {

        for( int p = 0; p < count; p++ ) {

            int tabidx1 = ( inputFrac >> INTERP_SHIFT ) << POINT_SHIFT;

            int tabidx2 = tabidx1 + POINTS;

            int bufidx = inputPos - POINTS/2 + 1;

            int a1 = 0, a2 = 0;

            for( int t = 0; t < POINTS; t++ ) {

                a1 += table[ tabidx1++ ] * input[ bufidx ] >> 15;

                a2 += table[ tabidx2++ ] * input[ bufidx ] >> 15;

                bufidx++;

            }

            int out = a1 + ( ( a2 - a1 ) * ( inputFrac & INTERP_BITMASK ) >> INTERP_SHIFT );

            lBuf[ pos ] += out * lAmp >> FP_SHIFT;

            rBuf[ pos ] += out * rAmp >> FP_SHIFT;

            pos++;

            inputFrac += step;


            inputPos += inputFrac >> FP_SHIFT;

            inputFrac &= FP_MASK;

        }

    }

}



1 comment(s) | add a comment | nofrills version


16-to-8-bit first-order dither

Type : First order error feedforward dithering code
References : Posted by Jon Watte

Notes :
This is about as simple a dithering algorithm as you can implement, but it's likely to sound better than just truncating to N bits.

Note that you might not want to carry forward the full difference for infinity. It's probably likely that the worst performance hit comes from the saturation conditionals, which can be avoided with appropriate instructions on many DSPs and integer SIMD type instructions, or CMOV.

Last, if sound quality is paramount (such as when going from > 16 bits to 16 bits) you probably want to use a higher-order dither function found elsewhere on this site.


Code :
// This code will down-convert and dither a 16-bit signed short
// mono signal into an 8-bit unsigned char signal, using a first
// order forward-feeding error term dither.

#define uchar unsigned char

void dither_one_channel_16_to_8( short * input, uchar * output, int count, int * memory )
{
  int m = *memory;
  while( count-- > 0 ) {
    int i = *input++;
    i += m;
    int j = i + 32768 - 128;
    uchar o;
    if( j < 0 ) {
      o = 0;
    }
    else if( j > 65535 ) {
      o = 255;
    }
    else {
      o = (uchar)((j>>8)&0xff);
    }
    m = ((j-32768+128)-i);
    *output++ = o;
  }
  *memory = m;
}


no comments on this item | add a comment | nofrills version


18dB/oct resonant 3 pole LPF with tanh() dist

References : Posted by Josep M Comajuncosas
Linked file : lpf18.zip
Linked file : lpf18.sme

Notes :
Implementation in CSound and Sync Modular...



no comments on this item | add a comment | nofrills version


1st and 2nd order pink noise filters

Type : Pink noise
References : Posted by umminger[AT]umminger[DOT]com

Notes :
Here are some new lower-order pink noise filter coefficients.

These have approximately equiripple error in decibels from 20hz to 20khz at a 44.1khz sampling rate.

1st order, ~ +/- 3 dB error (not recommended!)
num = [0.05338071119116 -0.03752455712906]
den = [1.00000000000000 -0.97712493947102]

2nd order, ~ +/- 0.9 dB error
num = [ 0.04957526213389 -0.06305581334498 0.01483220320740 ]
den = [ 1.00000000000000 -1.80116083982126 0.80257737639225 ]




no comments on this item | add a comment | nofrills version


2 Wave shaping things

References : Posted by Frederic Petrot

Notes :
Makes nice saturations effects that can be easilly computed using cordic
First using a atan function:
y1 using k=16
max is the max value you can reach (32767 would be a good guess)
Harmonics scale down linealy and not that fast

Second using the hyperbolic tangent function:
y2 using k=2
Harmonics scale down linealy very fast


Code :
y1 = (max>>1) * atan(k * x/max)

y2 = max * th(x/max)


14 comment(s) | add a comment | nofrills version


3 Band Equaliser

References : Posted by Neil C

Notes :
Simple 3 band equaliser with adjustable low and high frequencies ...

Fairly fast algo, good quality output (seems to be accoustically transparent with all gains set to 1.0)

How to use ...

1. First you need to declare a state for your eq

EQSTATE eq;

2. Now initialise the state (we'll assume your output frequency is 48Khz)

set_3band_state(eq,880,5000,480000);

Your EQ bands are now as follows (approximatley!)

low band = 0Hz to 880Hz
mid band = 880Hz to 5000Hz
high band = 5000Hz to 24000Hz

3. Set the gains to some values ...

eq.lg = 1.5; // Boost bass by 50%
eq.mg = 0.75; // Cut mid by 25%
eq.hg = 1.0; // Leave high band alone

4. You can now EQ some samples

out_sample = do_3band(eq,in_sample)


Have fun and mail me if any problems ... etanza at lycos dot co dot uk


Neil C / Etanza Systems, 2006 :)



Code :
First the header file ....
//---------------------------------------------------------------------------
//
//                                3 Band EQ :)
//
// EQ.H - Header file for 3 band EQ
//
// (c) Neil C / Etanza Systems / 2K6
//
// Shouts / Loves / Moans = etanza at lycos dot co dot uk
//
// This work is hereby placed in the public domain for all purposes, including
// use in commercial applications.
//
// The author assumes NO RESPONSIBILITY for any problems caused by the use of
// this software.
//
//----------------------------------------------------------------------------

#ifndef __EQ3BAND__
#define __EQ3BAND__


// ------------
//| Structures |
// ------------

typedef struct
{
  // Filter #1 (Low band)

  double  lf;       // Frequency
  double  f1p0;     // Poles ...
  double  f1p1;    
  double  f1p2;
  double  f1p3;

  // Filter #2 (High band)

  double  hf;       // Frequency
  double  f2p0;     // Poles ...
  double  f2p1;
  double  f2p2;
  double  f2p3;

  // Sample history buffer

  double  sdm1;     // Sample data minus 1
  double  sdm2;     //                   2
  double  sdm3;     //                   3

  // Gain Controls

  double  lg;       // low  gain
  double  mg;       // mid  gain
  double  hg;       // high gain
  
} EQSTATE;  


// ---------
//| Exports |
// ---------

extern void   init_3band_state(EQSTATE* es, int lowfreq, int highfreq, int mixfreq);
extern double do_3band(EQSTATE* es, double sample);


#endif // #ifndef __EQ3BAND__
//---------------------------------------------------------------------------

Now the source ...
//----------------------------------------------------------------------------
//
//                                3 Band EQ :)
//
// EQ.C - Main Source file for 3 band EQ
//
// (c) Neil C / Etanza Systems / 2K6
//
// Shouts / Loves / Moans = etanza at lycos dot co dot uk
//
// This work is hereby placed in the public domain for all purposes, including
// use in commercial applications.
//
// The author assumes NO RESPONSIBILITY for any problems caused by the use of
// this software.
//
//----------------------------------------------------------------------------

// NOTES :
//
// - Original filter code by Paul Kellet (musicdsp.pdf)
//
// - Uses 4 first order filters in series, should give 24dB per octave
//
// - Now with P4 Denormal fix :)


//----------------------------------------------------------------------------

// ----------
//| Includes |
// ----------

#include <math.h>
#include "eq.h"


// -----------
//| Constants |
// -----------

static double vsa = (1.0 / 4294967295.0);   // Very small amount (Denormal Fix)


// ---------------
//| Initialise EQ |
// ---------------

// Recommended frequencies are ...
//
//  lowfreq  = 880  Hz
//  highfreq = 5000 Hz
//
// Set mixfreq to whatever rate your system is using (eg 48Khz)

void init_3band_state(EQSTATE* es, int lowfreq, int highfreq, int mixfreq)
{
  // Clear state

  memset(es,0,sizeof(EQSTATE));

  // Set Low/Mid/High gains to unity

  es->lg = 1.0;
  es->mg = 1.0;
  es->hg = 1.0;

  // Calculate filter cutoff frequencies

  es->lf = 2 * sin(M_PI * ((double)lowfreq / (double)mixfreq));
  es->hf = 2 * sin(M_PI * ((double)highfreq / (double)mixfreq));
}


// ---------------
//| EQ one sample |
// ---------------

// - sample can be any range you like :)
//
// Note that the output will depend on the gain settings for each band
// (especially the bass) so may require clipping before output, but you
// knew that anyway :)

double do_3band(EQSTATE* es, double sample)
{
  // Locals

  double  l,m,h;      // Low / Mid / High - Sample Values

  // Filter #1 (lowpass)

  es->f1p0  += (es->lf * (sample   - es->f1p0)) + vsa;
  es->f1p1  += (es->lf * (es->f1p0 - es->f1p1));
  es->f1p2  += (es->lf * (es->f1p1 - es->f1p2));
  es->f1p3  += (es->lf * (es->f1p2 - es->f1p3));

  l          = es->f1p3;

  // Filter #2 (highpass)
  
  es->f2p0  += (es->hf * (sample   - es->f2p0)) + vsa;
  es->f2p1  += (es->hf * (es->f2p0 - es->f2p1));
  es->f2p2  += (es->hf * (es->f2p1 - es->f2p2));
  es->f2p3  += (es->hf * (es->f2p2 - es->f2p3));

  h          = es->sdm3 - es->f2p3;

  // Calculate midrange (signal - (low + high))

  m          = es->sdm3 - (h + l);

  // Scale, Combine and store

  l         *= es->lg;
  m         *= es->mg;
  h         *= es->hg;

  // Shuffle history buffer

  es->sdm3   = es->sdm2;
  es->sdm2   = es->sdm1;
  es->sdm1   = sample;                

  // Return result

  return(l + m + h);
}


//----------------------------------------------------------------------------


78 comment(s) | add a comment | nofrills version


303 type filter with saturation

Type : Runge-Kutta Filters
References : Posted by Hans Mikelson
Linked file : filters001.txt

Notes :
I posted a filter to the Csound mailing list a couple of weeks ago that has a 303 flavor to it. It basically does wacky distortions to the sound. I used Runge-Kutta for the diff eq. simulation though which makes it somewhat sluggish.

This is a CSound score!!




1 comment(s) | add a comment | nofrills version


3rd order Spline interpollation

References : Posted by Dave from Muon Software, originally from Josh Scholar

Notes :
(from Joshua Scholar about Spline interpollation in general...)
According to sampling theory, a perfect interpolation could be found by replacing each sample with a sinc function centered on that sample, ringing at your target nyquest frequency, and at each target point you just sum all of contributions from the sinc functions of every single point in source.
The sinc function has ringing that dies away very slowly, so each target sample will have to have contributions from a large neighborhood of source samples. Luckily, by definition the sinc function is bandwidth limited, so once we have a source that is prefilitered for our target nyquest frequency and reasonably oversampled relative to our nyquest frequency, ordinary interpolation techniques are quite fruitful even though they would be pretty useless if we hadn't oversampled.

We want an interpolation routine that at very least has the following characteristics:

1. Obviously it's continuous. But since finite differencing a signal (I don't really know about true differentiation) is equivalent to a low frequency attenuator that drops only about 6 dB per octave, continuity at the higher derivatives is important too.

2. It has to be stiff enough to find peaks when our oversampling missed them. This is where what I said about the combination the sinc function's limited bandwidth and oversampling making interpolation possible comes into play.

I've read some papers on splines, but most stuff on splines relates to graphics and uses a control point descriptions that is completely irrelevant to our sort of interpolation. In reading this stuff I quickly came to the conclusion that splines:

1. Are just piecewise functions made of polynomials designed to have some higher order continuity at the transition points.

2. Splines are highly arbitrary, because you can choose arbitrary derivatives (to any order) at each transition. Of course the more you specify the higher order the polynomials will be.

3. I already know enough about polynomials to construct any sort of spline. A polynomial through 'n' points with a derivative specified at 'm[1]' points and second derivatives specified at 'm[2]' points etc. will be a polynomial of the order n-1+m[1]+m[2]...

A way to construct third order splines (that admittedly doesn't help you construct higher order splines), is to linear interpolate between two parabolas. At each point (they are called knots) you have a parabola going through that point, the previous and the next point. Between each point you linearly interpolate between the polynomials for each point. This may help you imagine splines.

As a starting point I used a polynomial through 5 points for each knot and used MuPad (a free Mathematica like program) to derive a polynomial going through two points (knots) where at each point it has the same first two derivatives as a 4th order polynomial through the surrounding 5 points. My intuition was that basing it on polynomials through 3 points wouldn't be enough of a neighborhood to get good continuity. When I tested it, I found that not only did basing it on 5 point polynomials do much better than basing it on 3 point ones, but that 7 point ones did nearly as badly as 3 point ones. 5 points seems to be a sweet spot.

However, I could have set the derivatives to a nearly arbitrary values - basing the values on those of polynomials through the surrounding points was just a guess.

I've read that the math of sampling theory has different interpretation to the sinc function one where you could upsample by making a polynomial through every point at the same order as the number of points and this would give you the same answer as sinc function interpolation (but this only converges perfectly when there are an infinite number of points). Your head is probably spinning right now - the only point of mentioning that is to point out that perfect interpolation is exactly as stiff as a polynomial through the target points of the same order as the number of target points.


Code :
//interpolates between L0 and H0 taking the previous (L1) and next (H1)
points into account
inline float ThirdInterp(const float x,const float L1,const float L0,const
float H0,const float H1)
{
    return
    L0 +
    .5f*
    x*(H0-L1 +
       x*(H0 + L0*(-2) + L1 +
          x*( (H0 - L0)*9 + (L1 - H1)*3 +
             x*((L0 - H0)*15 + (H1 -  L1)*5 +
                x*((H0 - L0)*6 + (L1 - H1)*2 )))));
}


7 comment(s) | add a comment | nofrills version


4th order Linkwitz-Riley filters

Type : LP/HP - LR4
References : Posted by neolit123 at gmail dot com

Notes :
Original from T. Lossius - ttblue project

Optimized version in pseudo-code.

[! The filter is unstable for fast automation changes in the lower frequency range. Parameter interpolation and/or oversampling should fix this. !]

The sum of the Linkwitz-Riley (Butterworth squared) HP and LP outputs, will result an all-pass filter at Fc and flat magnitude response - close to ideal for crossovers.

Lubomir I. Ivanov


Code :
//-----------------------------------------
// [code]
//-----------------------------------------

//fc -> cutoff frequency
//pi -> 3.14285714285714
//srate -> sample rate

//================================================
// shared for both lp, hp; optimizations here
//================================================
wc=2*pi*fc;
wc2=wc*wc;
wc3=wc2*wc;
wc4=wc2*wc2;
k=wc/tan(pi*fc/srate);
k2=k*k;
k3=k2*k;
k4=k2*k2;
sqrt2=sqrt(2);
sq_tmp1=sqrt2*wc3*k;
sq_tmp2=sqrt2*wc*k3;
a_tmp=4*wc2*k2+2*sq_tmp1+k4+2*sq_tmp2+wc4;

b1=(4*(wc4+sq_tmp1-k4-sq_tmp2))/a_tmp;
b2=(6*wc4-8*wc2*k2+6*k4)/a_tmp;
b3=(4*(wc4-sq_tmp1+sq_tmp2-k4))/a_tmp;
b4=(k4-2*sq_tmp1+wc4-2*sq_tmp2+4*wc2*k2)/a_tmp;

//================================================
// low-pass
//================================================
a0=wc4/a_tmp;
a1=4*wc4/a_tmp;
a2=6*wc4/a_tmp;
a3=a1;
a4=a0;

//=====================================================
// high-pass
//=====================================================
a0=k4/a_tmp;
a1=-4*k4/a_tmp;
a2=6*k4/a_tmp;
a3=a1;
a4=a0;

//=====================================================
// sample loop - same for lp, hp
//=====================================================
tempx=input;

tempy=a0*tempx+a1*xm1+a2*xm2+a3*xm3+a4*xm4-b1*ym1-b2*ym2-b3*ym3-b4*ym4;
xm4=xm3;
xm3=xm2;
xm2=xm1;
xm1=tempx;
ym4=ym3;
ym3=ym2;
ym2=ym1;
ym1=tempy;

output=tempy;


8 comment(s) | add a comment | nofrills version


5-point spline interpollation

Type : interpollation
References : Joshua Scholar, posted by David Waugh
Code :
//nMask = sizeofwavetable-1 where sizeofwavetable is a power of two.
double interpolate(double* wavetable, int nMask, double location)
{
    /* 5-point spline*/

    int nearest_sample = (int) location;
    double x = location - (double) nearest_sample;

    double p0=wavetable[(nearest_sample-2)&nMask];
    double p1=wavetable[(nearest_sample-1)&nMask];
    double p2=wavetable[nearest_sample];
    double p3=wavetable[(nearest_sample+1)&nMask];
    double p4=wavetable[(nearest_sample+2)&nMask];
    double p5=wavetable[(nearest_sample+3)&nMask];

    return p2 + 0.04166666666*x*((p3-p1)*16.0+(p0-p4)*2.0
    + x *((p3+p1)*16.0-p0-p2*30.0- p4
    + x *(p3*66.0-p2*70.0-p4*33.0+p1*39.0+ p5*7.0- p0*9.0
    + x *( p2*126.0-p3*124.0+p4*61.0-p1*64.0- p5*12.0+p0*13.0
    + x *((p3-p2)*50.0+(p1-p4)*25.0+(p5-p0)*5.0)))));
};


6 comment(s) | add a comment | nofrills version


Alias-free waveform generation with analog filtering

Type : waveform generation
References : Posted by Magnus Jonsson
Linked file : synthesis001.txt

Notes :
(see linkfile)



no comments on this item | add a comment | nofrills version


Alien Wah

References : Nasca Octavian Paul ( paulnasca[AT]email.ro )
Linked file : alienwah.c

Notes :
"I found this algoritm by "playing around" with complex numbers. Please email me your opinions about it.

Paul."




3 comment(s) | add a comment | nofrills version


All-Pass Filters, a good explanation

Type : information
References : Posted by Olli Niemitalo
Linked file : filters002.txt


no comments on this item | add a comment | nofrills version


Allocating aligned memory

Type : memory allocation
References : Posted by Benno Senoner

Notes :
we waste up to align_size + sizeof(int) bytes when we alloc a memory area.
We store the aligned_ptr - unaligned_ptr delta in an int located before the aligned area.
This is needed for the free() routine since we need to free all the memory not only the aligned area.
You have to use aligned_free() to free the memory allocated with aligned_malloc() !


Code :
/* align_size has to be a power of two !! */
void *aligned_malloc(size_t size, size_t align_size) {

  char *ptr,*ptr2,*aligned_ptr;
  int align_mask = align_size - 1;

  ptr=(char *)malloc(size + align_size + sizeof(int));
  if(ptr==NULL) return(NULL);

  ptr2 = ptr + sizeof(int);
  aligned_ptr = ptr2 + (align_size - ((size_t)ptr2 & align_mask));


  ptr2 = aligned_ptr - sizeof(int);
  *((int *)ptr2)=(int)(aligned_ptr - ptr);

  return(aligned_ptr);
}

void aligned_free(void *ptr) {

  int *ptr2=(int *)ptr - 1;
  ptr -= *ptr2;
  free(ptr);
}


no comments on this item | add a comment | nofrills version


AM Formantic Synthesis

References : Posted by Paul Sernine

Notes :
Here is another tutorial from Doc Rochebois.
It performs formantic synthesis without filters and without grains. Instead, it uses "double carrier amplitude modulation" to pitch shift formantic waveforms. Just beware the phase relationships to avoid interferences. Some patches of the DX7 used the same trick but phase interferences were a problem. Here, Thierry Rochebois avoids them by using cosine-phased waveforms.

Various formantic waveforms are precalculated and put in tables, they correspond to different formant widths.
The runtime uses many intances (here 4) of these and pitch shifts them with double carriers (to preserve the harmonicity of the signal).

This is a tutorial code, it can be optimized in many ways.
Have Fun

Paul


Code :
// FormantsAM.cpp

// Thierry Rochebois' "Formantic Synthesis by Double Amplitude Modulation"

// Based on a tutorial by Thierry Rochebois.
// Comments by Paul Sernine.

// The spectral content of the signal is obtained by adding amplitude modulated formantic
// waveforms. The amplitude modulations spectraly shift the formantic waveforms.
// Continuous spectral shift, without losing the harmonic structure, is obtained
// by using crossfaded double carriers (multiple of the base frequency).
// To avoid  unwanted interference artifacts, phase relationships must be of the
// "cosine type".

// The output is a 44100Hz 16bit stereo PCM file.

#include <math.h>
#include <stdio.h>
#include <stdlib.h>

//Approximates cos(pi*x) for x in [-1,1].
inline float fast_cos(const float x)
{
  float x2=x*x;
  return 1+x2*(-4+2*x2);
}

//Length of the table
#define L_TABLE (256+1) //The last entry of the table equals the first (to avoid a modulo)
//Maximal formant width
#define I_MAX 64
//Table of formants
float TF[L_TABLE*I_MAX];

//Formantic function of width I (used to fill the table of formants)
float fonc_formant(float p,const float I)
{
  float a=0.5f;
  int hmax=int(10*I)>L_TABLE/2?L_TABLE/2:int(10*I);
  float phi=0.0f;
  for(int h=1;h<hmax;h++)
  {
    phi+=3.14159265359f*p;
    float hann=0.5f+0.5f*fast_cos(h*(1.0f/hmax));
    float gaussienne=0.85f*exp(-h*h/(I*I));
    float jupe=0.15f;
    float harmonique=cosf(phi);
    a+=hann*(gaussienne+jupe)*harmonique;
   }
  return a;
}

//Initialisation of the table TF with the fonction fonc_formant.
void init_formant(void)
{ float coef=2.0f/(L_TABLE-1);
  for(int I=0;I<I_MAX;I++)
    for(int P=0;P<L_TABLE;P++)
      TF[P+I*L_TABLE]=fonc_formant(-1+P*coef,float(I));
}

//This function emulates the function fonc_formant
// thanks to the table TF. A bilinear interpolation is
// performed
float formant(float p,float i)
{
  i=i<0?0:i>I_MAX-2?I_MAX-2:i;    // width limitation
    float P=(L_TABLE-1)*(p+1)*0.5f; // phase normalisation
    int P0=(int)P;     float fP=P-P0;  // Integer and fractional
    int I0=(int)i;     float fI=i-I0;  // parts of the phase (p) and width (i).
    int i00=P0+L_TABLE*I0;     int i10=i00+L_TABLE;
    //bilinear interpolation.
    return (1-fI)*(TF[i00] + fP*(TF[i00+1]-TF[i00]))
        +    fI*(TF[i10] + fP*(TF[i10+1]-TF[i10]));
}

// Double carrier.
// h : position (float harmonic number)
// p : phase
float porteuse(const float h,const float p)
{
  float h0=floor(h);  //integer and
  float hf=h-h0;      //decimal part of harmonic number.
  // modulos pour ramener p*h0 et p*(h0+1) dans [-1,1]
  float phi0=fmodf(p* h0   +1+1000,2.0f)-1.0f;
  float phi1=fmodf(p*(h0+1)+1+1000,2.0f)-1.0f;
  // two carriers.
  float Porteuse0=fast_cos(phi0);  float Porteuse1=fast_cos(phi1);
  // crossfade between the two carriers.
  return Porteuse0+hf*(Porteuse1-Porteuse0);
}
int main()
{
  //Formant table for various french vowels (you can add your own)
  float F1[]={  730,  200,  400,  250,  190,  350,  550,  550,  450};
  float A1[]={ 1.0f, 0.5f, 1.0f, 1.0f, 0.7f, 1.0f, 1.0f, 0.3f, 1.0f};
  float F2[]={ 1090, 2100,  900, 1700,  800, 1900, 1600,  850, 1100};
  float A2[]={ 2.0f, 0.5f, 0.7f, 0.7f,0.35f, 0.3f, 0.5f, 1.0f, 0.7f};
  float F3[]={ 2440, 3100, 2300, 2100, 2000, 2500, 2250, 1900, 1500};
  float A3[]={ 0.3f,0.15f, 0.2f, 0.4f, 0.1f, 0.3f, 0.7f, 0.2f, 0.2f};
  float F4[]={ 3400, 4700, 3000, 3300, 3400, 3700, 3200, 3000, 3000};
  float A4[]={ 0.2f, 0.1f, 0.2f, 0.3f, 0.1f, 0.1f, 0.3f, 0.2f, 0.3f};

  float f0,dp0,p0=0.0f;
  int F=7; //number of the current formant preset
  float f1,f2,f3,f4,a1,a2,a3,a4;
  f1=f2=f3=f4=100.0f;a1=a2=a3=a4=0.0f;

  init_formant();
  FILE *f=fopen("sortie.pcm","wb");
  for(int ns=0;ns<10*44100;ns++)
  {
    if(0==(ns%11025)){F++;F%=8;} //formant change
    f0=12*powf(2.0f,4-4*ns/(10*44100.0f)); //sweep
    f0*=(1.0f+0.01f*sinf(ns*0.0015f));        //vibrato
    dp0=f0*(1/22050.0f);
    float un_f0=1.0f/f0;
    p0+=dp0;  //phase increment
    p0-=2*(p0>1);
    { //smoothing of the commands.
      float r=0.001f;
      f1+=r*(F1[F]-f1);f2+=r*(F2[F]-f2);f3+=r*(F3[F]-f3);f4+=r*(F4[F]-f4);
      a1+=r*(A1[F]-a1);a2+=r*(A2[F]-a2);a3+=r*(A3[F]-a3);a4+=r*(A4[F]-a4);
    }

    //The f0/fn coefficients stand for a -3dB/oct spectral enveloppe
    float out=
           a1*(f0/f1)*formant(p0,100*un_f0)*porteuse(f1*un_f0,p0)
     +0.7f*a2*(f0/f2)*formant(p0,120*un_f0)*porteuse(f2*un_f0,p0)
     +     a3*(f0/f3)*formant(p0,150*un_f0)*porteuse(f3*un_f0,p0)
     +     a4*(f0/f4)*formant(p0,300*un_f0)*porteuse(f4*un_f0,p0);

    short s=short(15000.0f*out);
    fwrite(&s,2,1,f);fwrite(&s,2,1,f); //fichier raw pcm stereo
  }
  fclose(f);
  return 0;
}


5 comment(s) | add a comment | nofrills version


Another 4-pole lowpass...

Type : 4-pole LP/HP
References : Posted by fuzzpilz [AT] gmx [DOT] net

Notes :
Vaguely based on the Stilson/Smith Moog paper, but going in a rather different direction from others I've seen here.

The parameters are peak frequency and peak magnitude (g below); both are reasonably accurate for magnitudes above 1. DC gain is 1.

The filter has some undesirable properties - e.g. it's unstable for low peak freqs if implemented in single precision (haven't been able to cleanly separate it into biquads or onepoles to see if that helps), and it responds so strongly to parameter changes that it's not advisable to update the coefficients much more rarely than, say, every eight samples during sweeps, which makes it somewhat expensive.

I like the sound, however, and the accuracy is nice to have, since many filters are not very strong in that respect.

I haven't looked at the HP again for a while, but IIRC it had approximately the same good and bad sides.


Code :
double coef[9];
double d[4];
double omega; //peak freq
double g;     //peak mag

// calculating coefficients:

double k,p,q,a;
double a0,a1,a2,a3,a4;

k=(4.0*g-3.0)/(g+1.0);
p=1.0-0.25*k;p*=p;

// LP:
a=1.0/(tan(0.5*omega)*(1.0+p));
p=1.0+a;
q=1.0-a;
        
a0=1.0/(k+p*p*p*p);
a1=4.0*(k+p*p*p*q);
a2=6.0*(k+p*p*q*q);
a3=4.0*(k+p*q*q*q);
a4=    (k+q*q*q*q);
p=a0*(k+1.0);
        
coef[0]=p;
coef[1]=4.0*p;
coef[2]=6.0*p;
coef[3]=4.0*p;
coef[4]=p;
coef[5]=-a1*a0;
coef[6]=-a2*a0;
coef[7]=-a3*a0;
coef[8]=-a4*a0;

// or HP:
a=tan(0.5*omega)/(1.0+p);
p=a+1.0;
q=a-1.0;
        
a0=1.0/(p*p*p*p+k);
a1=4.0*(p*p*p*q-k);
a2=6.0*(p*p*q*q+k);
a3=4.0*(p*q*q*q-k);
a4=    (q*q*q*q+k);
p=a0*(k+1.0);
        
coef[0]=p;
coef[1]=-4.0*p;
coef[2]=6.0*p;
coef[3]=-4.0*p;
coef[4]=p;
coef[5]=-a1*a0;
coef[6]=-a2*a0;
coef[7]=-a3*a0;
coef[8]=-a4*a0;

// per sample:

out=coef[0]*in+d[0];
d[0]=coef[1]*in+coef[5]*out+d[1];
d[1]=coef[2]*in+coef[6]*out+d[2];
d[2]=coef[3]*in+coef[7]*out+d[3];
d[3]=coef[4]*in+coef[8]*out;


2 comment(s) | add a comment | nofrills version


Another cheap sinusoidal LFO

References : Posted by info[at]e-phonic[dot]com

Notes :
Some pseudo code for a easy to calculate LFO.

You can even make a rough triangle wave out of this by substracting the output of 2 of these with different phases.

PJ



Code :
r = the rate 0..1

--------------
p += r
if(p > 1) p -= 2;
out = p*(1-abs(p));
--------------


5 comment(s) | add a comment | nofrills version


another LFO class

References : Posted by mdsp
Linked file : LFO.zip

Notes :
This LFO uses an unsigned 32-bit phase and increment whose 8 Most Significant Bits adress a Look-up table while the 24 Least Significant Bits are used as the fractionnal part.
Note: As the phase overflow automatically, the index is always in the range 0-255.

It performs linear interpolation, but it is easy to add other types of interpolation.

Don't know how good it could be as an oscillator, but I found it good enough for a LFO.
BTW there is also different kind of waveforms.

Modifications:
We could use phase on 64-bit or change the proportion of bits used by the index and the fractionnal part.




3 comment(s) | add a comment | nofrills version


Antialiased Lines

Type : A slow, ugly, and unoptimized but short method to perform antialiased lines in a framebuffer
References : Posted by arguru[AT]smartelectronix[DOT]com

Notes :
Simple code to perform antialiased lines in a 32-bit RGBA (1 byte/component) framebuffer.

pframebuffer <- unsigned char* to framebuffer bytes (important: Y flipped line order! [like in the way Win32 CreateDIBSection works...])

client_height=framebuffer height in lines
client_width=framebuffer width in pixels (not in bytes)

This doesnt perform any clip checl so it fails if coordinates are set out of bounds.

sorry for the engrish



Code :
//
// By Arguru
//
void PutTransPixel(int const x,int const y,UCHAR const r,UCHAR const g,UCHAR const b,UCHAR const a)
{
    unsigned char* ppix=pframebuffer+(x+(client_height-(y+1))*client_width)*4;
    ppix[0]=((a*b)+(255-a)*ppix[0])/256;
    ppix[1]=((a*g)+(255-a)*ppix[1])/256;
    ppix[2]=((a*r)+(255-a)*ppix[2])/256;
}

void LineAntialiased(int const x1,int const y1,int const x2,int const y2,UCHAR const r,UCHAR const g,UCHAR const b)
{
    // some useful constants first
    double const dw=x2-x1;
    double const dh=y2-y1;
    double const slx=dh/dw;
    double const sly=dw/dh;

    // determine wichever raster scanning behaviour to use
    if(fabs(slx)<1.0)
    {
        // x scan
        int tx1=x1;
        int tx2=x2;
        double raster=y1;

        if(x1>x2)
        {
            tx1=x2;
            tx2=x1;
            raster=y2;
        }

        for(int x=tx1;x<=tx2;x++)
        {
            int const ri=int(raster);

            double const in_y0=1.0-(raster-ri);
            double const in_y1=1.0-(ri+1-raster);

            PutTransPixel(x,ri+0,r,g,b,in_y0*255.0);
            PutTransPixel(x,ri+1,r,g,b,in_y1*255.0);

            raster+=slx;
        }
    }
    else
    {
        // y scan
        int ty1=y1;
        int ty2=y2;
        double raster=x1;

        if(y1>y2)
        {
            ty1=y2;
            ty2=y1;
            raster=x2;
        }

        for(int y=ty1;y<=ty2;y++)
        {
            int const ri=int(raster);

            double const in_x0=1.0-(raster-ri);
            double const in_x1=1.0-(ri+1-raster);

            PutTransPixel(ri+0,y,r,g,b,in_x0*255.0);
            PutTransPixel(ri+1,y,r,g,b,in_x1*255.0);

            raster+=sly;
        }
    }
}


7 comment(s) | add a comment | nofrills version


antialiased square generator

Type : 1st April edition
References : Posted by Paul Sernine

Notes :
It is based on a code by Thierry Rochebois, obfuscated by me.
It generates a 16bit MONO raw pcm file. Have Fun.


Code :
//sqrfish.cpp
                       #include <math.h>
                        #include <stdio.h>
                        //obfuscation P.Sernine
int main()      {float ccc,cccc=0,CC=0,cc=0,CCCC,
     CCC,C,c;    FILE *CCCCCCC=fopen("sqrfish.pcm",
      "wb"  );  int ccccc= 0; float CCCCC=6.89e-6f;
      for(int CCCCCC=0;CCCCCC<1764000;CCCCCC++   ){
      if(!(CCCCCC%7350)){if(++ccccc>=30){ ccccc =0;
      CCCCC*=2;}CCC=1;}ccc=CCCCC*expf(0.057762265f*
      "aiakahiafahadfaiakahiahafahadf"[ccccc]);CCCC
      =0.75f-1.5f*ccc;cccc+=ccc;CCC*=0.9999f;cccc-=
      2*(cccc>1);C=cccc+CCCC*CC;  c=cccc+CCCC*cc; C
      -=2*(C>1);c-=2*(c>1);C+=2*(C<-1);      c+=1+2
      *(c<-1);c-=2*(c>1);C=C*C*(2 *C*C-4);
      c=c*c*(2*c*c-4); short cccccc=short(15000.0f*
      CCC*(C-c  )*CCC);CC=0.5f*(1+C+CC);cc=0.5f*(1+
     c+cc);      fwrite(&cccccc,2,1,CCCCCCC);}
//algo by              Thierry  Rochebois
                        fclose(CCCCCCC);
                       return 0000000;}


8 comment(s) | add a comment | nofrills version


Arbitary shaped band-limited waveform generation (using oversampling and low-pass filtering)

References : Posted by remage[AT]kac[DOT]poliod[DOT]hu
Code :
Arbitary shaped band-limited waveform generation
(using oversampling and low-pass filtering)

There are many articles about band-limited waveform synthesis techniques, that provide correct and fast methods for generating classic analogue waveforms, such as saw, pulse, and triangle wave.  However, generating arbitary shaped band-limited waveforms, such as the "sawsin" shape (found in this source-code archive), seems to be quite hard using these techniques.

My analogue waveforms are generated in a _very_ high sampling rate (actually it's 1.4112 GHz for 44.1 kHz waveforms, using 32x oversampling).  Using this sample-rate, the amplitude of the aliasing harmonics are negligible (the base analogue waveforms has exponentially decreasing harmonics amplitudes).

Using a 511-tap windowed sync FIR filter (with Blackman-Harris window, and 12 kHz cutoff frequency) the harmonics above 20 kHz are killed, the higher harmonics (that cause the sharp overshoot at step response) are dampened.

The filtered signal downsampled to 44.1 kHz contains the audible (non-aliased) harmonics only.

This waveform synthesis is performed for wavetables of 4096, 2048, 1024, ... 8, 4, 2 samples.  The real-time signal is interpolated from these waveform-tables, using Hermite-(cubic-)interpolation for the waveforms, and linear interpolation between the two wavetables near the required note.

This procedure is quite time-consuming, but the whole waveform (or, in my implementation, the whole waveform-set) can be precalculated (or saved at first launch of the synth) and reloaded at synth initialization.

I don't know if this is a theoretically correct solution, but the waveforms sound good (no audible aliasing).  Please let me know if I'm wrong...


5 comment(s) | add a comment | nofrills version


Audiable alias free waveform gen using width sine

Type : Very simple
References : Posted by joakim[DOT]dahlstrom[AT]ongame[DOT]com

Notes :
Warning, my english abilities is terribly limited.

How ever, the other day when finally understanding what bandlimited wave creation is (i am a noobie, been doing DSP stuf on and off for a half/year) it hit me i can implement one little part in my synths. It's all about the freq (that i knew), very simple you can reduce alias (the alias that you can hear that is) extremely by keeping track of your frequence, the way i solved it is using a factor, afact = 1 - sin(f*2PI). This means you can do audiable alias free synthesis without very complex algorithms or very huge tables, even though the sound becomes kind of low-filtered.
Propably something like this is mentioned b4, but incase it hasn't this is worth looking up

The psuedo code describes it more.

// Druttis


Code :
f := freq factor, 0 - 0.5 (0 to half samplingrate)

afact(f) = 1 - sin(f*2PI)

t := time (0 to ...)
ph := phase shift (0 to 1)
fm := freq mod (0 to 1)

sine(t,f,ph,fm) = sin((t*f+ph)*2PI + 0.5PI*fm*afact(f))

fb := feedback (0 to 1) (1 max saw)

saw(t,f,ph,fm,fb) = sine(t,f,ph,fb*sine(t-1,f,ph,fm))

pm := pulse mod (0 to 1) (1 max pulse)
pw := pulse width (0 to 1) (1 square)

pulse(t,f,ph,fm,fb,pm,pw) = saw(t,f,ph,fm,fb) - (t,f,ph+0.5*pw,fm,fb) * pm

I am not completely sure about fm for saw & pulse since i cant test that atm. but it should work :) otherwise just make sure fm are 0 for saw & pulse.

As you can see the saw & pulse wave are very variable.

// Druttis


5 comment(s) | add a comment | nofrills version


Automatic PDC system

Type : the type that actually works, completely
References : Posted by Tebello Thejane
Linked file : pdc.pdf

Notes :
No, people, implementing PDC is actually not as difficult as you might think it is.

This paper presents a solution to the problem of latency inherent in audio effects processors, and the two appendices give examples of the method being applied on Cubase SX (with an example which its native half-baked PDC fails to solve properly) as well as a convoluted example in FL Studio (taking advantage of the flexible routing capabilities introduced in version 6 of the software). All that's necessary to understand it is a grasp of basic algebra and an intermediate understanding of how music production software works (no need to understand the Laplace transform, linear processes, sigma and integral notation... YAY!).

Please do send me any feedback (kudos, errata, flames, job offers, questions, comments) you might have - my email address is included in the paper - or simply use musicdsp.org's own commenting system.

Tebello Thejane.


Code :
(I have sent the PDF to Bram as he suggested)

3 comment(s) | add a comment | nofrills version


Band Limited PWM Generator

Type : PWM generator
References : Posted by paul_sernine75 AT hotmail DOT fr

Notes :
This is a commented and deobfuscated version of my 1st April fish. It is based on a tutorial code by Thierry Rochebois. I just translated and added comments.

Regards,

Paul Sernine.


Code :
// SelfPMpwm.cpp

// Antialised PWM oscillator

// Based on a tutorial code by Thierry Rochebois (98).
// Itself inspired by US patent 4249447 by Norio Tomisawa (81).
// Comments added/translated by P.Sernine (06).

// This program generates a 44100Hz-raw-PCM-mono-wavefile.
// It is based on Tomisawa's self-phase-modulated sinewave generators.
// Rochebois uses a common phase accumulator to feed two half-Tomisawa-
// oscillators. Each half-Tomisawa-oscillator generates a bandlimited
// sawtooth (band limitation depending on the feedback coeff B).
// These half oscillators are phase offseted according to the desired
// pulse width. They are finally combined to obtain the PW signal.
// Note: the anti-"hunting" filter is a critical feature of a good
// implementation of Tomisawa's method.
#include <math.h>
#include <stdio.h>
const float pi=3.14159265359f;
int main()
{
  float freq,dphi; //!< frequency (Hz) and phase increment(rad/sample)
  float dphif=0;   //!< filtered (anti click) phase increment
  float phi=-pi;   //!< phase
  float Y0=0,Y1=0; //!< feedback memories
  float PW=pi;     //!< pulse width ]0,2pi[
  float B=2.3f;    //!< feedback coef
  FILE *f=fopen("SelfPMpwm.pcm","wb");
  // séquence ('a'=mi=E)
  // you can edit this if you prefer another melody.
  static char seq[]="aiakahiafahadfaiakahiahafahadf"; //!< sequence
  int note=sizeof(seq)-2;  //!< note number in the sequence
  int octave=0;     //!< octave number
  float env,envf=0; //!< envelopped and filtered envelopped
  for(int ns=0;ns<8*(sizeof(seq)-1)*44100/6;ns++)
  {
//waveform control --------------------------------------------------
    //Frequency
    //freq=27.5f*powf(2.0f,8*ns/(8*30*44100.0f/6)); //sweep
    freq=27.5f*powf(2.0f,octave+(seq[note]-'a'-5)/12.0f);
    //freq*=(1.0f+0.01f*sinf(ns*0.0015f));        //vibrato
    dphi=freq*(pi/22050.0f);   // phase increment
    dphif+=0.1f*(dphi-dphif);
    //notes and enveloppe trigger
    if((ns%(44100/6))==0)
    {
      note++;
      if(note>=(sizeof(seq)-1))// sequence loop
      {
        note=0;
        octave++;
      }
      env=1; //env set
      //PW=pi*(0.4+0.5f*(rand()%1000)/1000.0f); //random PW
    }
    env*=0.9998f;              // exp enveloppe
    envf+=0.1f*(env-envf);     // de-clicked enveloppe
    B=1.0f;                    // feedback coefficient
    //try this for a nice bass sound:
    //B*=envf*envf;              // feedback controlled by enveloppe
    B*=2.3f*(1-0.0001f*freq);  // feedback limitation
    if(B<0)
      B=0;

//waveform generation -----------------------------------------------
    //Common phase
    phi+=dphif;                 // phase increment
    if(phi>=pi)
      phi-=2*pi;               // phase wrapping

    // "phase"    half Tomisawa generator 0
    // B*Y0 -> self phase modulation
    float out0=cosf(phi+B*Y0); // half-output 0
    Y0=0.5f*(out0+Y0);         // anti "hunting" filter

    // "phase+PW" half Tomisawa generator 1
    // B*Y1 -> self phase modulation
    // PW   -> phase offset
    float out1=cosf(phi+B*Y1+PW); // half-output 1
    Y1=0.5f*(out1+Y1);            // anti "hunting" filter

    // combination, enveloppe and output
    short s=short(15000.0f*(out0-out1)*envf);
    fwrite(&s,2,1,f);          // file output
  }
  fclose(f);
  return 0;
}


9 comment(s) | add a comment | nofrills version


Band Limited waveforms my way

Type : classic Sawtooth example
References : Posted by Anton Savov (antto mail bg)

Notes :
This is my <ugly> C++ code for generating a single cycle of a Sawtooth in a table
normaly i create my "fundamental" table big enough to hold on around 20-40Hz in the current Sampling rate
also, i create the table twice as big, i do "mip-maps" then
so the size should be a power of two, say 1024 for 44100Hz = 44100/1024 = ~43.066Hz
then the mip-maps are with decreasing sizes (twice) 512, 256, 128, 64, 32, 16, 8, 4, and 2

if the "gibbs" effect is what i think it is - then i have a simple solution
here is my crappy code:


Code :
int sz = 1024; // the size of the table
int i = 0;
float *table; // pointer to the table
double scale = 1.0;
double pd; // phase
double omega = 1.0 / (double)(sz);

while (i < sz)
{
    double amp = scale;
    double x = 0.0; // the sample
    double h = 1; // harmonic number (starts from 1)
    double dd; // fix high frequency "ring"
    pd = (double)(i) / (double)(sz); // calc phase
    double hpd = pd; // phase of the harmonic
    while (true) // start looping for this sample
    {
        if ((omega * h) < 0.5) // harmonic frequency is in range?
        {
            dd = cos(omega * h * 2 * pi);
            x = x + (amp * dd * sin(hpd * 2 * pi));
            h = h + 1;
            hpd = pd * h;
            amp = 1.0 / h;
        }
        else { break; }
    }
    table[i] = x;
    ++i;
}

the peaks are around +/- 0.8
a square can be generated by just changing h = h+2; the peaks would be +/- 0.4

any bugs/improvements?


3 comment(s) | add a comment | nofrills version


Bandlimited sawtooth synthesis

Type : DSF BLIT
References : Posted by emanuel.landeholm [AT] telia.com
Linked file : synthesis002.txt

Notes :
This is working code for synthesizing a bandlimited sawtooth waveform. The algorithm is DSF BLIT + leaky integrator. Includes driver code.

There are two parameters you may tweak:

1) Desired attenuation at nyquist. A low value yields a duller sawtooth but gets rid of those annoying CLICKS when sweeping the frequency up real high. Must be strictly less than 1.0!

2) Integrator leakiness/cut off. Affects the shape of the waveform to some extent, esp. at the low end. Ideally you would want to set this low, but too low a setting will give you problems with DC.

Have fun!
/Emanuel Landeholm

(see linked file)




2 comment(s) | add a comment | nofrills version


Bandlimited waveform generation

Type : waveform generation
References : Posted by Joe Wright
Linked file : bandlimited.cpp
Linked file : bandlimited.pdf

Notes :
(see linkfile)



1 comment(s) | add a comment | nofrills version


Bandlimited waveform generation with hard sync

References : Posted by Emanuel Landeholm
Linked file : http://www.algonet.se/~e-san/hardsync.tar.gz


4 comment(s) | add a comment | nofrills version


Bandlimited waveforms synopsis.

References : Joe Wright
Linked file : waveforms.txt

Notes :
(see linkfile)



2 comment(s) | add a comment | nofrills version


Bandlimited waveforms...

References : Posted by Paul Kellet

Notes :
(Quoted from Paul's mail)
Below is another waveform generation method based on a train of sinc functions (actually an alternating loop along a sinc between t=0 and t=period/2).

The code integrates the pulse train with a dc offset to get a sawtooth, but other shapes can be made in the usual ways... Note that 'dc' and 'leak' may need to be adjusted for very high or low frequencies.

I don't know how original it is (I ought to read more) but it is of usable quality, particularly at low frequencies. There's some scope for optimisation by using a table for sinc, or maybe a a truncated/windowed sinc?

I think it should be possible to minimise the aliasing by fine tuning 'dp' to slightly less than 1 so the sincs join together neatly, but I haven't found the best way to do it. Any comments gratefully received.


Code :
float p=0.0f;      //current position
float dp=1.0f;     //change in postion per sample
float pmax;        //maximum position
float x;           //position in sinc function
float leak=0.995f; //leaky integrator
float dc;          //dc offset
float saw;         //output


//set frequency...

  pmax = 0.5f * getSampleRate() / freqHz;
  dc = -0.498f/pmax;


//for each sample...

  p += dp;
  if(p < 0.0f)
  {
    p = -p;
    dp = -dp;
  }
  else if(p > pmax)
  {
    p = pmax + pmax - p;
    dp = -dp;
  }

  x= pi * p;
  if(x < 0.00001f)
     x=0.00001f; //don't divide by 0

  saw = leak*saw + dc + (float)sin(x)/(x);


3 comment(s) | add a comment | nofrills version


Base-2 exp

References : Posted by Laurent de Soras

Notes :
Linear approx. between 2 integer values of val. Uses 32-bit integers. Not very efficient but fastest than exp()
This code was designed for x86 (little endian), but could be adapted for big endian processors.
Laurent thinks you just have to change the (*(1 + (int *) &ret)) expressions and replace it by (*(int *) &ret). However, He didn't test it.


Code :
inline double fast_exp2 (const double val)
{
   int    e;
   double ret;

   if (val >= 0)
   {
      e = int (val);
      ret = val - (e - 1);
      ((*(1 + (int *) &ret)) &= ~(2047 << 20)) += (e + 1023) << 20;
   }
   else
   {
      e = int (val + 1023);
      ret = val - (e - 1024);
      ((*(1 + (int *) &ret)) &= ~(2047 << 20)) += e << 20;
   }
   return (ret);
}


1 comment(s) | add a comment | nofrills version


Bass Booster

Type : LP and SUM
References : Posted by Johny Dupej

Notes :
This function adds a low-passed signal to the original signal. The low-pass has a quite wide response.

Params:
selectivity - frequency response of the LP (higher value gives a steeper one) [70.0 to 140.0 sounds good]
ratio - how much of the filtered signal is mixed to the original
gain2 - adjusts the final volume to handle cut-offs (might be good to set dynamically)


Code :
#define saturate(x) __min(__max(-1.0,x),1.0)

float BassBoosta(float sample)
{
static float selectivity, gain1, gain2, ratio, cap;
gain1 = 1.0/(selectivity + 1.0);

cap= (sample + cap*selectivity )*gain1;
sample = saturate((sample + cap*ratio)*gain2);

return sample;
}


2 comment(s) | add a comment | nofrills version


Beat Detector Class

References : Posted by DSPMaster[at]free[dot]fr

Notes :
This class was designed for a VST plugin. Basically, it's just a 2nd order LP filter, followed by an enveloppe detector (thanks Bram), feeding a Schmitt trigger. The rising edge detector provides a 1-sample pulse each time a beat is detected. Code is self documented...
Note : The class uses a fixed comparison level, you may need to change it.


Code :
// ***** BEATDETECTOR.H *****
#ifndef BeatDetectorH
#define BeatDetectorH

class TBeatDetector
{
private:
  float KBeatFilter;        // Filter coefficient
  float Filter1Out, Filter2Out;
  float BeatRelease;        // Release time coefficient
  float PeakEnv;            // Peak enveloppe follower
  bool BeatTrigger;         // Schmitt trigger output
  bool PrevBeatPulse;       // Rising edge memory
public:
  bool BeatPulse;           // Beat detector output

  TBeatDetector();
  ~TBeatDetector();
  virtual void setSampleRate(float SampleRate);
  virtual void AudioProcess (float input);
};
#endif


// ***** BEATDETECTOR.CPP *****
#include "BeatDetector.h"
#include "math.h"

#define FREQ_LP_BEAT 150.0f    // Low Pass filter frequency
#define T_FILTER 1.0f/(2.0f*M_PI*FREQ_LP_BEAT)  // Low Pass filter time constant
#define BEAT_RTIME 0.02f  // Release time of enveloppe detector in second

TBeatDetector::TBeatDetector()
// Beat detector constructor
{
  Filter1Out=0.0;
  Filter2Out=0.0;
  PeakEnv=0.0;
  BeatTrigger=false;
  PrevBeatPulse=false;
  setSampleRate(44100);
}

TBeatDetector::~TBeatDetector()
{
  // Nothing specific to do...
}

void TBeatDetector::setSampleRate (float sampleRate)
// Compute all sample frequency related coeffs
{
  KBeatFilter=1.0/(sampleRate*T_FILTER);
  BeatRelease=(float)exp(-1.0f/(sampleRate*BEAT_RTIME));
}

void TBeatDetector::AudioProcess (float input)
// Process incoming signal
{
  float EnvIn;

  // Step 1 : 2nd order low pass filter (made of two 1st order RC filter)
  Filter1Out=Filter1Out+(KBeatFilter*(input-Filter1Out));
  Filter2Out=Filter2Out+(KBeatFilter*(Filter1Out-Filter2Out));

  // Step 2 : peak detector
  EnvIn=fabs(Filter2Out);
  if (EnvIn>PeakEnv) PeakEnv=EnvIn;  // Attack time = 0
  else
  {
    PeakEnv*=BeatRelease;
    PeakEnv+=(1.0f-BeatRelease)*EnvIn;
  }

  // Step 3 : Schmitt trigger
  if (!BeatTrigger)
  {
    if (PeakEnv>0.3) BeatTrigger=true;
  }
  else
  {
    if (PeakEnv<0.15) BeatTrigger=false;
  }

  // Step 4 : rising edge detector
  BeatPulse=false;
  if ((BeatTrigger)&&(!PrevBeatPulse))
    BeatPulse=true;
  PrevBeatPulse=BeatTrigger;
}


11 comment(s) | add a comment | nofrills version


Biquad C code

References : Posted by Tom St Denis
Linked file : biquad.c

Notes :
Implementation of the RBJ cookbook, in C.



1 comment(s) | add a comment | nofrills version


Biquad, Butterworth, Chebyshev N-order, M-channel optimized filters

Type : LP, HP, BP, BS, Shelf, Notch, Boost
References : Posted by thevinn at yahoo dot com
Code :
/*

"A Collection of Useful C++ Classes for Digital Signal Processing"
By Vincent Falco

Please direct all comments to either the music-dsp mailing list or
the DSP and Plug-in Development forum:

    http://music.columbia.edu/cmc/music-dsp/
    
    http://www.kvraudio.com/forum/viewforum.php?f=33
    http://www.kvraudio.com/forum/

Support is provided for performing N-order Dsp floating point filter
operations on M-channel data with a caller specified floating point type.
The implementation breaks a high order IIR filter down into a series of
cascaded second order stages. Tests conclude that numerical stability is
maintained even at higher orders. For example the Butterworth low pass
filter is stable at up to 53 poles.

Processing functions are provided to use either Direct Form I or Direct
Form II of the filter transfer function. Direct Form II is slightly faster
but can cause discontinuities in the output if filter parameters are changed
during processing. Direct Form I is slightly slower, but maintains fidelity
even when parameters are changed during processing.

To support fast parameter changes, filters provide two functions for
adjusting parameters. A high accuracy Setup() function, and a faster
form called SetupFast() that uses approximations for trigonometric
functions. The approximations work quite well and should be suitable for
most applications.

Channels are stored in an interleaved format with M samples per frame
arranged contiguously. A single class instance can process all M channels
simultaneously in an efficient manner. A 'skip' parameter causes the
processing function to advance by skip additional samples in the destination
buffer in between every frame. Through manipulation of the skip paramter it
is possible to exclude channels from processing (for example, only processing
the left half of stereo interleaved data). For multichannel data which is
not interleaved, it will be necessary to instantiate multiple instance of
the filter and set skip=0.

There are a few other utility classes and functions included that may prove useful.

Classes:

Complex
CascadeStages
    Biquad
        BiquadLowPass
        BiquadHighPass
        BiquadBandPass1
        BiquadBandPass2
        BiquadBandStop
        BiquadAllPass
        BiquadPeakEq
        BiquadLowShelf
        BiquadHighShelf
    PoleFilter
        Butterworth
            ButterLowPass
            ButterHighPass
            ButterBandPass
            ButterBandStop
        Chebyshev1
            Cheby1LowPass
            Cheby1HighPass
            Cheby1BandPass
            Cheby1BandStop
        Chebyshev2
            Cheby2LowPass
            Cheby2HighPass
            Cheby2BandPass
            Cheby2BandStop
EnvelopeFollower
AutoLimiter

Functions:

    zero()
    copy()
    mix()
    scale()

    interleave()
    deinterleave()

Order for PoleFilter derived classes is specified in the number of poles,
except for band pass and band stop filters, for which the number of pole pairs
is specified.

For some filters there are two versions of Setup(), the one called
SetupFast() uses approximations to trigonometric functions for speed.
This is an option if you are doing frequent parameter changes to the filter.

There is an example function at the bottom that shows how to use the classes.

Filter ideas are based on a java applet (http://www.falstad.com/dfilter/)
developed by Paul Falstad.

All of this code was written by the author Vincent Falco except where marked.

--------------------------------------------------------------------------------

License: MIT License (http://www.opensource.org/licenses/mit-license.php)
Copyright (c) 2009 by Vincent Falco

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.
*
*/
/*
    To Do:

    - Shelving, peak, all-pass for Butterworth, Chebyshev, and Elliptic.

        The Biquads have these versions and I would like the others to have them
        as well. It would also be super awesome if higher order filters could
        have a "Q" parameter for resonance but I'm not expecting miracles, it
        would require redistributing the poles and zeroes. But if there's
        a research paper or code out there...I could incorporate it.

    - Denormal testing and fixing

        I'd like to know if denormals are a problem. And if so, it would be nice
        to have a small function that can reproduce the denormal problem. This
        way I can test the fix under all conditions. I will include the function
        as a "unit test" object in the header file so anyone can verify its
        correctness. But I'm a little lost.

    - Optimized template specializations for stages=1, channels={1,2}

        There are some pretty obvious optimizations I am saving for "the end".
        I don't want to do them until the code is finalized.

    - Optimized template specializations for SSE, other special instructions

    - Optimized trigonometric functions for fast parameter changes

    - Elliptic curve based filter coefficients

    - More utility functions for manipulating sample buffers

    - Need fast version of pow( 10, x )
*/

#ifndef __DSP_FILTER__
#define __DSP_FILTER__

#include <cmath>
#include <cfloat>
#include <assert.h>
#include <memory.h>
#include <stdlib.h>

//#define DSP_USE_STD_COMPLEX

#ifdef DSP_USE_STD_COMPLEX
#include <complex>
#endif

#define DSP_SSE3_OPTIMIZED

#ifdef DSP_SSE3_OPTIMIZED
    //#include <xmmintrin.h>
    //#include <emmintrin.h>
    #include <pmmintrin.h>
#endif

namespace Dsp
{
    //--------------------------------------------------------------------------
    // WARNING: Here there be templates
    //--------------------------------------------------------------------------

    //--------------------------------------------------------------------------
    //
    //    Configuration
    //
    //--------------------------------------------------------------------------

    // Regardless of the type of sample that the filter operates on (e.g.
    // float or double), all calculations are performed using double (or
    // better) for stability and accuracy. This controls the underlying
    // type used for calculations:
    typedef double CalcT;

    typedef int        Int32;    // Must be 32 bits
    typedef __int64 Int64;    // Must be 64 bits

    // This is used to prevent denormalization.
    const CalcT vsa=1.0 / 4294967295.0; // for CalcT as float

    // These constants are so important, I made my own copy. If you improve
    // the resolution of CalcT be sure to add more significant digits to these.
    const CalcT kPi        =3.1415926535897932384626433832795;
    const CalcT kPi_2    =1.57079632679489661923;
    const CalcT kLn2    =0.693147180559945309417;
    const CalcT kLn10    =2.30258509299404568402;

    //--------------------------------------------------------------------------

    // Some functions missing from <math.h>
    template<typename T>
    inline T acosh( T x )
    {
        return log( x+::sqrt(x*x-1) );
    }

    //--------------------------------------------------------------------------
    //
    //    Fast Trigonometric Functions
    //
    //--------------------------------------------------------------------------

    // Three approximations for both sine and cosine at a given angle.
    // The faster the routine, the larger the error.
    // From http://lab.polygonal.de/2007/07/18/fast-and-accurate-sinecosine-approximation/

    // Tuned for maximum pole stability. r must be in the range 0..pi
    // This one breaks down considerably at the higher angles. It is
    // included only for educational purposes.
    inline void fastestsincos( CalcT r, CalcT *sn, CalcT *cs )
    {
        const CalcT c=0.70710678118654752440; // sqrt(2)/2
        CalcT v=(2-4*c)*r*r+c;
        if(r<kPi_2)
        {
            *sn=v+r; *cs=v-r;
        }
        else
        {
            *sn=r+v; *cs=r-v;
        }
    }

    // Lower precision than ::fastsincos() but still decent
    inline void fastersincos( CalcT x, CalcT *sn, CalcT *cs )
    {
        //always wrap input angle to -PI..PI
        if        (x < -kPi) x += 2*kPi;
        else if (x >  kPi) x -= 2*kPi;
        //compute sine
        if (x < 0)    *sn = 1.27323954 * x + 0.405284735 * x * x;
        else        *sn = 1.27323954 * x - 0.405284735 * x * x;
        //compute cosine: sin(x + PI/2) = cos(x)
        x += kPi_2;
        if (x > kPi ) x -= 2*kPi;
        if (x < 0)    *cs = 1.27323954 * x + 0.405284735 * x * x;
        else        *cs = 1.27323954 * x - 0.405284735 * x * x;
    }

    // Slower than ::fastersincos() but still faster than
    // sin(), and with the best accuracy of these routines.
    inline void fastsincos( CalcT x, CalcT *sn, CalcT *cs )
    {
        CalcT s, c;
        //always wrap input angle to -PI..PI
             if (x < -kPi) x += 2*kPi;
        else if (x >  kPi) x -= 2*kPi;
        //compute sine
        if (x < 0)
        {
            s = 1.27323954 * x + .405284735 * x * x;
            if (s < 0)    s = .225 * (s * -s - s) + s;
            else        s = .225 * (s *  s - s) + s;
        }
        else
        {
            s = 1.27323954 * x - 0.405284735 * x * x;
            if (s < 0)    s = .225 * (s * -s - s) + s;
            else        s = .225 * (s *  s - s) + s;
        }
        *sn=s;
        //compute cosine: sin(x + PI/2) = cos(x)
        x += kPi_2;
        if (x > kPi ) x -= 2*kPi;
        if (x < 0)
        {
            c = 1.27323954 * x + 0.405284735 * x * x;
            if (c < 0)    c = .225 * (c * -c - c) + c;
            else        c = .225 * (c *  c - c) + c;
        }
        else
        {
            c = 1.27323954 * x - 0.405284735 * x * x;
            if (c < 0)    c = .225 * (c * -c - c) + c;
            else        c = .225 * (c *  c - c) + c;
        }
        *cs=c;
    }

    // Faster approximations to sqrt()
    //    From http://ilab.usc.edu/wiki/index.php/Fast_Square_Root
    //    The faster the routine, the more error in the approximation.

    // Log Base 2 Approximation
    // 5 times faster than sqrt()

    inline float fastsqrt1( float x )  
    {
        union { Int32 i; float x; } u;
        u.x = x;
        u.i = (Int32(1)<<29) + (u.i >> 1) - (Int32(1)<<22);
        return u.x;
    }

    inline double fastsqrt1( double x )  
    {
        union { Int64 i; double x; } u;
        u.x = x;
        u.i = (Int64(1)<<61) + (u.i >> 1) - (Int64(1)<<51);
        return u.x;
    }

    // Log Base 2 Approximation with one extra Babylonian Step
    // 2 times faster than sqrt()

    inline float fastsqrt2( float x )  
    {
        float v=fastsqrt1( x );
        v = 0.5f * (v + x/v); // One Babylonian step
        return v;
    }

    inline double fastsqrt2(const double x)  
    {
        double v=fastsqrt1( x );
        v = 0.5f * (v + x/v); // One Babylonian step
        return v;
    }

    // Log Base 2 Approximation with two extra Babylonian Steps
    // 50% faster than sqrt()

    inline float fastsqrt3( float x )  
    {
        float v=fastsqrt1( x );
        v =           v + x/v;
        v = 0.25f* v + x/v; // Two Babylonian steps
        return v;
    }

    inline double fastsqrt3(const double x)  
    {
        double v=fastsqrt1( x );
        v =           v + x/v;
        v = 0.25 * v + x/v; // Two Babylonian steps
        return v;
    }

    //--------------------------------------------------------------------------
    //
    //    Complex
    //
    //--------------------------------------------------------------------------

#ifdef DSP_USE_STD_COMPLEX

    template<typename T>
    inline std::complex<T> polar( const T &m, const T &a )
    {
        return std::polar( m, a );
    }

    template<typename T>
    inline T norm( const std::complex<T> &c )
    {
        return std::norm( c );
    }

    template<typename T>
    inline T abs( const std::complex<T> &c )
    {
        return std::abs(c);
    }

    template<typename T, typename To>
    inline std::complex<T> addmul( const std::complex<T> &c, T v, const std::complex<To> &c1 )
    {
        return std::complex<T>( c.real()+v*c1.real(), c.imag()+v*c1.imag() );
    }

    template<typename T>
    inline T arg( const std::complex<T> &c )
    {
        return std::arg( c );
    }

    template<typename T>
    inline std::complex<T> recip( const std::complex<T> &c )
    {
        T n=1.0/Dsp::norm(c);
        return std::complex<T>( n*c.real(), n*c.imag() );
    }
    template<typename T>
    inline std::complex<T> sqrt( const std::complex<T> &c )
    {
        return std::sqrt( c );
    }

    typedef std::complex<CalcT> Complex;

#else

    //--------------------------------------------------------------------------

    // "Its always good to have a few extra wheels in case one goes flat."

    template<typename T>
    struct ComplexT
    {
        ComplexT();
        ComplexT( T r_, T i_=0 );

        template<typename To>
        ComplexT( const ComplexT<To> &c );

        T            imag        ( void ) const;
        T            real        ( void ) const;

        ComplexT &    neg            ( void );
        ComplexT &    conj        ( void );

        template<typename To>
        ComplexT &    add            ( const ComplexT<To> &c );
        template<typename To>
        ComplexT &    sub            ( const ComplexT<To> &c );
        template<typename To>
        ComplexT &    mul            ( const ComplexT<To> &c );
        template<typename To>
        ComplexT &    div            ( const ComplexT<To> &c );
        template<typename To>
        ComplexT &    addmul        ( T v, const ComplexT<To> &c );

        ComplexT    operator-    ( void ) const;

        ComplexT    operator+    ( T v ) const;
        ComplexT    operator-    ( T v ) const;
        ComplexT    operator*    ( T v ) const;
        ComplexT    operator/    ( T v ) const;

        ComplexT &    operator+=    ( T v );
        ComplexT &    operator-=    ( T v );
        ComplexT &    operator*=    ( T v );
        ComplexT &    operator/=    ( T v );

        template<typename To>
        ComplexT    operator+    ( const ComplexT<To> &c ) const;
        template<typename To>
        ComplexT    operator-    ( const ComplexT<To> &c ) const;
        template<typename To>
        ComplexT    operator*    ( const ComplexT<To> &c ) const;
        template<typename To>
        ComplexT    operator/    ( const ComplexT<To> &c ) const;

        template<typename To>
        ComplexT &    operator+=    ( const ComplexT<To> &c );
        template<typename To>
        ComplexT &    operator-=    ( const ComplexT<To> &c );
        template<typename To>
        ComplexT &    operator*=    ( const ComplexT<To> &c );
        template<typename To>
        ComplexT &    operator/=    ( const ComplexT<To> &c );

    private:
        ComplexT &    add            ( T v );
        ComplexT &    sub            ( T v );
        ComplexT &    mul            ( T c, T d );
        ComplexT &    mul            ( T v );
        ComplexT &    div            ( T v );

        T r;
        T i;
    };

    //--------------------------------------------------------------------------

    template<typename T>
    inline ComplexT<T>::ComplexT()
    {
    }

    template<typename T>
    inline ComplexT<T>::ComplexT( T r_, T i_ )
    {
        r=r_;
        i=i_;
    }

    template<typename T>
    template<typename To>
    inline ComplexT<T>::ComplexT( const ComplexT<To> &c )
    {
        r=c.r;
        i=c.i;
    }

    template<typename T>
    inline T ComplexT<T>::imag( void ) const
    {
        return i;
    }

    template<typename T>
    inline T ComplexT<T>::real( void ) const
    {
        return r;
    }

    template<typename T>
    inline ComplexT<T> &ComplexT<T>::neg( void )
    {
        r=-r;
        i=-i;
        return *this;
    }

    template<typename T>
    inline ComplexT<T> &ComplexT<T>::conj( void )
    {
        i=-i;
        return *this;
    }

    template<typename T>
    inline ComplexT<T> &ComplexT<T>::add( T v )
    {
        r+=v;
        return *this;
    }

    template<typename T>
    inline ComplexT<T> &ComplexT<T>::sub( T v )
    {
        r-=v;
        return *this;
    }

    template<typename T>
    inline ComplexT<T> &ComplexT<T>::mul( T c, T d )
    {
        T ac=r*c;
        T bd=i*d;
        // must do i first
        i=(r+i)*(c+d)-(ac+bd);
        r=ac-bd;
        return *this;
    }

    template<typename T>
    inline ComplexT<T> &ComplexT<T>::mul( T v )
    {
        r*=v;
        i*=v;
        return *this;
    }

    template<typename T>
    inline ComplexT<T> &ComplexT<T>::div( T v )
    {
        r/=v;
        i/=v;
        return *this;
    }

    template<typename T>
    template<typename To>
    inline ComplexT<T> &ComplexT<T>::add( const ComplexT<To> &c )
    {
        r+=c.r;
        i+=c.i;
        return *this;
    }

    template<typename T>
    template<typename To>
    inline ComplexT<T> &ComplexT<T>::sub( const ComplexT<To> &c )
    {
        r-=c.r;
        i-=c.i;
        return *this;
    }

    template<typename T>
    template<typename To>
    inline ComplexT<T> &ComplexT<T>::mul( const ComplexT<To> &c )
    {
        return mul( c.r, c.i );
    }

    template<typename T>
    template<typename To>
    inline ComplexT<T> &ComplexT<T>::div( const ComplexT<To> &c )
    {
        T s=1.0/norm(c);
        return mul( c.r*s, -c.i*s );
    }

    template<typename T>
    inline ComplexT<T> ComplexT<T>::operator-( void ) const
    {
        return ComplexT<T>(*this).neg();
    }

    template<typename T>
    inline ComplexT<T> ComplexT<T>::operator+( T v ) const
    {
        return ComplexT<T>(*this).add( v );
    }

    template<typename T>
    inline ComplexT<T> ComplexT<T>::operator-( T v ) const
    {
        return ComplexT<T>(*this).sub( v );
    }

    template<typename T>
    inline ComplexT<T> ComplexT<T>::operator*( T v ) const
    {
        return ComplexT<T>(*this).mul( v );
    }

    template<typename T>
    inline ComplexT<T> ComplexT<T>::operator/( T v ) const
    {
        return ComplexT<T>(*this).div( v );
    }

    template<typename T>
    inline ComplexT<T> &ComplexT<T>::operator+=( T v )
    {
        return add( v );
    }

    template<typename T>
    inline ComplexT<T> &ComplexT<T>::operator-=( T v )
    {
        return sub( v );
    }

    template<typename T>
    inline ComplexT<T> &ComplexT<T>::operator*=( T v )
    {
        return mul( v );
    }

    template<typename T>
    inline ComplexT<T> &ComplexT<T>::operator/=( T v )
    {
        return div( v );
    }

    template<typename T>
    template<typename To>
    inline ComplexT<T> ComplexT<T>::operator+( const ComplexT<To> &c ) const
    {
        return ComplexT<T>(*this).add(c);
    }

    template<typename T>
    template<typename To>
    inline ComplexT<T> ComplexT<T>::operator-( const ComplexT<To> &c ) const
    {
        return ComplexT<T>(*this).sub(c);
    }

    template<typename T>
    template<typename To>
    inline ComplexT<T> ComplexT<T>::operator*( const ComplexT<To> &c ) const
    {
        return ComplexT<T>(*this).mul(c);
    }

    template<typename T>
    template<typename To>
    inline ComplexT<T> ComplexT<T>::operator/( const ComplexT<To> &c ) const
    {
        return ComplexT<T>(*this).div(c);
    }

    template<typename T>
    template<typename To>
    inline ComplexT<T> &ComplexT<T>::operator+=( const ComplexT<To> &c )
    {
        return add( c );
    }

    template<typename T>
    template<typename To>
    inline ComplexT<T> &ComplexT<T>::operator-=( const ComplexT<To> &c )
    {
        return sub( c );
    }

    template<typename T>
    template<typename To>
    inline ComplexT<T> &ComplexT<T>::operator*=( const ComplexT<To> &c )
    {
        return mul( c );
    }

    template<typename T>
    template<typename To>
    inline ComplexT<T> &ComplexT<T>::operator/=( const ComplexT<To> &c )
    {
        return div( c );
    }

    //--------------------------------------------------------------------------

    template<typename T>
    inline ComplexT<T> polar( const T &m, const T &a )
    {
        return ComplexT<T>( m*cos(a), m*sin(a) );
    }

    template<typename T>
    inline T norm( const ComplexT<T> &c )
    {
        return c.real()*c.real()+c.imag()*c.imag();
    }

    template<typename T>
    inline T abs( const ComplexT<T> &c )
    {
        return ::sqrt( c.real()*c.real()+c.imag()*c.imag() );
    }

    template<typename T, typename To>
    inline ComplexT<T> addmul( const ComplexT<T> &c, T v, const ComplexT<To> &c1 )
    {
        return ComplexT<T>( c.real()+v*c1.real(), c.imag()+v*c1.imag() );
    }

    template<typename T>
    inline T arg( const ComplexT<T> &c )
    {
        return atan2( c.imag(), c.real() );
    }

    template<typename T>
    inline ComplexT<T> recip( const ComplexT<T> &c )
    {
        T n=1.0/norm(c);
        return ComplexT<T>( n*c.real(), -n*c.imag() );
    }

    template<typename T>
    inline ComplexT<T> sqrt( const ComplexT<T> &c )
    {
        return polar( ::sqrt(abs(c)), arg(c)*0.5 );
    }

    //--------------------------------------------------------------------------

    typedef ComplexT<CalcT> Complex;

#endif

    //--------------------------------------------------------------------------
    //
    //    Numerical Analysis
    //
    //--------------------------------------------------------------------------

    // Implementation of Brent's Method provided by
    // John D. Cook (http://www.johndcook.com/)

    // The return value of Minimize is the minimum of the function f.
    // The location where f takes its minimum is returned in the variable minLoc.
    // Notation and implementation based on Chapter 5 of Richard Brent's book
    // "Algorithms for Minimization Without Derivatives".

    template<class TFunction>
    CalcT BrentMinimize
    (
        TFunction& f,    // [in] objective function to minimize
        CalcT leftEnd,    // [in] smaller value of bracketing interval
        CalcT rightEnd,    // [in] larger value of bracketing interval
        CalcT epsilon,    // [in] stopping tolerance
        CalcT& minLoc    // [out] location of minimum
    )
    {
        CalcT d, e, m, p, q, r, tol, t2, u, v, w, fu, fv, fw, fx;
        static const CalcT c = 0.5*(3.0 - ::sqrt(5.0));
        static const CalcT SQRT_DBL_EPSILON = ::sqrt(DBL_EPSILON);

        CalcT& a = leftEnd; CalcT& b = rightEnd; CalcT& x = minLoc;

        v = w = x = a + c*(b - a); d = e = 0.0;
        fv = fw = fx = f(x);
        int counter = 0;
    loop:
        counter++;
        m = 0.5*(a + b);
        tol = SQRT_DBL_EPSILON*::fabs(x) + epsilon; t2 = 2.0*tol;
        // Check stopping criteria
        if (::fabs(x - m) > t2 - 0.5*(b - a))
        {
            p = q = r = 0.0;
            if (::fabs(e) > tol)
            {
                // fit parabola
                r = (x - w)*(fx - fv);
                q = (x - v)*(fx - fw);
                p = (x - v)*q - (x - w)*r;
                q = 2.0*(q - r);
                (q > 0.0) ? p = -p : q = -q;
                r = e; e = d;
            }
            if (::fabs(p) < ::fabs(0.5*q*r) && p < q*(a - x) && p < q*(b - x))
            {
                // A parabolic interpolation step
                d = p/q;
                u = x + d;
                // f must not be evaluated too close to a or b
                if (u - a < t2 || b - u < t2)
                    d = (x < m) ? tol : -tol;
            }
            else
            {
                // A golden section step
                e = (x < m) ? b : a;
                e -= x;
                d = c*e;
            }
            // f must not be evaluated too close to x
            if (::fabs(d) >= tol)
                u = x + d;
            else if (d > 0.0)
                u = x + tol;
            else
                u = x - tol;
            fu = f(u);
            // Update a, b, v, w, and x
            if (fu <= fx)
            {
                (u < x) ? b = x : a = x;
                v = w; fv = fw;
                w = x; fw = fx;
                x = u; fx = fu;
            }
            else
            {
                (u < x) ? a = u : b = u;
                if (fu <= fw || w == x)
                {
                    v = w; fv = fw;
                    w = u; fw = fu;
                }
                else if (fu <= fv || v == x || v == w)
                {
                    v = u; fv = fu;
                }
            }
            goto loop;  // Yes, the dreaded goto statement. But the code
                        // here is faithful to Brent's orginal pseudocode.
        }
        return  fx;
    }

    //--------------------------------------------------------------------------
    //
    //    Infinite Impulse Response Filters
    //
    //--------------------------------------------------------------------------

    // IIR filter implementation using multiple second-order stages.

    class CascadeFilter
    {
    public:
        // Process data in place using Direct Form I
        // skip is added after each frame.
        // Direct Form I is more suitable when the filter parameters
        // are changed often. However, it is slightly slower.
        template<typename T>
        void ProcessI( size_t frames, T *dest, int skip=0 );

        // Process data in place using Direct Form II
        // skip is added after each frame.
        // Direct Form II is slightly faster than Direct Form I,
        // but changing filter parameters on stream can result
        // in discontinuities in the output. It is best suited
        // for a filter whose parameters are set only once.
        template<typename T>
        void ProcessII( size_t frames, T *dest, int skip=0 );

        // Convenience function that just calls ProcessI.
        // Feel free to change the implementation.
        template<typename T>
        void Process( size_t frames, T *dest, int skip=0 );

        // Determine response at angular frequency (0<=w<=kPi)
        Complex    Response( CalcT w );

        // Clear the history buffer.
        void Clear( void );

    protected:
        struct Hist;
        struct Stage;

        // for m_nchan==2
#ifdef DSP_SSE3_OPTIMIZED
        template<typename T>
        void ProcessISSEStageStereo( size_t frames, T *dest, Stage *stage, Hist *h, int skip );

        template<typename T>
        void ProcessISSEStereo( size_t frames, T *dest, int skip );
#endif

    protected:
        void Reset        ( void );
        void Normalize    ( CalcT scale );
        void SetAStage    ( CalcT x1, CalcT x2 );
        void SetBStage    ( CalcT x0, CalcT x1, CalcT x2 );
        void SetStage    ( CalcT a1, CalcT a2, CalcT b0, CalcT b1, CalcT b2 );

    protected:
        struct Hist
        {
            CalcT v[4];
        };

        struct Stage
        {
            CalcT a[3];    // a[0] unused
            CalcT b[3];
            void Reset( void );
        };

        struct ResponseFunctor
        {
            CascadeFilter *m_filter;
            CalcT operator()( CalcT w );
            ResponseFunctor( CascadeFilter *filter );
        };

        int        m_nchan;
        int        m_nstage;
        Stage *    m_stagep;
        Hist *    m_histp;
    };

    //--------------------------------------------------------------------------

    template<typename T>
    void CascadeFilter::ProcessI( size_t frames, T *dest, int skip )
    {
#ifdef DSP_SSE3_OPTIMIZED
        if( m_nchan==2 )
            ProcessISSEStereo( frames, dest, skip );
        else
#endif
        while( frames-- )
        {
            Hist *h=m_histp;
            for( int j=m_nchan;j;j-- )
            {
                CalcT in=CalcT(*dest);
                Stage *s=m_stagep;
                for( int i=m_nstage;i;i--,h++,s++ )
                {
                    CalcT out;
                    out=s->b[0]*in        + s->b[1]*h->v[0] + s->b[2]*h->v[1] +
                        s->a[1]*h->v[2] + s->a[2]*h->v[3];
                    h->v[1]=h->v[0]; h->v[0]=in;
                    h->v[3]=h->v[2]; h->v[2]=out;
                    in=out;
                }
                *dest++=T(in);
            }
            dest+=skip;
        }
    }

    // A good compiler already produces code that is optimized even for
    // the general case. The only way to make it go faster is to
    // to implement it in assembler or special instructions. Like this:

#ifdef DSP_SSE3_OPTIMIZED
    // ALL SSE OPTIMIZATIONS ASSUME CalcT as double
    template<typename T>
    inline void CascadeFilter::ProcessISSEStageStereo(
        size_t frames, T *dest, Stage *s, Hist *h, int skip )
    {
        assert( m_nchan==2 );

#if 1
        CalcT b0=s->b[0];

        __m128d m0=_mm_loadu_pd( &s->a[1] );    // a1 , a2
        __m128d m1=_mm_loadu_pd( &s->b[1] );    // b1 , b2
        __m128d m2=_mm_loadu_pd( &h[0].v[0] );    // h->v[0] , h->v[1]
        __m128d m3=_mm_loadu_pd( &h[0].v[2] );    // h->v[2] , h->v[3]
        __m128d m4=_mm_loadu_pd( &h[1].v[0] );    // h->v[0] , h->v[1]
        __m128d m5=_mm_loadu_pd( &h[1].v[2] );    // h->v[2] , h->v[3]

        while( frames-- )
        {
            CalcT in, b0in, out;

            __m128d m6;
            __m128d m7;

            in=CalcT(*dest);
            b0in=b0*in;

            m6=_mm_mul_pd ( m1, m2 );    // b1*h->v[0] , b2*h->v[1]
            m7=_mm_mul_pd ( m0, m3 );    // a1*h->v[2] , a2*h->v[3]
            m6=_mm_add_pd ( m6, m7 );    // b1*h->v[0] + a1*h->v[2], b2*h->v[1] + a2*h->v[3]
            m7=_mm_load_sd( &b0in );    // b0*in , 0
            m6=_mm_add_sd ( m6, m7 );    // b1*h->v[0] + a1*h->v[2] + in*b0 , b2*h->v[1] + a2*h->v[3] + 0
            m6=_mm_hadd_pd( m6, m7 );    // b1*h->v[0] + a1*h->v[2] + in*b0 + b2*h->v[1] + a2*h->v[3], in*b0
               _mm_store_sd( &out, m6 );
            m6=_mm_loadh_pd( m6, &in );    // out , in
            m2=_mm_shuffle_pd( m6, m2, _MM_SHUFFLE2( 0, 1 ) ); // h->v[0]=in , h->v[1]=h->v[0]
            m3=_mm_shuffle_pd( m6, m3, _MM_SHUFFLE2( 0, 0 ) ); // h->v[2]=out, h->v[3]=h->v[2]

            *dest++=T(out);

            in=CalcT(*dest);
            b0in=b0*in;

            m6=_mm_mul_pd ( m1, m4 );    // b1*h->v[0] , b2*h->v[1]
            m7=_mm_mul_pd ( m0, m5 );    // a1*h->v[2] , a2*h->v[3]
            m6=_mm_add_pd ( m6, m7 );    // b1*h->v[0] + a1*h->v[2], b2*h->v[1] + a2*h->v[3]
            m7=_mm_load_sd( &b0in );    // b0*in , 0
            m6=_mm_add_sd ( m6, m7 );    // b1*h->v[0] + a1*h->v[2] + in*b0 , b2*h->v[1] + a2*h->v[3] + 0
            m6=_mm_hadd_pd( m6, m7 );    // b1*h->v[0] + a1*h->v[2] + in*b0 + b2*h->v[1] + a2*h->v[3], in*b0
               _mm_store_sd( &out, m6 );
            m6=_mm_loadh_pd( m6, &in );    // out , in
            m4=_mm_shuffle_pd( m6, m4, _MM_SHUFFLE2( 0, 1 ) ); // h->v[0]=in , h->v[1]=h->v[0]
            m5=_mm_shuffle_pd( m6, m5, _MM_SHUFFLE2( 0, 0 ) ); // h->v[2]=out, h->v[3]=h->v[2]

            *dest++=T(out);

            dest+=skip;
        }

        // move history from registers back to state
        _mm_storeu_pd( &h[0].v[0], m2 );
        _mm_storeu_pd( &h[0].v[2], m3 );
        _mm_storeu_pd( &h[1].v[0], m4 );
        _mm_storeu_pd( &h[1].v[2], m5 );

#else
        // Template-specialized version from which the assembly was modeled
        CalcT a1=s->a[1];
        CalcT a2=s->a[2];
        CalcT b0=s->b[0];
        CalcT b1=s->b[1];
        CalcT b2=s->b[2];
        while( frames-- )
        {
            CalcT in, out;

            in=CalcT(*dest);
            out=b0*in+b1*h[0].v[0]+b2*h[0].v[1] +a1*h[0].v[2]+a2*h[0].v[3];
            h[0].v[1]=h[0].v[0]; h[0].v[0]=in;
            h[0].v[3]=h[0].v[2]; h[0].v[2]=out;
            in=out;
            *dest++=T(in);

            in=CalcT(*dest);
            out=b0*in+b1*h[1].v[0]+b2*h[1].v[1] +a1*h[1].v[2]+a2*h[1].v[3];
            h[1].v[1]=h[1].v[0]; h[1].v[0]=in;
            h[1].v[3]=h[1].v[2]; h[1].v[2]=out;
            in=out;
            *dest++=T(in);

            dest+=skip;
        }
#endif
    }

    // Note there could be a loss of accuracy here. Unlike the original version
    // of Process...() we are applying each stage to all of the input data.
    // Since the underlying type T could be float, the results from this function
    // may be different than the unoptimized version. However, it is much faster.
    template<typename T>
    void CascadeFilter::ProcessISSEStereo( size_t frames, T *dest, int skip )
    {
        assert( m_nchan==2 );
        Stage *s=m_stagep;
        Hist *h=m_histp;
        for( int i=m_nstage;i;i--,h+=2,s++ )
        {
            ProcessISSEStageStereo( frames, dest, s, h, skip );
        }
    }

#endif

    template<typename T>
    void CascadeFilter::ProcessII( size_t frames, T *dest, int skip )
    {
        while( frames-- )
        {
            Hist *h=m_histp;
            for( int j=m_nchan;j;j-- )
            {
                CalcT in=CalcT(*dest);
                Stage *s=m_stagep;
                for( int i=m_nstage;i;i--,h++,s++ )
                {
                    CalcT d2=h->v[2]=h->v[1];
                    CalcT d1=h->v[1]=h->v[0];
                    CalcT d0=h->v[0]=
                        in+s->a[1]*d1 + s->a[2]*d2;
                        in=s->b[0]*d0 + s->b[1]*d1 + s->b[2]*d2;
                }
                *dest++=T(in);
            }
            dest+=skip;
        }
    }

    template<typename T>
    inline void CascadeFilter::Process( size_t frames, T *dest, int skip )
    {
        ProcessI( frames, dest, skip );
    }

    inline Complex CascadeFilter::Response( CalcT w )
    {
        Complex ch( 1 );
        Complex cbot( 1 );
        Complex czn1=polar( 1., -w );
        Complex czn2=polar( 1., -2*w );

        Stage *s=m_stagep;
        for( int i=m_nstage;i;i-- )
        {
            Complex ct( s->b[0] );
            Complex cb( 1 );
            ct=addmul( ct,  s->b[1], czn1 );
            cb=addmul( cb, -s->a[1], czn1 );
            ct=addmul( ct,  s->b[2], czn2 );
            cb=addmul( cb, -s->a[2], czn2 );
            ch*=ct;
            cbot*=cb;
            s++;
        }

        return ch/cbot;
    }

    inline void CascadeFilter::Clear( void )
    {
        ::memset( m_histp, 0, m_nstage*m_nchan*sizeof(m_histp[0]) );
    }

    inline void CascadeFilter::Stage::Reset( void )
    {
                a[1]=0; a[2]=0;
        b[0]=1; b[1]=0; b[2]=0;
    }

    inline void CascadeFilter::Reset( void )
    {
        Stage *s=m_stagep;
        for( int i=m_nstage;i;i--,s++ )
            s->Reset();
    }

    // Apply scale factor to stage coefficients.
    inline void CascadeFilter::Normalize( CalcT scale )
    {
        // We are throwing the normalization into the first
        // stage. In theory it might be nice to spread it around
        // to preserve numerical accuracy.
        Stage *s=m_stagep;
        s->b[0]*=scale; s->b[1]*=scale; s->b[2]*=scale;
    }

    inline void CascadeFilter::SetAStage( CalcT x1, CalcT x2 )
    {
        Stage *s=m_stagep;
        for( int i=m_nstage;i;i-- )
        {
            if( s->a[1]==0 && s->a[2]==0 )
            {
                s->a[1]=x1;
                s->a[2]=x2;
                s=0;
                break;
            }
            if( s->a[2]==0 && x2==0 )
            {
                s->a[2]=-s->a[1]*x1;
                s->a[1]+=x1;
                s=0;
                break;
            }
            s++;
        }
        assert( s==0 );
    }

    inline void CascadeFilter::SetBStage( CalcT x0, CalcT x1, CalcT x2 )
    {
        Stage *s=m_stagep;
        for( int i=m_nstage;i;i-- )
        {
            if( s->b[1]==0 && s->b[2]==0 )
            {
                s->b[0]=x0;
                s->b[1]=x1;
                s->b[2]=x2;
                s=0;
                break;
            }
            if( s->b[2]==0 && x2==0 )
            {
                // (b0 + z b1)(x0 + z x1) = (b0 x0 + (b1 x0+b0 x1) z + b1 x1 z^2)
                s->b[2]=s->b[1]*x1;
                s->b[1]=s->b[1]*x0+s->b[0]*x1;
                s->b[0]*=x0;
                s=0;
                break;
            }
            s++;
        }
        assert( s==0 );
    }
    
    // Optimized version for Biquads
    inline void CascadeFilter::SetStage(
        CalcT a1, CalcT a2, CalcT b0, CalcT b1, CalcT b2 )
    {
        assert( m_nstage==1 );
        Stage *s=&m_stagep[0];
                    s->a[1]=a1; s->a[2]=a2;
        s->b[0]=b0; s->b[1]=b1; s->b[2]=b2;
    }

    inline CalcT CascadeFilter::ResponseFunctor::operator()( CalcT w )
    {
        return -Dsp::abs(m_filter->Response( w ));
    }

    inline CascadeFilter::ResponseFunctor::ResponseFunctor( CascadeFilter *filter )
    {
        m_filter=filter;
    }

    //--------------------------------------------------------------------------

    template<int stages, int channels>
    class CascadeStages : public CascadeFilter
    {
    public:
        CascadeStages();

    private:
        Hist    m_hist    [stages*channels];
        Stage    m_stage    [stages];
    };

    //--------------------------------------------------------------------------

    template<int stages, int channels>
    CascadeStages<stages, channels>::CascadeStages( void )
    {
        m_nchan=channels;
        m_nstage=stages;
        m_stagep=m_stage;
        m_histp=m_hist;
        Clear();
    }

    //--------------------------------------------------------------------------
    //
    //    Biquad Second Order IIR Filters
    //
    //--------------------------------------------------------------------------

    // Formulas from http://www.musicdsp.org/files/Audio-EQ-Cookbook.txt
    template<int channels>
    class Biquad : public CascadeStages<1, channels>
    {
    protected:
        void Setup( const CalcT a[3], const CalcT b[3] );
    };

    //--------------------------------------------------------------------------

    template<int channels>
    inline void Biquad<channels>::Setup( const CalcT a[3], const CalcT b[3] )
    {
        Reset();
        // transform Biquad coefficients
        CalcT ra0=1/a[0];
        SetAStage( -a[1]*ra0, -a[2]*ra0 );
        SetBStage(  b[0]*ra0,  b[1]*ra0, b[2]*ra0 );
    }

    //--------------------------------------------------------------------------

    template<int channels>
    class BiquadLowPass : public Biquad<channels>
    {
    public:
        void Setup            ( CalcT normFreq, CalcT q );
        void SetupFast        ( CalcT normFreq, CalcT q );
    protected:
        void SetupCommon    ( CalcT sn, CalcT cs, CalcT q );
    };

    //--------------------------------------------------------------------------

    template<int channels>
    inline void BiquadLowPass<channels>::SetupCommon( CalcT sn, CalcT cs, CalcT q  )
    {
        CalcT alph = sn / ( 2 * q );
        CalcT a0 =  1 / ( 1 + alph );
        CalcT b1 =  1 - cs;
        CalcT b0 = a0 * b1 * 0.5;
        CalcT a1 =  2 * cs;
        CalcT a2 = alph - 1;
        SetStage( a1*a0, a2*a0, b0, b1*a0, b0 );
    }

    template<int channels>
    void BiquadLowPass<channels>::Setup( CalcT normFreq, CalcT q  )
    {
        CalcT w0 = 2 * kPi * normFreq;
        CalcT cs = cos(w0);
        CalcT sn = sin(w0);
        SetupCommon( sn, cs, q );
    }

    template<int channels>
    void BiquadLowPass<channels>::SetupFast( CalcT normFreq, CalcT q  )
    {
        CalcT w0 = 2 * kPi * normFreq;
        CalcT sn, cs;
        fastsincos( w0, &sn, &cs );
        SetupCommon( sn, cs, q );
    }

    //--------------------------------------------------------------------------

    template<int channels>
    class BiquadHighPass : public Biquad<channels>
    {
    public:
        void Setup            ( CalcT normFreq, CalcT q );
        void SetupFast        ( CalcT normFreq, CalcT q );
    protected:
        void SetupCommon    ( CalcT sn, CalcT cs, CalcT q );
    };

    //--------------------------------------------------------------------------

    template<int channels>
    inline void BiquadHighPass<channels>::SetupCommon( CalcT sn, CalcT cs, CalcT q  )
    {
        CalcT alph = sn / ( 2 * q );
        CalcT a0 = -1 / ( 1 + alph );
        CalcT b1 = -( 1 + cs );
        CalcT b0 = a0 * b1 * -0.5;
        CalcT a1 = -2 * cs;
        CalcT a2 =  1 - alph;
        SetStage( a1*a0, a2*a0, b0, b1*a0, b0 );
    }

    template<int channels>
    void BiquadHighPass<channels>::Setup( CalcT normFreq, CalcT q )
    {
        CalcT w0 = 2 * kPi * normFreq;
        CalcT cs = cos(w0);
        CalcT sn = sin(w0);
        SetupCommon( sn, cs, q );
    }

    template<int channels>
    void BiquadHighPass<channels>::SetupFast( CalcT normFreq, CalcT q )
    {
        CalcT w0 = 2 * kPi * normFreq;
        CalcT sn, cs;
        fastsincos( w0, &sn, &cs );
        SetupCommon( sn, cs, q );
    }

    //--------------------------------------------------------------------------

    // Constant skirt gain, peak gain=Q
    template<int channels>
    class BiquadBandPass1 : public Biquad<channels>
    {
    public:
        void Setup            ( CalcT normFreq, CalcT q );
        void SetupFast        ( CalcT normFreq, CalcT q );
    protected:
        void SetupCommon    ( CalcT sn, CalcT cs, CalcT q );
    };

    //--------------------------------------------------------------------------

    template<int channels>
    inline void BiquadBandPass1<channels>::SetupCommon( CalcT sn, CalcT cs, CalcT q  )
    {
        CalcT alph = sn / ( 2 * q );
        CalcT a0 = -1 / ( 1 + alph );
        CalcT b0 = a0 * ( sn * -0.5 );
        CalcT a1 = -2 * cs;
        CalcT a2 =  1 - alph;
        SetStage( a1*a0, a2*a0, b0, 0, -b0 );
    }

    template<int channels>
    void BiquadBandPass1<channels>::Setup( CalcT normFreq, CalcT q )
    {
        CalcT w0 = 2 * kPi * normFreq;
        CalcT cs = cos(w0);
        CalcT sn = sin(w0);
        SetupCommon( sn, cs, q );
    }

    template<int channels>
    void BiquadBandPass1<channels>::SetupFast( CalcT normFreq, CalcT q )
    {
        CalcT w0 = 2 * kPi * normFreq;
        CalcT sn, cs;
        fastsincos( w0, &sn, &cs );
        SetupCommon( sn, cs, q );
    }

    //--------------------------------------------------------------------------

    // Constant 0dB peak gain
    template<int channels>
    class BiquadBandPass2 : public Biquad<channels>
    {
    public:
        void Setup            ( CalcT normFreq, CalcT q );
        void SetupFast        ( CalcT normFreq, CalcT q );
    protected:
        void SetupCommon    ( CalcT sn, CalcT cs, CalcT q );
    };

    //--------------------------------------------------------------------------

    template<int channels>
    inline void BiquadBandPass2<channels>::SetupCommon( CalcT sn, CalcT cs, CalcT q  )
    {
        CalcT alph = sn / ( 2 * q );
        CalcT b0 = -alph;
        CalcT b2 =  alph;
        CalcT a0 = -1 / ( 1 + alph );
        CalcT a1 = -2 * cs;
        CalcT a2 =  1 - alph;
        SetStage( a1*a0, a2*a0, b0*a0, 0, b2*a0 );
    }

    template<int channels>
    void BiquadBandPass2<channels>::Setup( CalcT normFreq, CalcT q )
    {
        CalcT w0 = 2 * kPi * normFreq;
        CalcT cs = cos(w0);
        CalcT sn = sin(w0);
        SetupCommon( sn, cs, q );
    }

    template<int channels>
    void BiquadBandPass2<channels>::SetupFast( CalcT normFreq, CalcT q )
    {
        CalcT w0 = 2 * kPi * normFreq;
        CalcT sn, cs;
        fastsincos( w0, &sn, &cs );
        SetupCommon( sn, cs, q );
    }

    //--------------------------------------------------------------------------

    template<int channels>
    class BiquadBandStop : public Biquad<channels>
    {
    public:
        void Setup            ( CalcT normFreq, CalcT q );
        void SetupFast        ( CalcT normFreq, CalcT q );
    protected:
        void SetupCommon    ( CalcT sn, CalcT cs, CalcT q );
    };

    //--------------------------------------------------------------------------

    template<int channels>
    inline void BiquadBandStop<channels>::SetupCommon( CalcT sn, CalcT cs, CalcT q  )
    {
        CalcT alph = sn / ( 2 * q );
        CalcT a0 = 1 / ( 1 + alph );
        CalcT b1 = a0 * ( -2 * cs );
        CalcT a2 = alph - 1;
        SetStage( -b1, a2*a0, a0, b1, a0 );
    }

    template<int channels>
    void BiquadBandStop<channels>::Setup( CalcT normFreq, CalcT q )
    {
        CalcT w0 = 2 * kPi * normFreq;
        CalcT cs = cos(w0);
        CalcT sn = sin(w0);
        SetupCommon( sn, cs, q );
    }

    template<int channels>
    void BiquadBandStop<channels>::SetupFast( CalcT normFreq, CalcT q )
    {
        CalcT w0 = 2 * kPi * normFreq;
        CalcT sn, cs;
        fastsincos( w0, &sn, &cs );
        SetupCommon( sn, cs, q );
    }

    //--------------------------------------------------------------------------

    template<int channels>
    class BiquadAllPass: public Biquad<channels>
    {
    public:
        void Setup            ( CalcT normFreq, CalcT q );
        void SetupFast        ( CalcT normFreq, CalcT q );
    protected:
        void SetupCommon    ( CalcT sn, CalcT cs, CalcT q );
    };

    //--------------------------------------------------------------------------

    template<int channels>
    void BiquadAllPass<channels>::SetupCommon( CalcT sn, CalcT cs, CalcT q  )
    {
        CalcT alph = sn / ( 2 * q );
        CalcT b2 =  1 + alph;
        CalcT a0 =  1 / b2;
        CalcT b0 =( 1 - alph ) * a0;
        CalcT b1 = -2 * cs * a0;
        SetStage( -b1, -b0, b0, b1, b2*a0 );
    }

    template<int channels>
    void BiquadAllPass<channels>::Setup( CalcT normFreq, CalcT q )
    {
        CalcT w0 = 2 * kPi * normFreq;
        CalcT cs = cos(w0);
        CalcT sn = sin(w0);
        SetupCommon( sn, cs, q );
    }

    template<int channels>
    void BiquadAllPass<channels>::SetupFast( CalcT normFreq, CalcT q )
    {
        CalcT w0 = 2 * kPi * normFreq;
        CalcT sn, cs;
        fastsincos( w0, &sn, &cs );
        SetupCommon( sn, cs, q );
    }

    //--------------------------------------------------------------------------

    template<int channels>
    class BiquadPeakEq: public Biquad<channels>
    {
    public:
        void Setup            ( CalcT normFreq, CalcT dB, CalcT bandWidth );
        void SetupFast        ( CalcT normFreq, CalcT dB, CalcT bandWidth );
    protected:
        void SetupCommon    ( CalcT sn, CalcT cs, CalcT alph, CalcT A );
    };

    //--------------------------------------------------------------------------

    template<int channels>
    inline void BiquadPeakEq<channels>::SetupCommon(
        CalcT sn, CalcT cs, CalcT alph, CalcT A )
    {
        CalcT t=alph*A;
        CalcT b0 =  1 - t;
        CalcT b2 =  1 + t;
        t=alph/A;
        CalcT a0 =  1 / ( 1 + t );
        CalcT a2 =  t - 1;
        CalcT b1 = a0 * ( -2 * cs );
        CalcT a1 = -b1;

        SetStage( a1, a2*a0, b0*a0, b1, b2*a0 );
    }

    template<int channels>
    void BiquadPeakEq<channels>::Setup( CalcT normFreq, CalcT dB, CalcT bandWidth )
    {
        CalcT A  = pow( 10, dB/40 );
        CalcT w0 = 2 * kPi * normFreq;
        CalcT cs = cos(w0);
        CalcT sn = sin(w0);
        CalcT alph = sn * sinh( kLn2/2 * bandWidth * w0/sn );
        SetupCommon( sn, cs, alph, A );
    }

    template<int channels>
    void BiquadPeakEq<channels>::SetupFast( CalcT normFreq, CalcT dB, CalcT bandWidth )
    {
        CalcT A  = pow( 10, dB/40 );
        CalcT w0 = 2 * kPi * normFreq;
        CalcT sn, cs;
        fastsincos( w0, &sn, &cs );
        CalcT alph = sn * sinh( kLn2/2 * bandWidth * w0/sn );
        SetupCommon( sn, cs, alph, A );
    }

    //--------------------------------------------------------------------------

    template<int channels>
    class BiquadLowShelf : public Biquad<channels>
    {
    public:
        void Setup            ( CalcT normFreq, CalcT dB, CalcT shelfSlope=1.0 );
        void SetupFast        ( CalcT normFreq, CalcT dB, CalcT shelfSlope=1.0 );
    protected:
        void SetupCommon    ( CalcT cs, CalcT A, CalcT sa );
    };

    //--------------------------------------------------------------------------

    template<int channels>
    inline void BiquadLowShelf<channels>::SetupCommon(
        CalcT cs, CalcT A, CalcT sa )
    {
        CalcT An    = A-1;
        CalcT Ap    = A+1;
        CalcT Ancs    = An*cs;
        CalcT Apcs    = Ap*cs;
        CalcT b0 =       A * (Ap - Ancs + sa );
        CalcT b2 =       A * (Ap - Ancs - sa );
        CalcT b1 = 2 * A * (An - Apcs);
        CalcT a2 =    sa - (Ap + Ancs);
        CalcT a0 =       1 / (Ap + Ancs + sa );
        CalcT a1 = 2 *       (An + Apcs);
        SetStage( a1*a0, a2*a0, b0*a0, b1*a0, b2*a0 );
    }

    template<int channels>
    void BiquadLowShelf<channels>::Setup( CalcT normFreq, CalcT dB, CalcT shelfSlope )
    {
        CalcT A  = pow( 10, dB/40 );
        CalcT w0 = 2 * kPi * normFreq;
        CalcT cs = cos(w0);
        CalcT sn = sin(w0);
        CalcT al = sn / 2 * ::sqrt( (A + 1/A) * (1/shelfSlope - 1) + 2 );
        CalcT sa =        2 * ::sqrt( A ) * al;
        SetupCommon( cs, A, sa );
    }

    // This could be optimized further
    template<int channels>
    void BiquadLowShelf<channels>::SetupFast( CalcT normFreq, CalcT dB, CalcT shelfSlope )
    {
        CalcT A  = pow( 10, dB/40 );
        CalcT w0 = 2 * kPi * normFreq;
        CalcT sn, cs;
        fastsincos( w0, &sn, &cs );
        CalcT al = sn / 2 * fastsqrt1( (A + 1/A) * (1/shelfSlope - 1) + 2 );
        CalcT sa =        2 * fastsqrt1( A ) * al;
        SetupCommon( cs, A, sa );
    }

    //--------------------------------------------------------------------------

    template<int channels>
    class BiquadHighShelf : public Biquad<channels>
    {
    public:
        void Setup            ( CalcT normFreq, CalcT dB, CalcT shelfSlope=1.0 );
        void SetupFast        ( CalcT normFreq, CalcT dB, CalcT shelfSlope=1.0 );
    protected:
        void SetupCommon    ( CalcT cs, CalcT A, CalcT sa );
    };

    //--------------------------------------------------------------------------

    template<int channels>
    void BiquadHighShelf<channels>::SetupCommon(
        CalcT cs, CalcT A, CalcT sa )
    {
        CalcT An    = A-1;
        CalcT Ap    = A+1;
        CalcT Ancs    = An*cs;
        CalcT Apcs    = Ap*cs;
        CalcT b0 =        A * (Ap + Ancs + sa );
        CalcT b1 = -2 * A * (An + Apcs);
        CalcT b2 =        A * (Ap + Ancs - sa );
        CalcT a0 =            (Ap - Ancs + sa );
        CalcT a2 =             Ancs + sa - Ap;
        CalcT a1 = -2    *    (An - Apcs);
        SetStage( a1/a0, a2/a0, b0/a0, b1/a0, b2/a0 );
    }

    template<int channels>
    void BiquadHighShelf<channels>::Setup( CalcT normFreq, CalcT dB, CalcT shelfSlope )
    {
        CalcT A  = pow( 10, dB/40 );
        CalcT w0 = 2 * kPi * normFreq;
        CalcT cs = cos(w0);
        CalcT sn = sin(w0);

        CalcT alph = sn / 2 * ::sqrt( (A + 1/A) * (1/shelfSlope - 1) + 2 );
        CalcT sa   =      2 * ::sqrt( A ) * alph;
        SetupCommon( cs, A, sa );
    }

    template<int channels>
    void BiquadHighShelf<channels>::SetupFast( CalcT normFreq, CalcT dB, CalcT shelfSlope )
    {
        CalcT A  = pow( 10, dB/40 );
        CalcT w0 = 2 * kPi * normFreq;
        CalcT sn, cs;
        fastsincos( w0, &sn, &cs );

        CalcT alph = sn / 2 * fastsqrt1( (A + 1/A) * (1/shelfSlope - 1) + 2 );
        CalcT sa   =      2 * fastsqrt1( A ) * alph;
        SetupCommon( cs, A, sa );
    }

    //--------------------------------------------------------------------------
    //
    //    General N-Pole IIR Filter
    //
    //--------------------------------------------------------------------------

    template<int stages, int channels>
    class PoleFilter : public CascadeStages<stages, channels>
    {
    public:
        PoleFilter();

        virtual int        CountPoles            ( void )=0;
        virtual int        CountZeroes            ( void )=0;

        virtual Complex    GetPole                ( int i )=0;
        virtual Complex    GetZero                ( int i )=0;

    protected:
        virtual Complex GetSPole            ( int i, CalcT wc )=0;

    protected:
        // Determines the method of obtaining
        // unity gain coefficients in the passband.
        enum Hint
        {
            // No normalizating
            hintNone,
            // Use Brent's method to find the maximum
            hintBrent,
            // Use the response at a given frequency
            hintPassband
        };

        Complex    BilinearTransform    ( const Complex &c );
        Complex    BandStopTransform    ( int i, const Complex &c );
        Complex    BandPassTransform    ( int i, const Complex &c );
        Complex    GetBandStopPole        ( int i );
        Complex    GetBandStopZero        ( int i );
        Complex    GetBandPassPole        ( int i );
        Complex    GetBandPassZero        ( int i );
        void    Normalize            ( void );
        void    Prepare                ( void );

        virtual void    BrentHint    ( CalcT *w0, CalcT *w1 );
        virtual CalcT    PassbandHint( void );

    protected:
        Hint    m_hint;
        int        m_n;
        CalcT    m_wc;
        CalcT    m_wc2;
    };

    //--------------------------------------------------------------------------

    template<int stages, int channels>
    inline PoleFilter<stages, channels>::PoleFilter( void )
    {
        m_hint=hintNone;
    }

    template<int stages, int channels>
    inline Complex PoleFilter<stages, channels>::BilinearTransform( const Complex &c )
    {
        return (c+1.)/(-c+1.);
    }

    template<int stages, int channels>
    inline Complex PoleFilter<stages, channels>::BandStopTransform( int i, const Complex &c )
    {
        CalcT a=cos((m_wc+m_wc2)*.5) /
                cos((m_wc-m_wc2)*.5);
        CalcT b=tan((m_wc-m_wc2)*.5);
        Complex c2(0);
        c2=addmul( c2, 4*(b*b+a*a-1), c );
        c2+=8*(b*b-a*a+1);
        c2*=c;
        c2+=4*(a*a+b*b-1);
        c2=Dsp::sqrt( c2 );
        c2*=((i&1)==0)?.5:-.5;
        c2+=a;
        c2=addmul( c2, -a, c );
        Complex c3( b+1 );
        c3=addmul( c3, b-1, c );
        return c2/c3;
    }

    template<int stages, int channels>
    inline Complex PoleFilter<stages, channels>::BandPassTransform( int i, const Complex &c )
    {
        CalcT a=  cos((m_wc+m_wc2)*0.5)/
                  cos((m_wc-m_wc2)*0.5);
        CalcT b=1/tan((m_wc-m_wc2)*0.5);
        Complex c2(0);
        c2=addmul( c2, 4*(b*b*(a*a-1)+1), c );
        c2+=8*(b*b*(a*a-1)-1);
        c2*=c;
        c2+=4*(b*b*(a*a-1)+1);
        c2=Dsp::sqrt( c2 );
        if ((i & 1) == 0)
            c2=-c2;
        c2=addmul( c2, 2*a*b, c );
        c2+=2*a*b;
        Complex c3(0);
        c3=addmul( c3, 2*(b-1), c );
        c3+=2*(1+b);
        return c2/c3;
    }

    template<int stages, int channels>
    Complex PoleFilter<stages, channels>::GetBandStopPole( int i )
    {
        Complex c=GetSPole( i/2, kPi_2 );
        c=BilinearTransform( c );
        c=BandStopTransform( i, c );
        return c;
    }

    template<int stages, int channels>
    Complex PoleFilter<stages, channels>::GetBandStopZero( int i )
    {
        return BandStopTransform( i, Complex( -1 ) );
    }

    template<int stages, int channels>
    Complex PoleFilter<stages, channels>::GetBandPassPole( int i )
    {
        Complex c=GetSPole( i/2, kPi_2 );
        c=BilinearTransform( c );
        c=BandPassTransform( i, c );
        return c;
    }

    template<int stages, int channels>
    Complex PoleFilter<stages, channels>::GetBandPassZero( int i )
    {
        return Complex( (i>=m_n)?1:-1 );
    }

    template<int stages, int channels>
    void PoleFilter<stages, channels>::Normalize( void )
    {
        switch( m_hint )
        {
        default:
        case hintNone:
            break;

        case hintPassband:
            {
                CalcT w=PassbandHint();
                ResponseFunctor f(this);
                CalcT mag=-f(w);
                CascadeStages::Normalize( 1/mag );
            }
            break;

        case hintBrent:
            {
                ResponseFunctor f(this);
                CalcT w0, w1, wmin, mag;
                BrentHint( &w0, &w1 );
                mag=-BrentMinimize( f, w0, w1, 1e-4, wmin );
                CascadeStages::Normalize( 1/mag );
            }
            break;
        }
    }

    template<int stages, int channels>
    void PoleFilter<stages, channels>::Prepare( void )
    {
        if( m_wc2<1e-8 )
            m_wc2=1e-8;
        if( m_wc >kPi-1e-8 )
            m_wc =kPi-1e-8;

        Reset();

        Complex c;
        int poles=CountPoles();
        for( int i=0;i<poles;i++ )
        {
            c=GetPole( i );
            if( ::abs(c.imag())<1e-6 )
                c=Complex( c.real(), 0 );
            if( c.imag()==0 )
                SetAStage( c.real(), 0 );
            else if( c.imag()>0 )
                SetAStage( 2*c.real(), -Dsp::norm(c) );
        }

        int zeroes=CountZeroes();
        for( int i=0;i<zeroes;i++ )
        {
            c=GetZero( i );
            if( ::abs(c.imag())<1e-6 )
                c=Complex( c.real(), 0 );
            if( c.imag()==0 )
                SetBStage( -c.real(), 1, 0 );
            else if( c.imag()>0 )
                SetBStage( Dsp::norm(c), -2*c.real(), 1 );
        }

        Normalize();
    }

    template<int stages, int channels>
    void PoleFilter<stages, channels>::BrentHint( CalcT *w0, CalcT *w1 )
    {
        // best that this never executes
        *w0=1e-4;
        *w1=kPi-1e-4;
    }

    template<int stages, int channels>
    CalcT PoleFilter<stages, channels>::PassbandHint( void )
    {
        // should never get here
        assert( 0 );
        return kPi_2;
    }

    //--------------------------------------------------------------------------
    //
    //    Butterworth Response IIR Filter
    //
    //--------------------------------------------------------------------------

    // Butterworth filter response characteristic.
    // Maximally flat magnitude response in the passband at the
    // expense of a more shallow rolloff in comparison to other types.
    template<int poles, int channels>
    class Butterworth : public PoleFilter<int((poles+1)/2), channels>
    {
    public:
        Butterworth();

        // cutoffFreq = freq / sampleRate
        void            Setup            ( CalcT cutoffFreq );

        virtual int        CountPoles        ( void );
        virtual int        CountZeroes        ( void );
        virtual Complex    GetPole            ( int i );

    protected:
                Complex GetSPole        ( int i, CalcT wc );
    };

    //--------------------------------------------------------------------------

    template<int poles, int channels>
    Butterworth<poles, channels>::Butterworth( void )
    {
        m_hint=hintPassband;
    }

    template<int poles, int channels>
    void Butterworth<poles, channels>::Setup( CalcT cutoffFreq )
    {
        m_n=poles;
        m_wc=2*kPi*cutoffFreq;
        Prepare();
    }

    template<int poles, int channels>
    int Butterworth<poles, channels>::CountPoles( void )
    {
        return poles;
    }

    template<int poles, int channels>
    int Butterworth<poles, channels>::CountZeroes( void )
    {
        return poles;
    }

    template<int poles, int channels>
    Complex Butterworth<poles, channels>::GetPole( int i )
    {
        return BilinearTransform( GetSPole( i, m_wc ) );
    }

    template<int poles, int channels>
    Complex Butterworth<poles, channels>::GetSPole( int i, CalcT wc )
    {
        return polar( tan(wc*0.5), kPi_2+(2*i+1)*kPi/(2*m_n) );
    }

    //--------------------------------------------------------------------------

    // Low Pass Butterworth filter
    // Stable up to 53 poles (frequency min=0.13% of Nyquist)
    template<int poles, int channels>
    class ButterLowPass : public Butterworth<poles, channels>
    {
    public:
        Complex GetZero            ( int i );

    protected:
        CalcT    PassbandHint    ( void );
    };

    //--------------------------------------------------------------------------

    template<int poles, int channels>
    Complex ButterLowPass<poles, channels>::GetZero( int i )
    {
        return Complex( -1 );
    }

    template<int poles, int channels>
    CalcT ButterLowPass<poles, channels>::PassbandHint( void )
    {
        return 0;
    }

    //--------------------------------------------------------------------------

    // High Pass Butterworth filter
    // Maximally flat magnitude response in the passband.
    // Stable up to 110 poles (frequency max=97% of Nyquist)
    template<int poles, int channels>
    class ButterHighPass : public Butterworth<poles, channels>
    {
    public:
        Complex GetZero( int i );

    protected:
        CalcT    PassbandHint    ( void );
    };

    //--------------------------------------------------------------------------

    template<int poles, int channels>
    Complex ButterHighPass<poles, channels>::GetZero( int i )
    {
        return Complex( 1 );
    }

    template<int poles, int channels>
    CalcT ButterHighPass<poles, channels>::PassbandHint( void )
    {
        return kPi;
    }

    //--------------------------------------------------------------------------

    // Band Pass Butterworth filter
    // Stable up to 80 pairs
    template<int pairs, int channels>
    class ButterBandPass : public Butterworth<pairs*2, channels>
    {
    public:
        // centerFreq = freq / sampleRate
        // normWidth  = freqWidth / sampleRate
        void            Setup            ( CalcT centerFreq, CalcT normWidth );

        virtual int        CountPoles        ( void );
        virtual int        CountZeroes        ( void );
        virtual Complex GetPole            ( int i );
        virtual Complex GetZero            ( int i );

    protected:
        CalcT    PassbandHint    ( void );
    };

    //--------------------------------------------------------------------------

    template<int pairs, int channels>
    void ButterBandPass<pairs, channels>::Setup( CalcT centerFreq, CalcT normWidth )
    {
        m_n=pairs;
        CalcT angularWidth=2*kPi*normWidth;
        m_wc2=2*kPi*centerFreq-(angularWidth/2);
        m_wc =m_wc2+angularWidth;
        Prepare();
    }

    template<int pairs, int channels>
    int ButterBandPass<pairs, channels>::CountPoles( void )
    {
        return pairs*2;
    }

    template<int pairs, int channels>
    int ButterBandPass<pairs, channels>::CountZeroes( void )
    {
        return pairs*2;
    }

    template<int pairs, int channels>
    Complex ButterBandPass<pairs, channels>::GetPole( int i )
    {
        return GetBandPassPole( i );
    }

    template<int pairs, int channels>
    Complex ButterBandPass<pairs, channels>::GetZero( int i )
    {
        return GetBandPassZero( i );
    }

    template<int poles, int channels>
    CalcT ButterBandPass<poles, channels>::PassbandHint( void )
    {
        return (m_wc+m_wc2)/2;
    }

    //--------------------------------------------------------------------------

    // Band Stop Butterworth filter
    // Stable up to 109 pairs
    template<int pairs, int channels>
    class ButterBandStop : public Butterworth<pairs*2, channels>
    {
    public:
        // centerFreq = freq / sampleRate
        // normWidth  = freqWidth / sampleRate
        void    Setup            ( CalcT centerFreq, CalcT normWidth );

        virtual int        CountPoles        ( void );
        virtual int        CountZeroes        ( void );
        virtual Complex GetPole            ( int i );
        virtual Complex GetZero            ( int i );

    protected:
        CalcT    PassbandHint    ( void );
    };

    //--------------------------------------------------------------------------

    template<int pairs, int channels>
    void ButterBandStop<pairs, channels>::Setup( CalcT centerFreq, CalcT normWidth )
    {
        m_n=pairs;
        CalcT angularWidth=2*kPi*normWidth;
        m_wc2=2*kPi*centerFreq-(angularWidth/2);
        m_wc =m_wc2+angularWidth;
        Prepare();
    }

    template<int pairs, int channels>
    int ButterBandStop<pairs, channels>::CountPoles( void )
    {
        return pairs*2;
    }

    template<int pairs, int channels>
    int ButterBandStop<pairs, channels>::CountZeroes( void )
    {
        return pairs*2;
    }

    template<int pairs, int channels>
    Complex ButterBandStop<pairs, channels>::GetPole( int i )
    {
        return GetBandStopPole( i );
    }

    template<int pairs, int channels>
    Complex ButterBandStop<pairs, channels>::GetZero( int i )
    {
        return GetBandStopZero( i );
    }

    template<int poles, int channels>
    CalcT ButterBandStop<poles, channels>::PassbandHint( void )
    {
        if( (m_wc+m_wc2)/2<kPi_2 )
            return kPi;
        else
            return 0;
    }

    //--------------------------------------------------------------------------
    //
    //    Chebyshev Response IIR Filter
    //
    //--------------------------------------------------------------------------

    // Type I Chebyshev filter characteristic.
    // Minimum error between actual and ideal response at the expense of
    // a user-definable amount of ripple in the passband.
    template<int poles, int channels>
    class Chebyshev1 : public PoleFilter<int((poles+1)/2), channels>
    {
    public:
                        Chebyshev1();

        // cutoffFreq = freq / sampleRate
        virtual    void    Setup            ( CalcT cutoffFreq, CalcT rippleDb );

        virtual int        CountPoles        ( void );
        virtual int        CountZeroes        ( void );
        virtual Complex    GetPole            ( int i );
        virtual Complex    GetZero            ( int i );

    protected:
        void            SetupCommon        ( CalcT rippleDb );
        virtual    Complex GetSPole        ( int i, CalcT wc );

    protected:
        CalcT    m_sgn;
        CalcT    m_eps;
    };

    //--------------------------------------------------------------------------

    template<int poles, int channels>
    Chebyshev1<poles, channels>::Chebyshev1()
    {
        m_hint=hintBrent;
    }

    template<int poles, int channels>
    void Chebyshev1<poles, channels>::Setup( CalcT cutoffFreq, CalcT rippleDb )
    {
        m_n=poles;
        m_wc=2*kPi*cutoffFreq;
        SetupCommon( rippleDb );
    }

    template<int poles, int channels>
    void Chebyshev1<poles, channels>::SetupCommon( CalcT rippleDb )
    {
        m_eps=::sqrt( 1/::exp( -rippleDb*0.1*kLn10 )-1 );
        Prepare();
        // This moves the bottom of the ripples to 0dB gain
        //CascadeStages::Normalize( pow( 10, rippleDb/20.0 ) );
    }

    template<int poles, int channels>
    int    Chebyshev1<poles, channels>::CountPoles( void )
    {
        return poles;
    }

    template<int poles, int channels>
    int Chebyshev1<poles, channels>::CountZeroes( void )
    {
        return poles;
    }

    template<int poles, int channels>
    Complex Chebyshev1<poles, channels>::GetPole( int i )
    {
        return BilinearTransform( GetSPole( i, m_wc ) )*m_sgn;
    }

    template<int poles, int channels>
    Complex Chebyshev1<poles, channels>::GetZero( int i )
    {
        return Complex( -m_sgn );
    }

    template<int poles, int channels>
    Complex Chebyshev1<poles, channels>::GetSPole( int i, CalcT wc )
    {
        int n        = m_n;
        CalcT ni    = 1.0/n;
        CalcT alpha    = 1/m_eps+::sqrt(1+1/(m_eps*m_eps));
        CalcT pn    = pow( alpha,  ni );
        CalcT nn    = pow( alpha, -ni );
        CalcT a        = 0.5*( pn - nn );
        CalcT b        = 0.5*( pn + nn );
        CalcT theta = kPi_2 + (2*i+1) * kPi/(2*n);
        Complex c    = polar( tan( 0.5*(m_sgn==-1?(kPi-wc):wc) ), theta );
        return Complex( a*c.real(), b*c.imag() );
    }

    //--------------------------------------------------------------------------

    // Low Pass Chebyshev Type I filter
    template<int poles, int channels>
    class Cheby1LowPass : public Chebyshev1<poles, channels>
    {
    public:
                Cheby1LowPass();

        void    Setup            ( CalcT cutoffFreq, CalcT rippleDb );

    protected:
        CalcT    PassbandHint    ( void );
    };

    //--------------------------------------------------------------------------

    template<int poles, int channels>
    Cheby1LowPass<poles, channels>::Cheby1LowPass()
    {
        m_sgn=1;
        m_hint=hintPassband;
    }

    template<int poles, int channels>
    void Cheby1LowPass<poles, channels>::Setup( CalcT cutoffFreq, CalcT rippleDb )
    {
        Chebyshev1::Setup( cutoffFreq, rippleDb );
        // move peak of ripple down to 0dB
        if( !(poles&1) )
            CascadeStages::Normalize( pow( 10, -rippleDb/20.0 ) );
    }

    template<int poles, int channels>
    CalcT Cheby1LowPass<poles, channels>::PassbandHint( void )
    {
        return 0;
    }

    //--------------------------------------------------------------------------

    // High Pass Chebyshev Type I filter
    template<int poles, int channels>
    class Cheby1HighPass : public Chebyshev1<poles, channels>
    {
    public:
        Cheby1HighPass();

        void    Setup            ( CalcT cutoffFreq, CalcT rippleDb );

    protected:
        CalcT    PassbandHint    ( void );
    };

    //--------------------------------------------------------------------------

    template<int poles, int channels>
    Cheby1HighPass<poles, channels>::Cheby1HighPass()
    {
        m_sgn=-1;
        m_hint=hintPassband;
    }

    template<int poles, int channels>
    void Cheby1HighPass<poles, channels>::Setup( CalcT cutoffFreq, CalcT rippleDb )
    {
        Chebyshev1::Setup( cutoffFreq, rippleDb );
        // move peak of ripple down to 0dB
        if( !(poles&1) )
            CascadeStages::Normalize( pow( 10, -rippleDb/20.0 ) );
    }

    template<int poles, int channels>
    CalcT Cheby1HighPass<poles, channels>::PassbandHint( void )
    {
        return kPi;
    }

    //--------------------------------------------------------------------------

    // Band Pass Chebyshev Type I filter
    template<int pairs, int channels>
    class Cheby1BandPass : public Chebyshev1<pairs*2, channels>
    {
    public:
        Cheby1BandPass();

        void    Setup            ( CalcT centerFreq, CalcT normWidth, CalcT rippleDb );

        int        CountPoles        ( void );
        int        CountZeroes        ( void );
        Complex    GetPole            ( int i );
        Complex    GetZero            ( int i );

    protected:
        void    BrentHint        ( CalcT *w0, CalcT *w1 );
        //CalcT    PassbandHint    ( void );
    };

    //--------------------------------------------------------------------------

    template<int pairs, int channels>
    Cheby1BandPass<pairs, channels>::Cheby1BandPass()
    {
        m_sgn=1;
        m_hint=hintBrent;
    }

    template<int pairs, int channels>
    void Cheby1BandPass<pairs, channels>::Setup( CalcT centerFreq, CalcT normWidth, CalcT rippleDb )
    {
        m_n=pairs;
        CalcT angularWidth=2*kPi*normWidth;
        m_wc2=2*kPi*centerFreq-(angularWidth/2);
        m_wc =m_wc2+angularWidth;
        SetupCommon( rippleDb );
    }

    template<int pairs, int channels>
    int Cheby1BandPass<pairs, channels>::CountPoles( void )
    {
        return pairs*2;
    }

    template<int pairs, int channels>
    int Cheby1BandPass<pairs, channels>::CountZeroes( void )
    {
        return pairs*2;
    }

    template<int pairs, int channels>
    Complex    Cheby1BandPass<pairs, channels>::GetPole( int i )
    {
        return GetBandPassPole( i );
    }

    template<int pairs, int channels>
    Complex    Cheby1BandPass<pairs, channels>::GetZero( int i )
    {
        return GetBandPassZero( i );
    }

    template<int poles, int channels>
    void Cheby1BandPass<poles, channels>::BrentHint( CalcT *w0, CalcT *w1 )
    {
        CalcT d=1e-4*(m_wc-m_wc2)/2;
        *w0=m_wc2+d;
        *w1=m_wc-d;
    }

    /*
    // Unfortunately, this doesn't work at the frequency extremes
    // Maybe we can inverse pre-warp the center point to make sure
    // it stays put after bilinear and bandpass transformation.
    template<int poles, int channels>
    CalcT Cheby1BandPass<poles, channels>::PassbandHint( void )
    {
        return (m_wc+m_wc2)/2;
    }
    */

    //--------------------------------------------------------------------------

    // Band Stop Chebyshev Type I filter
    template<int pairs, int channels>
    class Cheby1BandStop : public Chebyshev1<pairs*2, channels>
    {
    public:
        Cheby1BandStop();

        void    Setup            ( CalcT centerFreq, CalcT normWidth, CalcT rippleDb );

        int        CountPoles        ( void );
        int        CountZeroes        ( void );
        Complex    GetPole            ( int i );
        Complex    GetZero            ( int i );

    protected:
        void    BrentHint        ( CalcT *w0, CalcT *w1 );
        CalcT    PassbandHint    ( void );
    };

    //--------------------------------------------------------------------------

    template<int pairs, int channels>
    Cheby1BandStop<pairs, channels>::Cheby1BandStop()
    {
        m_sgn=1;
        m_hint=hintPassband;
    }

    template<int pairs, int channels>


3 comment(s) | add a comment | nofrills version


Bit quantization/reduction effect

Type : Bit-level noise-generating effect
References : Posted by Jon Watte

Notes :
This function, run on each sample, will emulate half the effect of running your signal through a Speak-N-Spell or similar low-bit-depth circuitry.

The other half would come from downsampling with no aliasing control, i e replicating every N-th sample N times in the output signal.


Code :
short keep_bits_from_16( short input, int keepBits ) {
  return (input & (-1 << (16-keepBits)));
}



3 comment(s) | add a comment | nofrills version


Bit-Reversed Counting

References : Posted by mailbjl[AT]yahoo[DOT]com

Notes :
Bit-reversed ordering comes up frequently in FFT implementations. Here is a non-branching algorithm (given in C) that increments the variable "s" bit-reversedly from 0 to N-1, where N is a power of 2.

Code :
int r = 0;        // counter
int s = 0;        // bit-reversal of r/2
int N = 256;      // N can be any power of 2
int N2 = N << 1;  // N<<1 == N*2

do {
  printf("%u ", s);
  r += 2;
  s ^= N - (N / (r&-r));
}
while (r < N2);


3 comment(s) | add a comment | nofrills version


Block/Loop Benchmarking

Type : Benchmarking Tool
References : Posted by arguru[AT]smartelectronix[DOT]com

Notes :
Requires CPU with RDTSC support

Code :
// Block-Process Benchmarking Code using rdtsc
// useful for measure DSP block stuff
// (based on Intel papers)
// 64-bit precission
// VeryUglyCode(tm) by Arguru

// globals
UINT time,time_low,time_high;

// call this just before enter your loop or whatever
void bpb_start()
{
    // read time stamp to EAX
    __asm rdtsc;
    __asm mov time_low,eax;
    __asm mov time_high,edx;
}

// call the following function just after your loop
// returns average cycles wasted per sample
UINT bpb_finish(UINT const num_samples)
{
    __asm rdtsc
    __asm sub eax,time_low;
    __asm sub edx,time_high;
    __asm div num_samples;
    __asm mov time,eax;
    return time;
}


2 comment(s) | add a comment | nofrills version


Branchless Clipping

Type : Clipping at 0dB, with none of the usual 'if..then..'
References : Posted by musicdsp[AT]dsparsons[DOT]co[DOT]uk

Notes :
I was working on something that I wanted to ensure that the signal never went above 0dB, and a branchless solution occurred to me.

It works by playing with the structure of a single type, shifting the sign bit down to make a new mulitplicand.

calling MaxZerodB(mydBSample) will ensure that it will never stray over 0dB.
By playing with signs or adding/removing offsets, this offers a complete branchless limiting solution, no matter whether dB or not (after all, they're all just numbers...).

eg:
Limit to <=0 : sample:=MaxZerodB(sample);
Limit to <=3 : sample:=MaxZerodB(sample-3)+3;
Limit to <=-4 : sample:=MaxZerodB(sample+4)-4;

Limit to >=0 : sample:=-MaxZerodB(-sample);
Limit to >=2 : sample:=-MaxZerodB(-sample+2)+2;
Limit to >=-1.5: sample:=-MaxZerodB(-sample-1.5)-1.5;

Whether it actually saves any CPU cycles remains to be seen, but it was an interesting diversion for half an hour :)

[Translating from pascal to other languages shouldn't be too hard, and for doubles, you'll need to fiddle it abit :)]


Code :
function MaxZerodB(dBin:single):single;
var tmp:longint;
begin
     //given that leftmost bit of a longint indicates the negative,
     //  if we shift that down to bit0, and multiply dBin by that
     //  it will return dBin, or zero :)
     tmp:=(longint((@dBin)^) and $80000000) shr 31;
     result:=dBin*tmp;
end;


6 comment(s) | add a comment | nofrills version


Butterworth

Type : LPF 24dB/Oct
References : Posted by Christian[at]savioursofsoul[dot]de
Code :
First calculate the prewarped digital frequency:

K = tan(Pi * Frequency / Samplerate);

Now calc some intermediate variables: (see 'Factors of Polynoms' at http://en.wikipedia.org/wiki/Butterworth_filter, especially if you want a higher order like 48dB/Oct)
a = 0.76536686473 * Q * K;
b = 1.84775906502 * Q * K;

K = K*K; (to optimize it a little bit)

Calculate the first biquad:

A0 = (K+a+1);
A1 = 2*(1-K);
A2 =(a-K-1);
B0 = K;
B1 = 2*B0;
B2 = B0;

Calculate the second biquad:

A3 = (K+b+1);
A4 = 2*(1-K);
A5 = (b-K-1);
B3 = K;
B4 = 2*B3;
B5 = B3;

Then calculate the output as follows:

Stage1 = B0*Input + State0;
State0 = B1*Input + A1/A0*Stage1 + State1;
State1 = B2*Input + A2/A0*Stage1;

Output = B3*Stage1 + State2;
State2 = B4*Stage1 + A4/A3*Output + State2;
State3 = B5*Stage1 + A5/A3*Output;


9 comment(s) | add a comment | nofrills version


Butterworth Optimized C++ Class

Type : 24db Resonant Lowpass
References : Posted by neotec

Notes :
This ist exactly the same as posted by "Zxform" (filters004.txt). The only difference is, that this version is an optimized one.

Parameters:
Cutoff [0.f -> Nyquist.f]
Resonance [0.f -> 1.f]

There are some minima and maxima defined, to make ist sound nice in all situations. This class is part of some of my VST Plugins, and works well and executes fast.


Code :
// FilterButterworth24db.h

#pragma once

class CFilterButterworth24db
{
public:
    CFilterButterworth24db(void);
    ~CFilterButterworth24db(void);
    void SetSampleRate(float fs);
    void Set(float cutoff, float q);
    float Run(float input);

private:
    float t0, t1, t2, t3;
    float coef0, coef1, coef2, coef3;
    float history1, history2, history3, history4;
    float gain;
    float min_cutoff, max_cutoff;
};

// FilterButterworth24db.cpp

#include <math.h>

#define BUDDA_Q_SCALE 6.f

#include "FilterButterworth24db.h"

CFilterButterworth24db::CFilterButterworth24db(void)
{
    this->history1 = 0.f;
    this->history2 = 0.f;
    this->history3 = 0.f;
    this->history4 = 0.f;

    this->SetSampleRate(44100.f);
    this->Set(22050.f, 0.0);
}

CFilterButterworth24db::~CFilterButterworth24db(void)
{
}

void CFilterButterworth24db::SetSampleRate(float fs)
{
    float pi = 4.f * atanf(1.f);

    this->t0 = 4.f * fs * fs;
    this->t1 = 8.f * fs * fs;
    this->t2 = 2.f * fs;
    this->t3 = pi / fs;

    this->min_cutoff = fs * 0.01f;
    this->max_cutoff = fs * 0.45f;
}

void CFilterButterworth24db::Set(float cutoff, float q)
{
    if (cutoff < this->min_cutoff)
        cutoff = this->min_cutoff;
    else if(cutoff > this->max_cutoff)
        cutoff = this->max_cutoff;

    if(q < 0.f)
        q = 0.f;
    else if(q > 1.f)
        q = 1.f;

    float wp = this->t2 * tanf(this->t3 * cutoff);
    float bd, bd_tmp, b1, b2;

    q *= BUDDA_Q_SCALE;
    q += 1.f;

    b1 = (0.765367f / q) / wp;
    b2 = 1.f / (wp * wp);

    bd_tmp = this->t0 * b2 + 1.f;

    bd = 1.f / (bd_tmp + this->t2 * b1);

    this->gain = bd * 0.5f;

    this->coef2 = (2.f - this->t1 * b2);

    this->coef0 = this->coef2 * bd;
    this->coef1 = (bd_tmp - this->t2 * b1) * bd;

    b1 = (1.847759f / q) / wp;

    bd = 1.f / (bd_tmp + this->t2 * b1);

    this->gain *= bd;
    this->coef2 *= bd;
    this->coef3 = (bd_tmp - this->t2 * b1) * bd;
}

float CFilterButterworth24db::Run(float input)
{
    float output = input * this->gain;
    float new_hist;

    output -= this->history1 * this->coef0;
    new_hist = output - this->history2 * this->coef1;

    output = new_hist + this->history1 * 2.f;
    output += this->history2;

    this->history2 = this->history1;
    this->history1 = new_hist;

    output -= this->history3 * this->coef2;
    new_hist = output - this->history4 * this->coef3;

    output = new_hist + this->history3 * 2.f;
    output += this->history4;

    this->history4 = this->history3;
    this->history3 = new_hist;

    return output;
}


15 comment(s) | add a comment | nofrills version


C# Oscilator class

Type : Sine, Saw, Variable Pulse, Triangle, C64 Noise
References : Posted by neotec

Notes :
Parameters:

Pitch: The Osc's pitch in Cents [0 - 14399] startig at A -> 6.875Hz
Pulsewidth: [0 - 65535] -> 0% to 99.99%
Value: The last Output value, a set to this property 'syncs' the Oscilator


Code :
public class SynthOscilator
{
    public enum OscWaveformType
    {
        SAW, PULSE, TRI, NOISE, SINE
    }

    public int Pitch
    {
        get
        {
            return this._Pitch;
        }
        set
        {
            this._Pitch = this.MinMax(0, value, 14399);
            this.OscStep = WaveSteps[this._Pitch];
        }
    }

    public int PulseWidth
    {
        get
        {
            return this._PulseWidth;
        }
        set
        {
            this._PulseWidth = this.MinMax(0, value, 65535);
        }
    }

    public OscWaveformType Waveform
    {
        get
        {
            return this._WaveForm;
        }
        set
        {
            this._WaveForm = value;
        }
    }

    public int Value
    {
        get
        {
            return this._Value;
        }
        set
        {
            this._Value = 0;
            this.OscNow = 0;
        }
    }

    private int _Pitch;
    private int _PulseWidth;
    private int _Value;
    private OscWaveformType _WaveForm;

    private int OscNow;
    private int OscStep;
    private int ShiftRegister;

    public const double BaseFrequence = 6.875;
    public const int SampleRate = 44100;
    public static int[] WaveSteps = new int[0];
    public static int[] SineTable = new int[0];
    
    public SynthOscilator()
    {
        if (WaveSteps.Length == 0)
            this.CalcSteps();

        if (SineTable.Length == 0)
            this.CalcSine();

        this._Pitch = 7200;
        this._PulseWidth = 32768;
        this._WaveForm = OscWaveformType.SAW;

        this.ShiftRegister = 0x7ffff8;

        this.OscNow = 0;
        this.OscStep = WaveSteps[this._Pitch];
        this._Value = 0;
    }

    private void CalcSteps()
    {
        WaveSteps = new int[14400];

        for (int i = 0; i < 14400; i++)
        {
            double t0, t1, t2;

            t0 = Math.Pow(2.0, (double)i / 1200.0);
            t1 = BaseFrequence * t0;
            t2 = (t1 * 65536.0) / (double)this.SampleRate;

            WaveSteps[i] = (int)Math.Round(t2 * 4096.0);
        }
    }

    private void CalcSine()
    {
        SineTable = new int[65536];

        double s = Math.PI / 32768.0;

        for (int i = 0; i < 65536; i++)
        {
            double v = Math.Sin((double)i * s) * 32768.0;

            int t = (int)Math.Round(v) + 32768;

            if (t < 0)
                t = 0;
            else if (t > 65535)
                t = 65535;

            SineTable[i] = t;
        }
    }
    
    public override int Run()
    {
        int ret = 32768;
        int osc = this.OscNow >> 12;

        switch (this._WaveForm)
        {
            case OscWaveformType.SAW:
                ret = osc;
                break;
            case OscWaveformType.PULSE:
                if (osc < this.PulseWidth)
                    ret = 65535;
                else
                    ret = 0;
                break;
            case OscWaveformType.TRI:
                if (osc < 32768)
                    ret = osc << 1;
                else
                    ret = 131071 - (osc << 1);
                break;
            case OscWaveformType.NOISE:
                ret = ((this.ShiftRegister & 0x400000) >> 11) |
                  ((this.ShiftRegister & 0x100000) >> 10) |
                  ((this.ShiftRegister & 0x010000) >> 7) |
                  ((this.ShiftRegister & 0x002000) >> 5) |
                  ((this.ShiftRegister & 0x000800) >> 4) |
                  ((this.ShiftRegister & 0x000080) >> 1) |
                  ((this.ShiftRegister & 0x000010) << 1) |
                  ((this.ShiftRegister & 0x000004) << 2);
                ret <<= 4;
                break;
            case OscWaveformType.SINE:
                ret = SynthTools.SineTable[osc];
                break;
            default:
                break;
        }

        this.OscNow += this.OscStep;

        if (this.OscNow > 0xfffffff)
        {
            int bit0 = ((this.ShiftRegister >> 22) ^ (this.ShiftRegister >> 17)) & 0x1;
            this.ShiftRegister <<= 1;
            this.ShiftRegister &= 0x7fffff;
            this.ShiftRegister |= bit0;
        }

        this.OscNow &= 0xfffffff;

        this._Value = ret - 32768;

        return this._Value;
    }

    public int MinMax(int a, int b, int c)
    {
        if (b < a)
            return a;
        else if (b > c)
            return c;
        else
            return b;
    }
}


no comments on this item | add a comment | nofrills version


C++ class implementation of RBJ Filters

References : Posted by arguru[AT]smartelectronix[DOT]com
Linked file : CFxRbjFilter.h

Notes :
[WARNING: This code is not FPU undernormalization safe!]



no comments on this item | add a comment | nofrills version


C++ gaussian noise generation

Type : gaussian noise generation
References : Posted by paul[at]expdigital[dot]co[dot]uk

Notes :
References :
Tobybears delphi noise generator was the basis. Simply converted it to C++.
Link for original is:
http://www.musicdsp.org/archive.php?classid=0#129
The output is in noise.


Code :
/* Include requisits */
#include <cstdlib>
#include <ctime>

/* Generate a new random seed from system time - do this once in your constructor */
srand(time(0));

/* Setup constants */
const static int q = 15;
const static float c1 = (1 << q) - 1;
const static float c2 = ((int)(c1 / 3)) + 1;
const static float c3 = 1.f / c1;

/* random number in range 0 - 1 not including 1 */
float random = 0.f;

/* the white noise */
float noise = 0.f;

for (int i = 0; i < numSamples; i++)
{
    random = ((float)rand() / (float)(RAND_MAX + 1));
    noise = (2.f * ((random * c2) + (random * c2) + (random * c2)) - 3.f * (c2 - 1.f)) * c3;
}


3 comment(s) | add a comment | nofrills version


C-Weighed Filter

Type : digital implementation (after bilinear transform)
References : Posted by Christian@savioursofsoul.de

Notes :
unoptimized version!

Code :
First prewarp the frequency of both poles:

K1 = tan(0.5*Pi*20.6 / SampleRate) // for 20.6Hz
K2 = tan(0.5*Pi*12200 / SampleRate) // for 12200Hz

Then calculate the both biquads:

b0 = 1
b1 = 0
b2 =-1
a0 = ((K1+1)*(K1+1)*(K2+1)*(K2+1));
a1 =-4*(K1*K1*K2*K2+K1*K1*K2+K1*K2*K2-K1-K2-1)*t;
a2 =-  ((K1-1)*(K1-1)*(K2-1)*(K2-1))*t;

and:

b3 = 1
b4 = 0
b5 =-1
a3 = ((K1+1)*(K1+1)*(K2+1)*(K2+1));
a4 =-4*(K1*K1*K2*K2+K1*K1*K2+K1*K2*K2-K1-K2-1)*t;
a5 =-  ((K1-1)*(K1-1)*(K2-1)*(K2-1))*t;

Now use an equation for calculating the biquads like this:

Stage1 = b0*Input                 + State0;
State0 =           + a1/a0*Stage1 + State1;
State1 = b2*Input  + a2/a0*Stage1;

Output = b3*Stage1                + State2;
State2 =           + a4/a3*Output + State2;
State3 = b5*Stage1 + a5/a3*Output;


2 comment(s) | add a comment | nofrills version


Calculate notes (java)

Type : Java class for calculating notes with different in params
References : Posted by larsby[AT]elak[DOT]org
Linked file : Frequency.java

Notes :
Converts between string notes and frequencies and back. I vaguely remember writing bits of it, and I got it off the net somwhere so dont ask me

- Larsby




no comments on this item | add a comment | nofrills version


Cascaded resonant lp/hp filter

Type : lp+hp
References : Posted by tobybear[AT]web[DOT]de

Notes :
// Cascaded resonant lowpass/hipass combi-filter
// The original source for this filter is from Paul Kellet from
// the archive. This is a cascaded version in Delphi where the
// output of the lowpass is fed into the highpass filter.
// Cutoff frequencies are in the range of 0<=x<1 which maps to
// 0..nyquist frequency

// input variables are:
// cut_lp: cutoff frequency of the lowpass (0..1)
// cut_hp: cutoff frequency of the hipass (0..1)
// res_lp: resonance of the lowpass (0..1)
// res_hp: resonance of the hipass (0..1)


Code :
var n1,n2,n3,n4:single; // filter delay, init these with 0!
    fb_lp,fb_hp:single; // storage for calculated feedback
const p4=1.0e-24; // Pentium 4 denormal problem elimination

function dofilter(inp,cut_lp,res_lp,cut_hp,res_hp:single):single;
begin
fb_lp:=res_lp+res_lp/(1-cut_lp);
fb_hp:=res_hp+res_hp/(1-cut_lp);
n1:=n1+cut_lp*(inp-n1+fb_lp*(n1-n2))+p4;
n2:=n2+cut_lp*(n1-n2);
n3:=n3+cut_hp*(n2-n3+fb_hp*(n3-n4))+p4;
n4:=n4+cut_hp*(n3-n4);
result:=i-n4;
end;


5 comment(s) | add a comment | nofrills version


Center separation in a stereo mixdown

References : Posted by Thiburce BELAVENTURE

Notes :
One year ago, i found a little trick to isolate or remove the center in a stereo mixdown.

My method use the time-frequency representation (FFT). I use a min fuction between left and right channels (for each bin) to create the pseudo center. I apply a phase correction, and i substract this signal to the left and right signals.

Then, we can remix them after treatments (or without) to produce a stereo signal in output.

This algorithm (I called it "TBIsolator") is not perfect, but the result is very nice, better than the phase technic (L substract R...). I know that it is not mathematically correct, but as an estimation of the center, the exact match is very hard to obtain. So, it is not so bad (just listen the result and see).

My implementation use a 4096 FFT size, with overlap-add method (factor 2). With a lower FFT size, the sound will be more dirty, and with a 16384 FFT size, the center will have too much high frequency (I don't explore why this thing appears).

I just post the TBIsolator code (see FFTReal in this site for implement the FFT engine).

pIns and pOuts buffers use the representation of the FFTReal class (0 to N/2-1: real parts, N/2 to N-1: imaginary parts).

Have fun with the TBIsolator algorithm ! I hope you enjoy it and if you enhance it, contact me (it's my baby...).

P.S.: the following function is not optimized.


Code :
/* ============================================================= */
/* nFFTSize must be a power of 2                                 */
/* ============================================================= */
/* Usage examples:                                               */
/* - suppress the center:  fAmpL = 1.f, fAmpC = 0.f, fAmpR = 1.f */
/* - keep only the center: fAmpL = 0.f, fAmpC = 1.f, fAmpR = 0.f */
/* ============================================================= */

void processTBIsolator(float *pIns[2], float *pOuts[2], long nFFTSize, float fAmpL, float fAmpC, float fAmpR)
{
    float fModL, fModR;
    float fRealL, fRealC, fRealR;
    float fImagL, fImagC, fImagR;
    double u;

    for ( long i = 0, j = nFFTSize / 2; i < nFFTSize / 2; i++ )
    {
        fModL = pIns[0][i] * pIns[0][i] + pIns[0][j] * pIns[0][j];
        fModR = pIns[1][i] * pIns[1][i] + pIns[1][j] * pIns[1][j];

        // min on complex numbers
        if ( fModL > fModR )
        {
            fRealC = fRealR;
            fImagC = fImagR;
        }
        else
        {
            fRealC = fRealL;
            fImagC = fImagL;
        }

        // phase correction...
        u = fabs(atan2(pIns[0][j], pIns[0][i]) - atan2(pIns[1][j], pIns[1][i])) / 3.141592653589;

        if ( u >= 1 ) u -= 1.;

        u = pow(1 - u*u*u, 24);

        fRealC *= (float) u;
        fImagC *= (float) u;

        // center extraction...
        fRealL = pIns[0][i] - fRealC;
        fImagL = pIns[0][j] - fImagC;

        fRealR = pIns[1][i] - fRealC;
        fImagR = pIns[1][j] - fImagC;

        // You can do some treatments here...

        pOuts[0][i] = fRealL * fAmpL + fRealC * fAmpC;
        pOuts[0][j] = fImagL * fAmpL + fImagC * fAmpC;

        pOuts[1][i] = fRealR * fAmpR + fRealC * fAmpC;
        pOuts[1][j] = fImagR * fAmpR + fImagC * fAmpC;
    }
}


1 comment(s) | add a comment | nofrills version


Center separation in a stereo mixdown

References : Posted by Thiburce BELAVENTURE

Notes :
One year ago, i found a little trick to isolate or remove the center in a stereo mixdown.

My method use the time-frequency representation (FFT). I use a min fuction between left and right channels (for each bin) to create the pseudo center. I apply a phase correction, and i substract this signal to the left and right signals.

Then, we can remix them after treatments (or without) to produce a stereo signal in output.

This algorithm (I called it "TBIsolator") is not perfect, but the result is very nice, better than the phase technic (L substract R...). I know that it is not mathematically correct, but as an estimation of the center, the exact match is very hard to obtain. So, it is not so bad (just listen the result and see).

My implementation use a 4096 FFT size, with overlap-add method (factor 2). With a lower FFT size, the sound will be more dirty, and with a 16384 FFT size, the center will have too much high frequency (I don't explore why this thing appears).

I just post the TBIsolator code (see FFTReal in this site for implement the FFT engine).

pIns and pOuts buffers use the representation of the FFTReal class (0 to N/2-1: real parts, N/2 to N-1: imaginary parts).

Have fun with the TBIsolator algorithm ! I hope you enjoy it and if you enhance it, contact me (it's my baby...).

P.S.: the following function is not optimized.


Code :
/* ============================================================= */
/* nFFTSize must be a power of 2                                 */
/* ============================================================= */
/* Usage examples:                                               */
/* - suppress the center:  fAmpL = 1.f, fAmpC = 0.f, fAmpR = 1.f */
/* - keep only the center: fAmpL = 0.f, fAmpC = 1.f, fAmpR = 0.f */
/* ============================================================= */

void processTBIsolator(float *pIns[2], float *pOuts[2], long nFFTSize, float fAmpL, float fAmpC, float fAmpR)
{
    float fModL, fModR;
    float fRealL, fRealC, fRealR;
    float fImagL, fImagC, fImagR;
    double u;

    for ( long i = 0, j = nFFTSize / 2; i < nFFTSize / 2; i++ )
    {
        fModL = pIns[0][i] * pIns[0][i] + pIns[0][j] * pIns[0][j];
        fModR = pIns[1][i] * pIns[1][i] + pIns[1][j] * pIns[1][j];

        // min on complex numbers
        if ( fModL > fModR )
        {
            fRealC = fRealR;
            fImagC = fImagR;
        }
        else
        {
            fRealC = fRealL;
            fImagC = fImagL;
        }
    
        // phase correction...
        u = fabs(atan2(pIns[0][j], pIns[0][i]) - atan2(pIns[1][j], pIns[1][i])) / 3.141592653589;
    
        if ( u >= 1 ) u -= 1.;
    
        u = pow(1 - u*u*u, 24);
    
        fRealC *= (float) u;
        fImagC *= (float) u;
    
        // center extraction...
        fRealL = pIns[0][i] - fRealC;
        fImagL = pIns[0][j] - fImagC;
    
        fRealR = pIns[1][i] - fRealC;
        fImagR = pIns[1][j] - fImagC;
    
        // You can do some treatments here...
    
        pOuts[0][i] = fRealL * fAmpL + fRealC * fAmpC;
        pOuts[0][j] = fImagL * fAmpL + fImagC * fAmpC;
    
        pOuts[1][i] = fRealR * fAmpR + fRealC * fAmpC;
        pOuts[1][j] = fImagR * fAmpR + fImagC * fAmpC;
    }
}


no comments on this item | add a comment | nofrills version


Cheap pseudo-sinusoidal lfo

References : Posted by fumminger[AT]umminger[DOT]com

Notes :
Although the code is written in standard C++, this algorithm is really better suited for dsps where one can take advantage of multiply-accumulate instructions and where the required phase accumulator can be easily implemented by masking a counter.

It provides a pretty cheap roughly sinusoidal waveform that is good enough for an lfo.


Code :
// x should be between -1.0 and 1.0
inline
double pseudo_sine(double x)
{
    // Compute 2*(x^2-1.0)^2-1.0
    x *= x;
    x -= 1.0;
    x *= x;
    // The following lines modify the range to lie between -1.0 and 1.0.
   // If a range of between 0.0 and 1.0 is acceptable or preferable
   // (as in a modulated delay line) then you can save some cycles.
    x *= 2.0;
    x -= 1.0;
}


2 comment(s) | add a comment | nofrills version


chebyshev waveshaper (using their recursive definition)

Type : chebyshev
References : Posted by mdsp

Notes :
someone asked for it on kvr-audio.

I use it in an unreleased additive synth.
There's no oversampling needed in my case since I feed it with a pure sinusoid and I control the order to not have frequencies above Fs/2. Otherwise you should oversample by the order you'll use in the function or bandlimit the signal before the waveshaper. unless you really want that aliasing effect... :)

I hope the code is self-explaining, otherwise there's plenty of sites explaining chebyshev polynoms and their applications.


Code :
float chebyshev(float x, float A[], int order)
{
   // To = 1
   // T1 = x
   // Tn = 2.x.Tn-1 - Tn-2
   // out = sum(Ai*Ti(x)) , i C {1,..,order}
   float Tn_2 = 1.0f;
   float Tn_1 = x;
   float Tn;
   float out = A[0]*Tn_1;

   for(int n=2;n<=order;n++)
   {
      Tn    =   2.0f*x*Tn_1 - Tn_2;
      out    +=   A[n-1]*Tn;      
      Tn_2 =   Tn_1;
      Tn_1 =  Tn;
   }
   return out;
}


4 comment(s) | add a comment | nofrills version


Class for waveguide/delay effects

Type : IIR filter
References : Posted by arguru[AT]smartelectronix.com

Notes :
Flexible-time, non-sample quantized delay , can be used for stuff like waveguide synthesis or time-based (chorus/flanger) fx.

MAX_WG_DELAY is a constant determining MAX buffer size (in samples)


Code :
class cwaveguide  
{
public:
    cwaveguide(){clear();}
    virtual ~cwaveguide(){};
    
    void clear()
    {
        counter=0;
        for(int s=0;s<MAX_WG_DELAY;s++)
            buffer[s]=0;
    }
    
    inline float feed(float const in,float const feedback,double const delay)
    {
        // calculate delay offset
        double back=(double)counter-delay;
        
        // clip lookback buffer-bound
        if(back<0.0)
            back=MAX_WG_DELAY+back;
        
        // compute interpolation left-floor
        int const index0=floor_int(back);
        
        // compute interpolation right-floor
        int index_1=index0-1;
        int index1=index0+1;
        int index2=index0+2;
        
        // clip interp. buffer-bound
        if(index_1<0)index_1=MAX_WG_DELAY-1;
        if(index1>=MAX_WG_DELAY)index1=0;
        if(index2>=MAX_WG_DELAY)index2=0;
        
        // get neighbourgh samples
        float const y_1= buffer [index_1];
        float const y0 = buffer [index0];
        float const y1 = buffer [index1];
        float const y2 = buffer [index2];
        
        // compute interpolation x
        float const x=(float)back-(float)index0;
        
        // calculate
        float const c0 = y0;
        float const c1 = 0.5f*(y1-y_1);
        float const c2 = y_1 - 2.5f*y0 + 2.0f*y1 - 0.5f*y2;
        float const c3 = 0.5f*(y2-y_1) + 1.5f*(y0-y1);
        
        float const output=((c3*x+c2)*x+c1)*x+c0;
        
        // add to delay buffer
        buffer[counter]=in+output*feedback;
        
        // increment delay counter
        counter++;
        
        // clip delay counter
        if(counter>=MAX_WG_DELAY)
            counter=0;
        
        // return output
        return output;
    }
    
    float    buffer[MAX_WG_DELAY];
    int        counter;
};


no comments on this item | add a comment | nofrills version


Clipping without branching

Type : Min, max and clip
References : Posted by Laurent de Soras

Notes :
It may reduce accuracy for small numbers. I.e. if you clip to [-1; 1], fractional part of the result will be quantized to 23 bits (or more, depending on the bit depth of the temporary results). Thus, 1e-20 will be rounded to 0. The other (positive) side effect is the denormal number elimination.

Code :
float max (float x, float a)
{
   x -= a;
   x += fabs (x);
   x *= 0.5;
   x += a;
   return (x);
}

float min (float x, float b)
{
   x = b - x;
   x += fabs (x)
   x *= 0.5;
   x = b - x;
   return (x);
}

float clip (float x, float a, float b)
{
   x1 = fabs (x-a);
   x2 = fabs (x-b);
   x = x1 + (a+b);
   x -= x2;
   x *= 0.5;
   return (x);
}


6 comment(s) | add a comment | nofrills version


Coefficients for Daubechies wavelets 1-38

Type : wavelet transform
References : Computed by Kazuo Hatano, Compiled and verified by Olli Niemitalo
Linked file : daub.h


no comments on this item | add a comment | nofrills version


Compressor

Type : Hardknee compressor with RMS look-ahead envelope calculation and adjustable attack/decay
References : Posted by flashinc[AT]mail[DOT]ru

Notes :
RMS is a true way to estimate _musical_ signal energy,
our ears behaves in a same way.

to making all it work,
try this values (as is, routine accepts percents and milliseconds) for first time:

threshold = 50%
slope = 50%
RMS window width = 1 ms
lookahead = 3 ms
attack time = 0.1 ms
release time = 300 ms

This code can be significantly improved in speed by
changing RMS calculation loop to 'running summ'
(keeping the summ in 'window' -
adding next newest sample and subtracting oldest on each step)


Code :
void compress
    (
        float*  wav_in,     // signal
        int     n,          // N samples
        double  threshold,  // threshold (percents)
        double  slope,      // slope angle (percents)
        int     sr,         // sample rate (smp/sec)
        double  tla,        // lookahead  (ms)
        double  twnd,       // window time (ms)
        double  tatt,       // attack time  (ms)
        double  trel        // release time (ms)
    )
{
    typedef float   stereodata[2];
    stereodata*     wav = (stereodata*) wav_in; // our stereo signal
    threshold *= 0.01;          // threshold to unity (0...1)
    slope *= 0.01;              // slope to unity
    tla *= 1e-3;                // lookahead time to seconds
    twnd *= 1e-3;               // window time to seconds
    tatt *= 1e-3;               // attack time to seconds
    trel *= 1e-3;               // release time to seconds

    // attack and release "per sample decay"
    double  att = (tatt == 0.0) ? (0.0) : exp (-1.0 / (sr * tatt));
    double  rel = (trel == 0.0) ? (0.0) : exp (-1.0 / (sr * trel));

    // envelope
    double  env = 0.0;

    // sample offset to lookahead wnd start
    int     lhsmp = (int) (sr * tla);

    // samples count in lookahead window
    int     nrms = (int) (sr * twnd);

    // for each sample...
    for (int i = 0; i < n; ++i)
    {
        // now compute RMS
        double  summ = 0;

        // for each sample in window
        for (int j = 0; j < nrms; ++j)
        {
            int     lki = i + j + lhsmp;
            double  smp;

            // if we in bounds of signal?
            // if so, convert to mono
            if (lki < n)
                smp = 0.5 * wav[lki][0] + 0.5 * wav[lki][1];
            else
                smp = 0.0;      // if we out of bounds we just get zero in smp

            summ += smp * smp;  // square em..
        }

        double  rms = sqrt (summ / nrms);   // root-mean-square

        // dynamic selection: attack or release?
        double  theta = rms > env ? att : rel;

        // smoothing with capacitor, envelope extraction...
        // here be aware of pIV denormal numbers glitch
        env = (1.0 - theta) * rms + theta * env;

        // the very easy hard knee 1:N compressor
        double  gain = 1.0;
        if (env > threshold)
            gain = gain - (env - threshold) * slope;

        // result - two hard kneed compressed channels...
        float  leftchannel = wav[i][0] * gain;
        float  rightchannel = wav[i][1] * gain;
    }
}


6 comment(s) | add a comment | nofrills version


Constant-time exponent of 2 detector

References : Posted by Brent Lehman (mailbjl[AT]yahoo.com)

Notes :
In your common FFT program, you want to make sure that the frame you're working with has a size that is a power of 2. This tells you in just a few operations. Granted, you won't be using this algorithm inside a loop, so the savings aren't that great, but every little hack helps ;)

Code :
// Quit if size isn't a power of 2
if ((-size ^ size) & size) return;

// If size is an unsigned int, the above might not compile.
// You'd want to use this instead:
if (((~size + 1) ^ size) & size) return;


1 comment(s) | add a comment | nofrills version


Conversion and normalization of 16-bit sample to a floating point number

References : Posted by George Yohng
Code :
float out;
signed short in;

// This code does the same as
//   out = ((float)in)*(1.0f/32768.0f);
//
// Depending on the architecture and conversion settings,
// it might be more optimal, though it is always
// advisable to check its speed against genuine
// algorithms.

((unsigned &)out)=0x43818000^in;
out-=259.0f;


2 comment(s) | add a comment | nofrills version


Conversions on a PowerPC

Type : motorola ASM conversions
References : Posted by James McCartney
Code :
double ftod(float x) { return (double)x;
00000000: 4E800020  blr
    // blr == return from subroutine, i.e. this function is a noop

float dtof(double x) { return (float)x;
00000000: FC200818  frsp       fp1,fp1
00000004: 4E800020  blr

int ftoi(float x) { return (int)x;
00000000: FC00081E  fctiwz     fp0,fp1
00000004: D801FFF0  stfd       fp0,-16(SP)
00000008: 8061FFF4  lwz        r3,-12(SP)
0000000C: 4E800020  blr

int dtoi(double x) { return (int)x;
00000000: FC00081E  fctiwz     fp0,fp1
00000004: D801FFF0  stfd       fp0,-16(SP)
00000008: 8061FFF4  lwz        r3,-12(SP)
0000000C: 4E800020  blr

double itod(int x) { return (double)x;
00000000: C8220000  lfd        fp1,@1558(RTOC)
00000004: 6C608000  xoris      r0,r3,$8000
00000008: 9001FFF4  stw        r0,-12(SP)
0000000C: 3C004330  lis        r0,17200
00000010: 9001FFF0  stw        r0,-16(SP)
00000014: C801FFF0  lfd        fp0,-16(SP)
00000018: FC200828  fsub       fp1,fp0,fp1
0000001C: 4E800020  blr

float itof(int x) { return (float)x;
00000000: C8220000  lfd        fp1,@1558(RTOC)
00000004: 6C608000  xoris      r0,r3,$8000
00000008: 9001FFF4  stw        r0,-12(SP)
0000000C: 3C004330  lis        r0,17200
00000010: 9001FFF0  stw        r0,-16(SP)
00000014: C801FFF0  lfd        fp0,-16(SP)
00000018: EC200828  fsubs      fp1,fp0,fp1
0000001C: 4E800020  blr


no comments on this item | add a comment | nofrills version


Cool Sounding Lowpass With Decibel Measured Resonance

Type : LP 2-pole resonant tweaked butterworth
References : Posted by daniel_jacob_werner [AT] yaho [DOT] com [DOT] au

Notes :
This algorithm is a modified version of the tweaked butterworth lowpass filter by Patrice Tarrabia posted on musicdsp.org's archives. It calculates the coefficients for a second order IIR filter. The resonance is specified in decibels above the DC gain. It can be made suitable to use as a SoundFont 2.0 filter by scaling the output so the overall gain matches the specification (i.e. if resonance is 6dB then you should scale the output by -3dB). Note that you can replace the sqrt(2) values in the standard butterworth highpass algorithm with my "q =" line of code to get a highpass also. How it works: normally q is the constant sqrt(2), and this value controls resonance. At sqrt(2) resonance is 0dB, smaller values increase resonance. By multiplying sqrt(2) by a power ratio we can specify the resonant gain at the cutoff frequency. The resonance power ratio is calculated with a standard formula to convert between decibels and power ratios (the powf statement...).

Good Luck,
Daniel Werner
http://experimentalscene.com/


Code :
float c, csq, resonance, q, a0, a1, a2, b1, b2;

c = 1.0f / (tanf(pi * (cutoff / samplerate)));
csq = c * c;
resonance = powf(10.0f, -(resonancedB * 0.1f));
q = sqrt(2.0f) * resonance;
a0 = 1.0f / (1.0f + (q * c) + (csq));
a1 = 2.0f * a0;
a2 = a0;
b1 = (2.0f * a0) * (1.0f - csq);
b2 = a0 * (1.0f - (q * c) + csq);


7 comment(s) | add a comment | nofrills version


Copy-protection schemes

References : Posted by Moyer, Andy

Notes :
This post of Andy sums up everything there is to know about copy-protection schemes:

"Build a great product and release improvements regularly so that people will
be willing to spend the money on it, thus causing anything that is cracked
to be outdated quickly. Build a strong relationship with your customers,
because if they've already paid for one of your products, and were
satisfied, chances are, they will be more likely to buy another one of your
products. Make your copy protection good enough so that somebody can't just
do a search in Google and enter in a published serial number, but don't make
registered users jump through flaming hoops to be able to use the product.
Also use various approaches to copy protection within a release, and vary
those approaches over multiple releases so that a hacker that cracked your
app's version 1.0 can't just run a recorded macro in a text editor to crack
your version 2.0 software [this being simplified]."




1 comment(s) | add a comment | nofrills version


Cubic interpollation

Type : interpollation
References : Posted by Olli Niemitalo
Linked file : other001.gif

Notes :
(see linkfile)
finpos is the fractional, inpos the integer part.


Code :
xm1 = x [inpos - 1];
x0  = x [inpos + 0];
x1  = x [inpos + 1];
x2  = x [inpos + 2];
a = (3 * (x0-x1) - xm1 + x2) / 2;
b = 2*x1 + xm1 - (5*x0 + x2) / 2;
c = (x1 - xm1) / 2;
y [outpos] = (((a * finpos) + b) * finpos + c) * finpos + x0;


no comments on this item | add a comment | nofrills version


Cubic polynomial envelopes

Type : envellope generation
References : Posted by Andy Mucho

Notes :
This function runs from:
startlevel at Time=0
midlevel at Time/2
endlevel at Time
At moments of extreme change over small time, the function can generate out
of range (of the 3 input level) numbers, but isn't really a problem in
actual use with real numbers, and sensible/real times..


Code :
time = 32
startlevel = 0
midlevel = 100
endlevel = 120
k = startlevel + endlevel - (midlevel * 2)
r = startlevel
s = (endlevel - startlevel - (2 * k)) / time
t = (2 * k) / (time * time)
bigr = r
bigs = s + t
bigt = 2 * t

for(int i=0;i<time;i++)
{
bigr = bigr + bigs
bigs = bigs + bigt
}


2 comment(s) | add a comment | nofrills version


Cure for malicious samples

Type : Filters Denormals, NaNs, Infinities
References : Posted by urs[AT]u-he[DOT]com

Notes :
A lot of incidents can happen during processing samples. A nasty one is denormalization, which makes cpus consume insanely many cycles for easiest instructions.

But even worse, if you have NaNs or Infinities inside recursive structures, maybe due to division by zero, all subsequent samples that are multiplied with these values will get "infected" and become NaN or Infinity. Your sound makes BLIPPP and that was it, silence from the speakers.

Thus I've written a small function that sets all of these cases to 0.0f.

You'll notice that I treat a buffer of floats as unsigned integers. And I avaoid branches by using comparison results as 0 or 1.

When compiled with GCC, this function should not create any "hidden" branches, but you should verify the assembly code anyway. Sometimes some parenthesis do the trick...

;) Urs


Code :
#ifndef UInt32
#define UInt32 unsigned int
#endif

void erase_All_NaNs_Infinities_And_Denormals( float* inSamples, int& inNumberOfSamples )
{
    UInt32* inArrayOfFloats = (UInt32*) inSamples;

    for ( int i = 0; i < inNumberOfSamples; i++ )
    {
        UInt32 sample = *inArrayOfFloats;
        UInt32 exponent = sample & 0x7F800000;
        
            // exponent < 0x7F800000 is 0 if NaN or Infinity, otherwise 1
            // exponent > 0 is 0 if denormalized, otherwise 1
            
        int aNaN = exponent < 0x7F800000;
        int aDen = exponent > 0;
            
        *inArrayOfFloats++ = sample * ( aNaN & aDen );
    
    }
}


5 comment(s) | add a comment | nofrills version


DC filter

Type : 1-pole/1-zero DC filter
References : Posted by andy[DOT]rossol[AT]bluewin[DOT]ch

Notes :
This is based on code found in the document:
"Introduction to Digital Filters (DRAFT)"
Julius O. Smith III (jos@ccrma.stanford.edu)
(http://www-ccrma.stanford.edu/~jos/filters/)
---

Some audio algorithms (asymmetric waveshaping, cascaded filters, ...) can produce DC offset. This offset can accumulate and reduce the signal/noise ratio.

So, how to fix it? The example code from Julius O. Smith's document is:
...
y(n) = x(n) - x(n-1) + R * y(n-1)
// "R" between 0.9 .. 1
// n=current (n-1)=previous in/out value
...
"R" depends on sampling rate and the low frequency point. Do not set "R" to a fixed value (e.g. 0.99) if you don't know the sample rate. Instead set R to:
(-3dB @ 40Hz): R = 1-(250/samplerate)
(-3dB @ 30Hz): R = 1-(190/samplerate)
(-3dB @ 20Hz): R = 1-(126/samplerate)




2 comment(s) | add a comment | nofrills version


Decimator

Type : Bit-reducer and sample&hold unit
References : Posted by tobyear[AT]web[DOT]de

Notes :
This is a simple bit and sample rate reduction code, maybe some of you can use it. The parameters are bits (1..32) and rate (0..1, 1 is the original samplerate).
Call the function like this:
y=decimate(x);

A VST plugin implementing this algorithm (with full Delphi source code included) can be downloaded from here:
http://tobybear.phreque.com/decimator.zip

Comments/suggestions/improvements are welcome, send them to: tobybear@web.de


Code :
// bits: 1..32
// rate: 0..1 (1 is original samplerate)

********** Pascal source **********
var m:longint;
    y,cnt,rate:single;

// call this at least once before calling
// decimate() the first time
procedure setparams(bits:integer;shrate:single);
begin
m:=1 shl (bits-1);
cnt:=1;
rate:=shrate;
end;

function decimate(i:single):single;
begin
cnt:=cnt+rate;
if (cnt>1) then
begin
  cnt:=cnt-1;
  y:=round(i*m)/m;
end;
result:=y;
end;

********** C source **********
int bits=16;
float rate=0.5;

long int m=1<<(bits-1);
float y=0,cnt=0;

float decimate(float i)
{
cnt+=rate;
if (cnt>=1)
{
  cnt-=1;
  y=(long int)(i*m)/(float)m;
}
return y;
}


9 comment(s) | add a comment | nofrills version


Delay time calculation for reverberation

References : Posted by Andy Mucho

Notes :
This is from some notes I had scribbled down from a while back on
automatically calculating diffuse delays. Given an intial delay line gain
and time, calculate the times and feedback gain for numlines delay lines..


Code :
int   numlines = 8;
float t1 = 50.0;        // d0 time
float g1 = 0.75;        // d0 gain
float rev = -3*t1 / log10 (g1);

for (int n = 0; n < numlines; ++n)
{
  float dt = t1 / pow (2, (float (n) / numlines));
  float g = pow (10, -((3*dt) / rev));
  printf ("d%d t=%.3f g=%.3f\n", n, dt, g);
}

The above with t1=50.0 and g1=0.75 yields:

d0 t=50.000 g=0.750
d1 t=45.850 g=0.768
d2 t=42.045 g=0.785
d3 t=38.555 g=0.801
d4 t=35.355 g=0.816
d5 t=32.421 g=0.830
d6 t=29.730 g=0.843
d7 t=27.263 g=0.855

To go more diffuse, chuck in dual feedback paths with a one cycle delay
effectively creating a phase-shifter in the feedback path, then things get
more exciting.. Though what the optimum phase shifts would be I couldn't
tell you right now..


1 comment(s) | add a comment | nofrills version


Delphi Class implementation of the RBJ filters

Type : Delphi class implementation of the RBJ filters
References : Posted by veryangrymobster@hotmail.com

Notes :
I haven't tested this code thoroughly as it's pretty much a straight conversion from Arguru c++ implementation.

Code :
  {
  RBJ Audio EQ Cookbook Filters
  A pascal conversion of arguru[AT]smartelectronix[DOT]com's
  c++ implementation.

  WARNING:This code is not FPU undernormalization safe.

  Filter Types
  0-LowPass
  1-HiPass
  2-BandPass CSG
  3-BandPass CZPG
  4-Notch
  5-AllPass
  6-Peaking
  7-LowShelf
  8-HiShelf
  }
unit uRbjEqFilters;

interface

uses math;

type
  TRbjEqFilter=class
  private
    b0a0,b1a0,b2a0,a1a0,a2a0:single;
    in1,in2,ou1,ou2:single;
    fSampleRate:single;
    fMaxBlockSize:integer;
    fFilterType:integer;
    fFreq,fQ,fDBGain:single;
    fQIsBandWidth:boolean;
    procedure SetQ(NewQ:single);
  public
    out1:array of single;
    constructor create(SampleRate:single;MaxBlockSize:integer);
    procedure CalcFilterCoeffs(pFilterType:integer;pFreq,pQ,pDBGain:single;pQIsBandWidth:boolean);overload;
    procedure CalcFilterCoeffs;overload;
    function Process(input:single):single; overload;
    procedure Process(Input:psingle;sampleframes:integer); overload;
    property FilterType:integer read fFilterType write fFilterType;
    property Freq:single read fFreq write fFreq;
    property q:single read fQ write SetQ;
    property DBGain:single read fDBGain write fDBGain;
    property QIsBandWidth:boolean read fQIsBandWidth write fQIsBandWidth;
  end;

implementation

constructor TRbjEqFilter.create(SampleRate:single;MaxBlockSize:integer);
begin
  fMaxBlockSize:=MaxBlockSize;
  setLength(out1,fMaxBlockSize);
  fSampleRate:=SampleRate;

  fFilterType:=0;
  fFreq:=500;
  fQ:=0.3;
  fDBGain:=0;
  fQIsBandWidth:=true;

  in1:=0;
  in2:=0;
  ou1:=0;
  ou2:=0;
end;

procedure TRbjEqFilter.SetQ(NewQ:single);
begin
  fQ:=(1-NewQ)*0.98;
end;

procedure TRbjEqFilter.CalcFilterCoeffs(pFilterType:integer;pFreq,pQ,pDBGain:single;pQIsBandWidth:boolean);
begin
  FilterType:=pFilterType;
  Freq:=pFreq;
  Q:=pQ;
  DBGain:=pDBGain;
  QIsBandWidth:=pQIsBandWidth;

  CalcFilterCoeffs;
end;

procedure TRbjEqFilter.CalcFilterCoeffs;
var
  alpha,a0,a1,a2,b0,b1,b2:single;
  A,beta,omega,tsin,tcos:single;
begin
  //peaking, LowShelf or HiShelf
  if fFilterType>=6 then
  begin
    A:=power(10.0,(DBGain/40.0));
    omega:=2*pi*fFreq/fSampleRate;
    tsin:=sin(omega);
    tcos:=cos(omega);

    if fQIsBandWidth then
      alpha:=tsin*sinh(log2(2.0)/2.0*fQ*omega/tsin)
    else
      alpha:=tsin/(2.0*fQ);

    beta:=sqrt(A)/fQ;

    // peaking
    if fFilterType=6 then
    begin
      b0:=1.0+alpha*A;
      b1:=-2.0*tcos;
      b2:=1.0-alpha*A;
      a0:=1.0+alpha/A;
      a1:=-2.0*tcos;
      a2:=1.0-alpha/A;
    end else
    // lowshelf
    if fFilterType=7 then
    begin
      b0:=(A*((A+1.0)-(A-1.0)*tcos+beta*tsin));
      b1:=(2.0*A*((A-1.0)-(A+1.0)*tcos));
      b2:=(A*((A+1.0)-(A-1.0)*tcos-beta*tsin));
      a0:=((A+1.0)+(A-1.0)*tcos+beta*tsin);
      a1:=(-2.0*((A-1.0)+(A+1.0)*tcos));
      a2:=((A+1.0)+(A-1.0)*tcos-beta*tsin);
    end;
    // hishelf
    if fFilterType=8 then
    begin
      b0:=(A*((A+1.0)+(A-1.0)*tcos+beta*tsin));
      b1:=(-2.0*A*((A-1.0)+(A+1.0)*tcos));
      b2:=(A*((A+1.0)+(A-1.0)*tcos-beta*tsin));
      a0:=((A+1.0)-(A-1.0)*tcos+beta*tsin);
      a1:=(2.0*((A-1.0)-(A+1.0)*tcos));
      a2:=((A+1.0)-(A-1.0)*tcos-beta*tsin);
    end;
  end else  //other filter types
  begin
    omega:=2*pi*fFreq/fSampleRate;
    tsin:=sin(omega);
    tcos:=cos(omega);
    if fQIsBandWidth then
      alpha:=tsin*sinh(log2(2)/2*fQ*omega/tsin)
    else
      alpha:=tsin/(2*fQ);
    //lowpass
    if fFilterType=0 then
    begin
      b0:=(1-tcos)/2;
      b1:=1-tcos;
      b2:=(1-tcos)/2;
      a0:=1+alpha;
      a1:=-2*tcos;
      a2:=1-alpha;
    end else //hipass
    if fFilterType=1 then
    begin
      b0:=(1+tcos)/2;
      b1:=-(1+tcos);
      b2:=(1+tcos)/2;
      a0:=1+alpha;
      a1:=-2*tcos;
      a2:=1-alpha;
    end else //bandpass CSG
    if fFilterType=2 then
    begin
      b0:=tsin/2;
      b1:=0;
      b2:=-tsin/2;
      a0:=1+alpha;
      a1:=-1*tcos;
      a2:=1-alpha;
    end else //bandpass CZPG
    if fFilterType=3 then
    begin
      b0:=alpha;
      b1:=0.0;
      b2:=-alpha;
      a0:=1.0+alpha;
      a1:=-2.0*tcos;
      a2:=1.0-alpha;
    end else  //notch
    if fFilterType=4 then
    begin
      b0:=1.0;
      b1:=-2.0*tcos;
      b2:=1.0;
      a0:=1.0+alpha;
      a1:=-2.0*tcos;
      a2:=1.0-alpha;
    end else   //allpass
    if fFilterType=5 then
    begin
      b0:=1.0-alpha;
      b1:=-2.0*tcos;
      b2:=1.0+alpha;
      a0:=1.0+alpha;
      a1:=-2.0*tcos;
      a2:=1.0-alpha;
    end;
  end;

  b0a0:=b0/a0;
  b1a0:=b1/a0;
  b2a0:=b2/a0;
  a1a0:=a1/a0;
  a2a0:=a2/a0;
end;


function TRbjEqFilter.Process(input:single):single;
var
  LastOut:single;
begin
  // filter
  LastOut:= b0a0*input + b1a0*in1 + b2a0*in2 - a1a0*ou1 - a2a0*ou2;

  // push in/out buffers
  in2:=in1;
  in1:=input;
  ou2:=ou1;
  ou1:=LastOut;

  // return output
  result:=LastOut;
end;

{
the process method is overloaded.
use Process(input:single):single;
for per sample processing
use Process(Input:psingle;sampleframes:integer);
for block processing. The input is a pointer to
the start of an array of single which contains
the audio data.
i.e.
RBJFilter.Process(@WaveData[0],256);
}

procedure TRbjEqFilter.Process(Input:psingle;sampleframes:integer);
var
  i:integer;
  LastOut:single;
begin
  for i:=0 to SampleFrames-1 do
  begin
    // filter
    LastOut:= b0a0*(input^)+ b1a0*in1 + b2a0*in2 - a1a0*ou1 - a2a0*ou2;
    //LastOut:=input^;
    // push in/out buffers
    in2:=in1;
    in1:=input^;
    ou2:=ou1;
    ou1:=LastOut;

    Out1[i]:=LastOut;

    inc(input);
  end;
end;

end.


2 comment(s) | add a comment | nofrills version


Denormal DOUBLE variables, macro

References : Posted by Jon Watte

Notes :
Use this macro if you want to find denormal numbers and you're using doubles...

Code :
#if PLATFORM_IS_BIG_ENDIAN
#define INDEX 0
#else
#define INDEX 1
#endif
inline bool is_denormal( double const & d ) {
  assert( sizeof( d ) == 2*sizeof( int ) );
  int l = ((int *)&d)[INDEX];
  return (l&0x7fe00000) != 0;
}


1 comment(s) | add a comment | nofrills version


Denormal numbers

References : Compiled by Merlijn Blaauw
Linked file : other001.txt

Notes :
this text describes some ways to avoid denormalisation. Denormalisation happens when FPU's go mad processing very small numbers



1 comment(s) | add a comment | nofrills version


Denormal numbers, the meta-text

References : Laurent de Soras
Linked file : denormal.pdf

Notes :
This very interesting paper, written by Laurent de Soras (www.ohmforce.com) has everything you ever wanted to know about denormal numbers! And it obviously descibes how you can get rid of them too!

(see linked file)




no comments on this item | add a comment | nofrills version


Denormalization preventer

References : Posted by gol

Notes :
A fast tiny numbers sweeper using integer math. Only for 32bit floats. Den_Thres is your 32bit (normalized) float threshold, something small enough but big enough to prevent future denormalizations.

EAX=input buffer
EDX=length
(adapt to your compiler)


Code :
    MOV   ECX,EDX
    LEA   EDI,[EAX+4*ECX]
    NEG   ECX
    MOV   EDX,Den_Thres
    SHL   EDX,1
    XOR   ESI,ESI

    @Loop:
    MOV   EAX,[EDI+4*ECX]
    LEA   EBX,[EAX*2]
    CMP   EBX,EDX
    CMOVB EAX,ESI
    MOV   [EDI+4*ECX],EAX

    INC   ECX
    JNZ   @Loop


no comments on this item | add a comment | nofrills version


Denormalization preventer

References : Posted by didid[AT]skynet[DOT]be

Notes :
Because I still see people adding noise or offset to their signal to avoid slow denormalization, here's a piece of code to zero out (near) tiny numbers instead.

Why zeroing out is better? Because a fully silent signal is better than a little offset, or noise. A host or effect can detect silent signals and choose not to process them in a safe way.
Plus, adding an offset or noise reduces huge packets of denormalization, but still leaves some behind.
Also, truncating is what the DAZ (Denormals Are Zero) SSE flag does.

This code uses integer comparison, and a CMOV, so you need a Pentium Pro or higher.
There's no need for an SSE version, as if you have SSE code you're probably already using the DAZ flag instead (but I advise plugins not to mess with the SSE flags, as the host is likely to have DAZ switched on already). This is for FPU code. Should work much faster than crap FPU comparison.

Den_Thres is your threshold, it cannot be denormalized (would be pointless). The function is Delphi, if you want to adapt, just make sure EAX is the buffer and EDX is length (Delphi register calling convention - it's not the same in C++).


Code :
const  Den_Thres:Single=1/$1000000;

procedure PrevFPUDen_Buffer(Buffer:Pointer;Length:Integer);
asm
    PUSH  ESI
    PUSH  EDI
    PUSH  EBX

    MOV   ECX,EDX
    LEA   EDI,[EAX+4*ECX]
    NEG   ECX
    MOV   EDX,Den_Thres
    SHL   EDX,1
    XOR   ESI,ESI

    @Loop:
    MOV   EAX,[EDI+4*ECX]
    LEA   EBX,[EAX*2]
    CMP   EBX,EDX
    CMOVB EAX,ESI
    MOV   [EDI+4*ECX],EAX

    INC   ECX
    JNZ   @Loop

    POP   EBX
    POP   EDI
    POP   ESI
End;


4 comment(s) | add a comment | nofrills version


DFT

Type : fourier transform
References : Posted by Andy Mucho
Code :
AnalyseWaveform(float *waveform, int framesize)
{
   float aa[MaxPartials];
   float bb[MaxPartials];
   for(int i=0;i<partials;i++)
   {
     aa[i]=0;
     bb[i]=0;
   }

   int hfs=framesize/2;
   float pd=pi/hfs;
   for (i=0;i<framesize;i++)
   {
     float w=waveform[i];
     int im = i-hfs;
     for(int h=0;h<partials;h++)
     {
        float th=(pd*(h+1))*im;
        aa[h]+=w*cos(th);
        bb[h]+=w*sin(th);
     }
   }
   for (int h=0;h<partials;h++)
       amp[h]= sqrt(aa[h]*aa[h]+bb[h]*bb[h])/hfs;
}


2 comment(s) | add a comment | nofrills version


Digital RIAA equalization filter coefficients

Type : RIAA
References : Posted by Frederick Umminger

Notes :
Use at your own risk. Confirm correctness before using. Don't assume I didn't goof something up.

-Frederick Umminger


Code :
The "turntable-input software" thread inspired me to generate some coefficients for a digital RIAA equalization filter. These coefficients were found by matching the magnitude response of the s-domain transfer function using some proprietary Matlab scripts. The phase response may or may not be totally whacked.

The s-domain transfer function is

R3(1+R1*C1*s)(1+R2*C2*s)/(R1(1+R2*C2*s) + R2(1+R1*C1*s) + R3(1+R1*C1*s)(1+R2*C2*s))

where

R1 = 883.3k
R2 = 75k
R3 = 604
C1 = 3.6n
C2 = 1n

This is based on the reference circuit found in http://www.hagtech.com/pdf/riaa.pdf

The coefficients of the digital transfer function b(z^-1)/a(z^-1) in descending powers of z, are:

44.1kHz
b = [ 0.02675918611906  -0.04592084787595   0.01921229297239]
a = [ 1.00000000000000  -0.73845850035973  -0.17951755477430]
error +/- 0.25dB

48kHz
b = [  0.02675918611906  -0.04592084787595   0.01921229297239]
a = [  1.00000000000000  -0.73845850035973  -0.17951755477430]
error +/- 0.15dB

88.2kHz
b = [  0.04872204977233  -0.09076930609195   0.04202280710877]
a = [ 1.00000000000000  -0.85197860443215  -0.10921171201431]
error +/- 0.01dB


96kHz
b = [ 0.05265477122714  -0.09864197097385   0.04596474352090  ]
a = [  1.00000000000000  -0.85835597216218  -0.10600020417219 ]
error +/- 0.006dB



10 comment(s) | add a comment | nofrills version


DIRAC - Free C/C++ Library for Time and Pitch Manipulation of Audio Based on Time-Frequency Transforms

Type : Time Stretch / Pitch Shift
References : Posted by Stephan M. Bernsee

Notes :
This is an availability notification for a free object library, no source code.

Code :
Past research has shown time domain [pitch] synchronized overlap-add ([P]SOLA) algorithms for independent time and pitch manipulation of audio ("time stretching" and "pitch shifting") to be the method of choice for single-pitched sounds such as voice and musically monophonic instrument recordings due to the prominent periodicity at the fundamental period. On the other hand, frequency domain methods have recently evolved around the concept of the phase vocoder that have proven to be vastly superior for multi-pitched sounds and entire musical pieces.

"Dirac" is a free cross-platform C/C++ object library that exploits the good localization of time-frequency transforms in both domains to build an algorithm for time and pitch manipulation that uses an arbitrary time-frequency tiling depending on the underlying signal. Additionally, the time and frequency localization parameter of the basis can be user-defined, making the algorithm smoothly scalable to provide either the phase coherence properties of a time domain process or the good frequency resolution of the phase vocoder.

The basic "Dirac" library comes as a free download off the DSPdimension web site and is currently available for Microsoft Visual C6+, CodeWarrior 8.x on Windows and MacOS, and for Xcode 2.x on MacOS X. Optional "Studio" and "Pro" versions with increased feature set are available commercially from the author.


More information and download at http://www.dspdimension.com


9 comment(s) | add a comment | nofrills version


Direct form II

Type : generic
References : Posted by Fuzzpilz

Notes :
I've noticed there's no code for direct form II filters in general here, though probably many of the filter examples use it. I haven't looked at them all to verify that, but there certainly doesn't seem to be a snippet describing this.

This is a simple direct form II implementation of a k-pole, k-zero filter. It's a little faster than (a naive, real-time implementation of) direct form I, as well as more numerically accurate.


Code :
Direct form I pseudocode:

y[n] = a[0]*x[n] + a[1]*x[n-1] + .. + a[k]*x[n-k]
                 - b[1]*y[n-1] - .. - b[k]*y[n-k];

Simple equivalent direct form II pseudocode:

y[n] = a[0]*x[n] + d[0];
d[0] = a[1]*x[n] - b[1]*y[n] + d[1];
d[1] = a[2]*x[n] - b[2]*y[n] + d[2];
.
.
d[k-2] = a[k-1]*x[n] - b[k-1]*y[n] + d[k-1];
d[k-1] = a[k]*x[n] - b[k]*y[n];

For example, a biquad:

out = a0*in + a1*h0 + a2*h1 - b1*h2 - b2*h3;
h1 = h0;
h0 = in;
h3 = h2;
h2 = out;

becomes

out = a0*in + d0;
d0 = a1*in - b1*out + d1;
d1 = a2*in - b2*out;


5 comment(s) | add a comment | nofrills version


Direct Form II biquad

References : Posted by robert.bielik at xponaut dot se

Notes :
The nominal implementation for biquads is the Direct Form I variant. But the Direct Form II is actually more suited for calculating the biquad since it needs only 2 memory locations, and the multiplications can be pipelined better by the compiler. In release build, I've noted a considerable speedup when compared to DF I. When processing stereo, the code below was ~ 2X faster. Until I develop a SIMD biquad that is faster, this will do.

Code :
// b0, b1, b2, a1, a2 calculated via r.b-j's cookbook
// formulae.
// m1, m2 are the memory locations
// dn is the de-denormal coeff (=1.0e-20f)

void processBiquad(const float* in, float* out, unsigned length)
{
    for(unsigned i = 0; i < length; ++i)
    {
        register float w = in[i] - a1*m1 - a2*m2 + dn;
        out[i] = b1*m1 + b2*m2 + b0*w;
        m2 = m1; m1 = w;
    }
    dn = -dn;
}

void processBiquadStereo(const float* inL,
   const float* inR,
   float* outL,
   float* outR,
   unsigned length)
{
    for(unsigned i = 0; i < length; ++i)
    {
        register float wL = inL[i] - a1*m1L - a2*m2L + dn;
        register float wR = inR[i] - a1*m1R - a2*m2R + dn;
        outL[i] = b1*m1L + b2*m2L + b0*wL;
        m2L = m1L; m1L = wL;
        outR[i] = b1*m1R + b2*m2R + b0*wR;
        m2R = m1R; m1R = wR;
    }
    dn = -dn;
}


2 comment(s) | add a comment | nofrills version


Direct pink noise synthesis with auto-correlated generator

Type : 16-bit fixed-point
References : Posted by RidgeRat

Notes :
Canonical C++ class with minimum system dependencies, BUT you must provide your own uniform random number generator. Accurate range is a little over 9 octaves, degrading gracefully beyond this. Estimated deviations +-0.25 dB from ideal 1/f curve in range. Scaled to fit signed 16-bit range.

Code :
// Pink noise class using the autocorrelated generator method.
// Method proposed and described by Larry Trammell "the RidgeRat" --
// see http://home.earthlink.net/~ltrammell/tech/newpink.htm
// There are no restrictions.
//
// ------------------------------------------------------------------
//
// This is a canonical, 16-bit fixed-point implementation of the
// generator in 32-bit arithmetic. There are only a few system
// dependencies.
//
//   -- access to an allocator 'malloc' for operator new
//   -- access to definition of 'size_t'
//   -- assumes 32-bit two's complement arithmetic
//   -- assumes long int is 32 bits, short int is 16 bits
//   -- assumes that signed right shift propagates the sign bit
//
// It needs a separate URand class to provide uniform 16-bit random
// numbers on interval [1,65535]. The assumed class must provide
// methods to query and set the current seed value, establish a
// scrambled initial seed value, and evaluate uniform random values.
//
//
// ----------- header -----------------------------------------------
// pinkgen.h

#ifndef  _pinkgen_h_
#define  _pinkgen_h_  1

#include  <stddef.h>
#include  <alloc.h>

// You must provide the uniform random generator class.
#ifndef  _URand_h_
#include  "URand.h"
#endif

class PinkNoise {
  private:
    // Coefficients (fixed)
    static long int const pA[5];
    static short int const pPSUM[5];

    // Internal pink generator state
    long int   contrib[5];   // stage contributions
    long int   accum;        // combined generators
    void       internal_clear( );          

    // Include a UNoise component
    URand     ugen;

  public:
    PinkNoise( );
    PinkNoise( PinkNoise & );
    ~PinkNoise( );
    void *  operator new( size_t );
    void  pinkclear( );
    short int  pinkrand( );
} ;
#endif

// ----------- implementation ---------------------------------------
// pinkgen.cpp

#include  "pinkgen.h"

// Static class data
long int const PinkNoise::pA[5] =
    { 14055, 12759, 10733, 12273, 15716 };
short int const PinkNoise::pPSUM[5] =
    { 22347, 27917, 29523, 29942, 30007 };

// Clear generator to a zero state.
void   PinkNoise::pinkclear( )
{
    int  i;
    for  (i=0; i<5; ++i)  { contrib[i]=0L; }
    accum = 0L;
}

// PRIVATE, clear generator and also scramble the internal
// uniform generator seed.
void   PinkNoise::internal_clear( )
{
    pinkclear();
    ugen.seed(0);    // Randomizes the seed!
}

// Constructor. Guarantee that initial state is cleared
// and uniform generator scrambled.
PinkNoise::PinkNoise( )
{
    internal_clear();
}

// Copy constructor. Preserve generator state from the source
// object, including the uniform generator seed.
PinkNoise::PinkNoise( PinkNoise & Source )
{
    int  i;
    for (i=0; i<5; ++i)  contrib[i]=Source.contrib[i];
    accum = Source.accum;
    ugen.seed( Source.ugen.seed( ) );
}

// Operator new. Just fetch required object storage.
void *   PinkNoise::operator new( size_t size )
{
    return   malloc(size);
}

// Destructor. No special action required.
PinkNoise::~PinkNoise( )  { /* NIL */ }

// Coding artifact for convenience
#define   UPDATE_CONTRIB(n)  \
     {                                   \
       accum -= contrib[n];              \
       contrib[n] = (long)randv * pA[n]; \
       accum += contrib[n];              \
       break;                            \
     }                                  

// Evaluate next randomized 'pink' number with uniform CPU loading.
short int   PinkNoise::pinkrand( )
{
    short int  randu = ugen.urand() & 0x7fff;     // U[0,32767]
    short int  randv = (short int) ugen.urand();  // U[-32768,32767]

    // Structured block, at most one update is performed
    while (1)
    {
      if (randu < pPSUM[0]) UPDATE_CONTRIB(0);
      if (randu < pPSUM[1]) UPDATE_CONTRIB(1);
      if (randu < pPSUM[2]) UPDATE_CONTRIB(2);
      if (randu < pPSUM[3]) UPDATE_CONTRIB(3);
      if (randu < pPSUM[4]) UPDATE_CONTRIB(4);
      break;
    }
    return (short int) (accum >> 16);
}

// ----------- application   -------------------------------

short int    pink_signal[1024];

void   example(void)
{
    PinkNoise    pinkgen;
    int  i;
    for  (i=0; i<1024; ++i)   pink_signal[i] = pinkgen.pinkrand();
}


no comments on this item | add a comment | nofrills version


Discrete Summation Formula (DSF)

References : Stylson, Smith and others... (posted by Alexander Kritov)

Notes :
Buzz uses this type of synth.
For cool sounds try to use variable,
for example a=exp(-x/12000)*0.8 // x- num.samples


Code :
double DSF (double x,  // input
            double a,  // a<1.0
            double N,  // N<SmplFQ/2,
            double fi) // phase
{
  double s1 = pow(a,N-1.0)*sin((N-1.0)*x+fi);
  double s2 = pow(a,N)*sin(N*x+fi);
  double s3 = a*sin(x+fi);
  double s4 =1.0 - (2*a*cos(x)) +(a*a);
  if (s4==0)
     return 0;
  else
     return (sin(fi) - s3 - s2 +s1)/s4;
}


6 comment(s) | add a comment | nofrills version


Dither code

Type : Dither with noise-shaping
References : Posted by Paul Kellett

Notes :
This is a simple implementation of highpass triangular-PDF dither (a good general-purpose dither) with optional 2nd-order noise shaping (which lowers the noise floor by 11dB below 0.1 Fs).
The code assumes input data is in the range +1 to -1 and doesn't check for overloads!

To save time when generating dither for multiple channels you can re-use lower bits of a previous random number instead of calling rand() again. e.g. r3=(r1 & 0x7F)<<8;


Code :
  int   r1, r2;                //rectangular-PDF random numbers
  float s1, s2;                //error feedback buffers
  float s = 0.5f;              //set to 0.0f for no noise shaping
  float w = pow(2.0,bits-1);   //word length (usually bits=16)
  float wi= 1.0f/w;
  float d = wi / RAND_MAX;     //dither amplitude (2 lsb)
  float o = wi * 0.5f;         //remove dc offset
  float in, tmp;
  int   out;

//for each sample...

  r2=r1;                               //can make HP-TRI dither by
  r1=rand();                           //subtracting previous rand()

  in += s * (s1 + s1 - s2);            //error feedback
  tmp = in + o + d * (float)(r1 - r2); //dc offset and dither

  out = (int)(w * tmp);                //truncate downwards
  if(tmp<0.0f) out--;                  //this is faster than floor()

  s2 = s1;
  s1 = in - wi * (float)out;           //error


no comments on this item | add a comment | nofrills version


Dithering

References : Paul Kellett
Linked file : nsdither.txt

Notes :
(see linked file)



no comments on this item | add a comment | nofrills version


Double to Int

Type : pointer cast (round to zero, or 'trunctate')
References : Posted by many people, implementation by Andy M00cho

Notes :
-Platform independant, literally. You have IEEE FP numbers, this will work, as long as your not expecting a signed integer back larger than 16bits :)
-Will only work correctly for FP numbers within the range of [-32768.0,32767.0]
-The FPU must be in Double-Precision mode


Code :
typedef double lreal;
typedef float  real;
typedef unsigned long uint32;
typedef long int32;

   //2^36 * 1.5, (52-_shiftamt=36) uses limited precision to floor
   //16.16 fixed point representation

const lreal _double2fixmagic = 68719476736.0*1.5;
const int32 _shiftamt        = 16;

#if BigEndian_
        #define iexp_                           0
        #define iman_                           1
#else
        #define iexp_                           1
        #define iman_                           0
#endif //BigEndian_

// Real2Int
inline int32 Real2Int(lreal val)
{
   val= val + _double2fixmagic;
   return ((int32*)&val)[iman_] >> _shiftamt;
}

// Real2Int
inline int32 Real2Int(real val)
{
   return Real2Int ((lreal)val);
}

For the x86 assembler freaks here's the assembler equivalent:

__double2fixmagic    dd 000000000h,042380000h

fld    AFloatingPoint Number
fadd   QWORD PTR __double2fixmagic
fstp   TEMP
movsx  eax,TEMP+2


2 comment(s) | add a comment | nofrills version


Drift generator

Type : Random
References : Posted by quintosardo[AT]yahoo[DOT]it

Notes :
I use this drift to modulate any sound parameter of my synth.
It is very effective if it slightly modulates amplitude or frequency of an FM modulator.
It is based on an incremental random variable, sine-warped.
I like it because it is "continuous" (as opposite to "sample and hold"), and I can set variation rate and max variation.
It can go to upper or lower constraint (+/- max drift) but it gradually decreases rate of variation when approaching to the limit.
I use it exactly as an LFO (-1.f .. +1.f)
I use a table for sin instead of sin() function because this way I can change random distribution, by selecting a different curve (different table) from sine...

I hope that it is clear ... (sigh... :-)
Bye!!!
P.S. Thank you for help in previous submission ;-)


Code :
const int kSamples //Number of samples in fSinTable below
float fSinTable[kSamples] // Tabulated sin() [0 - 2pi[ amplitude [-1.f .. 1.f]
float fWhere// Index
float fRate // Max rate of variation
float fLimit //max or min value
float fDrift // Output

//I assume that random() is a number from 0.f to 1.f, otherwise scale it

fWhere += fRate * random()
//I update this drift in a long-term cycle, so I don't care of branches
if (fWhere >= 1.f) fWhere -= 1.f
else if (fWhere < 0.f) sWhere += 1.f

fDrift = fLimit * fSinTable[(long) (fWhere * kSamples)]


1 comment(s) | add a comment | nofrills version


DSF (super-set of BLIT)

Type : matlab code
References : Posted by David Lowenfels

Notes :
Discrete Summation Formula ala Moorer

computes equivalent to sum{k=0:N-1}(a^k * sin(beta + k*theta))
modified from Emanuel Landeholm's C code
output should never clip past [-1,1]

If using for BLIT synthesis for virtual analog:
N = maxN;
a = attn_at_Nyquist ^ (1/maxN); %hide top harmonic popping in and out when sweeping frequency
beta = pi/2;
num = 1 - a^N * cos(N*theta) - a*( cos(theta) - a^N * cos(N*theta - theta) ); %don't waste time on beta

You can also get growing harmonics if a > 1, but the min statement in the code must be removed, and the scaling will be weird.


Code :
function output = dsf( freq, a, H, samples, beta)
%a = rolloff coeffecient
%H = number of harmonic overtones (fundamental not included)
%beta = harmonic phase shift

samplerate = 44.1e3;
freq = freq/samplerate; %normalize frequency

bandlimit = samplerate / 2; %Nyquist
maxN = 1 + floor( bandlimit / freq ); %prevent aliasing
N = min(H+2,maxN);

theta = 2*pi * phasor(freq, samples);

epsilon = 1e-6;
a = min(a, 1-epsilon); %prevent divide by zero

num = sin(beta) - a*sin(beta-theta) - a^N*sin(beta + N*theta) + a^(N+1)*sin(beta+(N-1)*theta);
den = (1 + a * ( a - 2*cos(theta) ));

output = 2*(num ./ den - 1) * freq; %subtract by one to remove DC, scale by freq to normalize
output = output * maxN/N;           %OPTIONAL: rescale to give louder output as rolloff increases

function out = phasor(normfreq, samples);
out = mod( (0:samples-1)*normfreq , 1);
out = out * 2 - 1;                  %make bipolar


1 comment(s) | add a comment | nofrills version


dynamic convolution

Type : a naive implementation in C++
References : Posted by Risto Holopainen

Notes :
This class illustrates the use of dynamic convolution with a set of IR:s consisting of exponentially damped sinusoids with glissando. There's lots of things to improve for efficiency.

Code :
#include <cmath>

class dynaconv
{
  public:
// sr=sample rate, cf=resonance frequency,
// dp=frq sweep or nonlinearity amount
        dynaconv(const int sr, float cf, float dp);
        double operator()(double);
    
    private:    
// steps: number of amplitude regions, L: length of impulse response
        enum {steps=258, dv=steps-2, L=200};
        double x[L];
        double h[steps][L];
        int S[L];
        double conv(double *x, int d);
};



dynaconv::dynaconv(const int sr, float cfr, float dp)
{
    for(int i=0; i<L; i++)
        x[i] = S[i] = 0;
    
    double sc = 6.0/L;
    double frq = twopi*cfr/sr;
    
// IR's initialised here.
// h[0] holds the IR for samples with lowest amplitude.
    for(int k=0; k<steps; k++)
        {
        double sum = 0;
        double theta=0;
        double w;
        for(int i=0; i<L; i++)
            {
    // IR of exp. decaying sinusoid with glissando
            h[k][i] = sin(theta)*exp(-sc*i);
            w = (double)i/L;
            theta += frq*(1 + dp*w*(k - 0.4*steps)/steps);
            sum += fabs(h[k][i]);
            }
            
        double norm = 1.0/sum;
        for(int i=0; i<L; i++)
            h[k][i] *= norm;
        }
}

double dynaconv::operator()(double in)
{
    double A = fabs(in);
    double a, b, w, y;
    int sel = int(dv*A);
    
    for(int j=L-1; j>0; j--)
        {
        x[j] = x[j-1];
        S[j] = S[j-1];
        }
    x[0] = in;
    S[0] = sel;        
    
    if(sel == 0)
        y = conv(x, 0);

    else if(sel > 0)
        {
        a = conv(x, 0);
        b = conv(x, 1);
        w = dv*A - sel;
        y = w*a + (1-w)*b;
        }

    return y;
}

double dynaconv::conv(double *x, int d)
{
    double y=0;
    for(int i=0; i<L; i++)
        y += x[i] * h[ S[i]+d ][i];

    return y;    
}



2 comment(s) | add a comment | nofrills version


Early echo's with image-mirror technique

References : Donald Schulz
Linked file : early_echo.c
Linked file : early_echo_eng.c

Notes :
(see linked files)
Donald Schulz's code for computing early echoes using the image-mirror method. There's an english and a german version.




no comments on this item | add a comment | nofrills version


Easy noise generation

Type : White Noise
References : Posted by mail[AT]ihsan-dsp[DOT]com

Notes :
Easy noise generation,
in .hpp,
b_noise = 19.1919191919191919191919191919191919191919;

alternatively, the number 19 below can be replaced with a number of your choice, to get that particular flavour of noise.

Regards,
Ove Karlsen.


Code :
    b_noise = b_noise * b_noise;
    int i_noise = b_noise;
    b_noise = b_noise - i_noise;

    double b_noiseout;
    b_noiseout = b_noise - 0.5;

    b_noise = b_noise + 19;


7 comment(s) | add a comment | nofrills version


ECE320 project: Reverberation w/ parameter control from PC

References : Posted by Brahim Hamadicharef (project by Hua Zheng and Shobhit Jain)
Linked file : rev.txt

Notes :
rev.asm
ECE320 project: Reverberation w/ parameter control from PC
Hua Zheng and Shobhit Jain
12/02/98 ~ 12/11/98
(se linked file)




no comments on this item | add a comment | nofrills version


Envelope detector

References : Posted by Bram

Notes :
Basicaly a one-pole LP filter with different coefficients for attack and release fed by the abs() of the signal. If you don't need different attack and decay settings, just use in->abs()->LP

Code :
//attack and release in milliseconds
float ga = (float) exp(-1/(SampleRate*attack));
float gr = (float) exp(-1/(SampleRate*release));

float envelope=0;

for(...)
{
  //get your data into 'input'
  EnvIn = abs(input);

  if(envelope < EnvIn)
  {
     envelope *= ga;
     envelope += (1-ga)*EnvIn;
  }
  else
  {
     envelope *= gr;
     envelope += (1-gr)*EnvIn;
  }
  //envelope now contains.........the envelope ;)
}


5 comment(s) | add a comment | nofrills version


Envelope Detector class (C++)

Type : envelope detector
References : Posted by Citizen Chunk
Linked file : http://www.chunkware.com/opensource/EnvelopeDetector.zip

Notes :
This is a C++ implementation of a simple envelope detector. The time constant (ms) represents the time it takes for the envelope to charge/discharge 63% (RC time constant).

(see linked files)




6 comment(s) | add a comment | nofrills version


Envelope Follower

References : Posted by ers
Code :
#define V_ENVELOPE_FOLLOWER_NUM_POINTS    2000
class vEnvelopeFollower :
{
      public:
        vEnvelopeFollower();
        virtual ~vEnvelopeFollower();
        inline void Calculate(float *b)
        {
            envelopeVal -= *buff;
            if (*b < 0)
                envelopeVal += *buff = -*b;
            else
                envelopeVal += *buff = *b;
            if (buff++ == bufferEnd)
                buff = buffer;
        }
        void SetBufferSize(float value);
        void GetControlValue(){return envelopeVal / (float)bufferSize;}

    private:
        float    buffer[V_ENVELOPE_FOLLOWER_NUM_POINTS];
        float    *bufferEnd, *buff, envelopeVal;
        int    bufferSize;
      float    val;
};

vEnvelopeFollower::vEnvelopeFollower()
{
    bufferEnd = buffer + V_ENVELOPE_FOLLOWER_NUM_POINTS-1;
    buff = buffer;
    val = 0;
    float *b = buffer;
    do
    {
        *b++ = 0;
    }while (b <= bufferEnd);
    bufferSize = V_ENVELOPE_FOLLOWER_NUM_POINTS;
    envelopeVal= 0;
}

vEnvelopeFollower::~vEnvelopeFollower()
{

}

void vEnvelopeFollower::SetBufferSize(float value)
{

    bufferEnd = buffer + (bufferSize = 100 + (int)(value * ((float)V_ENVELOPE_FOLLOWER_NUM_POINTS-102)));
    buff = buffer;
    float val =  envelopeVal / bufferSize;
    do
    {
        *buff++ = val;
    }while (buff <= bufferEnd);
    buff = buffer;
}


1 comment(s) | add a comment | nofrills version


Envelope follower with different attack and release

References : Posted by Bram

Notes :
xxxx_in_ms is xxxx in milliseconds ;-)

Code :
init::

attack_coef = exp(log(0.01)/( attack_in_ms * samplerate * 0.001));
release_coef = exp(log(0.01)/( release_in_ms * samplerate * 0.001));
envelope = 0.0;

loop::

tmp = fabs(in);
if(tmp > envelope)
    envelope = attack_coef * (envelope - tmp) + tmp;
else
    envelope = release_coef * (envelope - tmp) + tmp;


5 comment(s) | add a comment | nofrills version


Exponential curve for

Type : Exponential curve
References : Posted by neolit123 [at] gmail [dot] com

Notes :
When you design a frequency control for your filters you may need an exponential curve to give the lower frequencies more resolution.

F=20-20000hz
x=0-100%

Case (control middle point):
x=50%
F=1135hz

Ploted diagram with 5 points:
http://img151.imageshack.us/img151/9938/expplotur3.jpg


Code :
//tweak - 14.15005 to change middle point and F(max)
F = 19+floor(pow(4,x/14.15005))+x*20;



3 comment(s) | add a comment | nofrills version


Exponential parameter mapping

References : Posted by Russell Borogove

Notes :
Use this if you want to do an exponential map of a parameter (mParam) to a range (mMin - mMax).
Output is in mData...


Code :
float logmax = log10f( mMax );
float logmin = log10f( mMin );
float logdata = (mParam * (logmax-logmin)) + logmin;

mData = powf( 10.0f, logdata );
if (mData < mMin)
{
  mData = mMin;
}
if (mData > mMax)
{
  mData = mMax;
}


4 comment(s) | add a comment | nofrills version


Fast & small sine generation tutorial

Type : original document link: www.active-web.cc/html/research/sine/sin-cos.txt
References : Posted by office[AT]develotec[DOT]com
Linked file : http://www.active-web.cc/html/research/sine/sin-cos.txt

Notes :
original document link: www.active-web.cc/html/research/sine/sin-cos.txt



3 comment(s) | add a comment | nofrills version


fast abs/neg/sign for 32bit floats

Type : floating point functions
References : Posted by tobybear[AT]web[DOT]de

Notes :
Haven't seen this elsewhere, probably because it is too obvious? Anyway, these functions are intended for 32-bit floating point numbers only and should work a bit faster than the regular ones.

fastabs() gives you the absolute value of a float
fastneg() gives you the negative number (faster than multiplying with -1)
fastsgn() gives back +1 for 0 or positive numbers, -1 for negative numbers

Comments are welcome (tobybear[AT]web[DOT]de)

Cheers

Toby (www.tobybear.de)


Code :
// C/C++ code:
float fastabs(float f)
{int i=((*(int*)&f)&0x7fffffff);return (*(float*)&i);}

float fastneg(float f)
{int i=((*(int*)&f)^0x80000000);return (*(float*)&i);}

int fastsgn(float f)
{return 1+(((*(int*)&f)>>31)<<1);}

//Delphi/Pascal code:
function fastabs(f:single):single;
begin i:=longint((@f)^) and $7FFFFFFF;result:=single((@i)^) end;

function fastneg(f:single):single;
begin i:=longint((@f)^) xor $80000000;result:=single((@i)^) end;

function fastsgn(f:single):longint;
begin result:=1+((longint((@f)^) shr 31)shl 1) end;



6 comment(s) | add a comment | nofrills version


Fast binary log approximations

Type : C code
References : Posted by musicdsp.org[AT]mindcontrol.org

Notes :
This code uses IEEE 32-bit floating point representation knowledge to quickly compute approximations to the log2 of a value. Both functions return under-estimates of the actual value, although the second flavour is less of an under-estimate than the first (and might be sufficient for using in, say, a dBV/FS level meter).

Running the test program, here's the output:

0.1: -4 -3.400000
1: 0 0.000000
2: 1 1.000000
5: 2 2.250000
100: 6 6.562500


Code :
// Fast logarithm (2-based) approximation
// by Jon Watte

#include <assert.h>

int floorOfLn2( float f ) {
  assert( f > 0. );
  assert( sizeof(f) == sizeof(int) );
  assert( sizeof(f) == 4 );
  return (((*(int *)&f)&0x7f800000)>>23)-0x7f;
}

float approxLn2( float f ) {
  assert( f > 0. );
  assert( sizeof(f) == sizeof(int) );
  assert( sizeof(f) == 4 );
  int i = (*(int *)&f);
  return (((i&0x7f800000)>>23)-0x7f)+(i&0x007fffff)/(float)0x800000;
}

// Here's a test program:

#include <stdio.h>

// insert code from above here

int
main()
{
  printf( "0.1: %d  %f\n", floorOfLn2( 0.1 ), approxLn2( 0.1 ) );
  printf( "1:   %d  %f\n", floorOfLn2( 1. ), approxLn2( 1. ) );
  printf( "2:   %d  %f\n", floorOfLn2( 2. ), approxLn2( 2. ) );
  printf( "5:   %d  %f\n", floorOfLn2( 5. ), approxLn2( 5. ) );
  printf( "100: %d  %f\n", floorOfLn2( 100. ), approxLn2( 100. ) );
  return 0;
}


1 comment(s) | add a comment | nofrills version


Fast cube root, square root, and reciprocal for x86/SSE CPUs.

References : Posted by kaleja[AT]estarcion[DOT]com

Notes :
All of these methods use SSE instructions or bit twiddling tricks to get a rough approximation to cube root, square root, or reciprocal, which is then refined with one or more Newton-Raphson approximation steps.

Each is named to indicate its approximate level of accuracy and a comment describes its performance relative to the conventional versions.



Code :
// These functions approximate reciprocal, square root, and
// cube root to varying degrees of precision substantially
// faster than the straightforward methods 1/x, sqrtf(x),
// and powf( x, 1.0f/3.0f ). All require SSE-enabled CPU & OS.
// Unlike the powf() solution, the cube roots also correctly
// handle negative inputs.  

#define REALLY_INLINE __forceinline

// ~34 clocks on Pentium M vs. ~360 for powf
REALLY_INLINE float cuberoot_sse_8bits( float x )
{
    float z;
    static const float three = 3.0f;
    _asm
    {
        mov        eax, x                // x as bits
        movss    xmm2, x                // x2: x
        movss    xmm1, three            // x1: 3
        // Magic floating point cube root done with integer math.
        // The exponent is divided by three in such a way that
        // remainder bits get shoved into the top of the normalized
        // mantissa.
        mov        ecx, eax            // copy of x
        and        eax, 0x7FFFFFFF        // exponent & mantissa of x in biased-127
        sub     eax, 0x3F800000        // exponent & mantissa of x in 2's comp
        sar     eax, 10                //
        imul    eax, 341            // 341/1024 ~= .333
        add        eax, 0x3F800000        // back to biased-127
        and     eax, 0x7FFFFFFF        // remask
        and        ecx, 0x80000000        // original sign and mantissa
        or      eax, ecx            // masked new exponent, old sign and mantissa
        mov        z, eax                //
        
        // use SSE to refine the first approximation
        movss    xmm0, z                ;// x0: z
        movss    xmm3, xmm0            ;// x3: z
        mulss    xmm3, xmm0            ;// x3: z*z
        movss    xmm4, xmm3            ;// x4: z*z
        mulss   xmm3, xmm1            ;// x3: 3*z*z
        rcpss    xmm3, xmm3            ;// x3: ~ 1/(3*z*z)
        mulss   xmm4, xmm0            ;// x4: z*z*z
        subss    xmm4, xmm2            ;// x4: z*z*z-x
        mulss    xmm4, xmm3            ;// x4: (z*z*z-x) / (3*z*z)
        subss    xmm0, xmm4            ;// x0: z' accurate to within about 0.3%
        movss    z, xmm0
    }

    return z;
}

// ~60 clocks on Pentium M vs. ~360 for powf
REALLY_INLINE float cuberoot_sse_16bits( float x )
{
    float z;
    static const float three = 3.0f;
    _asm
    {
        mov        eax, x                // x as bits
        movss    xmm2, x                // x2: x
        movss    xmm1, three            // x1: 3
        // Magic floating point cube root done with integer math.
        // The exponent is divided by three in such a way that
        // remainder bits get shoved into the top of the normalized
        // mantissa.
        mov        ecx, eax            // copy of x
        and        eax, 0x7FFFFFFF        // exponent & mantissa of x in biased-127
        sub     eax, 0x3F800000        // exponent & mantissa of x in 2's comp
        sar     eax, 10                //
        imul    eax, 341            // 341/1024 ~= .333
        add        eax, 0x3F800000        // back to biased-127
        and     eax, 0x7FFFFFFF        // remask
        and        ecx, 0x80000000        // original sign and mantissa
        or      eax, ecx            // masked new exponent, old sign and mantissa
        mov        z, eax                //
        
        // use SSE to refine the first approximation
        movss    xmm0, z                ;// x0: z
        movss    xmm3, xmm0            ;// x3: z
        mulss    xmm3, xmm0            ;// x3: z*z
        movss    xmm4, xmm3            ;// x4: z*z
        mulss   xmm3, xmm1            ;// x3: 3*z*z
        rcpss    xmm3, xmm3            ;// x3: ~ 1/(3*z*z)
        mulss   xmm4, xmm0            ;// x4: z*z*z
        subss    xmm4, xmm2            ;// x4: z*z*z-x
        mulss    xmm4, xmm3            ;// x4: (z*z*z-x) / (3*z*z)
        subss    xmm0, xmm4            ;// x0: z' accurate to within about 0.3%

        movss    xmm3, xmm0            ;// x3: z
        mulss    xmm3, xmm0            ;// x3: z*z
        movss    xmm4, xmm3            ;// x4: z*z
        mulss   xmm3, xmm1            ;// x3: 3*z*z
        rcpss    xmm3, xmm3            ;// x3: ~ 1/(3*z*z)
        mulss   xmm4, xmm0            ;// x4: z*z*z
        subss    xmm4, xmm2            ;// x4: z*z*z-x
        mulss    xmm4, xmm3            ;// x4: (z*z*z-x) / (3*z*z)
        subss    xmm0, xmm4            ;// x0: z'' accurate to within about 0.001%

        movss    z, xmm0
    }
    
    return z;
}

// ~77 clocks on Pentium M vs. ~360 for powf
REALLY_INLINE float cuberoot_sse_22bits( float x )
{
    float z;
    static const float three = 3.0f;
    _asm
    {
        mov        eax, x                // x as bits
        movss    xmm2, x                // x2: x
        movss    xmm1, three            // x1: 3
        // Magic floating point cube root done with integer math.
        // The exponent is divided by three in such a way that
        // remainder bits get shoved into the top of the normalized
        // mantissa.
        mov        ecx, eax            // copy of x
        and        eax, 0x7FFFFFFF        // exponent & mantissa of x in biased-127
        sub     eax, 0x3F800000        // exponent & mantissa of x in 2's comp
        sar     eax, 10                //
        imul    eax, 341            // 341/1024 ~= .333
        add        eax, 0x3F800000        // back to biased-127
        and     eax, 0x7FFFFFFF        // remask
        and        ecx, 0x80000000        // original sign and mantissa
        or      eax, ecx            // masked new exponent, old sign and mantissa
        mov        z, eax                //
        
        // use SSE to refine the first approximation
        movss    xmm0, z                // x0: z
        movss    xmm3, xmm0            // x3: z
        mulss    xmm3, xmm0            // x3: z*z
        movss    xmm4, xmm3            // x4: z*z
        mulss   xmm3, xmm1            // x3: 3*z*z
        rcpss    xmm3, xmm3            // x3: ~ 1/(3*z*z)
        mulss   xmm4, xmm0            // x4: z*z*z
        subss    xmm4, xmm2            // x4: z*z*z-x
        mulss    xmm4, xmm3            // x4: (z*z*z-x) / (3*z*z)
        subss    xmm0, xmm4            // x0: z' accurate to within about 0.3%
        
        movss    xmm3, xmm0            // x3: z
        mulss    xmm3, xmm0            // x3: z*z
        movss    xmm4, xmm3            // x4: z*z
        mulss   xmm3, xmm1            // x3: 3*z*z
        rcpss    xmm3, xmm3            // x3: ~ 1/(3*z*z)
        mulss   xmm4, xmm0            // x4: z*z*z
        subss    xmm4, xmm2            // x4: z*z*z-x
        mulss    xmm4, xmm3            // x4: (z*z*z-x) / (3*z*z)
        subss    xmm0, xmm4            // x0: z'' accurate to within about 0.001%

        movss    xmm3, xmm0            // x3: z
        mulss    xmm3, xmm0            // x3: z*z
        movss    xmm4, xmm3            // x4: z*z
        mulss   xmm3, xmm1            // x3: 3*z*z
        rcpss    xmm3, xmm3            // x3: ~ 1/(3*z*z)
        mulss   xmm4, xmm0            // x4: z*z*z
        subss    xmm4, xmm2            // x4: z*z*z-x
        mulss    xmm4, xmm3            // x4: (z*z*z-x) / (3*z*z)
        subss    xmm0, xmm4            // x0: z''' accurate to within about 0.000012%
        
        movss    z, xmm0
    }
    
    return z;
}

// ~6 clocks on Pentium M vs. ~24 for single precision sqrtf
REALLY_INLINE float squareroot_sse_11bits( float x )
{
    float z;
    _asm
    {
        rsqrtss xmm0, x
        rcpss    xmm0, xmm0
        movss    z, xmm0            // z ~= sqrt(x) to 0.038%
    }
    return z;
}

// ~19 clocks on Pentium M vs. ~24 for single precision sqrtf
REALLY_INLINE float squareroot_sse_22bits( float x )
{
    static float half = 0.5f;
    float z;
    _asm
    {
        movss    xmm1, x            // x1: x
        rsqrtss xmm2, xmm1        // x2: ~1/sqrt(x) = 1/z
        rcpss    xmm0, xmm2        // x0: z == ~sqrt(x) to 0.05%
        
        movss    xmm4, xmm0        // x4: z
        movss    xmm3, half
        mulss    xmm4, xmm4        // x4: z*z
        mulss    xmm2, xmm3        // x2: 1 / 2z
        subss    xmm4, xmm1        // x4: z*z-x
        mulss    xmm4, xmm2        // x4: (z*z-x)/2z
        subss    xmm0, xmm4        // x0: z' to 0.000015%
        
        movss    z, xmm0          
    }
    return z;
}

// ~12 clocks on Pentium M vs. ~16 for single precision divide
REALLY_INLINE float reciprocal_sse_22bits( float x )
{
    float z;
    _asm
    {
        rcpss    xmm0, x            // x0: z ~= 1/x
        movss    xmm2, x            // x2: x
        movss    xmm1, xmm0        // x1: z ~= 1/x
        addss    xmm0, xmm0        // x0: 2z
        mulss    xmm1, xmm1        // x1: z^2
        mulss    xmm1, xmm2        // x1: xz^2
        subss    xmm0, xmm1        // x0: z' ~= 1/x to 0.000012%

        movss    z, xmm0          
    }
    return z;
}


5 comment(s) | add a comment | nofrills version


Fast Downsampling With Antialiasing

References : Posted by mumart[AT]gmail[DOT]com

Notes :
A quick and simple method of downsampling a signal by a factor of two with a useful amount of antialiasing. Each source sample is convolved with { 0.25, 0.5, 0.25 } before downsampling.

Code :
int filter_state;

/* input_buf can be equal to output_buf */
void downsample( int *input_buf, int *output_buf, int output_count ) {
    int input_idx, input_end, output_idx, output_sam;
    input_idx = output_idx = 0;
    input_end = output_count * 2;
    while( input_idx < input_end ) {
        output_sam = filter_state + ( input_buf[ input_idx++ ] >> 1 );
        filter_state = input_buf[ input_idx++ ] >> 2;
        output_buf[ output_idx++ ] = output_sam + filter_state;
    }
}


7 comment(s) | add a comment | nofrills version


fast exp() approximations

Type : Taylor series approximation
References : Posted by scoofy[AT]inf[DOT]elte[DOT]hu

Notes :
I needed a fast exp() approximation in the -3.14..3.14 range, so I made some approximations based on the tanh() code posted in the archive by Fuzzpilz. Should be pretty straightforward, but someone may find this useful.

The increasing numbers in the name of the function means increasing precision. Maximum error in the -1..1 range:
fastexp3: 0.05 (1.8%)
fastexp4: 0.01 (0.36%)
fastexp5: 0.0016152 (0.59%)
fastexp6: 0.0002263 (0.0083%)
fastexp7: 0.0000279 (0.001%)
fastexp8: 0.0000031 (0.00011%)
fastexp9: 0.0000003 (0.000011%)

Maximum error in the -3.14..3.14 range:
fastexp3: 8.8742 (38.4%)
fastexp4: 4.8237 (20.8%)
fastexp5: 2.28 (9.8%)
fastexp6: 0.9488 (4.1%)
fastexp7: 0.3516 (1.5%)
fastexp8: 0.1172 (0.5%)
fastexp9: 0.0355 (0.15%)

These were done using the Taylor series, for example I got fastexp4 by using:
exp(x) = 1 + x + x^2/2 + x^3/6 + x^4/24 + ...
= (24 + 24x + x^2*12 + x^3*4 + x^4) / 24
(using Horner-scheme:)
= (24 + x * (24 + x * (12 + x * (4 + x)))) * 0.041666666f



Code :
inline float fastexp3(float x) {
    return (6+x*(6+x*(3+x)))*0.16666666f;
}

inline float fastexp4(float x) {
    return (24+x*(24+x*(12+x*(4+x))))*0.041666666f;
}

inline float fastexp5(float x) {
    return (120+x*(120+x*(60+x*(20+x*(5+x)))))*0.0083333333f;
}

inline float fastexp6(float x) {
    return 720+x*(720+x*(360+x*(120+x*(30+x*(6+x))))))*0.0013888888f;
}

inline float fastexp7(float x) {
    return (5040+x*(5040+x*(2520+x*(840+x*(210+x*(42+x*(7+x)))))))*0.00019841269f;
}

inline float fastexp8(float x) {
    return (40320+x*(40320+x*(20160+x*(6720+x*(1680+x*(336+x*(56+x*(8+x))))))))*2.4801587301e-5;
}

inline float fastexp9(float x) {
  return (362880+x*(362880+x*(181440+x*(60480+x*(15120+x*(3024+x*(504+x*(72+x*(9+x)))))))))*2.75573192e-6;
}


17 comment(s) | add a comment | nofrills version


Fast exp2 approximation

References : Posted by Laurent de Soras

Notes :
Partial approximation of exp2 in fixed-point arithmetic. It is exactly :
[0 ; 1[ -> [0.5 ; 1[
f : x |-> 2^(x-1)
To get the full exp2 function, you have to separate the integer and fractionnal part of the number. The integer part may be processed by bitshifting. Process the fractionnal part with the function, and multiply the two results.
Maximum error is only 0.3 % which is pretty good for two mul ! You get also the continuity of the first derivate.

-- Laurent


Code :
// val is a 16-bit fixed-point value in 0x0 - 0xFFFF ([0 ; 1[)
// Returns a 32-bit fixed-point value in 0x80000000 - 0xFFFFFFFF ([0.5 ; 1[)
unsigned int fast_partial_exp2 (int val)
{
   unsigned int   result;

   __asm
   {
      mov eax, val
      shl eax, 15           ; eax = input [31/31 bits]
      or  eax, 080000000h   ; eax = input + 1  [32/31 bits]
      mul eax
      mov eax, edx          ; eax = (input + 1) ^ 2 [32/30 bits]
      mov edx, 2863311531   ; 2/3 [32/32 bits], rounded to +oo
      mul edx               ; eax = 2/3 (input + 1) ^ 2 [32/30 bits]
      add edx, 1431655766   ; + 4/3 [32/30 bits] + 1
      mov result, edx
   }

   return (result);
}


no comments on this item | add a comment | nofrills version


Fast Exponential Envelope Generator

References : Posted by Christian Schoenebeck

Notes :
The naive way to implement this would be to use a exp() call for each point
of the envelope. Unfortunately exp() is quite a heavy function for most
CPUs, so here is a numerical, much faster way to compute an exponential
envelope (performance gain measured in benchmark: about factor 100 with a
Intel P4, gcc -O3 --fast-math -march=i686 -mcpu=i686).

Note: you can't use a value of 0.0 for levelEnd. Instead you have to use an
appropriate, very small value (e.g. 0.001 should be sufficiently small
enough).


Code :
const float sampleRate = 44100;
float coeff;
float currentLevel;

void init(float levelBegin, float levelEnd, float releaseTime) {
    currentLevel = levelBegin;
    coeff = (log(levelEnd) - log(levelBegin)) /
            (releaseTime * sampleRate);
}

inline void calculateEnvelope(int samplePoints) {
    for (int i = 0; i < samplePoints; i++) {
        currentLevel += coeff * currentLevel;
        // do something with 'currentLevel' here
        ...
    }
}


14 comment(s) | add a comment | nofrills version


Fast Float Random Numbers

References : Posted by dominik.ries[AT]gmail[DOT]com

Notes :
a small and fast implementation for random float numbers in the range [-1,1], usable as white noise oscillator.

compared to the naive usage of the rand() function it gives a speedup factor of 9-10.

compared to a direct implementation of the rand() function (visual studio implementation) it still gives a speedup by a factor of 2-3.

apart from beeing faster it also provides more precision for the resulting floats since its base values use full 32bit precision.



Code :
// set the initial seed to whatever you like
static int RandSeed = 1;

// using rand() (16bit precision)
// takes about 110 seconds for 2 billion calls
float RandFloat1()
{  
    return ((float)rand()/RAND_MAX) * 2.0f - 1.0f;  
}

// direct implementation of rand() (16 bit precision)
// takes about 32 seconds for 2 billion calls
float RandFloat2()
{  
    return ((float)(((RandSeed = RandSeed * 214013L + 2531011L) >> 16) & 0x7fff)/RAND_MAX) * 2.0f - 1.0f;  
}  

// fast rand float, using full 32bit precision
// takes about 12 seconds for 2 billion calls
float Fast_RandFloat()
{
    RandSeed *= 16807;
    return (float)RandSeed * 4.6566129e-010f;
}


5 comment(s) | add a comment | nofrills version


Fast in-place Walsh-Hadamard Transform

Type : wavelet transform
References : Posted by Timo H Tossavainen

Notes :
IIRC, They're also called walsh-hadamard transforms.
Basically like Fourier, but the basis functions are squarewaves with different sequencies.
I did this for a transform data compression study a while back.
Here's some code to do a walsh hadamard transform on long ints in-place (you need to divide by n to get transform) the order is bit-reversed at output, IIRC.
The inverse transform is the same as the forward transform (expects bit-reversed input). i.e. x = 1/n * FWHT(FWHT(x)) (x is a vector)


Code :
void inline wht_bfly (long& a, long& b)
{
        long tmp = a;
        a += b;
        b = tmp - b;
}

// just a integer log2
int inline l2 (long x)
{
        int l2;
        for (l2 = 0; x > 0; x >>=1)
        {
                ++ l2;
        }

        return (l2);
}

////////////////////////////////////////////
// Fast in-place Walsh-Hadamard Transform //
////////////////////////////////////////////

void FWHT (std::vector& data)
{
  const int log2 = l2 (data.size()) - 1;
  for (int i = 0; i < log2; ++i)
  {
    for (int j = 0; j < (1 << log2); j += 1 << (i+1))
    {
       for (int k = 0; k < (1<<i); ++k)
       {
           wht_bfly (data [j + k], data [j + k + (1<<i)]);
       }
    }
  }
}


17 comment(s) | add a comment | nofrills version


Fast LFO in Delphi...

References : Posted by Dambrin Didier ( gol [AT] e-officedirect [DOT] com )
Linked file : LFOGenerator.zip

Notes :
[from Didier's mail...]
[see attached zip file too!]

I was working on a flanger, & needed an LFO for it. I first used a Sin(), but it was too slow, then tried a big wavetable, but it wasn't accurate enough.

I then checked the alternate sine generators from your web site, & while they're good, they all can drift, so you're also wasting too much CPU in branching for the drift checks.

So I made a quick & easy linear LFO, then a sine-like version of it. Can be useful for LFO's, not to output as sound.
If has no branching & is rather simple. 2 Abs() but apparently they're fast. In all cases faster than a Sin()


It's in delphi, but if you understand it you can translate it if you want.
It uses a 32bit integer counter that overflows, & a power for the sine output.
If you don't know delphi, $ is for hex (h at the end in c++?), Single is 32bit float, integer is 32bit integer (signed, normally).


Code :
unit Unit1;

interface

uses
  Windows, Messages, SysUtils, Classes, Graphics, Controls, Forms, Dialogs,
  StdCtrls, ExtCtrls, ComCtrls;

type
  TForm1 = class(TForm)
    PaintBox1: TPaintBox;
    Bevel1: TBevel;
    procedure PaintBox1Paint(Sender: TObject);
  private
    { Private declarations }
  public
    { Public declarations }
  end;

var
  Form1: TForm1;

implementation

{$R *.DFM}

procedure TForm1.PaintBox1Paint(Sender: TObject);
var n,Pos,Speed:Integer;
    Output,Scale,HalfScale,PosMul:Single;
    OurSpeed,OurScale:Single;
begin
OurSpeed:=100;  // 100 samples per cycle
OurScale:=100;  // output in -100..100

Pos:=0;  // position in our linear LFO
Speed:=Round($100000000/OurSpeed);


// --- triangle LFO ---
Scale:=OurScale*2;
PosMul:=Scale/$80000000;

// loop
for n:=0 to 299 do
  Begin
  // inc our 32bit integer LFO pos & let it overflow. It will be seen as signed when read by the math unit
  Pos:=Pos+Speed;

  Output:=Abs(Pos*PosMul)-OurScale;

  // visual
  Paintbox1.Canvas.Pixels[n,Round(100+Output)]:=clRed;
  End;


// --- sine-like LFO ---
Scale:=Sqrt(OurScale*4);
PosMul:=Scale/$80000000;
HalfScale:=Scale/2;

// loop
for n:=0 to 299 do
  Begin
  // inc our 32bit integer LFO pos & let it overflow. It will be seen as signed when read by the math unit
  Pos:=Pos+Speed;

  Output:=Abs(Pos*PosMul)-HalfScale;
  Output:=Output*(Scale-Abs(Output));

  // visual
  Paintbox1.Canvas.Pixels[n,Round(100+Output)]:=clBlue;
  End;
end;

end.


2 comment(s) | add a comment | nofrills version


Fast log2

References : Posted by Laurent de Soras
Code :
inline float fast_log2 (float val)
{
   assert (val > 0);

   int * const  exp_ptr = reinterpret_cast <int *> (&val);
   int          x = *exp_ptr;
   const int    log_2 = ((x >> 23) & 255) - 128;
   x &= ~(255 << 23);
   x += 127 << 23;
   *exp_ptr = x;

   return (val + log_2);
}


6 comment(s) | add a comment | nofrills version


fast power and root estimates for 32bit floats

Type : floating point functions
References : Posted by tobybear[AT]web[DOT]de

Notes :
Original code by Stefan Stenzel (also in this archive, see "pow(x,4) approximation") - extended for more flexibility.

fastpow(f,n) gives a rather *rough* estimate of a float number f to the power of an integer number n (y=f^n). It is fast but result can be quite a bit off, since we directly mess with the floating point exponent.-> use it only for getting rough estimates of the values and where precision is not that important.

fastroot(f,n) gives the n-th root of f. Same thing concerning precision applies here.

Cheers

Toby (www.tobybear.de)


Code :
//C/C++ source code:
float fastpower(float f,int n)
{
long *lp,l;
lp=(long*)(&f);
l=*lp;l-=0x3F800000l;l<<=(n-1);l+=0x3F800000l;
*lp=l;
return f;
}

float fastroot(float f,int n)
{
long *lp,l;
lp=(long*)(&f);
l=*lp;l-=0x3F800000l;l>>=(n-1);l+=0x3F800000l;
*lp=l;
return f;
}

//Delphi/Pascal source code:
function fastpower(i:single;n:integer):single;
var l:longint;
begin
l:=longint((@i)^);
l:=l-$3F800000;l:=l shl (n-1);l:=l+$3F800000;
result:=single((@l)^);
end;

function fastroot(i:single;n:integer):single;
var l:longint;
begin
l:=longint((@i)^);
l:=l-$3F800000;l:=l shr (n-1);l:=l+$3F800000;
result:=single((@l)^);
end;


no comments on this item | add a comment | nofrills version


Fast rounding functions in pascal

Type : round/ceil/floor/trunc
References : Posted by bouba@hotmail.com

Notes :
Original documentation with cpp samples:
http://ldesoras.free.fr/prod.html#doc_rounding


Code :
Pascal translation of the functions (accurates ones) :

function ffloor(f:double):integer;
var
  i:integer;
  t : double;
begin
t := -0.5 ;
  asm
    fld   f
    fadd  st,st(0)
    fadd  t
    fistp i
    sar   i, 1
end;
result:= i
end;

function fceil(f:double):integer;
var
  i:integer;
  t : double;
begin
t := -0.5 ;
  asm
    fld   f
    fadd  st,st(0)
    fsubr t
    fistp i
    sar   i, 1
end;
result:= -i
end;

function ftrunc(f:double):integer;
var
  i:integer;
  t : double;
begin
t := -0.5 ;
  asm
    fld   f
    fadd  st,st(0)
    fabs
    fadd t
    fistp i
    sar   i, 1
end;
if f<0 then i := -i;
result:= i
end;

function fround(f:double):integer;
var
  i:integer;
  t : double;
begin
t := 0.5 ;
  asm
    fld   f
    fadd  st,st(0)
    fadd t
    fistp i
    sar   i, 1
end;
result:= i
end;


1 comment(s) | add a comment | nofrills version


Fast sign for 32 bit floats

References : Posted by Peter Schoffhauzer

Notes :
Fast functions which give the sign of a 32 bit floating point number by checking the sign bit. There are two versions, one which gives the value as a float, and the other gives an int.

The _nozero versions are faster, but they give incorrect 1 or -1 for zero (depending on the sign bit set in the number). The int version should be faster than the Tobybear one in the archive, since this one doesn't have an addition, just bit operations.


Code :
/*------------------------------------------------------------------------------------
   fast sign, returns float
------------------------------------------------------------------------------------*/

// returns 1.0f for positive floats, -1.0f for negative floats
// for zero it does not work (may gives 1.0f or -1.0f), but it's faster
inline float fast_sign_nozero(float f) {
    float r = 1.0f;
    (int&)r |= ((int&)f & 0x80000000); // mask sign bit in f, set it in r if necessary
    return r;
}

// returns 1.0f for positive floats, -1.0f for negative floats, 0.0f for zero
inline float fast_sign(float f) {
    if (((int&)f & 0x7FFFFFFF)==0) return 0.f; // test exponent & mantissa bits: is input zero?
    else {
        float r = 1.0f;
        (int&)r |= ((int&)f & 0x80000000); // mask sign bit in f, set it in r if necessary
        return r;
    }
}

/*------------------------------------------------------------------------------------
   fast sign, returns int
------------------------------------------------------------------------------------*/

// returns 1 for positive floats, -1 for negative floats
// for 0.0f input it does not work (may give 1 or -1), but it's faster
inline int fast_sign_int_nozero(float f) {
      return (signed((int&)f & 0x80000000) >> 31 ) | 1;
}

// returns 1 for positive floats, -1 for negative floats, 0 for 0.0f
inline int fast_sign_int(float f) {
    if (((int&)f & 0x7FFFFFFF)==0) return 0; // test exponent & mantissa bits: is input zero?
    return (signed((int&)f & 0x80000000) >> 31 ) | 1;
}


2 comment(s) | add a comment | nofrills version


Fast SIN approximation for usage in e.g. additive synthesizers

References : Posted by neotec

Notes :
This code presents 2 'fastsin' functions. fastsin2 is less accurate than fastsin. In fact it's a simple taylor series, but optimized for integer phase.

phase is in 0 -> (2^32)-1 range and maps to 0 -> ~2PI

I get about 55000000 fastsin's per second on my P4,3.2GHz which would give a nice Kawai K5 emulation using 64 harmonics and 8->16 voices.


Code :
float fastsin(UINT32 phase)
{
    const float frf3 = -1.0f / 6.0f;
    const float frf5 = 1.0f / 120.0f;
    const float frf7 = -1.0f / 5040.0f;
    const float frf9 = 1.0f / 362880.0f;
    const float f0pi5 = 1.570796327f;
    float x, x2, asin;
    UINT32 tmp = 0x3f800000 | (phase >> 7);
    if (phase & 0x40000000)
        tmp ^= 0x007fffff;
    x = (*((float*)&tmp) - 1.0f) * f0pi5;
    x2 = x * x;
    asin = ((((frf9 * x2 + frf7) * x2 + frf5) * x2 + frf3) * x2 + 1.0f) * x;
    return (phase & 0x80000000) ? -asin : asin;
}

float fastsin2(UINT32 phase)
{
    const float frf3 = -1.0f / 6.0f;
    const float frf5 = 1.0f / 120.0f;
    const float frf7 = -1.0f / 5040.0f;
    const float f0pi5 = 1.570796327f;
    float x, x2, asin;
    UINT32 tmp = 0x3f800000 | (phase >> 7);
    if (phase & 0x40000000)
        tmp ^= 0x007fffff;
    x = (*((float*)&tmp) - 1.0f) * f0pi5;
    x2 = x * x;
    asin = (((frf7 * x2 + frf5) * x2 + frf3) * x2 + 1.0f) * x;
    return (phase & 0x80000000) ? -asin : asin;
}


2 comment(s) | add a comment | nofrills version


Fast sine and cosine calculation

Type : waveform generation
References : Lot's or references... Check Julius O. SMith mainly
Code :
init:
float a = 2.f*(float)sin(Pi*frequency/samplerate);

float s[2];

s[0] = 0.5f;
s[1] = 0.f;

loop:
s[0] = s[0] - a*s[1];
s[1] = s[1] + a*s[0];
output_sine = s[0];
output_cosine = s[1]


9 comment(s) | add a comment | nofrills version


Fast sine wave calculation

Type : waveform generation
References : James McCartney in Computer Music Journal, also the Julius O. Smith paper

Notes :
(posted by Niels Gorisse)
If you change the frequency, the amplitude rises (pitch lower) or lowers (pitch rise) a LOT I fixed the first problem by thinking about what actually goes wrong. The answer was to recalculate the phase for that frequency and the last value, and then continue normally.


Code :
Variables:
ip = phase of the first output sample in radians
w = freq*pi / samplerate
b1 = 2.0 * cos(w)

Init:
y1=sin(ip-w)
y2=sin(ip-2*w)

Loop:
y0 = b1*y1 - y2
y2 = y1
y1 = y0

output is in y0 (y0 = sin(ip + n*freq*pi / samplerate), n= 0, 1, 2, ... I *think*)

Later note by James McCartney:
if you unroll such a loop by 3 you can even eliminate the assigns!!

y0 = b1*y1 - y2
y2 = b1*y0 - y1
y1 = b1*y2 - y0


93 comment(s) | add a comment | nofrills version


Fast square wave generator

Type : NON-bandlimited osc...
References : Posted by Wolfgang (wschneider[AT]nexoft.de)

Notes :
Produces a square wave -1.0f .. +1.0f.
The resulting waveform is NOT band-limited, so it's propably of not much use for syntheis. It's rather useful for LFOs and the like, though.


Code :
Idea: use integer overflow to avoid conditional jumps.

// init:
typedef unsigned long ui32;

float sampleRate = 44100.0f; // whatever
float freq = 440.0f; // 440 Hz
float one = 1.0f;
ui32 intOver = 0L;
ui32 intIncr = (ui32)(4294967296.0 / hostSampleRate / freq));

// loop:
(*((ui32 *)&one)) &= 0x7FFFFFFF; // mask out sign bit
(*((ui32 *)&one)) |= (intOver & 0x80000000);
intOver += intIncr;


2 comment(s) | add a comment | nofrills version


Fast Whitenoise Generator

Type : Whitenoise
References : Posted by gerd[DOT]feldkirch[AT]web[DOT]de

Notes :
This is Whitenoise... :o)

Code :
float g_fScale = 2.0f / 0xffffffff;
int g_x1 = 0x67452301;
int g_x2 = 0xefcdab89;

void whitenoise(
  float* _fpDstBuffer, // Pointer to buffer
  unsigned int _uiBufferSize, // Size of buffer
  float _fLevel ) // Noiselevel (0.0 ... 1.0)
{
  _fLevel *= g_fScale;

  while( _uiBufferSize-- )
  {
    g_x1 ^= g_x2;
    *_fpDstBuffer++ = g_x2 * _fLevel;
    g_x2 += g_x1;
  }
}


9 comment(s) | add a comment | nofrills version


FFT

References : Toth Laszlo
Linked file : rvfft.ps
Linked file : rvfft.cpp

Notes :
A paper (postscript) and some C++ source for 4 different fft algorithms, compiled by Toth Laszlo from the Hungarian Academy of Sciences Research Group on Artificial Intelligence.
Toth says: "I've found that Sorensen's split-radix algorithm was the fastest, so I use this since then (this means that you may as well delete the other routines in my source - if you believe my results)."




1 comment(s) | add a comment | nofrills version


FFT classes in C++ and Object Pascal

Type : Real-to-Complex FFT and Complex-to-Real IFFT
References : Laurent de Soras (Object Pascal translation by Frederic Vanmol)
Linked file : FFTReal.zip

Notes :
(see linkfile)



2 comment(s) | add a comment | nofrills version


Float to int

References : Posted by Ross Bencina

Notes :
intel only

Code :
int truncate(float flt)
{
  int i;
  static const double half = 0.5f;
  _asm
  {
     fld flt
     fsub half
     fistp i
  }
  return i
}


3 comment(s) | add a comment | nofrills version


Float to int (more intel asm)

References : Posted by Laurent de Soras (via flipcode)

Notes :
[Found this on flipcode, seemed worth posting here too, hopefully Laurent will approve :) -- Ross]

Here is the code I often use. It is not a _ftol replacement, but it provides useful functions. The current processor rounding mode should be "to nearest" ; it is the default setting for most of the compilers.

The [fadd st, st (0) / sar i,1] trick fixes the "round to nearest even number" behaviour. Thus, round_int (N+0.5) always returns N+1 and floor_int function is appropriate to convert floating point numbers into fixed point numbers.

---------------------

Laurent de Soras
Audio DSP engineer & software designer
Ohm Force - Digital audio software
http://www.ohmforce.com


Code :
inline int round_int (double x)
{
   int      i;
   static const float round_to_nearest = 0.5f;
   __asm
   {
      fld      x
      fadd     st, st (0)
      fadd     round_to_nearest
      fistp    i
      sar      i, 1
   }

   return (i);
}

inline int floor_int (double x)
{
   int      i;
   static const float round_toward_m_i = -0.5f;
   __asm
    {
      fld      x
      fadd     st, st (0)
      fadd     round_toward_m_i
      fistp    i
      sar      i, 1
   }

   return (i);
}

inline int ceil_int (double x)
{
   int      i;
   static const float round_toward_p_i = -0.5f;
   __asm
   {
      fld      x
      fadd     st, st (0)
      fsubr    round_toward_p_i
      fistp    i
      sar      i, 1
   }

   return (-i);
}


5 comment(s) | add a comment | nofrills version


Float to integer conversion

References : Posted by Peter Schoffhauzer

Notes :
The fld x/fistp i instructions can be used for fast conversion of floating point numbers to integers. By default, the number is rounded to the nearest integer. However, sometimes that's not what we need.

Bits 12 and 11 if the FPU control word determine the way the FPU rounds numbers. The four rounding states are:

00 = round to nearest (this is the default)
01 = round down (towards -infinity)
10 = round up(towards +infinity)
11 = truncate up(towards zero)

The status word is loaded/stored using the fldcw/fstcw instructions. After setting the rounding mode as desired, the float2int() function will use that rounding mode until the control mode is reset. The ceil() and floor() functions set the rounding mode for every instruction, which is very slow. Therefore, it is a lot faster to set the rounding mode (down or up) before processing a block, and use float2int() instead.

References: SIMPLY FPU by Raymond Filiatreault
http://www.website.masmforum.com/tutorials/fptute/index.html

MSVC (and probably other compilers too) has functions defined in <float.h> for changing the FPU control word: _control87/_controlfp. The equivalent instructions are:

_controlfp(_RC_CHOP, _MCW_RC); // set rounding mode to truncate
_controlfp(_RC_UP _MCW_RC); // set rounding mode to up (ceil)
_controlfp(_RC_DOWN, _MCW_RC); // set rounding mode to down (floor)
_controlfp(_RC_NEAR, _MCW_RC); // set rounding mode to near (default)

Note that the FPU rounding mode affects other calculations too, so the same code may give different results.

Alternatively, single precision floating point numbers can be truncated or rounded to integers by using SSE instructions cvtss2si, cvttss2si, cvtps2pi or cvttps2pi, where SSE instructions are available (which means probably on all modern CPUs). These are not discussed here in detail, but I provided the function truncateSSE which always truncates a single precision floating point number to integer, regardless of the current rounding mode.

(Also I think the SSE rounding mode can differ from the rounding mode set in the FPU control word, but I haven't tested it so far.)


Code :
#ifndef __FPU_ROUNDING_H_
#define __FPU_ROUNDING_H_

static short control_word;
static short control_word2;

// round a float to int using the actual rounding mode
inline int float2int(float x) {
    int i;
    __asm {
        fld x
        fistp i
    }
    return i;
}

// set rounding mode to nearest
inline void set_round2near() {
    __asm      {
        fstcw    control_word        // store fpu control word
        mov     dx, word ptr [control_word]
        and        dx,0xf9ff            // round towards nearest (default)
        mov        control_word2, dx
        fldcw    control_word2        // load modfied control word
    }
}

// set rounding mode to round down
inline void set_round_down() {
    __asm  {
        fstcw    control_word        // store fpu control word
        mov     dx, word ptr [control_word]
        or        dx,0x0400            // round towards -inf
        and        dx,0xf7ff            
        mov        control_word2, dx
        fldcw    control_word2        // load modfied control word
    }
}

// set rounding mode to round up
inline void set_round_up() {
    __asm  {
        fstcw    control_word        // store fpu control word
        mov     dx, word ptr [control_word]
        or        dx,0x0800            // round towards +inf
        and        dx,0xfbff            
        mov        control_word2, dx
        fldcw    control_word2        // load modfied control word
    }
}

// set rounding mode to truncate
inline void set_truncate() {
    __asm  {
        fstcw    control_word        // store fpu control word
        mov     dx, word ptr [control_word]
        or        dx,0x0c00            // rounding: truncate
        mov        control_word2, dx
        fldcw    control_word2        // load modfied control word
    }
}

// restore original rounding mode
inline void restore_cw() {
    __asm fldcw    control_word
}

// truncate to integer using SSE
inline long truncateSSE(float x) {
    __asm cvttss2si eax,x
}

#endif


1 comment(s) | add a comment | nofrills version


Float-to-int, coverting an array of floats

References : Posted by Stefan Stenzel

Notes :
intel only

Code :
void f2short(float *fptr,short *iptr,int n)
{
_asm {
    mov     ebx,n
    mov     esi,fptr
    mov     edi,iptr
    lea     ebx,[esi+ebx*4]   ; ptr after last
    mov     edx,0x80008000    ; damn endianess confuses...
    mov     ecx,0x4b004b00    ; damn endianess confuses...
    mov     eax,[ebx]         ; get last value
    push    eax
    mov     eax,0x4b014B01
    mov     [ebx],eax              ; mark end
    mov     ax,[esi+2]
    jmp     startf2slp

;   Pad with nops to make loop start at address divisible
;   by 16 + 2, e.g. 0x01408062, don't ask why, but this
;   gives best performance. Unfortumately "align 16" does
;   not seem to work with my VC.
;   below I noted the measured execution times for different
;   nop-paddings on my Pentium Pro, 100 conversions.
;   saturation:  off pos neg


   nop         ;355 546 563 <- seems to be best
;   nop         ;951 547 544
;   nop         ;444 646 643
;   nop         ;444 646 643
;   nop         ;944 951 950
;   nop         ;358 447 644
;   nop         ;358 447 643
;   nop         ;358 544 643
;   nop         ;543 447 643
;   nop         ;643 447 643
;   nop         ;1047 546 746
;   nop         ;545 954 1253
;   nop         ;545 547 661
;   nop         ;544 547 746
;   nop         ;444 947 1147
;   nop         ;444 548 545
in_range:
    mov     eax,[esi]
    xor     eax,edx
saturate:
    lea     esi,[esi+4]
    mov     [edi],ax
    mov     ax,[esi+2]
    add     edi,2
startf2slp:
    cmp     ax,cx
    je      in_range
    mov     eax,edx
    js      saturate     ; saturate neg -> 0x8000
    dec     eax          ; saturate pos -> 0x7FFF
    cmp     esi,ebx      ; end reached ?
    jb      saturate
    pop     eax
    mov     [ebx],eax    ; restore end flag
    }
}


1 comment(s) | add a comment | nofrills version


fold back distortion

Type : distortion
References : Posted by hellfire[AT]upb[DOT]de

Notes :
a simple fold-back distortion filter.
if the signal exceeds the given threshold-level, it mirrors at the positive/negative threshold-border as long as the singal lies in the legal range (-threshold..+threshold).
there is no range limit, so inputs doesn't need to be in -1..+1 scale.
threshold should be >0
depending on use (low thresholds) it makes sense to rescale the input to full amplitude

performs approximately the following code
(just without the loop)

while (in>threshold || in<-threshold)
{
// mirror at positive threshold
if (in>threshold) in= threshold - (in-threshold);
// mirror at negative threshold
if (in<-threshold) in= -threshold + (-threshold-in);
}


Code :
float foldback(float in, float threshold)
{
  if (in>threshold || in<-threshold)
  {
    in= fabs(fabs(fmod(in - threshold, threshold*4)) - threshold*2) - threshold;
  }
  return in;
}


1 comment(s) | add a comment | nofrills version


Formant filter

References : Posted by Alex
Code :
/*
Public source code by alex@smartelectronix.com
Simple example of implementation of formant filter
Vowelnum can be 0,1,2,3,4 <=> A,E,I,O,U
Good for spectral rich input like saw or square
*/
//-------------------------------------------------------------VOWEL COEFFICIENTS
const double coeff[5][11]= {
{ 8.11044e-06,
8.943665402,    -36.83889529,    92.01697887,    -154.337906,    181.6233289,
-151.8651235,   89.09614114,    -35.10298511,    8.388101016,    -0.923313471  ///A
},
{4.36215e-06,
8.90438318,    -36.55179099,    91.05750846,    -152.422234,    179.1170248,  ///E
-149.6496211,87.78352223,    -34.60687431,    8.282228154,    -0.914150747
},
{ 3.33819e-06,
8.893102966,    -36.49532826,    90.96543286,    -152.4545478,    179.4835618,
-150.315433,    88.43409371,    -34.98612086,    8.407803364,    -0.932568035  ///I
},
{1.13572e-06,
8.994734087,    -37.2084849,    93.22900521,    -156.6929844,    184.596544,   ///O
-154.3755513,    90.49663749,    -35.58964535,    8.478996281,    -0.929252233
},
{4.09431e-07,
8.997322763,    -37.20218544,    93.11385476,    -156.2530937,    183.7080141,  ///U
-153.2631681,    89.59539726,    -35.12454591,    8.338655623,    -0.910251753
}
};
//---------------------------------------------------------------------------------
static double memory[10]={0,0,0,0,0,0,0,0,0,0};
//---------------------------------------------------------------------------------
float formant_filter(float *in, int vowelnum)
{
            res= (float) ( coeff[vowelnum][0]  *in +
                     coeff[vowelnum][1]  *memory[0] +  
                     coeff[vowelnum][2]  *memory[1] +
                     coeff[vowelnum][3]  *memory[2] +
                     coeff[vowelnum][4]  *memory[3] +
                     coeff[vowelnum][5]  *memory[4] +
                     coeff[vowelnum][6]  *memory[5] +
                     coeff[vowelnum][7]  *memory[6] +
                     coeff[vowelnum][8]  *memory[7] +
                     coeff[vowelnum][9]  *memory[8] +
                     coeff[vowelnum][10] *memory[9] );

memory[9]= memory[8];
memory[8]= memory[7];
memory[7]= memory[6];
memory[6]= memory[5];
memory[5]= memory[4];
memory[4]= memory[3];
memory[3]= memory[2];
memory[2]= memory[1];                    
memory[1]= memory[0];
memory[0]=(double) res;
return res;
}


18 comment(s) | add a comment | nofrills version


Frequency response from biquad coefficients

Type : biquad
References : Posted by peter[AT]sonicreef[DOT]com

Notes :
Here is a formula for plotting the frequency response of a biquad filter. Depending on the coefficients that you have, you might have to use negative values for the b- coefficients.



Code :
//w = frequency (0 < w < PI)
//square(x) = x*x

y = 20*log((sqrt(square(a0*square(cos(w))-a0*square(sin(w))+a1*cos(w)+a2)+square(2*a0*cos(w)*sin(w)+a1*(sin(w))))/
                  sqrt(square(square(cos(w))-   square(sin(w))+b1*cos(w)+b2)+square(2*   cos(w)*sin(w)+b1*(sin(w))))));


2 comment(s) | add a comment | nofrills version


frequency warped FIR lattice

Type : FIR using allpass chain
References : Posted by mail[AT]mutagene[DOT]net

Notes :
Not at all optimized and pretty hungry in terms of arrays and overhead (function requires two arrays containing lattice filter's internal state and ouputs to another two arrays with their next states). In this implementation I think you'll have to juggle taps1/newtaps in your processing loop, alternating between one set of arrays and the other for which to send to wfirlattice).

A frequency-warped lattice filter is just a lattice filter where every delay has been replaced with an allpass filter. By adjusting the allpass filters, the frequency response of the filter can be adjusted (e.g., design an FIR that approximates some filter. Play with with warping coefficient to "sweep" the FIR up and down without changing any other coefficients). Much more on warped filters can be found on Aki Harma's website ( http://www.acoustics.hut.fi/~aqi/ )


Code :
float wfirlattice(float input, float *taps1, float *taps2, float *reflcof, float lambda, float *newtaps1, float *newtaps2, int P)
// input is filter input
// taps1,taps2 are previous filter states (init to 0)
// reflcof are reflection coefficients.  abs(reflcof) < 1 for stable filter
// lamba is warping (0 = no warping, 0.75 is close to bark scale at 44.1 kHz)
// newtaps1, newtaps2 are new filter states
// P is the order of the filter
{
    float forward;
    float topline;

    forward = input;
    topline = forward;

    for (int i=0;i<P;i++)
    {
        newtaps2[i] = topline;
        newtaps1[i] = float(lambda)*(-topline + taps1[i]) + taps2[i];
        topline = newtaps1[i]+forward*(reflcof[i]);
        forward += newtaps1[i]*(reflcof[i]);
        taps1[i]=newtaps1[i];
        taps2[i]=newtaps2[i];
    }
    return forward;
}


3 comment(s) | add a comment | nofrills version


Gaussian dithering

Type : Dithering
References : Posted by Aleksey Vaneev (picoder[AT]mail[DOT]ru)

Notes :
It is a more sophisticated dithering than simple RND. It gives the most low noise floor for the whole spectrum even without noise-shaping. You can use as big N as you can afford (it will not hurt), but 4 or 5 is generally enough.

Code :
Basically, next value is calculated this way (for RND going from -0.5 to 0.5):

dither = (RND+RND+...+RND) / N.
           \           /
            \_________/
              N times

If your RND goes from 0 to 1, then this code is applicable:

dither = (RND+RND+...+RND - 0.5*N) / N.


no comments on this item | add a comment | nofrills version


Gaussian random numbers

Type : random number generation
References : Posted by tobybear[AT]web[DOT]de

Notes :
// Gaussian random numbers
// This algorithm (adapted from "Natur als fraktale Grafik" by
// Reinhard Scholl) implements a generation method for gaussian
// distributed random numbers with mean=0 and variance=1
// (standard gaussian distribution) mapped to the range of
// -1 to +1 with the maximum at 0.
// For only positive results you might abs() the return value.
// The q variable defines the precision, with q=15 the smallest
// distance between two numbers will be 1/(2^q div 3)=1/10922
// which usually gives good results.

// Note: the random() function used is the standard random
// function from Delphi/Pascal that produces *linear*
// distributed numbers from 0 to parameter-1, the equivalent
// C function is probably rand().


Code :
const q=15;
      c1=(1 shl q)-1;
      c2=(c1 div 3)+1;
      c3=1/c1;

function GRandom:single;
begin
result:=(2*(random(c2)+random(c2)+random(c2))-3*(c2-1))*c3;
end;


no comments on this item | add a comment | nofrills version


Gaussian White noise

References : Posted by Alexey Menshikov

Notes :
Code I use sometimes, but don't remember where I ripped it from.

- Alexey Menshikov


Code :
#define ranf() ((float) rand() / (float) RAND_MAX)

float ranfGauss (int m, float s)
{
   static int pass = 0;
   static float y2;
   float x1, x2, w, y1;

   if (pass)
   {
      y1 = y2;
   } else  {
      do {
         x1 = 2.0f * ranf () - 1.0f;
         x2 = 2.0f * ranf () - 1.0f;
         w = x1 * x1 + x2 * x2;
      } while (w >= 1.0f);

      w = (float)sqrt (-2.0 * log (w) / w);
      y1 = x1 * w;
      y2 = x2 * w;
   }
   pass = !pass;

   return ( (y1 * s + (float) m));
}


6 comment(s) | add a comment | nofrills version


Gaussian White Noise

References : Posted by remage[AT]netposta.hu

Notes :
SOURCE:

Steven W. Smith:
The Scientist and Engineer's Guide to Digital Signal Processing
http://www.dspguide.com


Code :
#define PI 3.1415926536f

float R1 = (float) rand() / (float) RAND_MAX;
float R2 = (float) rand() / (float) RAND_MAX;

float X = (float) sqrt( -2.0f * log( R1 )) * cos( 2.0f * PI * R2 );


1 comment(s) | add a comment | nofrills version


Generator

Type : antialiased sawtooth
References : Posted by Paul Sernine

Notes :
This code generates a swept antialiasing sawtooth in a raw 16bit pcm file.
It is based on the quad differentiation of a 5th order polynomial. The polynomial harmonics (and aliased harmonics) decay at 5*6 dB per oct. The differenciators correct the spectrum and waveform, while aliased harmonics are still attenuated.


Code :
/* clair.c         Examen Partiel 2b
   T.Rochebois
   02/03/98
*/
#include <stdio.h>
#include <math.h>
main()
{
  double phase=0,dphase,freq,compensation;
  double aw0=0,aw1=0,ax0=0,ax1=0,ay0=0,ay1=0,az0=0,az1=0,sortie;
  short aout;
  int sr=44100;       //sample rate (Hz)
  double f_debut=55.0;//start freq (Hz)
  double f_fin=sr/6.0;//end freq (Hz)
  double octaves=log(f_fin/f_debut)/log(2.0);
  double duree=50.0;  //duration (s)
  int i;
  FILE* f;
  f=fopen("saw.pcm","wb");
  for(i=0;i<duree*sr;i++)
  {
    //exponential frequency sweep
    //Can be replaced by anything you like.
    freq=f_debut*pow(2.0,octaves*i/(duree*sr));
    dphase=freq*(2.0/sr);     //normalised phase increment
    phase+=dphase;            //phase incrementation
    if(phase>1.0) phase-=2.0; //phase wrapping (-1,+1)

    //polynomial calculation (extensive continuity at -1 +1)
    //        7       1  3    1   5
    //P(x) = --- x - -- x  + --- x
    //       360     36      120
    aw0=phase*(7.0/360.0 + phase*phase*(-1/36.0 + phase*phase*(1/120.0)));
    // quad differentiation (first order high pass filters)
    ax0=aw1-aw0; ay0=ax1-ax0; az0=ay1-ay0; sortie=az1-az0;
    //compensation of the attenuation of the quad differentiator
    //this can be calculated at "control rate" and linearly
    //interpolated at sample rate.
    compensation=1.0/(dphase*dphase*dphase*dphase);
    // compensation and output
    aout=(short)(15000.0*compensation*sortie);
    fwrite(&aout,1,2,f);
    //old memories of differentiators
    aw1=aw0; ax1=ax0; ay1=ay0; az1=az0;
  }
  fclose(f);
}


4 comment(s) | add a comment | nofrills version


Guitar feedback

References : Posted by Sean Costello

Notes :
It is fairly simple to simulate guitar feedback with a simple Karplus-Strong algorithm (this was described in a CMJ article in the early 90's):



Code :
Run the output of the Karplus-Strong delay lines into a nonlinear shaping function for distortion (i.e. 6 parallel delay lines for 6 strings, going into 1 nonlinear shaping function that simulates an overdriven amplifier, fuzzbox, etc.);

Run part of the output into a delay line, to simulate the distance from the amplifier to the "strings";

The delay line feeds back into the Karplus-Strong delay lines. By controlling the amount of the output fed into the delay line, and the length of the delay line, you can control the intensity and pitch of the feedback note.


13 comment(s) | add a comment | nofrills version


Hermite Interpolator (x86 ASM)

Type : Hermite interpolator in x86 assembly (for MS VC++)
References : Posted by robert[DOT]bielik[AT]rbcaudio[DOT]com

Notes :
An "assemblified" variant of Laurent de Soras hermite interpolator. I tried to do calculations as parallell as I could muster, but there is almost certainly room for improvements. Right now, it works about 5.3 times (!) faster, not bad to start with...

Parameter explanation:
frac_pos: fractional value [0.0f - 1.0f] to interpolator
pntr: pointer to float array where:
pntr[0] = previous sample (idx = -1)
pntr[1] = current sample (idx = 0)
pntr[2] = next sample (idx = +1)
pntr[3] = after next sample (idx = +2)

The interpolation takes place between pntr[1] and pntr[2].

Regards,
/Robert Bielik
RBC Audio



Code :
const float c_half = 0.5f;

__declspec(naked) float __hermite(float frac_pos, const float* pntr)
{
    __asm
    {
        push    ecx;        
        mov        ecx, dword ptr[esp + 12]; //////////////////////////////////////////////////////////////////////////////////////////////////
        add        ecx, 0x04;            //    ST(0)        ST(1)        ST(2)        ST(3)        ST(4)        ST(5)        ST(6)        ST(7)        
        fld        dword ptr [ecx+4];    //    x1
        fsub    dword ptr [ecx-4];    //    x1-xm1
        fld        dword ptr [ecx];    //    x0            x1-xm1
        fsub    dword ptr [ecx+4];    //    v            x1-xm1
        fld        dword ptr [ecx+8];    //    x2            v            x1-xm1
        fsub    dword ptr [ecx];    //    x2-x0        v            x1-xm1
        fxch    st(2);                //    x1-m1        v            x2-x0
        fmul    c_half;                //    c            v            x2-x0
        fxch    st(2);                //    x2-x0        v            c
        fmul    c_half;                //    0.5*(x2-x0)    v            c
        fxch    st(2);                //    c            v            0.5*(x2-x0)
        fst        st(3);                //    c            v            0.5*(x2-x0)    c
        fadd    st(0), st(1);        //    w            v            0.5*(x2-x0)    c
        fxch    st(2);                //    0.5*(x2-x0)    v            w            c    
        faddp    st(1), st(0);        //  v+.5(x2-x0)    w            c    
        fadd    st(0), st(1);        //    a            w            c
        fadd    st(1), st(0);        //    a            b_neg        c
        fmul    dword ptr [esp+8];    //    a*frac        b_neg        c
        fsubp    st(1), st(0);        //    a*f-b        c
        fmul    dword ptr [esp+8];    //    (a*f-b)*f    c
        faddp    st(1), st(0);        //    res-x0/f
        fmul    dword ptr [esp+8];    //    res-x0
        fadd    dword ptr [ecx];    //    res
        pop        ecx;
        ret;
    }
}


4 comment(s) | add a comment | nofrills version


Hermite interpollation

References : Posted by various

Notes :
These are all different ways to do the same thing : hermite interpollation. Try'm all and benchmark.

Code :
// original
inline float hermite1(float x, float y0, float y1, float y2, float y3)
{
    // 4-point, 3rd-order Hermite (x-form)
    float c0 = y1;
    float c1 = 0.5f * (y2 - y0);
    float c2 = y0 - 2.5f * y1 + 2.f * y2 - 0.5f * y3;
    float c3 = 1.5f * (y1 - y2) + 0.5f * (y3 - y0);

    return ((c3 * x + c2) * x + c1) * x + c0;
}

// james mccartney
inline float hermite2(float x, float y0, float y1, float y2, float y3)
{
    // 4-point, 3rd-order Hermite (x-form)
    float c0 = y1;
    float c1 = 0.5f * (y2 - y0);
    float c3 = 1.5f * (y1 - y2) + 0.5f * (y3 - y0);
    float c2 = y0 - y1 + c1 - c3;

    return ((c3 * x + c2) * x + c1) * x + c0;
}

// james mccartney
inline float hermite3(float x, float y0, float y1, float y2, float y3)
{
        // 4-point, 3rd-order Hermite (x-form)
        float c0 = y1;
        float c1 = 0.5f * (y2 - y0);
        float y0my1 = y0 - y1;
        float c3 = (y1 - y2) + 0.5f * (y3 - y0my1 - y2);
        float c2 = y0my1 + c1 - c3;

        return ((c3 * x + c2) * x + c1) * x + c0;
}

// laurent de soras
inline float hermite4(float frac_pos, float xm1, float x0, float x1, float x2)
{
   const float    c     = (x1 - xm1) * 0.5f;
   const float    v     = x0 - x1;
   const float    w     = c + v;
   const float    a     = w + v + (x2 - x0) * 0.5f;
   const float    b_neg = w + a;

   return ((((a * frac_pos) - b_neg) * frac_pos + c) * frac_pos + x0);
}


9 comment(s) | add a comment | nofrills version


Hilbert Filter Coefficient Calculation

Type : Uncle Hilbert
References : Posted by Christian[at]savioursofsoul[dot]de

Notes :
This is the delphi code to create the filter coefficients, which are needed to phaseshift a signal by 90°
This may be useful for an evelope detector...

By windowing the filter coefficients you can trade phase response flatness with magnitude response flatness.

I had problems checking its response by using a dirac impulse. White noise works fine.

Also this introduces a latency of N/2!


Code :
type TSingleArray = Array of Single;

procedure UncleHilbert(var FilterCoefficients: TSingleArray; N : Integer);
var i,j : Integer;
begin
SetLength(FilterCoefficients,N);
for i:=0 to (N div 4) do
  begin
   FilterCoefficients[(N div 2)+(2*i-1)]:=+2/(PI*(2*i-1));
   FilterCoefficients[(N div 2)-(2*i-1)]:=-2/(PI*(2*i-1));
  end;
end;


no comments on this item | add a comment | nofrills version


Hiqh quality /2 decimators

Type : Decimators
References : Posted by Paul Sernine

Notes :
These are /2 decimators,
Just instanciate one of them and use the Calc method to obtain one sample while inputing two. There is 5,7 and 9 tap versions.
They are extracted/adapted from a tutorial code by Thierry Rochebois. The optimal coefficients are excerpts of Traitement numérique du signal, 5eme edition, M Bellanger, Masson pp. 339-346.


Code :
//Filtres décimateurs
// T.Rochebois
// Based on
//Traitement numérique du signal, 5eme edition, M Bellanger, Masson pp. 339-346
class Decimateur5
{
  private:
  float R1,R2,R3,R4,R5;
  const float h0;
  const float h1;
  const float h3;
  const float h5;
  public:
  Decimateur5::Decimateur5():h0(346/692.0f),h1(208/692.0f),h3(-44/692.0f),h5(9/692.0f)
  {
    R1=R2=R3=R4=R5=0.0f;
  }
  float Calc(const float x0,const float x1)
  {
    float h5x0=h5*x0;
    float h3x0=h3*x0;
    float h1x0=h1*x0;
    float R6=R5+h5x0;
    R5=R4+h3x0;
    R4=R3+h1x0;
    R3=R2+h1x0+h0*x1;
    R2=R1+h3x0;
    R1=h5x0;
    return R6;
  }
};
class Decimateur7
{
  private:
  float R1,R2,R3,R4,R5,R6,R7;
  const float h0,h1,h3,h5,h7;
  public:
  Decimateur7::Decimateur7():h0(802/1604.0f),h1(490/1604.0f),h3(-116/1604.0f),h5(33/1604.0f),h7(-6/1604.0f)
  {
    R1=R2=R3=R4=R5=R6=R7=0.0f;
  }
  float Calc(const float x0,const float x1)
  {
    float h7x0=h7*x0;
    float h5x0=h5*x0;
    float h3x0=h3*x0;
    float h1x0=h1*x0;
    float R8=R7+h7x0;
    R7=R6+h5x0;
    R6=R5+h3x0;
    R5=R4+h1x0;
    R4=R3+h1x0+h0*x1;
    R3=R2+h3x0;
    R2=R1+h5x0;
    R1=h7x0;
    return R8;
  }
};
class Decimateur9
{
  private:
  float R1,R2,R3,R4,R5,R6,R7,R8,R9;
  const float h0,h1,h3,h5,h7,h9;
  public:
  Decimateur9::Decimateur9():h0(8192/16384.0f),h1(5042/16384.0f),h3(-1277/16384.0f),h5(429/16384.0f),h7(-116/16384.0f),h9(18/16384.0f)
  {
    R1=R2=R3=R4=R5=R6=R7=R8=R9=0.0f;
  }
  float Calc(const float x0,const float x1)
  {
    float h9x0=h9*x0;
    float h7x0=h7*x0;
    float h5x0=h5*x0;
    float h3x0=h3*x0;
    float h1x0=h1*x0;
    float R10=R9+h9x0;
    R9=R8+h7x0;
    R8=R7+h5x0;
    R7=R6+h3x0;
    R6=R5+h1x0;
    R5=R4+h1x0+h0*x1;
    R4=R3+h3x0;
    R3=R2+h5x0;
    R2=R1+h7x0;
    R1=h9x0;
    return R10;
  }
};


3 comment(s) | add a comment | nofrills version


Inverted parabolic envelope

Type : envellope generation
References : Posted by James McCartney
Code :
dur = duration in samples
midlevel = amplitude at midpoint
beglevel = beginning and ending level (typically zero)

amp = midlevel - beglevel;

rdur = 1.0 / dur;
rdur2 = rdur * rdur;

level = beglevel;
slope = 4.0 * amp * (rdur - rdur2);
curve = -8.0 * amp * rdur2;

...

for (i=0; i<dur; ++i) {
        level += slope;
        slope += curve;
}


3 comment(s) | add a comment | nofrills version


Java FFT

Type : FFT Analysis
References : Posted by Loreno Heer

Notes :
May not work correctly ;-)

Code :
// WTest.java
/*
    Copyright (C) 2003 Loreno Heer, (helohe at bluewin dot ch)

    This program is free software; you can redistribute it and/or modify
    it under the terms of the GNU General Public License as published by
    the Free Software Foundation; either version 2 of the License, or
    (at your option) any later version.

    This program is distributed in the hope that it will be useful,
    but WITHOUT ANY WARRANTY; without even the implied warranty of
    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
    GNU General Public License for more details.

    You should have received a copy of the GNU General Public License
    along with this program; if not, write to the Free Software
    Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA

*/
public class WTest{

    private static double[] sin(double step, int size){
        double f = 0;
        double[] ret = new double[size];
        for(int i = 0; i < size; i++){
            ret[i] = Math.sin(f);
            f += step;
        }
        return ret;
    }

    private static double[] add(double[] a, double[] b){
        double[] c = new double[a.length];
        for(int i = 0; i < a.length; i++){
            c[i] = a[i] + b[i];
        }
        return c;
    }

    private static double[] sub(double[] a, double[] b){
        double[] c = new double[a.length];
        for(int i = 0; i < a.length; i++){
            c[i] = a[i] - b[i];
        }
        return c;
    }

    private static double[] add(double[] a, double b){
        double[] c = new double[a.length];
        for(int i = 0; i < a.length; i++){
            c[i] = a[i] + b;
        }
        return c;
    }

    private static double[] cp(double[] a, int size){
        double[] c = new double[size];
        for(int i = 0; i < size; i++){
            c[i] = a[i];
        }
        return c;
    }

    private static double[] mul(double[] a, double b){
        double[] c = new double[a.length];
        for(int i = 0; i < a.length; i++){
            c[i] = a[i] * b;
        }
        return c;
    }

    private static void print(double[] value){
        for(int i = 0; i < value.length; i++){
            System.out.print(i + "," + value[i] + "\n");
        }
        System.out.println();
    }

    private static double abs(double[] a){
        double c = 0;
        for(int i = 0; i < a.length; i++){
            c = ((c * i) + Math.abs(a[i])) / (i + 1);
        }
        return c;
    }

    private static double[] fft(double[] a, int min, int max, int step){
        double[] ret = new double[(max - min) / step];
        int i = 0;
        for(int d = min; d < max; d = d + step){
            double[] f = sin(fc(d), a.length);
            double[] dif = sub(a, f);
            ret[i] = 1 - abs(dif);
            i++;
        }
        return ret;
    }

    private static double[] fft_log(double[] a){
        double[] ret = new double[1551];
        int i = 0;
        for(double d = 0; d < 15.5; d = d + 0.01){
            double[] f = sin(fc(Math.pow(2,d)), a.length);
            double[] dif = sub(a, f);
            ret[i] = Math.abs(1 - abs(dif));
            i++;
        }
        return ret;
    }

    private static double fc(double d){
        return d * Math.PI / res;
    }

    private static void print_log(double[] value){
        for(int i = 0; i < value.length; i++){
            System.out.print(Math.pow(2,((double)i/100d)) + "," + value[i] + "\n");
        }
        System.out.println();
    }

    public static void main(String[] args){
        double[] f_0 = sin(fc(440), sample_length); // res / pi =>14005
        //double[] f_1 = sin(.02, sample_length);
        double[] f_2 = sin(fc(520), sample_length);
        //double[] f_3 = sin(.25, sample_length);

        //double[] f = add( add( add(f_0, f_1), f_2), f_3);

        double[] f = add(f_0, f_2);

        //print(f);

        double[] d = cp(f,1000);
        print_log(fft_log(d));
    }

    static double length = .2; // sec
    static int res = 44000; // resoultion (pro sec)
    static int sample_length = res; // resoultion

}


2 comment(s) | add a comment | nofrills version


Karlsen

Type : 24-dB (4-pole) lowpass
References : Posted by Best Regards,Ove Karlsen

Notes :
There's really not much voodoo going on in the filter itself, it's a simple as possible:

pole1 = (in * frequency) + (pole1 * (1 - frequency));

Most of you can probably understand that math, it's very similar to how an analog condenser works.
Although, I did have to do some JuJu to add resonance to it.
While studing the other filters, I found that the feedback phase is very important to how the overall
resonance level will be, and so I made a dynamic feedback path, and constant Q approximation by manipulation
of the feedback phase.
A bonus with this filter, is that you can "overdrive" it... Try high input levels..



Code :
// Karlsen 24dB Filter by Ove Karlsen / Synergy-7 in the year 2003.
// b_f = frequency 0..1
// b_q = resonance 0..50
// b_in = input
// to do bandpass, subtract poles from eachother, highpass subtract with input.


   float b_inSH = b_in // before the while statement.

    while (b_oversample < 2) {                        //2x oversampling (@44.1khz)
        float prevfp;
        prevfp = b_fp;
        if (prevfp > 1) {prevfp = 1;}                    // Q-limiter

        b_fp = (b_fp * 0.418) + ((b_q * pole4) * 0.582);        // dynamic feedback
        float intfp;
        intfp = (b_fp * 0.36) + (prevfp * 0.64);            // feedback phase
        b_in =    b_inSH - intfp;                        // inverted feedback        
                    
        pole1 = (b_in   * b_f) + (pole1 * (1 - b_f));            // pole 1
        if (pole1 > 1) {pole1 = 1;} else if (pole1 < -1) {pole1 = -1;}  // pole 1 clipping
        pole2 = (pole1   * b_f) + (pole2 * (1 - b_f));            // pole 2
        pole3 = (pole2   * b_f) + (pole3 * (1 - b_f));            // pole 3
        pole4 = (pole3   * b_f) + (pole4 * (1 - b_f));            // pole 4

        b_oversample++;
        }
    lowpassout = b_in;




23 comment(s) | add a comment | nofrills version


Karlsen Fast Ladder

Type : 4 pole ladder emulation
References : Posted by arifovekarlsen[AT]hotmail[DOT]com

Notes :
ATTN Admin: You should remove the old version named "Karlsen" on your website, and rather include this one instead.

Code :
// An updated version of "Karlsen 24dB Filter"
// This time, the fastest incarnation possible.
// The very best greetings, Arif Ove Karlsen.
// arifovekarlsen->hotmail.com

b_rscl = b_buf4; if (b_rscl > 1) {b_rscl = 1;}
b_in = (-b_rscl * b_rez) + b_in;
b_buf1 = ((-b_buf1 + b_in1) * b_cut) + b_buf1;
b_buf2 = ((-b_buf2 + b_buf1) * b_cut) + b_buf2;
b_buf3 = ((-b_buf3 + b_buf2) * b_cut) + b_buf3;
b_buf4 = ((-b_buf4 + b_buf3) * b_cut) + b_buf4;
b_lpout = b_buf4;


9 comment(s) | add a comment | nofrills version


Linear interpolation

Type : Linear interpolators for oversampled audio
References : Posted by scoofy[AT]inf[DOT]elte[DOT]hu

Notes :
Simple, fast linear interpolators for upsampling a signal by a factor of 2,4,8,16 or 32. Not very usable on their own since they introduce aliasing (but still better than zero order hold). These are best used with already oversampled signals.

-- Peter Schoffhauzer


Code :
#ifndef __LIN_INTERPOLATOR_H_
#define __LIN_INTERPOLATOR_H_

/************************************************************************
*    Linear interpolator class                                            *
************************************************************************/

class interpolator_linear
{
public:
    interpolator_linear() {
        reset_hist();
    }
    
    // reset history
    void reset_hist() {
        d1 = 0.f;
    }

    // 2x interpolator
    // out: pointer to float[2]
    inline void process2x(float const in, float *out) {
        out[0] = d1 + 0.5f*(in-d1);    // interpolate
        out[1] = in;
        d1 = in; // store delay
    }

    // 4x interpolator
    // out: pointer to float[4]
    inline void process4x(float const in, float *out) {
        float y = in-d1;
        out[0] = d1 + 0.25f*y;    // interpolate
        out[1] = d1 + 0.5f*y;
        out[2] = d1 + 0.75f*y;
        out[3] = in;
        d1 = in; // store delay
    }

    // 8x interpolator
    // out: pointer to float[8]
    inline void process8x(float const in, float *out) {
        float y = in-d1;
        out[0] = d1 + 0.125f*y;    // interpolate
        out[1] = d1 + 0.25f*y;
        out[2] = d1 + 0.375f*y;
        out[3] = d1 + 0.5f*y;
        out[4] = d1 + 0.625f*y;
        out[5] = d1 + 0.75f*y;
        out[6] = d1 + 0.875f*y;
        out[7] = in;
        d1 = in; // store delay
    }

    // 16x interpolator
    // out: pointer to float[16]
    inline void process16x(float const in, float *out) {
        float y = in-d1;
        out[0] = d1 + (1.0f/16.0f)*y;    // interpolate
        out[1] = d1 + (2.0f/16.0f)*y;    
        out[2] = d1 + (3.0f/16.0f)*y;    
        out[3] = d1 + (4.0f/16.0f)*y;    
        out[4] = d1 + (5.0f/16.0f)*y;    
        out[5] = d1 + (6.0f/16.0f)*y;    
        out[6] = d1 + (7.0f/16.0f)*y;    
        out[7] = d1 + (8.0f/16.0f)*y;    
        out[8] = d1 + (9.0f/16.0f)*y;    
        out[9] = d1 + (10.0f/16.0f)*y;    
        out[10] = d1 + (11.0f/16.0f)*y;    
        out[11] = d1 + (12.0f/16.0f)*y;    
        out[12] = d1 + (13.0f/16.0f)*y;    
        out[13] = d1 + (14.0f/16.0f)*y;    
        out[14] = d1 + (15.0f/16.0f)*y;    
        out[15] = in;        
        d1 = in; // store delay
    }

    // 32x interpolator
    // out: pointer to float[32]
    inline void process32x(float const in, float *out) {
        float y = in-d1;
        out[0] = d1 + (1.0f/32.0f)*y;    // interpolate
        out[1] = d1 + (2.0f/32.0f)*y;    
        out[2] = d1 + (3.0f/32.0f)*y;    
        out[3] = d1 + (4.0f/32.0f)*y;    
        out[4] = d1 + (5.0f/32.0f)*y;    
        out[5] = d1 + (6.0f/32.0f)*y;    
        out[6] = d1 + (7.0f/32.0f)*y;    
        out[7] = d1 + (8.0f/32.0f)*y;    
        out[8] = d1 + (9.0f/32.0f)*y;    
        out[9] = d1 + (10.0f/32.0f)*y;    
        out[10] = d1 + (11.0f/32.0f)*y;    
        out[11] = d1 + (12.0f/32.0f)*y;    
        out[12] = d1 + (13.0f/32.0f)*y;    
        out[13] = d1 + (14.0f/32.0f)*y;    
        out[14] = d1 + (15.0f/32.0f)*y;    
        out[15] = d1 + (16.0f/32.0f)*y;    
        out[16] = d1 + (17.0f/32.0f)*y;    
        out[17] = d1 + (18.0f/32.0f)*y;    
        out[18] = d1 + (19.0f/32.0f)*y;    
        out[19] = d1 + (20.0f/32.0f)*y;    
        out[20] = d1 + (21.0f/32.0f)*y;    
        out[21] = d1 + (22.0f/32.0f)*y;    
        out[22] = d1 + (23.0f/32.0f)*y;    
        out[23] = d1 + (24.0f/32.0f)*y;    
        out[24] = d1 + (25.0f/32.0f)*y;    
        out[25] = d1 + (26.0f/32.0f)*y;    
        out[26] = d1 + (27.0f/32.0f)*y;    
        out[27] = d1 + (28.0f/32.0f)*y;    
        out[28] = d1 + (29.0f/32.0f)*y;    
        out[29] = d1 + (30.0f/32.0f)*y;    
        out[30] = d1 + (31.0f/32.0f)*y;    
        out[31] = in;        
        d1 = in; // store delay
    }

private:
    float d1; // previous input
};

#endif


2 comment(s) | add a comment | nofrills version


Lo-Fi Crusher

Type : Quantizer / Decimator with smooth control
References : Posted by David Lowenfels

Notes :
Yet another bitcrusher algorithm. But this one has smooth parameter control.

Normfreq goes from 0 to 1.0; (freq/samplerate)
Input is assumed to be between 0 and 1.
Output gain is greater than unity when bits < 1.0;


Code :
function output = crusher( input, normfreq, bits );
    step = 1/2^(bits);
    phasor = 0;
    last = 0;

    for i = 1:length(input)
       phasor = phasor + normfreq;
       if (phasor >= 1.0)
          phasor = phasor - 1.0;
          last = step * floor( input(i)/step + 0.5 ); %quantize
       end
       output(i) = last; %sample and hold
    end
end


4 comment(s) | add a comment | nofrills version


Lock free fifo

References : Posted by Timo
Linked file : LockFreeFifo.h

Notes :
Simple implementation of a lock free FIFO.



2 comment(s) | add a comment | nofrills version


Look ahead limiting

References : Posted by Wilfried Welti

Notes :
use add_value with all values which enter the look-ahead area,
and remove_value with all value which leave this area. to get
the maximum value in the look-ahead area, use get_max_value.
in the very beginning initialize the table with zeroes.

If you always want to know the maximum amplitude in
your look-ahead area, the thing becomes a sorting
problem. very primitive approach using a look-up table


Code :
void lookup_add(unsigned section, unsigned size, unsigned value)
{
  if (section==value)
    lookup[section]++;
  else
  {
    size >>= 1;
    if (value>section)
    {
      lookup[section]++;
      lookup_add(section+size,size,value);
    }
    else
      lookup_add(section-size,size,value);
  }
}

void lookup_remove(unsigned section, unsigned size, unsigned value)
{
  if (section==value)
    lookup[section]--;
  else
  {
    size >>= 1;
    if (value>section)
    {
      lookup[section]--;
      lookup_remove(section+size,size,value);
    }
    else
      lookup_remove(section-size,size,value);
  }
}

unsigned lookup_getmax(unsigned section, unsigned size)
{
  unsigned max = lookup[section] ? section : 0;
  size >>= 1;
  if (size)
    if (max)
    {
      max = lookup_getmax((section+size),size);
      if (!max) max=section;
    }
    else
      max = lookup_getmax((section-size),size);
  return max;
}

void add_value(unsigned value)
{
  lookup_add(LOOKUP_VALUES>>1, LOOKUP_VALUES>>1, value);
}

void remove_value(unsigned value)
{
  lookup_remove(LOOKUP_VALUES>>1, LOOKUP_VALUES>>1, value);
}

unsigned get_max_value()
{
  return lookup_getmax(LOOKUP_VALUES>>1, LOOKUP_VALUES>>1);
}


no comments on this item | add a comment | nofrills version


Lookahead Limiter

Type : Limiter
References : Posted by Christian at savioursofsoul dot de

Notes :
I've been thinking about this for a long time and this is the best I came up with so far. I might be all wrong, but according to some simulations this looks quite nice (as I want it to be).

The below algorithm is written in prosa. It's up to you to transfer it into code.


Code :
Ingredients:
------------

1 circular buffers (size of the look ahead time)
2 circular buffers (half the size of the look ahead time)
4 parameters ('Lookahead Time [s]', 'Input Gain [dB]', 'Output Gain [dB]' and 'Release Time [s])
a handfull state variables

Recipe:
-------

0. Make sure all buffers are properly initialized and do not contain any dirt (pure zeros are what we need).

For each sample do the following procedure:

1. Store current sample in the lookahead time circular buffer, for later use (and retrieve the value that falls out as the preliminary 'Output')

2. Find maximum within this circular buffer. This can also be implemented efficiently with an hold algorithm.

3. Gain this maximum by the 'Input Gain [dB]' parameter

4. Calculate necessary gain reduction factor (=1, if no gain reduction takes place and <1 for any signal above 0 dBFS)

5. Eventually subtract this value from 1 for a better numerical stability. (MUST BE UNDONE LATER!)

6. Add this gain reduction value to the first of the smaller circular buffers to calculate the short time sum (add this value to a sum and subtract the value that felt out of the circular buffer).

7. normalize the sum by dividing it by the length of the circular buffer (-> / ('Lookahead Time' [samples] / 2))

8. repeat step 6 & 7 with this sum in the second circular buffer. The reason for these steps is to transform dirac impulses to a triangle (dirac -> rect -> triangle)

9. apply the release time (release time -> release slew rate 'factor' -> multiply by that factor) to the 'Maximum Gain Reduction' state variable

10. check whether the currently calculated gain reduction is higher than the 'Maximum Gain Reduction'. If so, replace!

11. eventually remove (1 - x) from step 5 here

12. calculate effective gain reduction by the above value gained by input and output gain.

13. Apply this gain reduction to the preliminary 'Output' from step 1

Repeat the above procedure (step 1-13) for all samples!


2 comment(s) | add a comment | nofrills version


Lowpass filter for parameter edge filtering

References : Olli Niemitalo
Linked file : filter001.gif

Notes :
use this filter to smooth sudden parameter changes
(see linkfile!)


Code :
/* - Three one-poles combined in parallel
* - Output stays within input limits
* - 18 dB/oct (approx) frequency response rolloff
* - Quite fast, 2x3 parallel multiplications/sample, no internal buffers
* - Time-scalable, allowing use with different samplerates
* - Impulse and edge responses have continuous differential
* - Requires high internal numerical precision
*/
{
        /* Parameters */
        // Number of samples from start of edge to halfway to new value
        const double        scale = 100;
        // 0 < Smoothness < 1. High is better, but may cause precision problems
        const double        smoothness = 0.999;

        /* Precalc variables */
        double                a = 1.0-(2.4/scale); // Could also be set directly
        double                b = smoothness;      //         -"-
        double                acoef = a;
        double                bcoef = a*b;
        double                ccoef = a*b*b;
        double                mastergain = 1.0 / (-1.0/(log(a)+2.0*log(b))+2.0/
                        (log(a)+log(b))-1.0/log(a));
        double                again = mastergain;
        double                bgain = mastergain * (log(a*b*b)*(log(a)-log(a*b)) /
                            ((log(a*b*b)-log(a*b))*log(a*b))
                            - log(a)/log(a*b));
        double                cgain = mastergain * (-(log(a)-log(a*b)) /
                        (log(a*b*b)-log(a*b)));

        /* Runtime variables */
        long                streamofs;
        double                areg = 0;
        double                breg = 0;
        double                creg = 0;

        /* Main loop */
        for (streamofs = 0; streamofs < streamsize; streamofs++)
        {
                /* Update filters */
                areg = acoef * areg + fromstream [streamofs];
                breg = bcoef * breg + fromstream [streamofs];
                creg = ccoef * creg + fromstream [streamofs];

                /* Combine filters in parallel */
                long                temp =   again * areg
                                       + bgain * breg
                                       + cgain * creg;

                /* Check clipping */
                if (temp > 32767)
                {
                        temp = 32767;
                }
                else if (temp < -32768)
                {
                        temp = -32768;
                }

                /* Store new value */
                tostream [streamofs] = temp;
        }
}


2 comment(s) | add a comment | nofrills version


LP and HP filter

Type : biquad, tweaked butterworth
References : Posted by Patrice Tarrabia
Code :
r  = rez amount, from sqrt(2) to ~ 0.1
f  = cutoff frequency
(from ~0 Hz to SampleRate/2 - though many
synths seem to filter only  up to SampleRate/4)

The filter algo:
out(n) = a1 * in + a2 * in(n-1) + a3 * in(n-2) - b1*out(n-1) - b2*out(n-2)

Lowpass:
      c = 1.0 / tan(pi * f / sample_rate);

      a1 = 1.0 / ( 1.0 + r * c + c * c);
      a2 = 2* a1;
      a3 = a1;
      b1 = 2.0 * ( 1.0 - c*c) * a1;
      b2 = ( 1.0 - r * c + c * c) * a1;

Hipass:
      c = tan(pi * f / sample_rate);

      a1 = 1.0 / ( 1.0 + r * c + c * c);
      a2 = -2*a1;
      a3 = a1;
      b1 = 2.0 * ( c*c - 1.0) * a1;
      b2 = ( 1.0 - r * c + c * c) * a1;


17 comment(s) | add a comment | nofrills version


LPC analysis (autocorrelation + Levinson-Durbin recursion)

References : Posted by mail[AT]mutagene[DOT]net

Notes :
The autocorrelation function implements a warped autocorrelation, so that frequency resolution can be specified by the variable 'lambda'. Levinson-Durbin recursion calculates autoregression coefficients a and reflection coefficients (for lattice filter implementation) K. Comments for Levinson-Durbin function implement matlab version of the same function.

No optimizations.


Code :
//find the order-P autocorrelation array, R, for the sequence x of length L and warping of lambda
//wAutocorrelate(&pfSrc[stIndex],siglen,R,P,0);
wAutocorrelate(float * x, unsigned int L, float * R, unsigned int P, float lambda)
{
    double    * dl = new double [L];
    double    * Rt = new double [L];
    double r1,r2,r1t;
    R[0]=0;
    Rt[0]=0;
    r1=0;
    r2=0;
    r1t=0;
    for(unsigned int k=0; k<L;k++)
    {
            Rt[0]+=double(x[k])*double(x[k]);

            dl[k]=r1-double(lambda)*double(x[k]-r2);
            r1 = x[k];
            r2 = dl[k];
    }
    for(unsigned int i=1; i<=P; i++)
    {
        Rt[i]=0;
        r1=0;
        r2=0;
        for(unsigned int k=0; k<L;k++)
        {
            Rt[i]+=double(dl[k])*double(x[k]);
            
            r1t = dl[k];
            dl[k]=r1-double(lambda)*double(r1t-r2);
            r1 = r1t;
            r2 = dl[k];
        }
    }
    for(i=0; i<=P; i++)
        R[i]=float(Rt[i]);
    delete[] dl;
    delete[] Rt;
}

// Calculate the Levinson-Durbin recursion for the autocorrelation sequence R of length P+1 and return the autocorrelation coefficients a and reflection coefficients K
LevinsonRecursion(unsigned int P, float *R, float *A, float *K)
{
    double Am1[62];

    if(R[0]==0.0) {
        for(unsigned int i=1; i<=P; i++)
        {
            K[i]=0.0;
            A[i]=0.0;
        }}
    else {
        double km,Em1,Em;
        unsigned int k,s,m;
        for (k=0;k<=P;k++){
            A[0]=0;
            Am1[0]=0; }
        A[0]=1;
        Am1[0]=1;
        km=0;
        Em1=R[0];
        for (m=1;m<=P;m++)                    //m=2:N+1
        {
            double err=0.0f;                    //err = 0;
            for (k=1;k<=m-1;k++)            //for k=2:m-1
                err += Am1[k]*R[m-k];        // err = err + am1(k)*R(m-k+1);
            km = (R[m]-err)/Em1;            //km=(R(m)-err)/Em1;
            K[m-1] = -float(km);
            A[m]=(float)km;                        //am(m)=km;
            for (k=1;k<=m-1;k++)            //for k=2:m-1
                A[k]=float(Am1[k]-km*Am1[m-k]);  // am(k)=am1(k)-km*am1(m-k+1);
            Em=(1-km*km)*Em1;                //Em=(1-km*km)*Em1;
            for(s=0;s<=P;s++)                //for s=1:N+1
                Am1[s] = A[s];                // am1(s) = am(s)
            Em1 = Em;                        //Em1 = Em;
        }
    }
    return 0;
}


3 comment(s) | add a comment | nofrills version


LPF 24dB/Oct

Type : Chebyshev
References : Posted by Christian[AT]savioursofsoul[DOT]de
Code :
First calculate the prewarped digital frequency:

K = tan(Pi * Frequency / Samplerate);

Now we calc some Coefficients:

sg = Sinh(PassbandRipple);
cg = Cosh(PassbandRipple);
cg *= cg;

Coeff[0] = 1 / (cg-0.85355339059327376220042218105097);
Coeff[1] = K * Coeff[0]*sg*1.847759065022573512256366378792;
Coeff[2] = 1 / (cg-0.14644660940672623779957781894758);
Coeff[3] = K * Coeff[2]*sg*0.76536686473017954345691996806;

K *= K; // (just to optimize it a little bit)

Calculate the first biquad:

A0 =   (Coeff[1]+K+Coeff[0]);
A1 = 2*(Coeff[0]-K)*t;
A2 =   (Coeff[1]-K-Coeff[0])*t;
B0 = t*K;
B1 = 2*B0;
B2 = B0;

Calculate the second biquad:

A3 =   (Coeff[3]+K+Coeff[2]);
A4 = 2*(Coeff[2]-K)*t;
A5 =   (Coeff[3]-K-Coeff[2])*t;
B3 = t*K;
B4 = 2*B3;
B5 = B3;

Then calculate the output as follows:

Stage1 = B0*Input + State0;
State0 = B1*Input + A1/A0*Stage1 + State1;
State1 = B2*Input + A2/A0*Stage1;

Output = B3*Stage1 + State2;
State2 = B4*Stage1 + A4/A3*Output + State2;
State3 = B5*Stage1 + A5/A3*Output;


3 comment(s) | add a comment | nofrills version


Magnitude and phase plot of arbitrary IIR function, up to 5th order

Type : magnitude and phase at any frequency
References : Posted by George Yohng

Notes :
Amplitude and phase calculation of IIR equation
run at sample rate "sampleRate" at frequency "F".

AMPLITUDE
-----------
cf_mag(F,sampleRate,
a0,a1,a2,a3,a4,a5,
b0,b1,b2,b3,b4,b5)
-----------

PHASE
-----------
cf_phi(F,sampleRate,
a0,a1,a2,a3,a4,a5,
b0,b1,b2,b3,b4,b5)
-----------

If you need a frequency diagram, draw a plot for
F=0...sampleRate/2

If you need amplitude in dB, use cf_lin2db(cf_mag(.......))

Set b0=-1 if you have such function:

y[n] = a0*x[n] + a1*x[n-1] + a2*x[n-2] + a3*x[n-3] + a4*x[n-4] + a5*x[n-5] +
+ b1*y[n-1] + b2*y[n-2] + b3*y[n-3] + b4*y[n-4] + b5*y[n-5];

Set b0=1 if you have such function:

y[n] = a0*x[n] + a1*x[n-1] + a2*x[n-2] + a3*x[n-3] + a4*x[n-4] + a5*x[n-5] +
- b1*y[n-1] - b2*y[n-2] - b3*y[n-3] - b4*y[n-4] - b5*y[n-5];

Do not try to reverse engineer these formulae - they don't give any sense
other than they are derived from transfer function, and they work. :)


Code :
/*
C file can be downloaded from
http://www.yohng.com/dsp/cfsmp.c
*/


#define C_PI 3.14159265358979323846264

double cf_mag(double f,double rate,
              double a0,double a1,double a2,double a3,double a4,double a5,
              double b0,double b1,double b2,double b3,double b4,double b5)
{
    return
        sqrt((a0*a0 + a1*a1 + a2*a2 + a3*a3 + a4*a4 + a5*a5 +
              2*(a0*a1 + a1*a2 + a2*a3 + a3*a4 + a4*a5)*cos((2*f*C_PI)/rate) +
              2*(a0*a2 + a1*a3 + a2*a4 + a3*a5)*cos((4*f*C_PI)/rate) +
              2*a0*a3*cos((6*f*C_PI)/rate) + 2*a1*a4*cos((6*f*C_PI)/rate) +
              2*a2*a5*cos((6*f*C_PI)/rate) + 2*a0*a4*cos((8*f*C_PI)/rate) +
              2*a1*a5*cos((8*f*C_PI)/rate) + 2*a0*a5*cos((10*f*C_PI)/rate))/
              (b0*b0 + b1*b1 + b2*b2 + b3*b3 + b4*b4 + b5*b5 +
              2*(b0*b1 + b1*b2 + b2*b3 + b3*b4 + b4*b5)*cos((2*f*C_PI)/rate) +
              2*(b0*b2 + b1*b3 + b2*b4 + b3*b5)*cos((4*f*C_PI)/rate) +
              2*b0*b3*cos((6*f*C_PI)/rate) + 2*b1*b4*cos((6*f*C_PI)/rate) +
              2*b2*b5*cos((6*f*C_PI)/rate) + 2*b0*b4*cos((8*f*C_PI)/rate) +
              2*b1*b5*cos((8*f*C_PI)/rate) + 2*b0*b5*cos((10*f*C_PI)/rate)));
}


double cf_phi(double f,double rate,
              double a0,double a1,double a2,double a3,double a4,double a5,
              double b0,double b1,double b2,double b3,double b4,double b5)
{
        atan2((a0*b0 + a1*b1 + a2*b2 + a3*b3 + a4*b4 + a5*b5 +
              (a0*b1 + a1*(b0 + b2) + a2*(b1 + b3) + a5*b4 + a3*(b2 + b4) +
              a4*(b3 + b5))*cos((2*f*C_PI)/rate) +
              ((a0 + a4)*b2 + (a1 + a5)*b3 + a2*(b0 + b4) +
              a3*(b1 + b5))*cos((4*f*C_PI)/rate) + a3*b0*cos((6*f*C_PI)/rate) +
              a4*b1*cos((6*f*C_PI)/rate) + a5*b2*cos((6*f*C_PI)/rate) +
              a0*b3*cos((6*f*C_PI)/rate) + a1*b4*cos((6*f*C_PI)/rate) +
              a2*b5*cos((6*f*C_PI)/rate) + a4*b0*cos((8*f*C_PI)/rate) +
              a5*b1*cos((8*f*C_PI)/rate) + a0*b4*cos((8*f*C_PI)/rate) +
              a1*b5*cos((8*f*C_PI)/rate) +
              (a5*b0 + a0*b5)*cos((10*f*C_PI)/rate))/
              (b0*b0 + b1*b1 + b2*b2 + b3*b3 + b4*b4 + b5*b5 +
              2*((b0*b1 + b1*b2 + b3*(b2 + b4) + b4*b5)*cos((2*f*C_PI)/rate) +
              (b2*(b0 + b4) + b3*(b1 + b5))*cos((4*f*C_PI)/rate) +
              (b0*b3 + b1*b4 + b2*b5)*cos((6*f*C_PI)/rate) +
              (b0*b4 + b1*b5)*cos((8*f*C_PI)/rate) +
              b0*b5*cos((10*f*C_PI)/rate))),

             ((a1*b0 + a3*b0 + a5*b0 - a0*b1 + a2*b1 + a4*b1 - a1*b2 +
              a3*b2 + a5*b2 - a0*b3 - a2*b3 + a4*b3 -
              a1*b4 - a3*b4 + a5*b4 - a0*b5 - a2*b5 - a4*b5 +
              2*(a3*b1 + a5*b1 - a0*b2 + a4*(b0 + b2) - a1*b3 + a5*b3 +
              a2*(b0 - b4) - a0*b4 - a1*b5 - a3*b5)*cos((2*f*C_PI)/rate) +
              2*(a3*b0 + a4*b1 + a5*(b0 + b2) - a0*b3 - a1*b4 - a0*b5 - a2*b5)*
              cos((4*f*C_PI)/rate) + 2*a4*b0*cos((6*f*C_PI)/rate) +
              2*a5*b1*cos((6*f*C_PI)/rate) - 2*a0*b4*cos((6*f*C_PI)/rate) -
              2*a1*b5*cos((6*f*C_PI)/rate) + 2*a5*b0*cos((8*f*C_PI)/rate) -
              2*a0*b5*cos((8*f*C_PI)/rate))*sin((2*f*C_PI)/rate))/
              (b0*b0 + b1*b1 + b2*b2 + b3*b3 + b4*b4 + b5*b5 +
              2*(b0*b1 + b1*b2 + b2*b3 + b3*b4 + b4*b5)*cos((2*f*C_PI)/rate) +
              2*(b0*b2 + b1*b3 + b2*b4 + b3*b5)*cos((4*f*C_PI)/rate) +
              2*b0*b3*cos((6*f*C_PI)/rate) + 2*b1*b4*cos((6*f*C_PI)/rate) +
              2*b2*b5*cos((6*f*C_PI)/rate) + 2*b0*b4*cos((8*f*C_PI)/rate) +
              2*b1*b5*cos((8*f*C_PI)/rate) + 2*b0*b5*cos((10*f*C_PI)/rate)));
}

double cf_lin2db(double lin)
{
    if (lin<9e-51) return -1000; /* prevent invalid operation */
    return 20*log10(lin);
}



9 comment(s) | add a comment | nofrills version


Matlab Time Domain Impulse Response Inverter/Divider

References : Posted by arcane[AT]arcanemethods[DOT]com

Notes :
Matlab code for time domain inversion of an impulse response or the division of two of them (transfer function.) The main teoplitz function is given both as a .m file and as a .c file for Matlab'w MEX compilation. The latter is much faster.



Code :
function inv=invimplms(den,n,d)
% syntax inv=invimplms(den,n,d)
%         den - denominator impulse
%         n   - length of result
%         d   - delay of result
%         inv - inverse impulse response of length n with delay d
%
% Levinson-Durbin algorithm from Proakis and Manolokis p.865
%
% Author: Bob Cain, May 1, 2001 arcane[AT]arcanemethods[DOT]com

    m=xcorr(den,n-1);
    m=m(n:end);
    b=[den(d+1:-1:1);zeros(n-d-1,1)];
    inv=toepsolve(m,b);



function quo=divimplms(num,den,n,d)
%Syntax quo=divimplms(num,den,n,d)
%       num - numerator impulse
%       den - denominator impulse
%       n   - length of result
%       d   - delay of result
%        quo - quotient impulse response of length n delayed by d
%
% Levinson-Durbin algorithm from Proakis and Manolokis p.865
%
% Author: Bob Cain, May 1, 2001 arcane@arcanemethods.com

    m=xcorr(den,n-1);
    m=m(n:end);
    b=xcorr([zeros(d,1);num],den,n-1);
    b=b(n:-1:1);
    quo=toepsolve(m,b);


function hinv=toepsolve(r,q)
% Solve Toeplitz system of equations.
%    Solves R*hinv = q, where R is the symmetric Toeplitz matrix
%    whos first column is r
%    Assumes all inputs are real
%    Inputs:  
%       r - first column of Toeplitz matrix, length n
%       q - rhs vector, length n
%    Outputs:
%       hinv - length n solution
%
%   Algorithm from Roberts & Mullis, p.233
%
%   Author: T. Krauss, Sept 10, 1997
%
%   Modified: R. Cain, Dec 16, 2004 to remove a pair of transposes
%             that caused errors.

    n=length(q);
    a=zeros(n+1,2);
    a(1,1)=1;

    hinv=zeros(n,1);
    hinv(1)=q(1)/r(1);

    alpha=r(1);
    c=1;
    d=2;

    for k=1:n-1,
       a(k+1,c)=0;
       a(1,d)=1;
       beta=0;
       j=1:k;
       beta=sum(r(k+2-j).*a(j,c))/alpha;
       a(j+1,d)=a(j+1,c)-beta*a(k+1-j,c);
       alpha=alpha*(1-beta^2);
       hinv(k+1,1)=(q(k+1)-sum(r(k+2-j).*hinv(j,1)))/alpha;
       hinv(j)=hinv(j)+a(k+2-j,d)*hinv(k+1);
       temp=c;
       c=d;
       d=temp;
    end


-----What follows is the .c version of toepsolve--------

#include <math.h>
#include "mex.h"

/*  function hinv = toepsolve(r,q);
*  TOEPSOLVE  Solve Toeplitz system of equations.
*    Solves R*hinv = q, where R is the symmetric Toeplitz matrix
*    whos first column is r
*    Assumes all inputs are real
*    Inputs:  
*       r - first column of Toeplitz matrix, length n
*       q - rhs vector, length n
*    Outputs:
*       hinv - length n solution
*
*   Algorithm from Roberts & Mullis, p.233
*
*   Author: T. Krauss, Sept 10, 1997
*
*   Modified: R. Cain, Dec 16, 2004 to replace unnecessasary
*             n by n matrix allocation for a with an n by 2 rotating
*             buffer and to more closely match the .m function.
*/
void mexFunction(
    int nlhs,
    mxArray *plhs[],
    int nrhs,
    const mxArray *prhs[]
)
{
   double (*a)[2],*hinv,alpha,beta;
   int c,d,temp,j,k;
  
   double eps = mxGetEps();
   int n = (mxGetN(prhs[0])>=mxGetM(prhs[0])) ? mxGetN(prhs[0]) : mxGetM(prhs[0]) ;
   double *r = mxGetPr(prhs[0]);
   double *q = mxGetPr(prhs[1]);
  
   a = (double (*)[2])mxCalloc((n+1)*2,sizeof(double));
   if (a == NULL) {
       mexErrMsgTxt("Sorry, failed to allocate buffer.");
   }
   a[0][0]=1.0;
  
   plhs[0] = mxCreateDoubleMatrix(n,1,0);
   hinv = mxGetPr(plhs[0]);
   hinv[0] = q[0]/r[0];
  
   alpha=r[0];
   c=0;
   d=1;
  
   for (k = 1; k < n; k++) {
       a[k][c] = 0;
       a[0][d] = 1.0;
       beta = 0.0;
       for (j = 1; j <= k; j++) {
           beta += r[k+1-j]*a[j-1][c];
       }
       beta /= alpha;
       for (j = 1; j <= k; j++) {
           a[j][d] = a[j][c] - beta*a[k-j][c];
       }
       alpha *= (1 - beta*beta);
       hinv[k] = q[k];
       for (j = 1; j <= k; j++) {
           hinv[k] -= r[k+1-j]*hinv[j-1];
       }
       hinv[k] /= alpha;
       for (j = 1; j <= k; j++) {
           hinv[j-1] += a[k+1-j][d]*hinv[k];
       }
       temp=c;
       c=d;
       d=temp;
   } /* loop over k */
  
   mxFree(a);

   return;
}





no comments on this item | add a comment | nofrills version


MATLAB-Tools for SNDAN

References : Posted by Markus Sapp
Linked file : other001.zip

Notes :
(see linkfile)



no comments on this item | add a comment | nofrills version


matlab/octave code for minblep table generation

References : Posted by dfl[at]ccrma[dot]stanford[dot]edu

Notes :
When I tested this code, it was running with each function in a separate file... so it might need some tweaking (endfunction statements?) if you try and run it all as one file.

Enjoy!

PS There's a C++ version by Daniel Werner here.
http://www.experimentalscene.com/?type=2&id=1
Not sure if it the output is any different than my version.
(eg no thresholding in minphase calculation)


Code :
% Octave/Matlab code to generate a minblep table for bandlimited synthesis
%% original minblep technique described by Eli Brandt:
%% http://www.cs.cmu.edu/~eli/L/icmc01/hardsync.html

% (c) David Lowenfels 2004
% you may use this code freely to generate your tables,
% but please send me a free copy of the software that you
% make with it, or at least send me an email to say hello
% and put my name in the software credits :)
% (IIRC: mps and clipdb functions are from Julius Smith)

% usage:
%  fc = dilation factor
%  Nzc = number of zero crossings
%  omega = oversampling factor
%  thresh = dB threshold for minimum phase calc

mbtable = minblep( fc, Nzc, omega, thresh );
mblen = length( mbtable );
save -binary mbtable.mat mbtable ktable nzc mblen;

*********************************************
function [out] = minblep( fc, Nzc, omega, thresh )

out = filter( 1, [1 -1], minblip( fc, Nzc, omega, thresh ) );

len = length( out );
normal = mean( out( floor(len*0.7):len ) )
out = out / normal; %% normalize

%% now truncate so it ends at proper phase cycle for minimum discontinuity
thresh = 1e-6;
for i = len:-1:len-1000
%  pause
  a = out(i) - thresh - 1;
    b = out(i-1) - thresh - 1;
%    i
    if(  (abs(a) < thresh) & (a > b) )
      break;
    endif
endfor

%out = out';
out = out(1:i);


*********************************************


function [out] = minblip( fc, Nzc, omega, thresh )
if (nargin < 4 )
  thresh = -100;
end
if (nargin < 3 )
  omega = 64;
end
if (nargin < 2 )
  Nzc = 16;
end
if (nargin < 1 )
  fc = 0.9;
end

blip = sinctable( omega, Nzc, fc );
%% length(blip) must be nextpow2! (if fc < 1 );

mag = fft( blip );
out = real( ifft( mps( mag, thresh ) ) );

*********************************************

function [sm] = mps(s, thresh)
% [sm] = mps(s)
% create minimum-phase spectrum sm from complex spectrum s

if (nargin < 2 )
  thresh = -100;
endif

s = clipdb(s, thresh);
sm = exp( fft( fold( ifft( log( s )))));

*********************************************
function [clipped] = clipdb(s,cutoff)
% [clipped] = clipdb(s,cutoff)
% Clip magnitude of s at its maximum + cutoff in dB.
% Example: clip(s,-100) makes sure the minimum magnitude
% of s is not more than 100dB below its maximum magnitude.
% If s is zero, nothing is done.

as = abs(s);
mas = max(as(:));
if mas==0, return; end
if cutoff >= 0, return; end
thresh = mas*10^(cutoff/20); % db to linear
toosmall = find(as < thresh);
clipped = s;
clipped(toosmall) = thresh;
*********************************************

function [out, phase] = sinctable( omega, Nzc, fc )

if (nargin < 3 )
  fc = 1.0 %% cutoff frequency
end %if
if (nargin < 2 )
  Nzc = 16  %% number of zero crossings
end %if
if (nargin < 1 )
  omega = 64 %% oversampling factor
end %if

Nzc = Nzc / fc %% This ensures more flatness at the ends.

phase = linspace( -Nzc, Nzc, Nzc*omega*2 );

%sinc = sin( pi * fc * phase) ./ (pi * fc * phase);

num = sin( pi*fc*phase );
den = pi*fc*phase;

len = length( phase );
sinc = zeros( len, 1 );

%sinc = num ./ den;

for i=1:len
  if ( den(i) ~= 0 )
    sinc(i) = num(i) / den(i);
    else
    sinc(i) = 1;
  end
end %for

out = sinc;
window = blackman( len );
out = out .* window;



no comments on this item | add a comment | nofrills version


MDCT and IMDCT based on FFTW3

Type : analysis and synthesis filterbank
References : Posted by shuhua dot zhang at gmail dot com

Notes :
MDCT/IMDCT is the most widely used filterbank in digital audio coding, e.g. MP3, AAC, WMA, OGG Vorbis, ATRAC.

suppose input x and N=size(x,1)/2. the MDCT transform matrix is
C=cos(pi/N*([0:2*N-1]'+.5+.5*N)*([0:N-1]+.5));
then MDCT spectrum for input x is
y=C'*x;
A well known fast algorithm is based on FFT :
(1) fold column-wisely the 2*N rows into N rows
(2) complex arrange the N rows into N/2 rows
(3) pre-twiddle, N/2-point complex fft, post-twiddle
(4) reorder to form the MDCT spectrum
in fact, (2)-(4) is a fast DCT-IV algorithm.

Implementation of the algorithm can be found in faac, but a little bit mess to extract for standalone use, and I ran into that problem. So I wrote some c codes to implement MDCT/IMDCT for any length that is a multiple of 4. Hopefully they will be useful to people here.

I benchmarked the codes using 3 FFT routines, FFT in faac, kiss_fft, and the awful FFTW. MDCT based on FFTW is the fastest, 2048-point MDCT single precision 10^5 times in 1.54s, about 50% of FFT in faac on my Petium IV 3G Hz.

An author of the FFTW, Steven G. Johnson, has a hard-coded fixed size MDCT of 256 input samples(http://jdj.mit.edu/~stevenj/mdct_128nr.c). My code is 13% slower than his.


Using the codes is very simple:
(1) init (declare first "extern void* mdctf_init(int)")
void* m_plan = mdctf_init(N);
(2) run mdct/imdct as many times as you wish
mdctf(freq, time, m_plan);
(3) free
mdctf_free(m_plan);

Of course you need the the fftw library. On Linux, gcc options are "-O2 -lfftw3f -lm". This is single precision.


Enjoy :)


Code :
/*********************************************************
  MDCT/IMDCT of 4x length, Single Precision, based on FFTW
  shuhua dot zhang at gmail dot com
  Dept. of E.E., Tsinghua University
*********************************************************/

#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <fftw3.h>


typedef struct {
    int              N;              // Number of time data points
    float*           twiddle;        // Twiddle factor
    fftwf_complex*   fft_in;         // fft workspace, input
    fftwf_complex*   fft_out;        // fft workspace, output
    fftwf_plan       fft_plan;       // fft configuration
} mdctf_plan;


mdctf_plan* mdctf_init(int N);
void mdctf_free(mdctf_plan* m_plan);
void mdctf(float* mdct_line, float* time_signal, mdctf_plan* m_plan);
void imdctf(float* time_signal, float* mdct_line, mdctf_plan* m_plan);


mdctf_plan* mdctf_init(int N)
{
    mdctf_plan* m_plan;
    double alpha, omiga, scale;
    int    n;

    if( 0x00 != (N & 0x03))
    {
        fprintf(stderr, " Expecting N a multiple of 4\n");    
        return NULL;
    }

    m_plan = (mdctf_plan*) malloc(sizeof(mdctf_plan));
    
    m_plan->N = N;

    m_plan->twiddle = (float*) malloc(sizeof(float) * N >> 1);
    alpha = 2.f * M_PI / (8.f * N);
    omiga = 2.f * M_PI / N;
    scale = sqrt(sqrt(2.f / N));    
    for(n = 0; n < (N >> 2); n++)
    {    
        m_plan->twiddle[2*n+0] = (float) (scale * cos(omiga * n + alpha));
        m_plan->twiddle[2*n+1] = (float) (scale * sin(omiga * n + alpha));
    }    

    m_plan->fft_in   = (fftwf_complex*) fftwf_malloc(sizeof(fftwf_complex) * N >> 2);    
    m_plan->fft_out  = (fftwf_complex*) fftwf_malloc(sizeof(fftwf_complex) * N >> 2);    
    m_plan->fft_plan = fftwf_plan_dft_1d(N >> 2,
                                         m_plan->fft_in,
                                         m_plan->fft_out,    
                                         FFTW_FORWARD,
                                         FFTW_MEASURE);

    return m_plan;

}


void mdctf_free(mdctf_plan* m_plan)
{
    fftwf_destroy_plan(m_plan->fft_plan);
    fftwf_free(m_plan->fft_in);
    fftwf_free(m_plan->fft_out);
    free(m_plan->twiddle);
    free(m_plan);
}


void mdctf(float* mdct_line, float* time_signal, mdctf_plan* m_plan)
{
    float  *xr, *xi, r0, i0;
    float  *cos_tw, *sin_tw, c, s;
    int     N4, N2, N34, N54, n;

    N4  = (m_plan->N) >> 2;
    N2  = 2 * N4;
    N34 = 3 * N4;
    N54 = 5 * N4;

    cos_tw = m_plan->twiddle;
    sin_tw = cos_tw + 1;
    
    /* odd/even folding and pre-twiddle */
    xr = (float*) m_plan->fft_in;
    xi = xr + 1;
    for(n = 0; n < N4; n += 2)
    {
        r0 = time_signal[N34-1-n] + time_signal[N34+n];    
        i0 = time_signal[N4+n]    - time_signal[N4-1-n];    
        
        c = cos_tw[n];
        s = sin_tw[n];

        xr[n] = r0 * c + i0 * s;
        xi[n] = i0 * c - r0 * s;
    }

    for(; n < N2; n += 2)
    {
        r0 = time_signal[N34-1-n] - time_signal[-N4+n];    
        i0 = time_signal[N4+n]    + time_signal[N54-1-n];    
        
        c = cos_tw[n];
        s = sin_tw[n];

        xr[n] = r0 * c + i0 * s;
        xi[n] = i0 * c - r0 * s;
    }

    /* complex FFT of N/4 long */
    fftwf_execute(m_plan->fft_plan);

    /* post-twiddle */
    xr = (float*) m_plan->fft_out;
    xi = xr + 1;
    for(n = 0; n < N2; n += 2)
    {
        r0 = xr[n];
        i0 = xi[n];
        
        c = cos_tw[n];
        s = sin_tw[n];    

        mdct_line[n]      = - r0 * c - i0 * s;
        mdct_line[N2-1-n] = - r0 * s + i0 * c;
    }
}


void imdctf(float* time_signal, float* mdct_line, mdctf_plan* m_plan)
{
    float  *xr, *xi, r0, i0, r1, i1;
    float  *cos_tw, *sin_tw, c, s;
    int     N4, N2, N34, N54, n;

    N4  = (m_plan->N) >> 2;
    N2  = 2 * N4;
    N34 = 3 * N4;
    N54 = 5 * N4;

    cos_tw = m_plan->twiddle;
    sin_tw = cos_tw + 1;
    
       /* pre-twiddle */
    xr = (float*) m_plan->fft_in;
    xi = xr + 1;
    for(n = 0; n < N2; n += 2)
    {
        r0 =  mdct_line[n];
        i0 =  mdct_line[N2-1-n];
        
        c = cos_tw[n];
        s = sin_tw[n];    
        
        xr[n] = -2.f * (i0 * s + r0 * c);
        xi[n] = -2.f * (i0 * c - r0 * s);
    }
    
    /* complex FFT of N/4 long */
    fftwf_execute(m_plan->fft_plan);

    /* odd/even expanding and post-twiddle */
    xr = (float*) m_plan->fft_out;
    xi = xr + 1;
    for(n = 0; n < N4; n += 2)
    {
        r0 = xr[n];
        i0 = xi[n];    
        
        c = cos_tw[n];
        s = sin_tw[n];

        r1 = r0 * c + i0 * s;
        i1 = r0 * s - i0 * c;

        time_signal[N34-1-n] =  r1;
        time_signal[N34+n]   =  r1;
        time_signal[N4+n]    =  i1;
        time_signal[N4-1-n]  = -i1;
    }

    for(; n < N2; n += 2)
    {
        r0 = xr[n];
        i0 = xi[n];    
        
        c = cos_tw[n];
        s = sin_tw[n];
        
        r1 = r0 * c + i0 * s;
        i1 = r0 * s - i0 * c;
        
        time_signal[N34-1-n] =  r1;
        time_signal[-N4+n]   = -r1;
        time_signal[N4+n]    =  i1;
        time_signal[N54-1-n] =  i1;
    }
}




2 comment(s) | add a comment | nofrills version


Measuring interpollation noise

References : Posted by Jon Watte

Notes :
You can easily estimate the error by evaluating the actual function and
evaluating your interpolator at each of the mid-points between your
samples. The absolute difference between these values, over the absolute
value of the "correct" value, is your relative error. log10 of your relative
error times 20 is an estimate of your quantization noise in dB. Example:

You have a table for every 0.5 "index units". The value at index unit 72.0
is 0.995 and the value at index unit 72.5 is 0.999. The interpolated value
at index 72.25 is 0.997. Suppose the actual function value at that point was
0.998; you would have an error of 0.001 which is a relative error of 0.001002004..
log10(error) is about -2.99913, which times 20 is about -59.98. Thus, that's
your quantization noise at that position in the table. Repeat for each pair of
samples in the table.

Note: I said "quantization noise" not "aliasing noise". The aliasing noise will,
as far as I know, only happen when you start up-sampling without band-limiting
and get frequency aliasing (wrap-around), and thus is mostly independent of
what specific interpolation mechanism you're using.




no comments on this item | add a comment | nofrills version


MIDI note/frequency conversion

Type : -
References : Posted by tobybear[AT]web[DOT]de

Notes :
I get often asked about simple things like MIDI note/frequency conversion, so I thought I could as well post some source code about this.
The following is Pascal/Delphi syntax, but it shouldn't be a problem to convert it to almost any language in no time.

Uses for this code are mainly for initializing oscillators to the right frequency based upon a given MIDI note, but you might also check what MIDI note is closest to a given frequency for pitch detection etc.
In realtime applications it might be a good idea to get rid of the power and log2 calculations and generate a lookup table on initialization.

A full Pascal/Delphi unit with these functions (including lookup table generation) and a simple demo application can be downloaded here:
http://tobybear.phreque.com/dsp_conv.zip

If you have any comments/suggestions, please send them to: tobybear@web.de


Code :
// MIDI NOTE/FREQUENCY CONVERSIONS

const notes:array[0..11] of string= ('C ','C#','D ','D#','E ','F ','F#','G ','G#','A ','A#','B ');
const base_a4=440; // set A4=440Hz

// converts from MIDI note number to frequency
// example: NoteToFrequency(12)=32.703
function NoteToFrequency(n:integer):double;
begin
if (n>=0)and(n<=119) then
  result:=base_a4*power(2,(n-57)/12)
else result:=-1;
end;

// converts from MIDI note number to string
// example: NoteToName(12)='C 1'
function NoteToName(n:integer):string;
begin
if (n>=0)and(n<=119) then
  result:=notes[n mod 12]+inttostr(n div 12)
else result:='---';
end;

// converts from frequency to closest MIDI note
// example: FrequencyToNote(443)=57 (A 4)
function FrequencyToNote(f:double):integer;
begin
result:=round(12*log2(f/base_a4))+57;
end;

// converts from string to MIDI note
// example: NameToNote('A4')=57
function NameToNote(s:string):integer;
var c,i:integer;
begin
if length(s)=2 then s:=s[1]+' '+s[2];
if length(s)<>3 then begin result:=-1;exit end;
s:=uppercase(s);
c:=-1;
for i:=0 to 11 do
  if notes[i]=copy(s,1,2) then
  begin
   c:=i;
   break
  end;
try
  i:=strtoint(s[3]);
  result:=i*12+c;
except
  result:=-1;
end;
if c<0 then result:=-1;
end;


3 comment(s) | add a comment | nofrills version


Millimeter to DB (faders...)

References : Posted by James McCartney

Notes :
These two functions reproduce a traditional professional
mixer fader taper.

MMtoDB converts millimeters of fader travel from the
bottom of the fader for a 100 millimeter fader into
decibels. DBtoMM is the inverse.

The taper is as follows from the top:
The top of the fader is +10 dB
100 mm to 52 mm : -5 dB per 12 mm
52 mm to 16 mm : -10 dB per 12 mm
16 mm to 4 mm : -20 dB per 12 mm
4 mm to 0 mm : fade to zero. (in these functions I go to -200dB
which is effectively zero for up to 32 bit audio.)


Code :
float MMtoDB(float mm)
{
        float db;

        mm = 100. - mm;

        if (mm <= 0.) {
                db = 10.;
        } else if (mm < 48.) {
                db = 10. - 5./12. * mm;
        } else if (mm < 84.) {
                db = -10. - 10./12. * (mm - 48.);
        } else if (mm < 96.) {
                db = -40. - 20./12. * (mm - 84.);
        } else if (mm < 100.) {
                db = -60. - 35. * (mm - 96.);
        } else db = -200.;
        return db;
}

float DBtoMM(float db)
{
        float mm;
        if (db >= 10.) {
                mm = 0.;
        } else if (db > -10.) {
                mm = -12./5. * (db - 10.);
        } else if (db > -40.) {
                mm = 48. - 12./10. * (db + 10.);
        } else if (db > -60.) {
                mm = 84. - 12./20. * (db + 40.);
        } else if (db > -200.) {
                mm = 96. - 1./35. * (db + 60.);
        } else mm = 100.;

        mm = 100. - mm;

        return mm;
}


2 comment(s) | add a comment | nofrills version


Moog Filter

Type : Antti's version (nonlinearities)
References : Posted by Christian[at]savioursofsoul[dot]de

Notes :
Here is a Delphi/Object Pascal translation of Antti's Moog Filter.

Antti wrote:

"At last DAFX I published a paper presenting a non-linear model of the Moog ladder. For that, see http://dafx04.na.infn.it/WebProc/Proc/P_061.pdf

I used quite different approach in that one. A half-sample delay ([0.5 0.5] FIR filter basically) is inserted in the feedback loop. The remaining tuning and resonance error are corrected with polynomials. This approach depends on using at least 2X oversampling - the response after nyquist/2 is abysmal but that's taken care of by the oversampling.

Victor Lazzarini has implemented my model in CSound:
http://www.csounds.com/udo/displayOpcode.php?opcode_id=32

In summary: You can use various methods, but you will need some numerically derived correction to realize exact tuning and resonance control. If you can afford 2X oversampling, use Victor's CSound code - the tuning has been tested to be very close ideal.

Ps. Remember to use real oversampling instead of the "double sampling" the CSound code uses."

I did not implemented real oversampling, but i inserted additional noise, which simulates the resistance noise and also avoids denormal problems...


Code :
http://www.savioursofsoul.de/Christian/MoogFilter.pas

14 comment(s) | add a comment | nofrills version


Moog VCF

Type : 24db resonant lowpass
References : CSound source code, Stilson/Smith CCRMA paper.

Notes :
Digital approximation of Moog VCF. Fairly easy to calculate coefficients, fairly easy to process algorithm, good sound.

Code :
//Init
cutoff = cutoff freq in Hz
fs = sampling frequency //(e.g. 44100Hz)
res = resonance [0 - 1] //(minimum - maximum)

f = 2 * cutoff / fs; //[0 - 1]
k = 3.6*f - 1.6*f*f -1; //(Empirical tunning)
p = (k+1)*0.5;
scale = e^((1-p)*1.386249;
r = res*scale;
y4 = output;

y1=y2=y3=y4=oldx=oldy1=oldy2=oldy3=0;

//Loop
//--Inverted feed back for corner peaking
x = input - r*y4;

//Four cascaded onepole filters (bilinear transform)
y1=x*p + oldx*p - k*y1;
y2=y1*p+oldy1*p - k*y2;
y3=y2*p+oldy2*p - k*y3;
y4=y3*p+oldy3*p - k*y4;

//Clipper band limited sigmoid
y4 = y4 - (y4^3)/6;

oldx = x;
oldy1 = y1;
oldy2 = y2;
oldy3 = y3;


12 comment(s) | add a comment | nofrills version


Moog VCF, variation 1

Type : 24db resonant lowpass
References : CSound source code, Stilson/Smith CCRMA paper., Paul Kellett version

Notes :
The second "q =" line previously used exp() - I'm not sure if what I've done is any faster, but this line needs playing with anyway as it controls which frequencies will self-oscillate. I
think it could be tweaked to sound better than it currently does.

Highpass / Bandpass :

They are only 6dB/oct, but still seem musically useful - the 'fruity' sound of the 24dB/oct lowpass is retained.


Code :
// Moog 24 dB/oct resonant lowpass VCF
// References: CSound source code, Stilson/Smith CCRMA paper.
// Modified by paul.kellett@maxim.abel.co.uk July 2000

  float f, p, q;             //filter coefficients
  float b0, b1, b2, b3, b4;  //filter buffers (beware denormals!)
  float t1, t2;              //temporary buffers

// Set coefficients given frequency & resonance [0.0...1.0]

  q = 1.0f - frequency;
  p = frequency + 0.8f * frequency * q;
  f = p + p - 1.0f;
  q = resonance * (1.0f + 0.5f * q * (1.0f - q + 5.6f * q * q));

// Filter (in [-1.0...+1.0])

  in -= q * b4;                          //feedback
  t1 = b1;  b1 = (in + b0) * p - b1 * f;
  t2 = b2;  b2 = (b1 + t1) * p - b2 * f;
  t1 = b3;  b3 = (b2 + t2) * p - b3 * f;
            b4 = (b3 + t1) * p - b4 * f;
  b4 = b4 - b4 * b4 * b4 * 0.166667f;    //clipping
  b0 = in;

// Lowpass  output:  b4
// Highpass output:  in - b4;
// Bandpass output:  3.0f * (b3 - b4);


7 comment(s) | add a comment | nofrills version


Moog VCF, variation 2

Type : 24db resonant lowpass
References : CSound source code, Stilson/Smith CCRMA paper., Timo Tossavainen (?) version

Notes :
in[x] and out[x] are member variables, init to 0.0 the controls:

fc = cutoff, nearly linear [0,1] -> [0, fs/2]
res = resonance [0, 4] -> [no resonance, self-oscillation]


Code :
Tdouble MoogVCF::run(double input, double fc, double res)
{
  double f = fc * 1.16;
  double fb = res * (1.0 - 0.15 * f * f);
  input -= out4 * fb;
  input *= 0.35013 * (f*f)*(f*f);
  out1 = input + 0.3 * in1 + (1 - f) * out1; // Pole 1
  in1  = input;
  out2 = out1 + 0.3 * in2 + (1 - f) * out2;  // Pole 2
  in2  = out1;
  out3 = out2 + 0.3 * in3 + (1 - f) * out3;  // Pole 3
  in3  = out2;
  out4 = out3 + 0.3 * in4 + (1 - f) * out4;  // Pole 4
  in4  = out3;
  return out4;
}


11 comment(s) | add a comment | nofrills version


Most simple and smooth feedback delay

Type : Feedback delay
References : Posted by antiprosynthesis[AT]hotmail[DOT]com

Notes :
fDlyTime = delay time parameter (0-1)

i = input index
j = delay index


Code :
if( i >= SampleRate )
    i = 0;

j = i - (fDlyTime * SampleRate);

if( j < 0 )
    j += SampleRate;

Output = DlyBuffer[ i++ ] = Input + (DlyBuffer[ j ] * fFeedback);


4 comment(s) | add a comment | nofrills version


Most simple static delay

Type : Static delay
References : Posted by antiprosynthesis[AT]hotmail[DOT]com

Notes :
This is the most simple static delay (just delays the input sound an amount of samples). Very useful for newbies also probably very easy to change in a feedback delay (for comb filters for example).

Note: fDlyTime is the delay time parameter (0 to 1)

i = input index
j = output index


Code :
if( i >= SampleRate )
    i = 0;

DlyBuffer[ i ] = Input;

j = i - (fDlyTime * SampleRate);

i++;

if( j < 0 )
    j = SampleRate + j;

Output = DlyBuffer[ j ];


9 comment(s) | add a comment | nofrills version


Motorola 56300 Disassembler

Type : disassembler
References : Posted by chris_townsend[AT]digidesign[DOT]com
Linked file : Disassemble56k.zip

Notes :
This code contains functions to disassemble Motorola 56k opcodes. The code was originally created by Stephen Davis at Stanford. I made minor modifications to support many 56300 opcodes, although it would nice to add them all at some point. Specifically, I added support for CLB, NORMF, immediate AND, immediate OR, multi-bit ASR/ASL, multi-bit LSL/LSR, MAX, MAXM, BRA, Bcc, BSR, BScc, DMAC, MACsu, MACuu, and conditional ALU instructions.



3 comment(s) | add a comment | nofrills version


Noise Shaping Class

Type : Dithering with 9th order noise shaping
References : Posted by cshei[AT]indiana.edu
Linked file : NS9dither16.h

Notes :
This is an implemetation of a 9th order noise shaping & dithering class, that runs quite fast (it has one function that uses Intel x86 assembly, but you can replace it with a different rounding function if you are running on a non-Intel platform). _aligned_malloc and _aligned_free require the MSVC++ Processor Pack, available from www.microsoft.com. You can replace them with "new" and "delete," but allocating aligned memory seems to make it run faster. Also, you can replace ZeroMemory with a memset that sets the memory to 0 if you aren't using Win32.
Input should be floats from -32768 to 32767 (processS will clip at these points for you, but clipping is bad when you are trying to convert floats to shorts). Note to reviewer - it would probably be better if you put the code in a file such as NSDither.h and have a link to it - it's rather long.

(see linked file)




2 comment(s) | add a comment | nofrills version


Nonblocking multiprocessor/multithread algorithms in C++

Type : queue, stack, garbage collection, memory allocation, templates for atomic algorithms and types
References : Posted by joshscholar[AT]yahoo[DOT]com
Linked file : ATOMIC.H

Notes :
see linked file...



2 comment(s) | add a comment | nofrills version


Notch filter

Type : 2 poles 2 zeros IIR
References : Posted by Olli Niemitalo

Notes :
Creates a muted spot in the spectrum with adjustable steepness. A complex conjugate pair of zeros on the z- plane unit circle and neutralizing poles approaching at the same angles from inside the unit circle.

Code :
Parameters:
0 =< freq =< samplerate/2
0 =< q < 1 (The higher, the narrower)

AlgoAlgo=double pi = 3.141592654;
double sqrt2 = sqrt(2.0);

double freq = 2050; // Change! (zero & pole angle)
double q = 0.4;     // Change! (pole magnitude)

double z1x = cos(2*pi*freq/samplerate);
double a0a2 = (1-q)*(1-q)/(2*(fabs(z1x)+1)) + q;
double a1 = -2*z1x*a0a2;
double b1 = -2*z1x*q;
double b2 = q*q;
double reg0, reg1, reg2;

unsigned int streamofs;
reg1 = 0;
reg2 = 0;

/* Main loop */
for (streamofs = 0; streamofs < streamsize; streamofs++)
{
  reg0 = a0a2 * ((double)fromstream[streamofs]
                 + fromstream[streamofs+2])
       + a1 * fromstream[streamofs+1]
       - b1 * reg1
       - b2 * reg2;

  reg2 = reg1;
  reg1 = reg0;

  int temp = reg0;

  /* Check clipping */
  if (temp > 32767) {
    temp = 32767;
  } else if (temp < -32768) temp = -32768;

  /* Store new value */
  tostream[streamofs] = temp;
}


6 comment(s) | add a comment | nofrills version


One pole filter, LP and HP

Type : Simple 1 pole LP and HP filter
References : Posted by scoofy[AT]inf[DOT]elte[DOT]hu

Notes :
Slope: 6dB/Oct

Reference: www.dspguide.com


Code :
Process loop (lowpass):
out = a0*in - b1*tmp;
tmp = out;

Simple HP version: subtract lowpass output from the input (has strange behaviour towards nyquist):
out = a0*in - b1*tmp;
tmp = out;
hp = in-out;

Coefficient calculation:
x = exp(-2.0*pi*freq/samplerate);
a0 = 1.0-x;
b1 = -x;






6 comment(s) | add a comment | nofrills version


One pole LP and HP

References : Posted by Bram
Code :
LP:
recursion: tmp = (1-p)*in + p*tmp with output = tmp
coefficient: p = (2-cos(x)) - sqrt((2-cos(x))^2 - 1) with x = 2*pi*cutoff/samplerate
coeficient approximation: p = (1 - 2*cutoff/samplerate)^2

HP:
recursion: tmp = (p-1)*in - p*tmp with output = tmp
coefficient: p = (2+cos(x)) - sqrt((2+cos(x))^2 - 1) with x = 2*pi*cutoff/samplerate
coeficient approximation: p = (2*cutoff/samplerate)^2


4 comment(s) | add a comment | nofrills version


One pole, one zero LP/HP

References : Posted by mistert[AT]inwind[DOT]it
Code :
void SetLPF(float fCut, float fSampling)
{
    float w = 2.0 * fSampling;
    float Norm;

    fCut *= 2.0F * PI;
    Norm = 1.0 / (fCut + w);
    b1 = (w - fCut) * Norm;
    a0 = a1 = fCut * Norm;
}

void SetHPF(float fCut, float fSampling)
{
    float w = 2.0 * fSampling;
    float Norm;

    fCut *= 2.0F * PI;
    Norm = 1.0 / (fCut + w);
    a0 = w * Norm;
    a1 = -a0;
    b1 = (w - fCut) * Norm;
}

Where
out[n] = in[n]*a0 + in[n-1]*a1 + out[n-1]*b1;


6 comment(s) | add a comment | nofrills version


One zero, LP/HP

References : Posted by Bram

Notes :
LP is only 'valid' for cutoffs > samplerate/4
HP is only 'valid' for cutoffs < samplerate/4


Code :
theta = cutoff*2*pi / samplerate

LP:
H(z) = (1+p*z^(-1)) / (1+p)
out[i] = 1/(1+p) * in[i] + p/(1+p) * in[i-1];
p = (1-2*cos(theta)) - sqrt((1-2*cos(theta))^2 - 1)
Pi/2 < theta < Pi

HP:
H(z) = (1-p*z^(-1)) / (1+p)
out[i] = 1/(1+p) * in[i] - p/(1+p) * in[i-1];
p = (1+2*cos(theta)) - sqrt((1+2*cos(theta))^2 - 1)
0 < theta < Pi/2


5 comment(s) | add a comment | nofrills version


One-Liner IIR Filters (1st order)

Type : IIR 1-pole
References : Posted by chris at ariescode dot com

Notes :
Here is a collection of one liner IIR filters.
Each filter has been transformed into a single C++ expression.

The filter parameter is f or g, and the state variable that needs to be kept around between interations is s.

- Christian


Code :
    101 Leaky Integrator

        a0 = 1
        b1 = 1 - f

        out = s += in - f * s;


    102 Basic Lowpass (all-pole)

        A first order lowpass filter, by finite difference appoximation (differentials --> differences).

        a0 = f
    b1 = 1 - f

        out = s += f * ( in - s );


    103 Lowpass with inverted control

    Same as above, except for different filter parameter is now inverted.
        In this case, g equals the location of the pole.

        a0 = g - 1
    b1 = g

        out = s = in + g * ( s - in );


    104 Lowpass with zero at Nyquist

        A first order lowpass filter, by via the conformal map of the z-plane (0..infinity --> 0..Nyquist).

        a0 = f
        a1 = f
        b1 = 1 - 2 * f

    s = temp + ( out = s + ( temp = f * ( in - s ) ) );


    105 Basic Highpass (DC-blocker)

        Input complement to basic lowpass, yields a finite difference highpass filter.

        a0 = 1 - f
        a1 = f - 1
        b1 = 1 - f

        out = in - ( s += f * ( in - s ) );


    106 Highpass with forced unity gain at Nyquist

        Input complement to filter 104, yields a conformal map highpass filter.

        a0 = 1 - f
        a1 = f - 1
        b1 = 1 - 2 * f

        out = in + temp - ( s += 2 * ( temp = f * ( in - s ) ) );


    107 Basic Allpass

        This corresponds to a first order allpass filter,
        where g is the location of the pole in the range -1..1.

        a0 = -g
        a1 = 1
        b1 = g

    s = in + g * ( out = s - g * in );






1 comment(s) | add a comment | nofrills version


Output Limiter using Envelope Follower in C++

References : Posted by thevinn at yahoo dot com

Notes :
Here's a Limiter class that will automatically compress a signal if it would cause clipping.

You can control the attack and decay parameters of the limiter. The attack determines how quickly the limiter will respond to a sudden increase in output level. I have found that attack=10ms and decay=500ms works very well for my application.

This C++ example demonstrates the use of template parameters to allow the same piece of code to work with either floats or doubles (without needing to make a duplicate of the code). As well as allowing the same code to work with interleaved audio data (any number of channels) or linear, via the "skip" parameter. Note that even in this case, the compiler produces fully optimized output in the case where the template is instantiated for a compile-time constant value of skip.

In Limiter::Process() you can see the envelope class getting called for one sample, this shows how even calling a function for a single sample can get fully optimized out by the compiler if code is structured correctly.

While this is a fairly simple algorithm, I wanted to share the technique for using template parameters to develop routines that can work with any size floating point representation or multichannel audio data, while still remaining fully optimized.

These classes were based on ideas found in the musicdsp.org archives.


Code :
class EnvelopeFollower
{
public:
    EnvelopeFollower();

    void Setup( double attackMs, double releaseMs, int sampleRate );

    template<class T, int skip>
    void Process( size_t count, const T *src );

    double envelope;

protected:
    double a;
    double r;
};

//----------

inline EnvelopeFollower::EnvelopeFollower()
{
    envelope=0;
}

inline void EnvelopeFollower::Setup( double attackMs, double releaseMs, int sampleRate )
{
    a = pow( 0.01, 1.0 / ( attackMs * sampleRate * 0.001 ) );
    r = pow( 0.01, 1.0 / ( releaseMs * sampleRate * 0.001 ) );
}

template<class T, int skip>
void EnvelopeFollower::Process( size_t count, const T *src )
{
    while( count-- )
    {
        double v=::fabs( *src );
        src+=skip;
        if( v>envelope )
            envelope = a * ( envelope - v ) + v;
        else
            envelope = r * ( envelope - v ) + v;
    }
}

//----------

struct Limiter
{
    void Setup( double attackMs, double releaseMs, int sampleRate );

    template<class T, int skip>
    void Process( size_t nSamples, T *dest );

private:
    EnvelopeFollower e;
};

//----------

inline void Limiter::Setup( double attackMs, double releaseMs, int sampleRate )
{
    e.Setup( attackMs, releaseMs, sampleRate );
}

template<class T, int skip>
void Limiter::Process( size_t count, T *dest )
{
    while( count-- )
    {
        T v=*dest;
        // don't worry, this should get optimized
        e.Process<T, skip>( 1, &v );
        if( e.envelope>1 )
            *dest=*dest/e.envelope;
        dest+=skip;
    }
}


no comments on this item | add a comment | nofrills version


PADsynth synthesys method

Type : wavetable generation
References : Posted by zynaddsubfx[at]yahoo[dot]com

Notes :
Please see the full description of the algorithm with public domain c++ code here:
http://zynaddsubfx.sourceforge.net/doc/PADsynth/PADsynth.htm


Code :
It's here:
http://zynaddsubfx.sourceforge.net/doc/PADsynth/PADsynth.htm
You may copy it (everything is public domain).
Paul


3 comment(s) | add a comment | nofrills version


Parabolic shaper

References : Posted by azertopia at free dot fr

Notes :
This function can be used for oscillators or shaper.
it can be driven by a phase accumulator or an audio input.


Code :
Function Parashape(inp:single):single;
var fgh,tgh:single;
begin
fgh    := inp ;
fgh    := 0.25-f_abs(fgh) ;
tgh    := fgh ;
tgh    := 1-2*f_abs(tgh);
fgh    := fgh*8;
result := fgh*tgh ;
end;
// f_abs is the function of ddsputils unit.


no comments on this item | add a comment | nofrills version


Parallel combs delay calculation

References : Posted by Juhana Sadeharju ( kouhia[AT]nic[DOT]funet[DOT]fi )

Notes :
This formula can be found from a patent related to parallel combs structure. The formula places the first echoes coming out of parallel combs to uniformly distributed sequence. If T_ ,...,T_n are the delay lines in increasing order, the formula can be derived by setting T_(k-1)/T_k = Constant and T_n/(2*T_1) = Constant, where 2*T_1 is the echo coming just after the echo T_n.
I figured this out myself as it is not told in the patent. The formula is not the best which one can come up. I use a search method to find echo sequences which are uniform enough for long enough time. The formula is uniform for a short time only.
The formula doesn't work good for series allpass and FDN structures, for which a similar formula can be derived with the same idea. The search method works for these structures as well.




no comments on this item | add a comment | nofrills version


Peak/Notch filter

Type : peak/notch
References : Posted by tobybear[AT]web[DOT]de

Notes :
// Peak/Notch filter
// I don't know anymore where this came from, just found it on
// my hard drive :-)
// Seems to be a peak/notch filter with adjustable slope
// steepness, though slope gets rather wide the lower the
// frequency is.
// "cut" and "steep" range is from 0..1
// Try to feed it with white noise, then the peak output does
// rather well eliminate all other frequencies except the given
// frequency in higher frequency ranges.


Code :
var f,r:single;
    outp,outp1,outp2:single; // init these with 0!
const p4=1.0e-24; // Pentium 4 denormal problem elimination

function PeakNotch(inp,cut,steep:single;ftype:integer):single;
begin
r:=steep*0.99609375;
f:=cos(pi*cut);
a0:=(1-r)*sqrt(r*(r-4*(f*f)+2)+1);
b1:=2*f*r;
b2:=-(r*r);
outp:=a0*inp+b1*outp1+b2*outp2+p4;
outp2:=outp1;
outp1:=outp;
if ftype=0 then
  result:=outp //peak
else
  result:=inp-outp; //notch
end;


2 comment(s) | add a comment | nofrills version


Perfect LP4 filter

Type : LP
References : Posted by azertopia at free dot fr

Notes :
hacked from the exemple of user script in FL Edison

Code :
TLP24DB = class
constructor create;
procedure process(inp,Frq,Res:single;SR:integer);
private
t, t2, x, f, k, p, r, y1, y2, y3, y4, oldx, oldy1, oldy2, oldy3: Single;
public outlp:single;
end;
----------------------------------------
implementation

constructor TLP24DB.create;
begin
  y1:=0;
  y2:=0;
  y3:=0;
  y4:=0;
  oldx:=0;
  oldy1:=0;
  oldy2:=0;
  oldy3:=0;
end;
procedure TLP24DB.process(inp: Single; Frq: Single; Res: Single; SR: Integer);
begin
  f := (Frq+Frq) / SR;
  p:=f*(1.8-0.8*f);
  k:=p+p-1.0;
  t:=(1.0-p)*1.386249;
  t2:=12.0+t*t;
  r := res*(t2+6.0*t)/(t2-6.0*t);
  x := inp - r*y4;
  y1:=x*p + oldx*p - k*y1;
  y2:=y1*p+oldy1*p - k*y2;
  y3:=y2*p+oldy2*p - k*y3;
  y4:=y3*p+oldy3*p - k*y4;
  y4 := y4 - ((y4*y4*y4)/6.0);
  oldx := x;
  oldy1 := y1+_kd;
  oldy2 := y2+_kd;;
  oldy3 := y3+_kd;;
  outlp := y4;
end;

// the result is in outlp
// 1/ call MyTLP24DB.Process
// 2/then get the result from outlp.
// this filter have a fantastic sound w/a very special res
// _kd is the denormal killer value.


2 comment(s) | add a comment | nofrills version


Phase equalization

Type : Allpass
References : Posted by Uli the Grasso

Notes :
The idea is simple: One can equalize the phase response of a system, for example of a loudspeaker, by approximating its phase response by an FIR filter and then turn around the coefficients of the filter. At http://grassomusic.de/english/phaseeq.htm you find more info and an Octave script.



5 comment(s) | add a comment | nofrills version


Phase modulation Vs. Frequency modulation

References : Posted by Bram
Linked file : SimpleOscillator.h

Notes :
This code shows what the difference is betwee FM and PM.
The code is NOT optimised, nor should it be used like this.
It is an <b>EXAMPLE</b>

See linked file.




no comments on this item | add a comment | nofrills version


Phase modulation Vs. Frequency modulation II

References : Posted by James McCartney

Notes :
The difference between FM & PM in a digital oscillator is that FM is added to the frequency before the phase integration, while PM is added to the phase after the phase integration. Phase integration is when the old phase for the oscillator is added to the current frequency (in radians per sample) to get the new phase for the oscillator. The equivalent PM modulator to obtain the same waveform as FM is the integral of the FM modulator. Since the integral of sine waves are inverted cosine waves this is no problem. In modulators with multiple partials, the equivalent PM modulator will have different relative partial amplitudes. For example, the integral of a square wave is a triangle wave; they have the same harmonic content, but the relative partial amplitudes are different. These differences make no difference since we are not trying to exactly recreate FM, but real (or nonreal) instruments.

The reason PM is better is because in PM and FM there can be non-zero energy produced at 0 Hz, which in FM will produce a shift in pitch if the FM wave is used again as a modulator, however in PM the DC component will only produce a phase shift. Another reason PM is better is that the modulation index (which determines the number of sidebands produced and which in normal FM is calculated as the modulator amplitude divided by frequency of modulator) is not dependant on the frequency of the modulator, it is always equal to the amplitude of the modulator in radians. The benefit of solving the DC frequency shift problem, is that cascaded carrier-modulator pairs and feedback modulation are possible. The simpler calculation of modulation index makes it easier to have voices keep the same harmonic structure throughout all pitches.

The basic mathematics of phase modulation are available in any text on electronic communication theory.

Below is some C code for a digital oscillator that implements FM,PM,and AM. It illustrates the difference in implementation of FM & PM. It is only meant as an example, and not as an efficient implementation.


Code :
/* Example implementation of digital oscillator with FM, PM, & AM */

#define PI 3.14159265358979
#define RADIANS_TO_INDEX (512.0 / (2.0 * PI))

typedef struct{    /* oscillator data */
    double freq;   /* oscillator frequency in radians per sample */
    double phase;  /* accumulated oscillator phase in radians */
    double wavetable[512]; /* waveform lookup table */
} OscilRec;


/* oscil - compute 1 sample of oscillator output whose freq. phase and
*    wavetable are in the OscilRec structure pointed to by orec.
*/
double oscil(orec, fm, pm, am)
    OscilRec *orec;  /* pointer to the oscil's data */
    double fm; /* frequency modulation input  in radians per sample */
    double pm; /* phase modulation input      in radians */
    double am; /* amplitude modulation input  in any units you want */
{
    long tableindex;            /* index into wavetable */
    double instantaneous_freq;  /* oscillator freq  + freq  modulation */
    double instantaneous_phase; /* oscillator phase + phase modulation */
    double output;              /* oscillator output */
    
    instantaneous_freq  = orec->freq  + fm; /* get instantaneous freq */
    orec->phase += instantaneous_freq;      /* accumulate phase */
    instantaneous_phase = orec->phase + pm; /* get instantaneous phase */
    
    /* convert to lookup table index */
    tableindex = RADIANS_TO_INDEX * instantaneous_phase;
    tableindex &= 511; /* make it mod 512 === eliminate multiples of 2*k*PI */
    
    output = orec->wavetable[tableindex] * am; /* lookup and mult by am input */
    
    return (output);  /* return oscillator output */
}


1 comment(s) | add a comment | nofrills version


Phaser code

References : Posted by Ross Bencina
Linked file : phaser.cpp

Notes :
(see linked file)



10 comment(s) | add a comment | nofrills version


Piecewise quadratic approximate exponential function

Type : Approximation of base-2 exponential function
References : Posted by Johannes M.-R.

Notes :
The code assumes round-to-zero mode, and ieee 754 float.
To achieve other bases, multiply the input by the logarithmus dualis of the base.


Code :
inline float fpow2(const float y)
{
    union
    {
        float f;
        int i;
    } c;

    int integer = (int)y;
    if(y < 0)
        integer = integer-1;

    float frac = y - (float)integer;

    c.i = (integer+127) << 23;
    c.f *= 0.33977f*frac*frac + (1.0f-0.33977f)*frac + 1.0f;

    return c.f;
}


no comments on this item | add a comment | nofrills version


Pink noise filter

References : Posted by Paul Kellett
Linked file : pink.txt

Notes :