Thursday, April 5, 2018

frequency - Why should one use windowing functions for FFT?



So I just revised my pitch calculation algorithm using a harmonic product spectrum algorithm. I was just curious about why this explanation of Harmonic Product Spectrum states that you need to implement a Hanning Window to the data set. What would be the effect of implementing other Window functions on a data set (and then FFTing it)? Which Windowing function is actually the best for frequency detection? Here are the relevant methods I have used in my code:


/**
* Calculates the Frequency based off of the byte array,
* @param bytes The audioData you want to analyze
* @return The calculated frequency in Hertz.
*/
private int getFrequency(byte[] bytes){
double[] audioData = this.bytesToDoubleArray(bytes);
audioData = applyHanningWindow(audioData);
Complex[] complex = new Complex[audioData.length];

for(int i = 0; i complex[i] = new Complex(audioData[i], 0);
}
Complex[] fftTransformed = FFT.fft(complex);
//return calculateFrequency(fftTransformed);
System.out.println("Max size:" + (fftTransformed.length*getFFTBinSize(fftTransformed.length)/4));
return calculateFundamentalFrequency(fftTransformed,4);
}

private double[] applyHanningWindow(double[] data){

return applyHanningWindow(data, 0, data.length);
}

private double[] applyHanningWindow(double[] signal_in, int pos, int size)
{
for (int i = pos; i < pos + size; i++)
{
int j = i - pos; // j = index into Hann window function
signal_in[i] = (double)(signal_in[i] * 0.5 * (1.0 - Math.cos(2.0 * Math.PI * j / size)));
}

return signal_in;
}


/**
* Harmonic Product Spectrum
* @param fftData
* @param n
* @return
*/

private int calculateFundamentalFrequency(Complex[] fftData, int n){
Complex[][] data = new Complex[n][fftData.length/n];
for(int i = 0; i for(int j = 0; j data[i][j] = fftData[j*(i+1)];
}
}
Complex[] result = new Complex[fftData.length/n];//Combines the arrays
for(int i = 0; i Complex tmp = new Complex(1,0);

for(int j = 0; j tmp = tmp.times(data[j][i]);
}
result[i] = tmp;
}
//Calculates Maximum Magnitude of the array
double max = Double.MIN_VALUE;
int index = -1;
for(int i = 0; i Complex c = result[i];

double tmp = c.getMagnitude();
if(tmp>max){
max = tmp;;
index = i;
}
}
return index*getFFTBinSize(fftData.length);
}

Answer



The FFT can only be performed over a limited chunk of data. The basic math is based on the assumption that the time domain signal is periodic, i.e. your chunk of data is repeated in time. That typically results in a major discontinuity at the edges of the chunk. Let's look at a quick example: FFT size = 1000 points, Sample Rate = 1000 Hz, Frequency resolution = 1Hz. If you have a 10 Hz sine wave you don't have a discontinuity since exactly 10 periods fit into your FFT window and the values (and derivatives) at the edges are the same. The FFT of this siganl will be zero except for a single value in bin #10. This also works for a 11 Hz sine wave just as well. However, for 10.3 HZ sine wave you end up with a lot of discontinuity and the FFT will have energy in all bins with maximum around 10 or 11 and then "skirts" that roll trail off to the sides. So a small change in frequency results in a massive change in the FFT picture.



Windowing is used to avoid this: Windows make sure that the data at the edges are zero, so there is no discontinuity. However multiplication in the time domain is convolution in the frequency domain and that results in widening of spectral lines and also in side lobes. Teh choice of window controls the trade offs between main lobe width and side lobe spacing and height. Your application specific requirements determine what window to use and there are dozens of choices. Hanning is just one of them. It's basically "the window of choice if you don't have any better ideas". Personally, I prefer Kaiser windows as they have a continuous parameter that can control the window behavior over a wide range.


In general FFT is not a great method for pitch detection. For most audio signals the maximum in the spectrum is NOT the fundamental (typically harmonics have higher energy), in order to get decent resolution you need a long pieces of data but that makes the algorithm very slow and sluggish to respond to changes. Much better options are phase looked loops, delay looked loops, autocorrelations, max/min tracker, zero crossing tracker, etc.


No comments:

Post a Comment

digital communications - Understanding the Matched Filter

I have a question about matched filtering. Does the matched filter maximise the SNR at the moment of decision only? As far as I understand, ...