I need to reconstruct the envelope of a sound.
Audio data are professionally-recorded natural sounds (speech, bird songs) with very little noise. I would prefer working in the time domain rather than in the frequency domain (I've seen some algorithms based on FFT transformations that looked overcomplicated for what I need). The algorithm will be implemented in an interpreted language so it needs to stay "light" in computation.
As a first approach, I considered using a peak detection algorithm, then doing a linear interpolation between the peaks. But isn't there some pitfalls with such a naive approach? Are there some standard ways of implementing envelope reconstruction in the time domain that would better suit my needs?
FWIW, I'm not familiar with digital signal processing vocabulary, so do not hesitate to reword my question if I misused some terms
Answer
As I said in a comment, you can get the envelope of a signal by running it through a lowpass filter.
The steps required for this (usually) are
Go through all the samples (x(N) ) and check for negative samples. Convert them to positive values
Implement a lowpass filter (FIR) by creating a filter kernel of appropriate length M (h(M) ). Note that your FIR filter (sin(x)/x) should normally be multiplied by a window function (e.g Hamming, Blackman etc).
Convolve your signal with the filter kernel.
for(int i=0;i
for(int j=0;j
y[i+j] = y[i+j]+x[i]*h(j);
Note that the output (filtered) signal is usually right shifted by M samples. Also, M is often chosen to be odd in order to create a perfectly symmetrical filter although I'm sure not all implementations adhere to this.
No comments:
Post a Comment