Wednesday, December 5, 2018

cross correlation - Subtracting Original Audio Signal


here is my scenario: one speaker is sending a chirp, and a microphone is recording it as well as all following echoes. The function of the chirp is already know.


My goal: subtract the original chirp from the recorded sound, and leave only the echoes.


I think this questions is hard for me in these ways:



  1. To match the original chirp and recorded sounds, I don't know which approach to use: cross-correlation, or minimize sum(abs(difference between two vectors)). The difference here means element-wise difference.


  2. The echo may be mixed with the chirp itself, so I'd better only do cross-correlation on the first small portion of the chirp.

  3. The amplitude of recorded sound depends on the microphone.


Thank you very much!



Answer



Well my friend, let me answer your questions here. You've asked why subtraction of recorded sweep from original one will not produce any info about echos. So I took exponential sinusoid in range of $5\mathrm{Hz}$ up to $7995\mathrm{Hz}$, sampling frequency is $16\mathrm{kHz}$. Then I've used some crude filtering:


enter image description here


In result we get the following:


enter image description here


You can see that amplitude is modulated according to filtering applied. Now if we want to get some info about the impulse response of our system, then let's try simple subtraction as suggested by you, that's what we get:



enter image description here


What you can get from this time domain signal is what is the difference between different frequencies in time. Any info about reflections would not be straightforward. This is simple filtering, so there are none, but imagine you have some reverberation in the end, then your recorded signal is longer than played. Thus you have some tail of decaying sinusoids - nightmare to analyse anything But when you deconvolve filtered signal with the inverse filter, that's what you get:


enter image description here


How beautiful is that! ;) You've obtained the impulse response of a filter that was applied at the very beginning. What's more, you can perform Fourier Transform of that impulse response and that will give you frequency response of this filter. I quickly did some frequency analysis, so you can see result below. Upper normalised frequency axis is forced to be logarithmic. You can easily relate that to filtering applied in the beginning. What's more, you can deduce that this plugin is using FIR filter because phase is linear!


enter image description here


If it was for example measurement of acoustic space, then your impulse response would be:


enter image description here


From that you can very easily detect any reflections, delays between arrivals via different paths and even flutter echo or different fancy stuff.


Because you are interested in detection of echo, then I suggest you to read the work of Dietsch (unfortunately in German): Ein objektives Kriterium zur Erfassung von Echostörungen bei Musik- und Sprachdarbietungen


Above all following thesis: Evaluation of objective echo criteria. Guy is investigating echos a lot - you even have MATLAB files in the appendix.



Some bits about theory standing behind sweep-sine measurements: Advancements in Impulse Response Measurements by Sine Sweeps, and: Simultaneous Measurement of Impulse Response and Distortion with a Swept-Sine Technique


And bit more about derivatives from IR to convince you how powerfull it is: Theoretical and Applied Room Acoustics Including Parameters and IR Derivatives (although that one is targeting more at Architectural Acoustics)


I hope I did convinced you - good luck!


No comments:

Post a Comment

digital communications - Understanding the Matched Filter

I have a question about matched filtering. Does the matched filter maximise the SNR at the moment of decision only? As far as I understand, ...