Friday, June 7, 2019

noise - peak limiting/audio compression formula needed


I'm looking for a formula to effectively compress an audio waveform to limit peaks. This isn't an "automatic volume control" application where one would control amplifier gain to maintain a volume level, but rather I want to limit ("soft" truncate) individual peaks. (I know this introduces harmonics, but I'm trying to analyze the data, not listen to it.)


My (very crude) formula so far is:


factor = (10 * average / level) + exp(-sqrt(0.1 * level / average))

Where level is the instantaneous sound level, average is the historical average sound level, and factor is a multiplier used to produce the "adjusted" level (factor times level).


Further, this multiplier is only applied if it computes to a value less than 1. Otherwise level is left unadjusted.



The intent is to limit the adjusted level to some multiple (about 15x with this formula) of the historical average. This formula is sorta what I need, but exhibits a "dip" as the numbers get larger. That is, the adjusted level (ie, factor times level) increases up to a point with increasing unadjusted level but then, rather than going asymptotic, begins to actually get smaller. (In fact, the first factor was added primarily to prevent the formula from going to zero with extremely high values.)


(The reason for wanting to limit the values this way is primarily so transient noise doesn't seriously upset the running average of the sound level. But when you're analyzing snores "transient noise" is quite significant, so I can's simply squelch it.)


So, can anyone suggest something better? (It seems that asymptotic behavior is easy to produce when you don't want it, but hard when you do.)



Answer



Two problems here: how to get a reliable estimate of the level, and how to compress the data.



  • Use robust statistics on the original (not peak-limited) data like median or quantiles instead of a running average to make your "typical level" detection robust to outliers.

  • $k \times \tanh(\frac{x}{k})$ works nice as a $C^\infty$ compression formula and it is actually what is happening in some audio circuits (using OTAs). To get adaptive compression that preserves the dynamics of the original signal and just remove transients, make k track the smoothed "average" level.


example




  • Blue: original signal

  • Green: 2 x median of the absolute value over a sliding window as a "typical level" detection

  • Red: tanh compression (formula given above with k equal to the level plotted in green)


No comments:

Post a Comment

digital communications - Understanding the Matched Filter

I have a question about matched filtering. Does the matched filter maximise the SNR at the moment of decision only? As far as I understand, ...