Monday, May 27, 2019

fft - What are best practices to compute an audio spectrogram?


The spectrogram if generally defined with the squared magnitude of the fft. However, in lots of implementation, it seems that people just use the magnitude without square.


Moreover an audio signal is by convention scale between -1 and 1. This scaling often needs a supplementary step in implementations, in python language for example, which is not always do.


Finally, what are best practices to compute an audio spectrogram? - square magnitude of the fft / magnitude of the fft ? - Integer audio values / scaling (-1 to 1) audio values


EDIT


As the comments tell, these are questions without consequences if the aim is to plot an image of the spectrogram.


However, I would like to use the matrix of the spectrogram as an entry point for sound analysis and recognition. In this case, the computation process matters and I find curious that the implementations differ so often.




No comments:

Post a Comment

digital communications - Understanding the Matched Filter

I have a question about matched filtering. Does the matched filter maximise the SNR at the moment of decision only? As far as I understand, ...