fft - What are best practices to compute an audio spectrogram?

Monday, May 27, 2019

fft - What are best practices to compute an audio spectrogram?

The spectrogram if generally defined with the squared magnitude of the fft. However, in lots of implementation, it seems that people just use the magnitude without square.

Moreover an audio signal is by convention scale between -1 and 1. This scaling often needs a supplementary step in implementations, in python language for example, which is not always do.

Finally, what are best practices to compute an audio spectrogram? - square magnitude of the fft / magnitude of the fft ? - Integer audio values / scaling (-1 to 1) audio values

EDIT

As the comments tell, these are questions without consequences if the aim is to plot an image of the spectrogram.

However, I would like to use the matrix of the spectrogram as an entry point for sound analysis and recognition. In this case, the computation process matters and I find curious that the implementations differ so often.

Notes

Monday, May 27, 2019

fft - What are best practices to compute an audio spectrogram?

No comments:

Post a Comment

digital communications - Understanding the Matched Filter