Tuesday, December 17, 2019

fft - Two voice pronunciation comparison similarity MFCC + DTW


Calculate each MFCC to compare wave file A and wave file B, and then use FastDTW to measure the distance after two sets of MFCCs.



We compared the four wave files and obtained the Euclidean distance value.


The values below are the Euclidean distance values.



675.0095954620155 A.wav vs. A2.wav


998.7554375714773 A.wav vs B.wav


976.1293903229977 A.wav vs. B2.wav


856.4364672719398 A.wav vs C.wav


645.8353052519245 A.wav vs C2.wav





  1. I do not know how the Euclidean distance values look for similarity.

  2. Is the similarity rate high when the distance value approaches zero?

  3. How do you know there is a high match with this distance value?

  4. Is not DTW a library to see how similar the voices are?




No comments:

Post a Comment

digital communications - Understanding the Matched Filter

I have a question about matched filtering. Does the matched filter maximise the SNR at the moment of decision only? As far as I understand, ...