I've been trying to implement generalized cross correlation with a PHAT weighting function for a while now, and cannot get it to work. I have tried correlating using MATLAB's xcorr.m file, and it works for getting a correct delay tau (on simulated sinusoidal signals).
The code for the signals are thus:
Fs = 8000;
dt = 1/Fs;%0.125e-3
f1 = 100;
tdelay = 0.625e-03;%try different values
t3 = (0:dt:(1)-dt)';
x3 = cos(2*pi*f1*t3);
x4 = cos(2*pi*f1*(t3-tdelay));
As can be seen in part of the source code for xcorr.m
, the cross correlation is implemented thus:
%Transform both vectors
X = fft(x,2^nextpow2(2*M-1));
Y = fft(y,2^nextpow2(2*M-1));
% Compute cross-correlation
c = ifft(X.*conj(Y));
According to definitions of GCC-PHAT, the only addition I needed to make was to divide the product by its own magnitude, before taking the ifft. Here is my version with this change.
%Transform both vectors
X = fft(x,2^nextpow2(2*M-1));
Y = fft(y,2^nextpow2(2*M-1));
% Compute cross-correlation
R = X.*conj(Y);
c = ifft(R./abs(R));
However I always end up with a tau of zero with the PHAT weighting! On closer examination of the array produced as a result of this division, it seems as if the first value of R is a real value (with no imaginary component), and so when divided by its magnitude it becomes 1. All the other values in array R are complex, so do not end up as 1 when divided by their own magnitude and so end up with a value of <1.
This can be seen below, for the 1st 10 values of R.
K>> R(1:10,1)
ans =
0.000000000000000 + 0.000000000000000i
-0.494299608718696 - 0.003002230689022i
-0.002678647083223 - 0.000032538742345i
-0.488954228290329 - 0.008909374553649i
-0.010656518992354 - 0.000258902698589i
-0.478379290671260 - 0.014528074329782i
-0.023760667475633 - 0.000865926459320i
-0.462803929640386 - 0.019677623220519i
-0.041707017319469 - 0.002026674993917i
-0.442565618721743 - 0.024194329448597i
K>> abs(R(1:10,1))
ans =
0.000000000000000
0.494308725968464
0.002678844707371
0.489035391682370
0.010659663580139
0.478599844010494
0.023776441018801
0.463222070011989
0.041756229537848
0.443226457301486
K>> R(1:10,1)./abs(R(1:10,1))
ans =
1.000000000000000 + 0.000000000000000i
-0.999981555555690 - 0.006073594357736i
-0.999926227844417 - 0.012146557900713i
-0.999834033705084 - 0.018218261306199i
-0.999705001216859 - 0.024288074069393i
-0.999539169638280 - 0.030355367874845i
-0.999336589392997 - 0.036419515378054i
-0.999097322000240 - 0.042479891383435i
-0.998821440083941 - 0.048535871565711i
-0.998509027227826 - 0.054586834901284i
As can be seen above, the largest value ends up being at the 1st index, when we divide R by its own magnitude. So once the ifft is taken, the highest value is ALWAYS at the beginning of the array, which gives a lag and timediff of zero...even when I set the delay between the 2 identical signals at the beginning to a nonzero value (e.g. delay = 0.75e-03).
What am I doing wrong?? Any help appreciated.
Rory
P.S. if anyone is wondering why I'm bothering with the PHAT weighting, its because it should give far better results (in theory) in a real life scenario, for TDOA.
No comments:
Post a Comment