I'm currently designing a curve-matching algorithm and as I already explored many ideas I'm requesting your help. So, if you got some advices about how to handle this problem, feel free to answer!
I got a flash LIDAR spatio-temporal response to one short impulsion. The data acquired is stored in a 3-D matrix:
- First and second dimensions are spatial ones. As in a 2-D image they represent azimuth and elevation of a pixel.
- Third dimension is the temporal one. For each pixel the illumination is recorded for a fixed duration and stored in this dimension. Typically, the curve observed is a Gaussian and its parameters (variance, max amplitude and delay) are affected by the LIDAR surrounding environment.
To be more specific the laser pulse is distorted by the environment. The multi-path of light impacts the Gaussian variance, the object distance to the RADAR impacts the delay and the amplitude and the reflectivity of the object impacts the amplitude of the Gaussian.
As in the real world objects are contiguous and homogeneously distributed in space, neighboring pixels are more likely to have a similar reflectivity and distance to the LIDAR thus neighboring Gaussians are very similar.
What I'm trying to do is to quantify the similarity of a pixel impulse response to its neighborhood (3x3 neighborhood for instance). This is very close to the concept of gradient, I want to detect pixels that are parts of an homogeneous objects and ones at the border of an object where there is a disruption.
For now I've implemented two different algorithms. One that is similar to a an image derivation adapted to the 3-D structure, with a 2-D Laplacian Filter. The second algorithm is based on the measure of similarity using Sum of Absolute Differences, also adapted to the data 3-D structure.
These algorithms work somewhat fine but since the signal can be noisy and sometimes very far from a Gaussian shape I think and I hope better algorithms can be implemented.
My research tracks are focused on curve-fitting before comparison (Levenberg–Marquardt algorithm for instance) for now, but this proved very computationally time-consuming!
If you know about other methods or concepts that can help in resolving my problem, feel free to participate :)
Thank you!
EDIT:
Matrix size is 32x128x128
Answer
The two most obvious things you can try are:
- Fitting a Gaussian to your data and then clustering their parameters
- Estimate the similarity of waveforms directly and then try to cluster that
Since you know that the return waveform conforms to a Gaussian, it is better to use a method that takes this into account.
So, basically, for every pixel time course, you will end up with a fitted $\mu, \sigma^2$. You can then use these two, plus any other target dependent feature to characterise the surface.
These algorithms work somewhat fine but since the signal can be noisy and sometimes very far from a Gaussian shape I think and I hope better algorithms can be implemented.
The fact that you know that your data is supposed to conform to a specific model is a great deal of knowledge and you should try to take that into account as much as possible. "...sometimes very far from Gaussian..." doesn't always mean "not Gaussian". You might for example be getting two or more returns in the same time-course which means that you might have to apply something like a kernel density estimator or an iterative fitting process to pick up each individual return. For more information, please see this paper.
The second option is to estimate "similarity" of waveforms directly. In this case, given two time series $U,V \in \mathbb{R}^\mathbb{N}$, you are looking at a function $F(U,V,\Theta) \rightarrow \mathbb{R}$ which simply returns a number that is proportional to how "similar" these two waveforms are, given some additional parameters in $\Theta$. Most of the times, "similar" is another word for "co-varying".
The most obvious of the possible $F$s is the discrete cross-correlation. In that case $f_{xcorr}(U,V)$ returns a number between $-1..1$ which tells you how similar $U,V$ are. Another possible $f$ is Coherence. In this case, parameters include the frequency component of which you want the coherence calculated at (e.g. $f_{coher}(U,V,k)$ (for the $k^{th}$ component) which returns a number between $0..1$. Another possible $f$ is Mutual Information with is a slightly different take on "similarity" and more challenging to apply correctly on continuous (rather than digital) signals.
And then, on top of the metrics mentioned above you have variations, like wavelet based cross-correlation instead of correlation, LZ complexity instead of Mutual Information and so on.
The field of Similarity Metrics is vast and there is a lot of work happening in the processing of Electroencephalography (EEG) signals in Functional Connectivity (which basically involves a lot of those evaluations of similarity under very challenging conditions of waveform noise and artifacts). For more information, please see this, this and this.
These publications are not exactly on your topic but they use a great deal of similar methods and tools for estimating the similarity between two time-series which is what you are trying to do as well. Therefore, when reading those, try to extrapolate the terminology and then try to search in remote sensing related journals for similar things. I am sure that such a search would return much more relevant results adapted to the specifics of estimating similarity of the laser pulse returns.
Hope this helps.
No comments:
Post a Comment