Sunday, October 1, 2017

image processing - How Hessian feature detector works?



I know about Harris corner detector, and I understand the basic idea of its second moment matrix, $$M = \left[ \begin{array}{cc} I_x^2 & I_xI_y \\ I_xI_y & I_y^2 \end{array} \right]$$, edges and other unstable points can be removed via $M$.


But about Hessian detector, it uses Hessian matrix to detect key points and remove edges, $$\mathcal{H} = \left[ \begin{array}{cc} I_{xx} & I_{xy} \\ I_{xy} & I_{yy} \end{array} \right]$$, and I don't understand how could $\mathcal{H}$ remove edge and detect stable points? What's the intuitive basic idea behind it?



Answer



I will try to avoid math, because math and "how to do it" tutorials can be easily found.


So, I start by pointing out one VERY important thing: One does not compute Harris for a single pixel, but for a vicinity (a patch of image) around that pixel! Let $I(i)_{xx}, I(i)_{xy} ...$ be your derivatives for a point $i_0$, then,


$H = \left[ \begin{array}{cc} \sum_{i\in V}I(i)_{xx} w (i-i_0) & \sum_{i\in V}I(i)_{xy}w (i-i_0) \\ \sum_{i\in V}I(i)_{xy} w (i-i_0)& \sum_{i\in V}I(i)_{yy} w (i-i_0)\\ \end{array} \right] $


The $w(t)$ is a Gaussian kernel. The previous eq tells you to integrate the derivative values over vicinity $V$ around current pixel. Each value of the neighbors is multiplied with a value that shrinks as the distance increases. The law of decreasing follows a Gaussian, because $w(t)$ is Gaussian centred at $i_0$. And that's it with math.


Now, back to the empirical observations. If you use solely the derivatives, and that pixel is part of a linear structure (edge), then, you get a strong response for the derivatives. On the other hand, if the pixel is at a corner (an intersection of two edges) then, the derivative responses will cancel themselves off.


Saying that, the the Hessian is able to capture the local structure in that vicinity without "cancelling" effect. BUT very important, you have to integrate in order to get a proper Hessian.


Having a Hessian, obtained using Harris method or by other means, one might want to extract information about the vicinity. There are methods to get numerical values on how likely is to have an edge at current pixel, a corner, etc. Check the corner detection theory.



Now, about "stable points" or salient points. Picture that you are in a foreign town with no GPS and only with a good map. If you are "teleported" in a middle of a street, you might locate the street on the map, but you cannot tell where exactly are you on that street or in what direction you should go to move left or right (wrt to the map). Imagine now that you are at a intersection. Then, you precisely can point your position on the map!. (Of course, assume that two streets doesn't intersect more than once).


Imagine now that you must match two images. One acts as a map, and the other as the city. You must find pixels that can be uniquely described, so you can do the matching. Check images on this post for example of matching. These points are called salient points. Moreover, the corner points tend not to change their 'cornerness' properties when the image is scaled, translated, rotated, skewed, etc. (affine transforms) This is why they are called "stable".


Some points in the image allows you to uniquely identify them. These pixels are located at corners or at intersection of lines. Imagine that your vicinity $V$ is on a line. Except for the orientation of the line, you cannot find anything else from that vicinity. But if $V$ is on a corner, than, you can find out the directions of the lines that intersect, maybe the angle, etc.


Not all corner points are salient, but only corner points have great chances of being salient.


Hope it helps!


p.s. How to find if a point is corner or not, take a look at Harris paper.


p.p.s. More on matching, search for SIFT or SURF.


p.p.p.s. There is a "generalization" of the Harris method, called Structure Tensor. Check Knutsson seminal work!


No comments:

Post a Comment

digital communications - Understanding the Matched Filter

I have a question about matched filtering. Does the matched filter maximise the SNR at the moment of decision only? As far as I understand, ...