I need to implement a script for generating features from an input image by using the Gabor filter. I have no past experience of wavelets and I'm just learning Fourier analysis (I understand the basic idea behind Fourier analysis and transform) so they can't help me to understand Gabor filter, because I need to have an implementation done in a week. I don't need to understand the foundations of the Gabor filter function, but I would want to understand to some extent of what it is and what does it do. What are the parameters? What do they mean? What is the output of the function? For example this is the formula I copied from Wikipedia:
$$g(x,y;\lambda, \theta, \psi, \sigma,\gamma) = \exp\left(-\frac{x'^2+\gamma^2 y'^2}{2\sigma^2}\right)\exp\left(i\left(2\pi\frac{x'}{\lambda}+\psi\right)\right)$$
Now my obvious question is: What does this mean? What does the variables mean? According to Wikipedia:
$x, y$: I assume these coordinates specify the pixel value of an image at coordinates $(x,y)$ (2. This is OK, understood)
$\lambda$: represents the wavelength of the sinusoidal factor (Sinusoidal factor, huh? 3a. How do you select it? 3b. Where does it come from? 3c. Is it an arbitrary number or what? Freely chosen?)
$\theta$: represents the orientation of the normal to the parallel stripes of a Gabor function (4. What does that mean?)
$\psi$: is the phase offset (5. offset of what? How is this value determined? Is it freely chosen?)
$\sigma$: is the sigma/standard deviation of the Gaussian envelope (6. Need more explanation...)
$\gamma$: is the spatial aspect ratio, and specifies the ellipticity of the support of the Gabor function (7. Again need more details and more explanation)
And most importantly:
$$g(x,y;\lambda, \theta, \psi, \sigma,\gamma) = X$$
- What is the output value $X$? What does it mean?
As I mentioned I don't need thorough explanation of the theory, because I bet it is long and reading a 1000 page book on unknown subject is not an option for me right now. I need to have have a black-box understanding of this function so that I can implement it in code and most importantly understand what is the input and what is the output.
Thank you for any help!! =)
P.S.
I read this post:
https://math.stackexchange.com/questions/259877/value-of-x-y-in-computing-gabor-filter-function
but it doesn't answer my question thorough enough :)
A matlab implementation is in this answer: https://dsp.stackexchange.com/a/14201/5737
1) The wikipedia formula is a little bit too general.
2) If you know the basics of the with Fourier transform then you empirically know that: The image is viewed as being formed by superimposing a series of sinusoidal waves of various frequencies oriented in all kind of directions. Each "pixel" in the transform tells us the "intensity" of such a wave. The position of the "pixel" tells us the frequency and orientation of the wave. In practice, one wants to select only certain waves, having a specific frequency and a specific orientation.
So there you have it: The Gabor transform is one of many so called band pass filters that allows you to "cut" the Fourier transform and isolate only specific information. Another important information is that each Fourier "pixel" is a complex value (real and imaginary part)
3) Parameters: Two parameters are already shown:
3.a) The tuning frequency $f_0$, or tuning period $P_0$, or $\lambda$ establish to what kind of sinus wave the filter will respond best. ( $f = 1/P_0 = 1/\lambda$ or $f=\pi/\lambda$ depending on the specific implementation ) Basically, a smaller $P_0$ means a denser sinus wave. A larger $P_0$ means larger waves. $P_0$ is in pixels (3, 5, 30, etc. pixels) Don't go under 3 pixels or beyond $W/2$ or you get nasty effects. $W$ is the width of the image, if the image is square. You specify this using P0 parameter in the matlab code.
3.b) Central angle. These waves can have any directions. You want to select only waves at specific angle. So, the second parameter is the tuning angle, $\theta_0$ or $\theta$ in your formula. Usually, in radians. Orient in matlab code.
One cannot isolate only a certain frequency or a certain orientation. (Search for uncertainty principle in the textbooks. Yes, is similar to the one in physics) But one can tune how much of the nearby frequencies will leak. The next two parameters specify that:
3.c) $\Delta F$, frequency bandwidth expressed in octaves. Usefull values, 1.5, 2, 3. Larger values means capturing a broader range of frequencies. There is a price for a tighter bandpass, a poorer spatial localization. Why, again, textbook. FBW in matlab code.
3.d) $\Delta \theta$ the angle bandwidth. Expressed in radians. $\pi/3$ or $\pi/2$ works just fine. ABW in matlab code.
The relation between $\Delta f$, $\Delta \theta$ from matlab code and $\sigma, \gamma$ from Wikipedia have a formula but is not essential for understanding Gabor. $\psi$ is again not important for basic understanding.
So you have it: $X$ from Wikipedia is a 2D matrix of numbers containing a convolution mask. You take the original image, filter it with the convolution mask and get another image. This new image is the "Gabor response" for the original image.
The matlab code does these two steps together. Constructs the Gabor filter with the specified parameters and performs convolution. The results ReConv,ImConv are the responses. Each response "pixel" have a real and an imaginary part. If you want to use this code you usually must compute the energy for the response: $E = \sqrt{a^2+b^2}$ where $a$ is the real part of the response (ReConv) and $b$ is the imaginary part (ImConv), for each pixel.
There you have it: 1) Build a Gabor filter specifying $P_0, \theta_0, \Delta f, \Delta \theta$ 2) Convolute your image with the filter. You will get two values per each pixel. 3) Compute the energy $E$ and get the intensity of the response for each pixel in the original image
Another intuition: Suppose you want to select edges stretching on an orientation perpendicular to $\pi/6$ and a certain width of 20 pixels. You can build a Gabor filter with $P_0 \approx 20, \theta_0 = \pi/6, \Delta f = 2, \Delta \theta = pi/2$. TAKE CARE that there is no equal relation between the dimension of your edge and the $P_0$ parameter. You should try various values and see what works best. The other parameters ($\Delta f, \Delta \theta$) touch them when you have some experience tuning the first two.
Hope it helps!
Cristi
Update
Here is a site which allows you to play a bit with Gabor parameters and note the results: http://www.cogsci.nl/pages/gabor-generator.php
After a quick look, both freqnency and angle bandwidths are tied together as "Standard Deviation in pixels ... to the Gauss envelope". The rest of the parameters are easily identified. However, note that there are numerical differences! Eg. the phase can be expressed in 0-1 interval or 0-$\pi$ interval.