I am using this link to study Generalized Hough Transform. I will explain it briefly. In order to detect a shape, an R-table is created for the shape with $\phi$ indicating gradient orientation for points on the edge of the shape. Then $(r, \beta)$ is used to indicate the vector from each of these points to the centroid in the shape. While detecting the shape, we will first find all the edge points and their gradient orientation $(\phi)$. Based on this $\phi$, $(r, \beta)$ values are used to calculate centroid points. Based on the votes on centroid points (and R table), the shape can be detected.
My question is related to rotation of the shape which is explained towards the end. If the shape is rotated by $\theta$, then the gradient orientation $(\phi)$ for a given edge point changes. So, shouldn't we do either one of the following:
1) Rotate all the $\phi$ values in R table by $\theta$?
OR
2) Rotate gradient vector by $(-\theta)$ and then calculate $\phi$ for the edge point?
Answer
I was having the same issue as you had: most resources I found online seem to miss this issue about the rotating gradient orientation, as did A_A in their answer (which to be fair is very clear in the other aspects of the GHT).
I found that the original paper by Ballard addresses this very thing in section 4.5 (the emphasis is mine):
We denote a particular $R$-table for a shape $S$ by $R(\phi)$. $R$ can be viewed as a multiply-vector-valued function. It is easy to see that simple transformations to this table will allow it to detect scaled or rotated instances of the same shape.
For example if the shape is scaled by $s$ and this transformation is denoted by $T_s$, then $T_s[R(\phi)] = sR(\phi)$ i.e.,all the vectors are scaled by $s$. Also, if the object is rotated by $\theta$ and this transformation is denoted by $T_\theta$, then $T_\theta[R(\phi)] = Rot\{R[(\phi-\theta)\ (mod\ 2\pi)],\ \theta\}$ i.e., all the indices are incremented by $-\theta$ modulo $2\pi$, the appropriate vectors $r$ are found, and then they are rotated by $\theta$.
So I am pretty sure that your option (1) is a correct way of going about this: when varying $\theta$ you index an offset-by-$\theta$ entry on the $R$-table.
A disclaimer is in order: I am no expert on the matter. In any case, it seems pretty clear to me from the paper, but perhaps I'm missing some invariance which lets us ignore the offset. Anyway, hope this helps.
Edit: Indeed, I just finished implementing the GHT and it won't work properly without the $\theta$-offset.
This first image shows the result without applying the offset before entering the table (the red crosses on the left indicate detected centroids corresponding with the object on the right). As you can see, the rotated object is not detected (the false positive on the lower-left side disappears if I use a stricter threshold (i.e, minimum amount of votes)).
The following results from offsetting by $-\theta$ before retrieving the $R$-table entries (and is also resistant to a higher threshold, meaning it's not just a translation of the false positive):
No comments:
Post a Comment