I found algorithms that seems the same to me, but they are described with different names (in field of adaptive filtering).
For example:
LMS - least-mean-squares seems to be GD - stochastic gradient descent
Often the stochastic gradient descent is called just gradient descent what seems to be something different (but still similar) according to wikipedia.
Extension of GD (or LMS) with normalization seems to be always called NLMS, why not NGD?
Why are the used names different? Are the algorithms really the same thing or am I wrong?
Answer
The LMS algorithm is based on the idea of gradient descent to search for the optimal (minimum error) condition, with a cost function equal to the mean squared error at the filter output. However, it doesn't actually calculate the gradient directly, as that would require knowing:
$$ E(\mathbf{x}[n]e^*[n]) $$
where $\mathbf{x}[n]$ is the input vector and $e[n]$ is the filter output error at time instant $n$. The value of this expectation would typically be something that isn't known ahead of time, so it's instead approximated (in most cases) by the instantaneous value $\mathbf{x}[n]e^*[n]$ instead. This (described in more detail at Wikipedia) is the key feature of the LMS filter structure.
Gradient descent just refers to the method used to hunt for the minimum-cost solution; it doesn't force the use of any particular cost function. An LMS filter is a specialization of gradient descent that uses a mean-squared error cost function and the aforementioned approximation for the gradient at each time step.
Stochastic gradient descent would refer to techniques that don't directly compute the gradient, instead using an approximation to the gradient that has similar stochastic properties (e.g. the same expected value). LMS would be an example of an algorithm that uses a SGD approach, using the approximation I described above.
No comments:
Post a Comment