Tuesday, January 30, 2018

lowpass filter - Differences between filtering and polynomial regression smoothing?


What are the differences between classical low-pass filtering (with an IIR or FIR), and "smoothing" by localized Nth degree polynomial regression and/or interpolation (in the case of upsampling), specifically in the case where N is greater than 1 but less than the local number of points used in the regression fit.



Answer



Both low pass filtering and polynomial regression smoothing could be seen as approximations of a function. However, the means of doing this are different. The key question to ask here is "Can you do one in terms of the other?" and the short answer is "not always", for reasons that are explained below.


When smoothing by filtering the key operation is convolution where $y(n)=x(n)*h(n)$, which in the frequency domain translates to $y=F^{-1}(F(x)F(h))$ where $F$ denotes the Discrete Fourier Transform (and $F^{-1}$ the inverse). The Discrete Fourier Transform (e.g. $F(x)$) offers an approximation of $x$ as a sum of trigonometric functions. When $h$ is a low pass filter, a smaller number of low frequency components are retained and the abrupt changes in $x$ are smoothed out. This sets low-pass filtering in the context of function approximation by using trigonometric functions as the basis functions, but it is worth revisiting the convolution formula to note that when filtering, y(n) (the output of the filter) depends on $x(n)$ as well as a weighted sum of past samples of $x$ (the weighting here determined by the "shape" of $h$). (similar considerations hold for IIR filters of course with the addition of past values of $y(n)$ as well)


When smoothing by some n-degree polynomial though, the output of the interpolant depends only on $x(n)$ and a mixture of (different) basis functions (also called monomials). What are these different basis functions? It's a constant ($a_0x^0$), a line ($a_1x$), a parabola ($a_2x^2$) and so on (please refer to this for a nice illustration). Usually though, when dealing with equi-distant samples in time and for reasons to do with accuracy, what is used is Newton's form of the polynomial. The reason i am citing this is because through that it is easy to see that when performing linear interpolation you could construct a filter kernel that returns a linearly weighted sum of available samples, just as a low order interpolation polynomial would use "lines" to interpolate between two samples. But at higher degrees, the two approximation methods would return different results (due to the differences in the basis functions).



As i wrote above, not taking into account past values of $x(n)$ is not strict. This is a subtle point. Because usually, when building a polynomial the values outside the given interval ("past" and "future" of a signal) are not considered. It is however possible to include these by fixing the derivatives at the edges of the interval. And if this is done repeatedly (like a non-overlapping sliding window) then effectively, the "past samples" of x(n) would be taken into account. (This is the trick that splines use and in-fact there is a convolution expression for bicubic interpolation. However, please note here that the interpretation of $x$ is different when talking about splines -note the point about normalisation-)


The reason for using filtering as interpolation some times, say for instance in the case of "Sinc Interpolation", is because it also makes sense from a physical point of view. The idealised representation of a band-limited system (e.g. a (linear) amplifier or lens in an optical system) in time domain is the sinc pulse. The frequency domain representation of a sinc pulse is a rectangle "pulse". Therefore, with very few assumptions we expect a missing value to be more or less near its neighbours (of course, within limits). If this was performed with some n-order polynomial (for higher n) then in a way we "fix" the way that a missing value is related to its neighbours which might not always be realistic (why should the sound-pressure values of a wave-front hitting a microphone be fixed to have the shape of a $x^3$ for example? It puts an assumption on how the sound-source behaves which might not always be true. Please note that i do not imply any suitability of an interpolation scheme from a psychophysics point of view here, which involves the processing of the brain (see Lanczos resampling for example). I am strictly speaking about constraints imposed by interpolation when one tries to "guess" objectively missing values.


There is no universal "best method", it pretty much depends on the interpolation problem you are faced with.


I hope this helps.


P.S. (The artifacts generated by each of the two approximation methods are different as well, see for example the Gibbs Phenomenon and overfitting, although overfitting is "at the other side" of your question.)


No comments:

Post a Comment

digital communications - Understanding the Matched Filter

I have a question about matched filtering. Does the matched filter maximise the SNR at the moment of decision only? As far as I understand, ...