Given a filter used to shape the digital signal, $p(x)$, and given that we do not want the filter combination to cause any ISI, what "matched" filter, $q(x)$ will maximise the SNR?
Matched filters are used in digital communications to maximize the signal to noise ratio. Often a root-raised-cosine filter is used to shape the signal, since it is bounded in frequency space and the same filter can be applied to the received signal to improve the signal-to-noise ratio (SNR) without causing inter-symbol-interference (ISI).
However if a less optimum filter is used to shape the signal, then using the same filter at the receiver can introduce ISI. It is not immediately obvious what the best choice of filter at the receiving end is.
My understanding is that the SNR is maximized by maximizing $\int{p(x)q(x)dx}$, so I want to maximize this while satisfying the constraint that filters cause no ISI ($p(x)*q(x) = 0$ for $x=kT$, $k$ is an integer, $T$ is symbol width).
Presumably one could do this by solving a Euler-Lagrange equation with some lagrange multipliers for the constraints. Is there an easier way, or am I making a mistake, or going in the wrong direction?
For the case of linear modulation on the AWGN channel with equiprobable symbols (a very common case) the optimum approach is to truly use a filter that is matched to the symbol waveform, i.e.:
$$ q(x) = p(x) $$
Using a matched filter provides the optimum signal-to-noise ratio at the filter output at each decision instant. This is easy to see when you remember that a matched filter acts like a sliding cross-correlator between its input signal and the expected symbol waveform, correlating the two at all possible lags. At optimum decision instants, the filter's impulse response (typically scaled to have unit energy) lines up exactly with a transmitted symbol, analogous to a zero-lag condition on the cross-correlation operation. At this time value, the filter's output is equal to the amount of energy in the received symbol, scaled by a data-dependent factor (e.g. for BPSK, the matched filter would output $ E_s $ or $ -E_s $), plus a noise term.
The noise energy at the filter output during the sampling instant is not dependent upon the time-domain shape of the filter's impulse response, only the impulse response's total energy (as noted previously, typically unity). Therefore, the signal to noise ratio is maximized by maximizing the amount of signal energy in the filter output at the sampling instant. By choosing the receiver filter to be matched to the symbol shape, we have done so, as the symbol waveform has maximum correlation with a filter impulse response that has an identical shape. Thus, the matched filter provides maximum SNR, for the AWGN channel case.
With that bout of hand-waving out of the way (you can definitely get at it with more mathematical rigor, but I'm an engineer and this is a free service; if you want to dig into the details, check any digital communication theory text), you might be thinking that I forgot that you asked about the non-ideal, ISI case. Fear not, for I assert that if you know the transmitted pulse shape, the matched filter is still the optimum choice for the AWGN channel.
The key: if you know the responses of the pulse-shaping and receiver detection filters $p(x)$ and $q(x)$, and the last "few" transmitted symbols, you can calculate what the ISI induced by those previous symbols would be and account for it accordingly; it is a deterministic quantity. The amount of symbol history that you require is related to the amount of ISI you have, i.e. how many symbol periods the cascaded filter response smears across.
Of course, you typically don't know with certainty what the previous few symbols were; if you did, then you might be at a high enough SNR that your ISI can be neglected. In the more interesting case, you can't make that assumption. Instead, a maximum-likelihood sequence detection approach is employed using the Viterbi algorithm. This process is referred to as Viterbi equalization, because in this model you treat the ISI induced by the pulse shape like a soft-valued convolutional code that is applied to your transmit waveform. The time duration of the ISI in the Viterbi equalizer defines the required number of algorithm states, similar to the constraint length in a convolutional code.
This approach is often used in systems that have the non-optimum pulse shape that you noted; one notable example is GSM (which uses a Gaussian pulse shape that extends across multiple symbol intervals). One great reference on this topic was published by Sklar in 2003:
B. Sklar, “How I learned to love the trellis“, IEEE Signal Processing Magazine, pp. 87-102, May, 2003