Preston Bohm
spectroscopy signal processing

Matched Filters in Spectroscopy: Pulling Weak Lines Out of Noisy Spectra

Spectral Matched-Filter Explorer x = αg + Bβ + n
Line depth α (true) 0.020 AU
Detector noise σ 0.0040 AU
Lamp drift & baseline curvature 0.50×
Template mismatch (shift) 0.0 nm
Ground truth shape
① Measured spectrum x(λ), 400 wavelength channels data fitted model α̂g + Bβ̂
② Spectral template g(λ), doublet as fitted (incl. mismatch)
③ Residuals x − α̂g − Bβ̂ ±1σ band
④ Detection statistic z(λ₀), template scanned across the spectrum z = α̂/σα̂ 5σ threshold
α̂ at line center
σα̂
z = α̂ / σα̂
Residual RMS / σ

The explorer above simulates a weak absorption feature buried in noise on a slowly varying baseline. Crank up the lamp drift to watch the GLS estimator absorb the baseline while the line holds its significance, use the mismatch slider to see template errors bias the amplitude, and toggle between doublet and Lorentzian shapes to see how geometry changes the response.

The problem: weak features on ugly backgrounds

A recurring task in absorption and emission spectroscopy is deciding whether a known feature is present in a spectrum and, if so, how strong. The line shape is known from laboratory reference data, a spectral template, but the feature is weak: a few milli-absorbance units against comparable detector noise. Worse, real spectra never sit on a flat baseline. Lamp intensity drifts between reference and sample, detector response varies with wavelength, and stray light and unresolved broadband absorbers add baseline curvature, all nuisance backgrounds typically far larger than the signal. This is the everyday situation in differential optical absorption spectroscopy (DOAS) [6] and in any instrument hunting for small lines.

Write the measured spectrum (in absorbance, the negative log of transmission against a reference) as a vector over N wavelength channels:

\[ \mathbf{x} \;=\; \alpha\,\mathbf{g} \;+\; \mathbf{B}\boldsymbol{\beta} \;+\; \mathbf{n} \]

Here \( \mathbf{g} \in \mathbb{R}^N \) is the template (the expected absorbance shape, unit peak depth) and \( \alpha \) the unknown amplitude, proportional to column density or concentration via Beer–Lambert. The matrix \( \mathbf{B} \in \mathbb{R}^{N \times m} \) collects the nuisance terms: a column of ones for the offset, a slope, quadratic curvature, and low-order shapes (polynomials or splines) for lamp drift or detector response. The vector \( \mathbf{n} \) is detector noise with covariance \( \mathbf{C} = \mathbb{E}[\mathbf{n}\mathbf{n}^\top] \), covering shot noise, read noise, and channel-to-channel correlations from the readout or prior smoothing.

The matched filter as a noise-weighted projection

Forget the baseline for a moment (\( \boldsymbol{\beta} = 0 \)) and ask for the best linear estimate of \( \alpha \). Generalized least squares gives the minimum-variance unbiased answer [3], [5]:

\[ \hat{\alpha} \;=\; \frac{\mathbf{g}^\top \mathbf{C}^{-1} \mathbf{x}}{\mathbf{g}^\top \mathbf{C}^{-1} \mathbf{g}}, \qquad \operatorname{var}(\hat{\alpha}) \;=\; \frac{1}{\mathbf{g}^\top \mathbf{C}^{-1} \mathbf{g}} \]

This is the matched filter in honest form: project the data onto the template, weighting every channel by the inverse noise covariance. Low-noise channels count more, and correlated noise patterns are rotated away before the projection. The classical result that the optimal detector is a "time-reversed conjugate copy of the signal" [1], [4] is exactly this inner product written as a convolution.

The textbook simplification

\[ \hat{\alpha} \;=\; \frac{\mathbf{g}^\top \mathbf{x}}{\mathbf{g}^\top \mathbf{g}} \]

drops \( \mathbf{C}^{-1} \), legitimate only when the noise is white with equal variance in every channel, \( \mathbf{C} = \sigma^2 \mathbf{I} \), so the weighting cancels. For a real spectrometer, where shot noise scales with signal level, hot pixels exist, and the readout correlates neighboring channels, the plain dot product is no longer optimal and its quoted uncertainty is wrong. It stays unbiased for the signal but blind to a bigger problem: the baseline.

Nuisance backgrounds and generalized least squares

The naive dot product treats everything in \( \mathbf{x} \) overlapping the template as signal. A sloped or curved baseline overlaps every template at some level, so lamp drift leaks into \( \hat{\alpha} \) as bias, visible in the explorer as structured residuals when you raise the drift slider. The fix is not to subtract a hand-drawn baseline first but to estimate signal and baseline jointly. Stack the template and nuisance columns into one design matrix \( \mathbf{A} = [\,\mathbf{g} \;\; \mathbf{B}\,] \) and solve the generalized least-squares problem [3]:

\[ \begin{bmatrix} \hat{\alpha} \\ \hat{\boldsymbol{\beta}} \end{bmatrix} \;=\; \left( \mathbf{A}^\top \mathbf{C}^{-1} \mathbf{A} \right)^{-1} \mathbf{A}^\top \mathbf{C}^{-1} \mathbf{x} \]

An equivalent, illuminating form: first project both data and template onto the subspace orthogonal to the baseline (the projected vectors \( \tilde{\mathbf{x}}, \tilde{\mathbf{g}} \)), then matched-filter what is left:

\[ \hat{\alpha} \;=\; \frac{\tilde{\mathbf{g}}^\top \mathbf{C}^{-1} \tilde{\mathbf{x}}}{\tilde{\mathbf{g}}^\top \mathbf{C}^{-1} \tilde{\mathbf{g}}} \]

In words: only the part of the template that cannot be mimicked by offset, slope, curvature, or lamp drift carries information about the line. This is the "differential" in DOAS: broadband structure goes to the polynomial, and only the narrow differential structure of the cross-section is used for quantification [6]. The nuisance model costs a modest variance increase, \( \tilde{\mathbf{g}}^\top \mathbf{C}^{-1} \tilde{\mathbf{g}} \le \mathbf{g}^\top \mathbf{C}^{-1} \mathbf{g} \), since some template energy is sacrificed to the baseline subspace. That trade is almost always worth it.

Detection: the test statistic

Estimation and detection are two views of the same projection. Under the no-signal hypothesis, \( \hat{\alpha} \) is zero-mean Gaussian with the variance above, so

\[ z \;=\; \frac{\hat{\alpha}}{\sigma_{\hat{\alpha}}}, \qquad \sigma_{\hat{\alpha}} = \left( \tilde{\mathbf{g}}^\top \mathbf{C}^{-1} \tilde{\mathbf{g}} \right)^{-1/2} \]

is a standard normal score, and thresholding \( z \) is the generalized likelihood-ratio test for this model [2]. Panel ④ computes \( z \) with the template re-centered at every candidate wavelength \( \lambda_0 \), a matched-filter scan; a genuine line produces a sharp peak at its true position. The expected significance of a line of depth \( \alpha \) is \( \mathbb{E}[z] = \alpha \sqrt{ \tilde{\mathbf{g}}^\top \mathbf{C}^{-1} \tilde{\mathbf{g}} } \): deeper lines, more channels, and quieter detectors all help just as intuition says.

Limitations

References

  1. G. L. Turin, "An introduction to matched filters," IRE Trans. Inf. Theory, vol. 6, no. 3, pp. 311–329, Jun. 1960, doi: 10.1109/TIT.1960.1057571.
  2. S. M. Kay, Fundamentals of Statistical Signal Processing, Volume II: Detection Theory. Upper Saddle River, NJ, USA: Prentice-Hall, 1998.
  3. S. M. Kay, Fundamentals of Statistical Signal Processing, Volume I: Estimation Theory. Englewood Cliffs, NJ, USA: Prentice-Hall, 1993.
  4. D. O. North, "An analysis of the factors which determine signal/noise discrimination in pulsed-carrier systems," Proc. IEEE, vol. 51, no. 7, pp. 1016–1027, Jul. 1963, doi: 10.1109/PROC.1963.2383.
  5. A. C. Aitken, "On least squares and linear combination of observations," Proc. Roy. Soc. Edinburgh, vol. 55, pp. 42–48, 1936, doi: 10.1017/S0370164600014346.
  6. U. Platt and J. Stutz, Differential Optical Absorption Spectroscopy: Principles and Applications. Berlin, Germany: Springer, 2008, doi: 10.1007/978-3-540-75776-4.
  7. C. D. Rodgers, Inverse Methods for Atmospheric Sounding: Theory and Practice. Singapore: World Scientific, 2000, doi: 10.1142/3171.