I basicaly understood how the Rubberband/Baseline-correction works.
- The given spectrum is divided into (N) ranges.
- The lowest points in every range are determined.
- The initial baseline is built out of those points.
- Now all the points on the spectrum are drawn down by the difference between the lowest point in the current range and the lowest point on the baseline.
There are some nuances, though, that I do not know how to handle. E.g., what if one of the points is exactly on the border between two ranges, etc.
Plus, I have to be able to prove that the algorithm that I am writing is a solid one and can be referenced by other works or scientific papers.
If anyone could give me some reference I would be very pleased.
This can be easily done in R
or Python
. There are well-tested functions available, so you don't have to worry about any boundaries or nuances. Moreover, both are free and popular among scientists.
There is a special package to handle spectral data, called hyperSpec. The rubberband baseline correction is already implemented there (function spc.rubberband
). All details are highlighted in the documentation. The usage is pretty simple:
require(hyperSpec)
spc <- read.spe("paracetamol.SPE")
baseline <- spc.rubberband(spc)
corrected <- spc - baseline
There is no (to the best of my knowledge) out-of-the box solution for python, but you can use scipy.spatial.ConvexHull
function to find indices of all points that form a convex hull around your spectrum. Suppose that the spectrum is contained in x
and y
arrays:
import numpy as np
from scipy.spatial import ConvexHull
def rubberband(x, y):
# Find the convex hull
v = ConvexHull(np.array(zip(x, y))).vertices
Array v
contains indices of the vertex points, arranged in the CCW direction, e.g. [892, 125, 93, 0, 4, 89, 701, 1023]
. We have to extract part where v
is ascending, e.g. 0–1023.
# Rotate convex hull vertices until they start from the lowest one
v = np.roll(v, -v.argmin())
# Leave only the ascending part
v = v[:v.argmax()]
# Create baseline using linear interpolation between vertices
return np.interp(x, x[v], y[v])
Now the baseline is corrected like this:
y = y - rubberband(x, y)