chemometrics.IHMRegression

class chemometrics.IHMRegression(features, peak_parameters, bl_order=2, spectra_generator=<function pseudo_voigt_spectra>, method='LG', gradient_truncation=20)

Bases: IHM

Indirect Hard Modeling (IHM) of spectra with OLS prediction

IHM models spectra based on a mechanistic description of multiple pure component spectra each consisting of a set of peaks. The spectra are described by the peak parameters. For new spectra, the mechanistic spectral model is adjusted by a parameter optimization. This allows to correct for a variety of effects such as instrument specific shifts or sensor variability. The weights of the optimized spectral parameter set are used for concentration predictions. Concentrations should be given as molalities. At low concentrations, the solute concentrations may be approximated as mass concentration, molarity, molar fraction, weight fraction etc (see Notes).

Note: The first component spectra is always assumed to be the solvent. The predicted molalities for the solvent are always constant and should, by definition, be equal to the inverse molar weight 1/M_W [mol/kg].

featuresndarray of shape (n_features, 1), default=None

feature-related x variable

peak_parameterslist of ndarrays

List of peak parameter arrays

bl_orderint (default: 2)

Order of background polynome

spectra_generatorfunction, default=pseudo_voigt_spectra

Reference to spectra-generating function

method{‘LG’}
Algorithm for spectral fit:
  • ‘LG’ (default): largest gradient method as descirbed in

    [EKriesten].

gradient_truncationint (default: 20)

For peak_parameter fitting, only step along the most important gradient directions up to the number of directions given by gradient_truncation.

n_components_int

Number of components in model

linearized_breakpoints_ndarray

Vector which indicates at what point different sections of the linarized parameter vector end. Structure: (background parameters, component weights, component shifts, spectra parameters)

regressor_LinearRegression

Estimator which converts weights to concentrations

The current optimization strategy follows the largest gradient approach described in [EKriesten] . To reduce the complexity of the optimization problem, first global parameters are optimized (background, spectral shift, spectral weights). Peak parameters are optimized one by one depending on the gradient size up to a certain number of parameters.

Concentration: The optical spectroscopic sensor probes an unknown volume $V$. Each chemical species in the probed volume contributes to the Raman signal with a weight $w_i$ proportional to the number of moles in the probed volume. .. math:: w_i propto N_i

We may furthermore expand: .. math:

w_i K_i= N_i
w_i K_i = N_i

The solvent is denoted by a subscript $S$. Following the same argument as above, we may define the solvent mass in the probed volume .. math:: m_S = w_S K’_S = w_S

rac{K_S}{M_{W, S}}

with :math: M_{W, S}, the molar mass of the solvent.

Molality is given by: .. math:

b_i =

rac{N_i}{m_S}= rac{w_i K_i}{w_S K’_S}=k_i rac{w_i}{w_S}

features:

feature or spectral dimension (e.g. wavelength, wavenumber)

peak:

a feature-associated effect described by a scaled probablity function

component:

a chemical species described by a linear combination of peaks

baseline:

slowely varying effect not associated to a specific component

spectra:

a linear combination of multiple components and baseline effects

__init__(features, peak_parameters, bl_order=2, spectra_generator=<function pseudo_voigt_spectra>, method='LG', gradient_truncation=20)

Methods

__init__(features, peak_parameters[, ...])

fit(X, y)

Calibrate regression model of IHM

fit_transform(X[, y])

Fit to data, then transform it.

get_params([deep])

Get parameters for this estimator.

predict(X)

Predict concentrations from given X

set_params(**params)

Set the parameters of this estimator.

transform(X[, y])

Transform spectra in IHM parameter set

fit(X, y)

Calibrate regression model of IHM

fit_transform(X, y=None, **fit_params)

Fit to data, then transform it.

Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.

Parameters
  • X (array-like of shape (n_samples, n_features)) – Input samples.

  • y (array-like of shape (n_samples,) or (n_samples, n_outputs), default=None) – Target values (None for unsupervised transformations).

  • **fit_params (dict) – Additional fit parameters.

Returns

X_new – Transformed array.

Return type

ndarray array of shape (n_samples, n_features_new)

get_params(deep=True)

Get parameters for this estimator.

Parameters

deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns

params – Parameter names mapped to their values.

Return type

dict

predict(X)

Predict concentrations from given X

set_params(**params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters

**params (dict) – Estimator parameters.

Returns

self – Estimator instance.

Return type

estimator instance

transform(X, y=None)

Transform spectra in IHM parameter set