Simple and effective way for data preprocessing selection based on design of experiments

J. Gerretzen, E. Szymańska, J.J. Jansen, J. Bart, H.J. van Manen, E.R. van den Heuvel, L.M.C. Buydens

Research output: Contribution to journalArticleAcademicpeer-review

32 Citations (Scopus)
1 Downloads (Pure)

Abstract

The selection of optimal preprocessing is among the main bottlenecks in chemometric data analysis. Preprocessing currently is a burden, since a multitude of different preprocessing methods is available for, e.g., baseline correction, smoothing, and alignment, but it is not clear beforehand which method(s) should be used for which data set. The process of preprocessing selection is often limited to trial-and-error and is therefore considered somewhat subjective. In this paper, we present a novel, simple, and effective approach for preprocessing selection. The defining feature of this approach is a design of experiments. On the basis of the design, model performance of a few well-chosen preprocessing methods, and combinations thereof (called strategies) is evaluated. Interpretation of the main effects and interactions subsequently enables the selection of an optimal preprocessing strategy. The presented approach is applied to eight different spectroscopic data sets, covering both calibration and classification challenges. We show that the approach is able to select a preprocessing strategy which improves model performance by at least 50% compared to the raw data; in most cases, it leads to a strategy very close to the true optimum. Our approach makes preprocessing selection fast, insightful, and objective. © 2015 American Chemical Society.
Original languageEnglish
Pages (from-to)12096-12103
Number of pages8
JournalAnalytical Chemistry
Volume87
Issue number24
DOIs
Publication statusPublished - 2015

Fingerprint

Design of experiments
Calibration

Bibliographical note

Cited By :1

Export Date: 29 June 2016

CODEN: ANCHA

Cite this

Gerretzen, J., Szymańska, E., Jansen, J. J., Bart, J., van Manen, H. J., van den Heuvel, E. R., & Buydens, L. M. C. (2015). Simple and effective way for data preprocessing selection based on design of experiments. Analytical Chemistry, 87(24), 12096-12103. https://doi.org/10.1021/acs.analchem.5b02832
Gerretzen, J. ; Szymańska, E. ; Jansen, J.J. ; Bart, J. ; van Manen, H.J. ; van den Heuvel, E.R. ; Buydens, L.M.C. / Simple and effective way for data preprocessing selection based on design of experiments. In: Analytical Chemistry. 2015 ; Vol. 87, No. 24. pp. 12096-12103.
@article{5dd3c58f9fe34b26be0e52d5b0e8403e,
title = "Simple and effective way for data preprocessing selection based on design of experiments",
abstract = "The selection of optimal preprocessing is among the main bottlenecks in chemometric data analysis. Preprocessing currently is a burden, since a multitude of different preprocessing methods is available for, e.g., baseline correction, smoothing, and alignment, but it is not clear beforehand which method(s) should be used for which data set. The process of preprocessing selection is often limited to trial-and-error and is therefore considered somewhat subjective. In this paper, we present a novel, simple, and effective approach for preprocessing selection. The defining feature of this approach is a design of experiments. On the basis of the design, model performance of a few well-chosen preprocessing methods, and combinations thereof (called strategies) is evaluated. Interpretation of the main effects and interactions subsequently enables the selection of an optimal preprocessing strategy. The presented approach is applied to eight different spectroscopic data sets, covering both calibration and classification challenges. We show that the approach is able to select a preprocessing strategy which improves model performance by at least 50{\%} compared to the raw data; in most cases, it leads to a strategy very close to the true optimum. Our approach makes preprocessing selection fast, insightful, and objective. {\circledC} 2015 American Chemical Society.",
author = "J. Gerretzen and E. Szymańska and J.J. Jansen and J. Bart and {van Manen}, H.J. and {van den Heuvel}, E.R. and L.M.C. Buydens",
note = "Cited By :1 Export Date: 29 June 2016 CODEN: ANCHA",
year = "2015",
doi = "10.1021/acs.analchem.5b02832",
language = "English",
volume = "87",
pages = "12096--12103",
journal = "Analytical Chemistry",
issn = "0003-2700",
publisher = "American Chemical Society",
number = "24",

}

Gerretzen, J, Szymańska, E, Jansen, JJ, Bart, J, van Manen, HJ, van den Heuvel, ER & Buydens, LMC 2015, 'Simple and effective way for data preprocessing selection based on design of experiments', Analytical Chemistry, vol. 87, no. 24, pp. 12096-12103. https://doi.org/10.1021/acs.analchem.5b02832

Simple and effective way for data preprocessing selection based on design of experiments. / Gerretzen, J.; Szymańska, E.; Jansen, J.J.; Bart, J.; van Manen, H.J.; van den Heuvel, E.R.; Buydens, L.M.C.

In: Analytical Chemistry, Vol. 87, No. 24, 2015, p. 12096-12103.

Research output: Contribution to journalArticleAcademicpeer-review

TY - JOUR

T1 - Simple and effective way for data preprocessing selection based on design of experiments

AU - Gerretzen, J.

AU - Szymańska, E.

AU - Jansen, J.J.

AU - Bart, J.

AU - van Manen, H.J.

AU - van den Heuvel, E.R.

AU - Buydens, L.M.C.

N1 - Cited By :1 Export Date: 29 June 2016 CODEN: ANCHA

PY - 2015

Y1 - 2015

N2 - The selection of optimal preprocessing is among the main bottlenecks in chemometric data analysis. Preprocessing currently is a burden, since a multitude of different preprocessing methods is available for, e.g., baseline correction, smoothing, and alignment, but it is not clear beforehand which method(s) should be used for which data set. The process of preprocessing selection is often limited to trial-and-error and is therefore considered somewhat subjective. In this paper, we present a novel, simple, and effective approach for preprocessing selection. The defining feature of this approach is a design of experiments. On the basis of the design, model performance of a few well-chosen preprocessing methods, and combinations thereof (called strategies) is evaluated. Interpretation of the main effects and interactions subsequently enables the selection of an optimal preprocessing strategy. The presented approach is applied to eight different spectroscopic data sets, covering both calibration and classification challenges. We show that the approach is able to select a preprocessing strategy which improves model performance by at least 50% compared to the raw data; in most cases, it leads to a strategy very close to the true optimum. Our approach makes preprocessing selection fast, insightful, and objective. © 2015 American Chemical Society.

AB - The selection of optimal preprocessing is among the main bottlenecks in chemometric data analysis. Preprocessing currently is a burden, since a multitude of different preprocessing methods is available for, e.g., baseline correction, smoothing, and alignment, but it is not clear beforehand which method(s) should be used for which data set. The process of preprocessing selection is often limited to trial-and-error and is therefore considered somewhat subjective. In this paper, we present a novel, simple, and effective approach for preprocessing selection. The defining feature of this approach is a design of experiments. On the basis of the design, model performance of a few well-chosen preprocessing methods, and combinations thereof (called strategies) is evaluated. Interpretation of the main effects and interactions subsequently enables the selection of an optimal preprocessing strategy. The presented approach is applied to eight different spectroscopic data sets, covering both calibration and classification challenges. We show that the approach is able to select a preprocessing strategy which improves model performance by at least 50% compared to the raw data; in most cases, it leads to a strategy very close to the true optimum. Our approach makes preprocessing selection fast, insightful, and objective. © 2015 American Chemical Society.

U2 - 10.1021/acs.analchem.5b02832

DO - 10.1021/acs.analchem.5b02832

M3 - Article

C2 - 26632985

VL - 87

SP - 12096

EP - 12103

JO - Analytical Chemistry

JF - Analytical Chemistry

SN - 0003-2700

IS - 24

ER -