TY - JOUR
T1 - Convolutional neural network for automated peak detection in reversed-phase liquid chromatography
AU - Kensert, Alexander
AU - Bosten, Emery
AU - Collaerts, Gilles
AU - Efthymiadis, Kyriakos
AU - Van Broeck, Peter
AU - Desmet, Gert
AU - Cabooter, Deirdre
PY - 2022/6/7
Y1 - 2022/6/7
N2 - Although commercially available software provides options for automatic peak detection, visual inspection and manual corrections are often needed. Peak detection algorithms commonly employed require carefully written rules and thresholds to increase true positive rates and decrease false positive rates. In this study, a deep learning model, specifically, a convolutional neural network (CNN), was implemented to perform automatic peak detection in reversed-phase liquid chromatography (RPLC). The model inputs a whole chromatogram and outputs predicted locations, probabilities, and areas of the peaks. The obtained results on a simulated validation set demonstrated that the model performed well (ROC-AUC of 0.996), and comparably or better than a derivative-based approach using the Savitzky-Golay algorithm for detecting peaks on experimental chromatograms (8.6% increase in true positives). In addition, predicted peak probabilities (typically between 0.5 and 1.0 for true positives) gave an indication of how confident the CNN model was in the peaks detected. The CNN model was trained entirely on simulated chromatograms (a training set of 1,000,000 chromatograms), and thus no effort had to be put into collecting and labeling chromatograms. A potential major drawback of this approach, namely training a CNN model on simulated chromatograms, is the risk of not capturing the actual “chromatogram space” well enough that is needed to perform accurate peak detection in real chromatograms.
AB - Although commercially available software provides options for automatic peak detection, visual inspection and manual corrections are often needed. Peak detection algorithms commonly employed require carefully written rules and thresholds to increase true positive rates and decrease false positive rates. In this study, a deep learning model, specifically, a convolutional neural network (CNN), was implemented to perform automatic peak detection in reversed-phase liquid chromatography (RPLC). The model inputs a whole chromatogram and outputs predicted locations, probabilities, and areas of the peaks. The obtained results on a simulated validation set demonstrated that the model performed well (ROC-AUC of 0.996), and comparably or better than a derivative-based approach using the Savitzky-Golay algorithm for detecting peaks on experimental chromatograms (8.6% increase in true positives). In addition, predicted peak probabilities (typically between 0.5 and 1.0 for true positives) gave an indication of how confident the CNN model was in the peaks detected. The CNN model was trained entirely on simulated chromatograms (a training set of 1,000,000 chromatograms), and thus no effort had to be put into collecting and labeling chromatograms. A potential major drawback of this approach, namely training a CNN model on simulated chromatograms, is the risk of not capturing the actual “chromatogram space” well enough that is needed to perform accurate peak detection in real chromatograms.
KW - Convolutional neural networks
KW - Machine learning
KW - Method development
KW - Peak finding
UR - http://www.scopus.com/inward/record.url?scp=85128196704&partnerID=8YFLogxK
U2 - 10.1016/j.chroma.2022.463005
DO - 10.1016/j.chroma.2022.463005
M3 - Article
AN - SCOPUS:85128196704
VL - 1672
JO - Journal of Chromatography. A
JF - Journal of Chromatography. A
SN - 0021-9673
M1 - 463005
ER -