Abstract
Breath sample analysis is a relatively novel tool in the clinical context, which is succesfully used to diagnose various pathological conditions. In this study breath samples are analyzed by means of selected ion flow tube mass spectrometry (SIFT-MS). This is a quantitative technique for determining the absolute concentration of trace compounds in air and breath samples. SIFT-MS does not require sample preparation, no calibration standard is needed and since the SIFT-MS is directly connected with the tube where the patients exhale, it is easily operated.
In this study the aim is to distinguish cystic fibrosis and asthma from healthy breath samples, based on their SIFT-MS spectra. For 55 samples, belonging to 3 groups (asthmatic, healthy and cystic fibrosis), SIFT-MS spectra with 708 variables were recorded.
Multivariate data analysis tools were used to study the data set. In a first step the significant variables (different from noise) in the spectra were determined based on Dong’s algorithm. The spectra were pre-treated with various data pre-treatment techniques, aiming to highlight the information relevant for the classification of the samples.
Consecutively, an unsupervised technique, Principal Component Analysis (PCA), was used to evaluate whether the expected groups (healthy, cystic fibrosis, asthma) in the samples set were observed. With the best pre-processing (DOSC, direct orthogonal signal correction) the 3 groups were observed in the PCA score plot. However, subgroups were observed. The important variables in the spectra were discovered using the corresponding PCA loadings plot. These variables were found independent of different pre-processings: H3O+ 37+, H3O+ 55+, NO+ 30+, NO+ 48+, O2+ 32+, O2+ 37+, O2+ 55+.
Further, classification and discrimination models were made in order to be able to classify and identify new samples. The best classifications were found using KNN (K nearest neighbors) and PCA-LDA (principal component analysis- linear discrimination). All models are assign all samples to their proper class (calibration and cross-validation sensitivity, specificity, accuracy and a non-error rate, all are 100%).
In the future, the performance of these models to a broader data set of new samples with known class origin (healthy, asthma or cystic fibrosis) should be further evaluated. However based on the actual results, our approach seems to have potential to be further developed for clinical diagnosis.
In this study the aim is to distinguish cystic fibrosis and asthma from healthy breath samples, based on their SIFT-MS spectra. For 55 samples, belonging to 3 groups (asthmatic, healthy and cystic fibrosis), SIFT-MS spectra with 708 variables were recorded.
Multivariate data analysis tools were used to study the data set. In a first step the significant variables (different from noise) in the spectra were determined based on Dong’s algorithm. The spectra were pre-treated with various data pre-treatment techniques, aiming to highlight the information relevant for the classification of the samples.
Consecutively, an unsupervised technique, Principal Component Analysis (PCA), was used to evaluate whether the expected groups (healthy, cystic fibrosis, asthma) in the samples set were observed. With the best pre-processing (DOSC, direct orthogonal signal correction) the 3 groups were observed in the PCA score plot. However, subgroups were observed. The important variables in the spectra were discovered using the corresponding PCA loadings plot. These variables were found independent of different pre-processings: H3O+ 37+, H3O+ 55+, NO+ 30+, NO+ 48+, O2+ 32+, O2+ 37+, O2+ 55+.
Further, classification and discrimination models were made in order to be able to classify and identify new samples. The best classifications were found using KNN (K nearest neighbors) and PCA-LDA (principal component analysis- linear discrimination). All models are assign all samples to their proper class (calibration and cross-validation sensitivity, specificity, accuracy and a non-error rate, all are 100%).
In the future, the performance of these models to a broader data set of new samples with known class origin (healthy, asthma or cystic fibrosis) should be further evaluated. However based on the actual results, our approach seems to have potential to be further developed for clinical diagnosis.
Original language | English |
---|---|
Publication status | Published - 2017 |
Event | XXth annual BSMS meeting, Heverlee - Leuven, Belgium Duration: 8 Feb 2017 → 8 Feb 2017 |
Conference
Conference | XXth annual BSMS meeting, Heverlee |
---|---|
Country/Territory | Belgium |
City | Leuven |
Period | 8/02/17 → 8/02/17 |