Combining spectroscopy and machine learning processing for precise sample classification

Scriptie/Masterproef: Master's Thesis

Samenvatting

Laser-based sorting machines enable a non-destructive food classification based on optical
spectroscopy, indicating their high potential to improve food safety. Current state-of-the-art
data processing of the spectroscopic measurements uses machine learning techniques only
very scarcely. This thesis investigates how machine learning can potentially improve the
performance of the state-of-the-art sorting machines.
In a first case study, raw potatoes are classified using ultraviolet-visible and near infrared
(UVVIS/NIR) reflection spectroscopy according to their acrylamide level after frying. With
a broadband illumination and detection system, perfect classification using cross-validation
is obtained using Linear Discriminant Analysis or Extreme Learning Machine without preprocessing,
and using Naïve Bayes, Partial Least Squares or Neural Networks utilizing the
first or second derivative signal. In an industrial, laser-based setting, with a limited number of
illumination systems and detectors, Linear Discriminant Analysis with adapted prior probabilities
was found to yield the best results, with accuracies between 92% and 100% for the di.erent
acrylamide levels. The choice of the most important illumination wavelengths was done using
a sequential feature selection search.
In a second case study, a general nut sorting methodology was developed based on their
reflection and fluorescence properties, covering the detection of foreign objects and molds,
in combination with an evaluation of the product quality. Furthermore, the integration of
both the nut quality and safety evaluation into a single sorting configuration is targeted.
This was done for four nut varieties: hazelnuts, walnuts, almonds, pistachio. A novel
classification scheme, which uses a cascade of individual classifiers based on the di.erent
measurement techniques, was implemented. Machine learning processing was indispensable
to obtain satisfying results. For all nut varieties, using a combination of Quadratic Discriminant
Analysis, Extreme Learning Machine and Support Vector Machines, a false negative rate of
maximal 8% was obtained, while the false positive rate of almost all bad sample types was
maximal 3%. The shrivelled samples were found to be the most difficult to classify correctly,
with a maximal false positive rate of 15%. The limited number of commercially available laser
sources did not significantly a.ect the results, while the detector sensitivity and bandwidth
had the most influence on the performance of the classifiers using fluorescence measurements.
It can be concluded that the combination of machine learning and spectroscopy contributes
to an improved classification performance, while being feasible to be integrated in industrial
optical sorting machines

Datum prijs2020
Originele taalEnglish
BegeleiderMartin Virte (Advisor)

Citeer dit

'