Application of Machine Learning Techniques for Ecotope Classification based on Hyperspectral Images

Jonathan Cheung-Wai Chan, Desire Paelinckx

Onderzoeksoutput: Commissioned report


The objective of this study is to use timely and inexpensive remote sensing techniques for ecotope classification. Ecotope classification provides ecological insights to land use and thus is an important input for environmental policy-making. Hyperspectral data with its fine spectra resolution will help to discern ecotopes that has not been possible with conventional broadband visible and near infrared data. In view of the large volume of the data, a favorable processing procedure should be robust and with high repeatability. We have investigated the use of well-established machine learning methods for the processing of hyperspectral data. Decision tree classifiers have been widely examined in classification of remotely sensed data and have proven to be a comparative learner with many advantages: no data assumption, easy to interpret, fast training, and high repeatability. Voting classification using ensemble classifiers are simple and robust algorithms that can be implemented with any learning model to improve accuracy. In this study, we have applied voting classification of Boosting using decision trees as base learner. A wrapper approach for feature selection that takes into consideration the induction algorithm was adopted. A post-classification using multi-scale anisotropic diffusion will be implemented to produce natural boundary of the image. A probabilistic measure of scale selection based on decomposed likelihood and granularity will be used for scale selection. This processing is to smooth out noise and produce more homogenous regions. A Level II classification scheme of Biological Valuation Map was adapted. A 16-class scheme with tree and grassland categories is extracted for our experiments. Urban land uses and water surfaces have been excluded to focus on classes of interests.

Our results show that a decision tree classifier achieved 60% accuracy. Voting classification increased accuracy by 8% to 68% for the two major class categories. Wrapper based feature selection identified 17% (21 out of 126 bands) of the original wavebands, with which comparable accuracy to using all the bands was achieved but computation time was dramatically reduced by 86% at 99 boosting trials. A comparison was made to use the 22 best wavebands chosen by an independent but comparable study by Thenkabail et el. (2004). We found similar accuracies at 68% only that the machine learning feature selection focused more on early shortwave infrared bands. More than one-third, eight out of 21, of the selected wavebands falls into the region of early shortwave infrared region (1.3-1.9 _m) which is sensitive to the moisture content of vegetation or soil, and has been identified as useful for estimating vegetation stresses. Only 3 selected bands fall into the presumably important near-infrared (0.75-1.05 _m) and far near-infrared (1.05-1.30 _m) ranges. These results point to the importance of the shortwave infrared for mapping of Biological Valuation Map. To show the usefulness of hyperspectral approach, multi-spectral analysis using six similated Landsat TM bands were conducted to compare with HyMap inputs. The accuracy was 48.6% (without boosting) compared to 60.2% using 126 hyperspectral bands.

Some classes within the grassland categories are not separable. But the confusion matrix suggested that classes belonging to permanent grasslands can be merged to form a new class (permanent grassland with nature value). This new class together with arable land and temporary species poor grassland can be mapped at good accuracy. This is valuable for BVM mapping because it can greatly reduce the resources used for identify such classes. Classification accuracies of the tree categories are comparatively higher. Promising classes included deciduous forests (61%-81%), poplar and conifers plantations (92.5%), orchards (63%-98%) and scrubs (70%). In all cases it is possible that the degree of accuracy is even higher because the 'misclassification' might reflect the real situation. Finally, the diffusion-based post-classification filter improves homogeneity and enhances visual interpretation of the final map.

In conclusion we found hyperspectral data effective for identifying ecoptope and remote sensing methods provide first-hand timely information for Biological Valuation Map which is important for environmental policy makers. The proposed machine learning algorithms using decision trees and voting classification, because of the obtained accuracy and fast in computation, are well-suite to classification problems with high dimension inputs, such as hyperspectral classification.
Originele taal-2English
Aantal pagina's114
StatusPublished - 1 okt 2005

Publicatie series

NaamFinal Report for the ECOMALT Project (SR/03/046)


Duik in de onderzoeksthema's van 'Application of Machine Learning Techniques for Ecotope Classification based on Hyperspectral Images'. Samen vormen ze een unieke vingerafdruk.

Citeer dit