Abstract
The paper presents an original filter approach for effective feature selection in microarray data characterized by a large number of input variables and a few samples. The approach is based on the use of a new information-theoretic selection, the double input symmetrical relevance (DISR), which relies on a measure of variable complementarity. This measure evaluates the additional information that a set of variables provides about the output with respect to the sum of each single variable contribution. We show that a variable selection approach based on DISR can be formulated as a quadratic optimization problem: the dispersion sum problem (DSP). To solve this problem, we use a strategy based on backward elimination and sequential replacement (BESR). The combination of BESR and the DISR criterion is compared in theoretical and experimental terms to recently proposed information-theoretic criteria. Experimental results on a synthetic dataset as well as on a set of eleven microarray classification tasks show that the proposed technique is competitive with existing filter selection methods.
Original language | English |
---|---|
Pages (from-to) | 261-274 |
Number of pages | 14 |
Journal | IEEE Journal of Selected Topics in Signal Processing |
Volume | 2 |
Publication status | Published - 2008 |
Keywords
- feature extraction
- filtering theory
- quadratic programming
- signal classification