Information-Theoretic Feature Selection in Microarray Data Using Variable Complementarity

Colas Schretter, Patrick E. Meyer, Gianluca Bontempi

Research output: Contribution to journalArticlepeer-review

216 Citations (Scopus)

Abstract

The paper presents an original filter approach for effective feature selection in microarray data characterized by a large number of input variables and a few samples. The approach is based on the use of a new information-theoretic selection, the double input symmetrical relevance (DISR), which relies on a measure of variable complementarity. This measure evaluates the additional information that a set of variables provides about the output with respect to the sum of each single variable contribution. We show that a variable selection approach based on DISR can be formulated as a quadratic optimization problem: the dispersion sum problem (DSP). To solve this problem, we use a strategy based on backward elimination and sequential replacement (BESR). The combination of BESR and the DISR criterion is compared in theoretical and experimental terms to recently proposed information-theoretic criteria. Experimental results on a synthetic dataset as well as on a set of eleven microarray classification tasks show that the proposed technique is competitive with existing filter selection methods.
Original languageEnglish
Pages (from-to)261-274
Number of pages14
JournalIEEE Journal of Selected Topics in Signal Processing
Volume2
Publication statusPublished - 2008

Keywords

  • feature extraction
  • filtering theory
  • quadratic programming
  • signal classification

Fingerprint

Dive into the research topics of 'Information-Theoretic Feature Selection in Microarray Data Using Variable Complementarity'. Together they form a unique fingerprint.

Cite this