Interpretable Semi-Supervised Classifier for Predicting Cancer Stages

Isel Grau, Dipankar Sengupta, Ann Nowe

Research output: Chapter in Book/Report/Conference proceedingChapterResearchpeer-review

2 Citations (Scopus)
72 Downloads (Pure)


Machine learning techniques in medicine have been at the forefront addressing challenges such as diagnosis, prognosis prediction, or precision medicine. In this field, the data is sometimes abundant but comes from different data sources or lack assigned labels. The process of manually labeling this data when conforming to a curated dataset for supervised classification can be costly. Semi-supervised classification offers a wide range of methods for leveraging unlabeled data when learning prediction models. However, these classifiers are commonly deep or ensemble learning structures that often result in black boxes. The requirement of interpretable models for medical settings led us to propose the self-labeling grey-box classifier, which outperforms other semi-supervised classifiers on benchmarking datasets while providing interpretability. In this chapter, we illustrate the applications of the self-labeling grey-box on the omics and clinical datasets from the cancer genome atlas. We show that the self-labeling grey-box is accurate in predicting cancer stages of rare cancers by leveraging the unlabeled instances from more common cancer types. We discuss insights, the features influencing prediction, as well as a global representation of the knowledge through decision trees or rule lists, which can aid clinicians and researchers.
Original languageEnglish
Title of host publicationMachine Learning, Big Data, and IoT for Medical Informatics
EditorsPardeep Kumar, Yugal Kumar, Mohamed Tawhid
Number of pages19
ISBN (Electronic)9780128217818
ISBN (Print)9780128217771
Publication statusPublished - 1 Jan 2021


Dive into the research topics of 'Interpretable Semi-Supervised Classifier for Predicting Cancer Stages'. Together they form a unique fingerprint.

Cite this