Abstract

Ensemble modeling is an increasingly popular data science technique that combines the knowledge of multiple base learners to enhance predictive performance. In this paper, the idea was to increase predictive performance by holding out three algorithms when testing multiple classifiers: (a) the best overall performing algorithm (based on the harmonic mean of sensitivity and specificity (HMSS) of that algorithm); (b) the most sensitive model; and (c) the most specific model. This approach boils down to majority voting between the predictions of these three base learners. In this exemplary study, a case of identifying a prolonged QT interval after administering a drug-drug interaction with increased risk of QT prolongation (QT-DDI) is presented. Performance measures included accuracy, sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV). Overall performance was measured by calculating the HMSS. Results show an increase in all performance measure characteristics compared to the original best performing algorithm, except for specificity where performance remained stable. The presented approach is fairly simple and shows potential to increase predictive performance, even without adjusting the default cut-offs to differentiate between high and low risk cases. Future research should look at a way of combining all tested algorithms, instead of using only three. Similarly, this approach should be tested on a multiclass prediction problem.

Original languageEnglish
Pages (from-to)435-439
Number of pages5
JournalStudies in Health Technology and Informatics
Volume294
DOIs
Publication statusPublished - 2022

Keywords

  • Algorithms
  • Data Science
  • Humans
  • Sensitivity and Specificity

Fingerprint

Dive into the research topics of 'Optimization of Performance by Combining Most Sensitive and Specific Models in Data Science Results in Majority Voting Ensemble'. Together they form a unique fingerprint.

Cite this