Abstract
Counterfactual explanations are crucial for enabling users to
comprehend and engage with machine learning models, particularly in
high-stakes domains such as finance and healthcare, where decisions have
significant impacts on individuals’ lives. However, existing methods often
lack the flexibility to accommodate users’ unique constraints and prefer-
ences. We present a framework for Flexible Counterfactual Explanations,
with an implementation using Generative Adversarial Networks (FCE-
GAN). Our approach introduces counterfactual templates, allowing users
to dynamically specify mutable and immutable features at inference time,
to generate flexible counterfactual explanations for any black-box model.
For instance, in heart disease risk prediction, patients may be unable or
unwilling to change certain factors like age or family history, while being
open to modifying diet or exercise habits. Our framework employs a two-
stage process: first generating candidate counterfactuals, then selecting
those meeting predefined quality measures. We demonstrate our frame-
work’s effectiveness on the Adult UCI income and Heart Disease Risk
Prediction datasets, showing improved performance in generating di-
verse, realistic, and actionable counterfactual explanations compared to
existing methods. Our approach offers a powerful, generalizable tool for
enhancing model interpretability and fairness in critical decision-making
systems, with the flexibility to accommodate various generative models
beyond GANs.
comprehend and engage with machine learning models, particularly in
high-stakes domains such as finance and healthcare, where decisions have
significant impacts on individuals’ lives. However, existing methods often
lack the flexibility to accommodate users’ unique constraints and prefer-
ences. We present a framework for Flexible Counterfactual Explanations,
with an implementation using Generative Adversarial Networks (FCE-
GAN). Our approach introduces counterfactual templates, allowing users
to dynamically specify mutable and immutable features at inference time,
to generate flexible counterfactual explanations for any black-box model.
For instance, in heart disease risk prediction, patients may be unable or
unwilling to change certain factors like age or family history, while being
open to modifying diet or exercise habits. Our framework employs a two-
stage process: first generating candidate counterfactuals, then selecting
those meeting predefined quality measures. We demonstrate our frame-
work’s effectiveness on the Adult UCI income and Heart Disease Risk
Prediction datasets, showing improved performance in generating di-
verse, realistic, and actionable counterfactual explanations compared to
existing methods. Our approach offers a powerful, generalizable tool for
enhancing model interpretability and fairness in critical decision-making
systems, with the flexibility to accommodate various generative models
beyond GANs.
Original language | English |
---|---|
Publication status | Unpublished - 18 Nov 2024 |
Event | BNAIC/BeNeLearn 2024: Joint International Scientific Conferences on AI and Machine Learning - Jaarbeurs Supernova, Utrecht, Netherlands Duration: 18 Nov 2024 → 20 Nov 2024 Conference number: 36 https://bnaic2024.sites.uu.nl/ https://bnaic2024.sites.uu.nl |
Conference
Conference | BNAIC/BeNeLearn 2024: Joint International Scientific Conferences on AI and Machine Learning |
---|---|
Abbreviated title | BNAIC/BeNeLearn 2024 |
Country/Territory | Netherlands |
City | Utrecht |
Period | 18/11/24 → 20/11/24 |
Internet address |