Samenvatting
Despite promising developments in Explainable Artificial Intelligence, the
practical value of XAI methods remains under-explored and insufficiently validated in real-world settings. Robust and context-aware evaluation is essential,
not only to produce understandable explanations but also to ensure their
trustworthiness and usability for intended users, but tends to be overlooked
because of no clear guidelines on how to design an evaluation with users.
This study addresses this gap with two main goals: (1) to develop a framework of well-defined, atomic properties that characterise the user experience
of XAI in healthcare; and (2) to provide clear, context-sensitive guidelines for
defining evaluation strategies based on system characteristics.
We conducted a systematic review of 82 user studies, sourced from five
databases, all situated within healthcare settings and focused on evaluating
AI-generated explanations. The analysis was guided by a predefined coding
scheme informed by an existing evaluation framework, complemented by
inductive codes developed iteratively.
The review yields three key contributions: (1) a synthesis of current evaluation practices, highlighting a growing focus on human-centred approaches
in healthcare XAI; (2) insights into the interrelations among explanation
properties; and (3) an updated framework and a set of actionable guidelines
to support interdisciplinary teams in designing and implementing effective
evaluation strategies for XAI systems tailored to specific application contexts.
practical value of XAI methods remains under-explored and insufficiently validated in real-world settings. Robust and context-aware evaluation is essential,
not only to produce understandable explanations but also to ensure their
trustworthiness and usability for intended users, but tends to be overlooked
because of no clear guidelines on how to design an evaluation with users.
This study addresses this gap with two main goals: (1) to develop a framework of well-defined, atomic properties that characterise the user experience
of XAI in healthcare; and (2) to provide clear, context-sensitive guidelines for
defining evaluation strategies based on system characteristics.
We conducted a systematic review of 82 user studies, sourced from five
databases, all situated within healthcare settings and focused on evaluating
AI-generated explanations. The analysis was guided by a predefined coding
scheme informed by an existing evaluation framework, complemented by
inductive codes developed iteratively.
The review yields three key contributions: (1) a synthesis of current evaluation practices, highlighting a growing focus on human-centred approaches
in healthcare XAI; (2) insights into the interrelations among explanation
properties; and (3) an updated framework and a set of actionable guidelines
to support interdisciplinary teams in designing and implementing effective
evaluation strategies for XAI systems tailored to specific application contexts.
Originele taal-2 | English |
---|---|
Aantal pagina's | 114 |
Status | Submitted - 2025 |