Using profit-based evaluation measures is a necessity in business-oriented contexts, as they aid companies in making cost-optimal decisions. Among the measures that effectively include the true nature of costs and benefits in binary classification, the expected maximum profit (EMP) has been used successfully for churn prediction and credit scoring, and defined in general for binary classification problems. However, despite its competitive results against the most frequently used measures, the EMP relies on a fixed probability distribution of costs and benefits, the range of which in real applications is not entirely known. In this paper, we propose to extend this measure by adding random shocks to these distributions. We call this new measure the R-EMP, following the convention of the analogous EMP measure. Our metric adds a stochastic component to each point of the cost-benefit distributions, assuming that costs and benefits have a fixed probability, but its distribution range is subject to an external shock, which can be different for each cost or benefit. The experimental setup is focused on a credit scoring application using a dataset of a Chilean financial institution, with the attribute selection for a logistic regression being accomplished using the AUC, EMP, H-measure, and R-EMP as the selection criteria. The results indicate that the R-EMP measure is the most robust metric for achieving the greatest profit for the company under uncertain external conditions.
- Business analytics
- Performance measures
- Profit-driven analytics
- Supervised binary classification