Israeli credit company MAX changed its modeling approach for their retail credit portfolio in order to become more value-driven. It moved from the traditional segment-based default models, which focus on the probability of a loan not reaching its end date, to predicting different value drivers on a customer level.
To enable this, MIcompany introduced a new biostatistics-based modeling approach, Competing Hazard modeling, which improved model performance with 51%-81% (reduced Mean Absolute Error).
After integrating these predictions in a Customer Lifetime Value calculation, MAX was able to refocus its commercial efforts to relevant value pools, and improve the customer offering based on these insights.
Supported by a value monitoring dashboard and a model management engine, MAX has structurally changed its retail credit strategy, and is ready to develop new use cases within retail credit to further grow within the Israeli market.
January 21st 2019 was a historic day for Leumi Card, as from that day on it got the opportunity to grow its customer-facing position, after being formally separated from Leumi Bank. Following this day the company acted quick; it was rebranded to MAX with a sharpened product portfolio and the ambition to become Israel’s leader in the credit market, by becoming a one-stop-shop for retail and business customers.
For this, MAX had to understand the actual value they gain from their current customers, so they could focus their acquisition efforts on customer segments that they can serve well, with appropriate products.
To assess value creation on a customer level, you first need to define value and calculate that on the required granularity level. Then, you need to predict how this value will develop over time, which differs strongly per customer type. In the retail credit industry context: if a customer is unable to fulfill their payments, the entire open debt and future-interest income is lost. We call this a default event, which logically has a major impact on the customer value.
Thus, having a precise value prediction was crucial for MAX.
MAX had to deal with two other facts. Firstly, the event of an “early pay-off” can occur, in which a client closes their loan before the end date. As this event has a different impact on customer value, it needs to be modeled as well. As the early pay-off event interacts with the default event, meaning that they can’t both happen for the same loan, they need to be modeled separately. Secondly, historical data had limited value, since the financial market changes fast and MAX’s application policy has adjusted multiple times in the past few years. As historical data is used to predict future events, assuming behavior and policies are stable, such changing circumstances limit the usability of the data.
Given the challenge at hand, traditional modeling techniques were not applicable.
However, MAX showed its innovative character: it used biostatistics-based innovative modeling techniques, which proved to work well for this specific challenge.
This technique enabled them to understand and predict customer behavior, and calculate expected value on a granular level.
MAX immediately used the model results to define new business initiatives, within a few weeks after model implementation. Supported by a new sales dashboard, business teams started to launch new value-driven campaigns, while MAX’s elite data science team monitored model performance in the Model Management Center.
This article details how MAX was able to tackle the challenge of customer value modeling in the retail credit industry, while creating sustainable business impact in the process.
A. Biostatistics modeling is used for retail credit modeling
When focusing on personal loans in the retail industry, the main modeling challenge is determining loan finalization probability; i.e. the probability a customer will fulfill all of their agreed interest and installment payments.
For accurate assessments of these probabilities, survival modeling techniques are typically chosen, which are a branch of statistics for analyzing the expected duration of time until one event happens. These survival models result in a prediction of survival per time slot; the probability the end-state was not reached within that time slot.
When dealing with one possible default event, a standard survival modeling technique could be used, such as the Kaplan-Meier estimate or a Cox proportional hazards model.
However, in MAX’s situation, there are multiple possible end states of a loan. A loan could end on the agreed end date, or it can be discontinued due to several scenarios such as default. The probability of reaching each of the end-states depends on different types of features, such as customer behavior, customer characteristics, and product characteristics.
The specific complexity of this situation, is the fact that there are multiple interfering effects; if a loan was defaulted, it cannot be in another scenario. Dutch researchers H. Putter, M. Fiocco and R. B. Geskus (2007) state: “one of the fundamental assumptions underlying the Kaplan-Meier estimator is violated: the assumption of independence of time to event and the censoring distributions” (p. 2390).
To solve this challenge, we realized we could learn much from our biostatistics colleagues.
In medical research, often more than one type of event plays a role. Usually, one type of event can be singled out as the event of interest. The other event types may prevent the event of interest from occurring. For example, in researching the time to staphylococcus infection during hospital stay in patients with burn wounds, censoring may occur due to death or hospital discharge. It is also used in bone marrow, cancer and HIV-research. In such cases of interfering effect challenges, the Cox proportional hazard model was still used, but the interpretation of the results was different. This modeling approach is called Competing Hazard modeling.
While unprecedented in credit modeling application, MAX used this biostatistics approach: incorporating the two hazards in the model and including the interfering effects in the model results, which resulted in a granular loan duration prediction.
Let’s have a look at the statistical side of this solution and implication on the modeling challenge. If you wish to skip the mathematical part, the business story continues below.
In standard survival modeling techniques, we define as the probability of failing at time , and
where = number of failures at time , and = number of people “alive” after time .
Then, is defined as the survival function, which is the probability of survival up until time , implying that:
In Competing Hazards modeling, we have the following analogues: we define as the probability of failing at time due to cause k, and as the overall survival function – the probability of survival from any cause up until time .
The Cox proportional hazards are then calculated as follows:
With being the baseline and being the covariate function, per cause k.
The modeling works as follows: for each cause a Cox-model is developed, with its own variables and parameter output. This results in an estimation of the unconditional probability of failing from cause k at , being the product of the hazard and the estimated probability of being event-free at .
With this, the cumulative incidence of cause k at time t is estimated as the sum of these terms, being:
Concluding, from the above, k Cox proportional hazards models are scored, each focusing on a specific hazard, while considering all k in its training phase.
Going back to our credit modeling challenge, each survival modeling approach answers a different research question. The standard survival modeling techniques, such as Cox proportional hazards, answer the question “at each time t, what is the probability that individuals will be unable to finish their loan due to scenario a or b?”. This question is interesting to answer, but is not the correct question for our needs.
Since each risk has a different effect on the expected customer value, we need to answer the question “at each time t, what are the unique probabilities for each scenario?”
Let’s go through a simple example to understand the business effect of using each survival modeling technique.
Say we want to sell a 3-year loan for a private customer, who is considered a risky customer since they have a history of late payments. We want to have a calculation that considers the different ending scenarios for this loan, in order to adjust the interest rate.
When applying a standard survival modeling technique, the model may predict that in 20% of cases, the customer will fail to finish their loan with the pre-arranged terms, due to any of the scenarios. Without distinguishing between scenarios, we might get a wrong assessment of the risks and their likelihood, leading to business decisions that do not consider risks correctly, such as acceptance or price setting.
With competing hazards, we know that the loan is 4% likely to be in scenario a, and 16% likely to be in scenario b (still accumulating to the 20% of the standard model). Now we have a clear view on the risks of the loan, allowing us to correctly make business decisions relating this customer.
B. Steering on customer value is the answer to volume-profit challenges
Customer Lifetime Value (CLV) steering means making business decisions based on expected value creation of a customer, during their entire customer lifetime.
This business method is beneficial for customers, since they get appreciated for expected customer loyalty and product development with fair offers for services and products. The company benefits by having a compass to focus its efforts on valuable customer segments, with a longer time scope in mind. Using this method, the company can balance between short-term volume ambitions, while ensuring long-term profits.
Customer Lifetime Value is built from four components: expected income, expected costs, risk and expected lifetime. We detail the multi-dimensional challenge of predicting these components in the retail credit industry.
To determine the expected income, a good understanding of the product income model is required. Based on this model and the exact product characteristics, the periodic cashflows and discounting procedures are translated into the expected income per time unit.
The expected costs on a customer level are determined, by allocating acquisition (channel) costs on a customer level and calculating expected customer treatment costs, based on customer characteristics and historical behavior.
By combining expected income and costs, the expected margin on a customer level can be determined.
This is the maximum expected margin, as it assumes the standard end state of “ending the loan on time”.
The expected margin should be corrected for the probabilities of the different scenarios and their margin implications.
To determine the expected customer lifetime, the probability of taking a new loan after reaching the standard end date loan is calculated, based on customer and product characteristics and customer behavior. After integrating the probabilities of the other end states, the expected lifetime on a customer level can be determined.
By combining the four components and discounting, the Customer Lifetime Value is calculated.
After calculating this customer-driven metric, MAX was able to determine the value creation of its different customer segments and locate strength and weakness points. It was also able to evaluate historical campaigns on value creation and compare this with volume generation.
MAX decided to balance these drivers, aiming for sustainable growth.
As stated before, MAX developed the CLV metric to drive its future decision making. To ensure achieving actual business impact, while trusting the model outcomes, MAX did three things.
They defined quick-win value creation initiatives, and developed a value performance dashboard to validate these actions. Furthermore, to ensure that the model outcomes are stable and trustworthy, they have built a Model Management Center, which monitors model performance and sends out an alert in case of deviations.
A. Value creation opportunities from new model
During the model development phase, MAX set up a cross-functional business team. The team was focused on defining specific and concrete short-term value-creation interventions. Typical questions were: how to grow quantities, while keeping a positive value of acquisition? For which customer segments should we limit sales, due to alarming values? How to balance the price and loan offer, to ensure value creation while keeping an attractive market position?
Some specific growth engines based on the business strategy were defined, reducing value losses and growing in specific value segments.
Within a few weeks after model implementation, the different quick-win actions were set in motion. This resulted in improved offers for MAX’s customers, while maintaining attractive value creation for MAX.
B. Sales dashboard to enable value steering
MAX realized that it requires standardized insights on value creation, for two main reasons. Firstly, to track the results of value creation initiatives. Secondly, to detect new possible interventions.
This effort would also create one source of truth about retail credit sales performance, which has been an on-going challenge as well.
With this aim, the business team designed three standard views. The Top View is the management summary, showing sales development in terms of quantities, created value and other financial KPIs.
The Analytics View enables business users to explore performance on key dimensions. Also, a new process was set up to validate the success of launched campaigns using a Sales Funnel View. The dashboards were integrated in MAX’s existing infrastructure, enabling the different end users to act upon it.
“The creation of one centralized truth of business performance is crucial for our business steering. At the same time, we have integrated data from different departments into one infrastructure, which enables us to innovate on our further modeling.”
Lena Dubilier, CDO MAX
C. Monitoring model performance in MCC by elite DS-team
MAX developed new models, which were well-performing and thoroughly validated. This is crucial, as important business decisions are made based on the model outcomes.
To create an ongoing monitor on the model performance development, MAX developed a Model Control Center. In this control center the model quality is tracked, but it also contains automated triggers, signaling if the performance drops for one of the defined micro-segments.
MAX’s elite data science team took ownership of the models, tracking their performance and intervening when required. This team was selected based on their analytical potential and trained to become all-round data science leaders in the international Data & AI Talent Program, which is part of the curriculum of GAIn – the Global AI network. During this 3-year educational program, the Data Scientists are trained by international trainers in the areas of Data and Technology, Machine Learning and Statistics, Leadership and Change, and Impact and Opportunities. During this program the Data Scientists developed an integrated skillset, which enables them to manage and lead AI projects.
MAX has made it happen; implementing the first business interventions, tracking their immediate impact on sales and value, while ensuring the model predictions are stable and validated. MAX will continue to develop new use cases within retail credit, while incorporating the required dose of analytical skill and creativity to tackle unprecedented modeling exercises – truly taking value creation to the next level.