In the world of financial services where risk management is paramount, we’ve all seen artificial intelligence and machine learning rapidly transforming the landscape. In fact, a recent
survey by the Bank of England and the Financial Conduct Authority (FCA) revealed that
72% of UK financial firms are already using or developing AI/ML applications, and this trend is accelerating at an astonishing pace, with
the median number of ML applications projected to skyrocket by 3.5 times in the next three years. This growth is not surprising - AI/ML models hold the promise of unlocking insights from vast amounts of data, enabling financial organisations
to make smarter, more informed decisions, and enhance their risk management strategies.
The survey’s findings are consistent with observations that I've made through my work with UK financial services institutions. Although, I have found that the progression towards AI/ML methodologies is more advanced within Fintech and Challenger Banks that,
unlike the High Street Banks, may not suffer from actual limitations due to legacy systems or perceived limitations relating to their IRB status.
Fintechs and Challenger Banks have typically recruited tech-savvy data scientists with deep understandings of the array of alternative advanced techniques that are available. Meanwhile, major banks still hold a significant advantage in terms of experience
and data. They have decades of experience in building credit models, have established model development standards, and have a thorough understanding of the underlying data.
The question now is whether the principles that underpin the development of traditional models remain wholly relevant to the new generation of AI-powered models which are mathematically derived in a completely different way.
Model Development: Traditional VS AI/ML
Traditional scorecard development has long adhered to meticulous sample design, ensuring that the applications during the sample window are both stable and reflective of proposals most recently received. It is typical for Population Stability Indices or Characteristics
Stability Indices to be calculated, and for a detailed investigation of any patterns that extend beyond reasonable expectations of seasonal variation. This approach hinges on the notion of a bespoke development sample tailored to the specific population it
serves. The composition or segmental mix and its specificity is seen as a key factor in the suitability of the model development sample.
Interestingly, we often see that AI/ML models exhibit a significant degree of cross-learning. This is where models display stronger performance when the training sample is extended to include additional observations that might not traditionally be considered
directly relevant. For example, we see superior performance from models trained on an expanded sample window versus equivalent models optimised on a period that simply aligns to the independent test sample. This is unlikely to happen using linear models!
Similar findings can be seen when adjacent segments or groups are added to the training samples. Indeed, AI/ML models thrive when developed upon large and diverse data sets. These phenomena will have implications for sample design and choice of exclusions within
model developments of the future, potentially rewriting conventional wisdom.
Similarly, many credit scorecard developments have incorporated segmentation, whereby a model is built for each of a number of sub-populations (eg. Thin File / Thick File, Clean / Dirty). The benefit of this approach is that, by building multiple models, some
non-linearity can be captured. Of course, the choice of segmentation is not always obvious and is unlikely to be optimal, however some performance uplifts are achieved. Given that AI/ML models are built because of their ability to capture non-linearity, there
is limited need for segmented models here, unless there are fundamental differences in data structure. Therefore, AI/ML models are more complex, fewer of them should be required.
Another area of focus within traditional scorecard development is the process of moving from fine-to-coarse classing. Hereby the modeller seeks to effectively divide continuous data into several ordinal groups so that the underlying bad rate shows a logical
progression and is based on sufficient volume to give a reliable result. Advanced methodologies within AI/ML models eliminate the need for fine-to-coarse classing as the grouping is achieved by the underlying methodology, generating smooth response profiles
rather than the step-changes seen as scorecard attribute boundaries are crossed. Furthermore, many training routines now include the option to add constraints to ensure features have a logical impact on the model predictions.
As the wave of AI/ML model development surges in the coming years, a fusion of deep knowledge of underlying credit data and advanced methodology is key. While new challenges arise in this new generation of models, such as unintended bias and explainability,
historical concerns will become less relevant.