Efficiency
Maintaining a complex rule engine with hundreds of interdependent rules that express constantly changing fraud patterns isn’t easy and it definitely isn’t scalable. In contrast, machine learning based solutions scale automatically via cloud service providers - the only difference in cost between processing 1k and 100k transactions is the figure on the invoice from your cloud service provider. Data scientists or machine learning engineers need to do exactly the same job provided they use proper tools and automate repetitive tasks like retraining models or data collection.
Automatic adaptation via retraining
Concept drift is less troublesome for machine learning based solutions. In rule based engines, changes in fraud patterns call for manual recalibration of rules and creation of new ones that are a result of research. This is manual work that can’t be easily automated. In comparison, ML models require rerunning the training on new data samples and (sometimes) coming up with new features that would capture the change in detected phenomena described as concept drift. Retraining can be easily automated so, again, ML models prove to be more effective cost wise.
Automatic detection of fraud patterns
Today, you can attend an online bootcamp that teaches you how to effectively commit fraud the same way one might attend an online course to learn programming. This means that obvious fraud patterns, expressed by rule based engines that haven’t evolved as much as ML in recent years, will be swiftly bypassed by modern fraudsters. In light of this, automatic fraud pattern detection that comes with ML models is a necessity rather than a luxury.
Power of ensembles
Many modern day ML algorithms work as ensembles (e.g. random forest, gradient boosting). This means that, under the hood, algorithms create numerous separate classifiers that are trained independently on different data subsets, learning slightly different things about fraud patterns. When deployed, they vote on the score for every transaction, solving the problem of bias. If a fraudster is coming from the other part of the world and is half the age of the analyst that composes the rules, the bias transferred from analyst to code can create a gateway for fraudsters coming from different backgrounds. Ensembles partially alleviate this single point of failure.
Explainability
Rule based systems hold a strong advantage over ML models in terms of explainability. In such systems, there is little ambiguity over why a certain transaction was blocked. Some ML algorithms (especially deep neural networks, the most hyped of all ML techniques) work as a black boxes - there is no easy way of saying why it returned a certain value for certain input. Fortunately, most fraud detection datasets are imbalanced and made of structured data - this means that algorithms that utilize decision trees work really well. Predictions of such models can be easily explained using packages like ELI5 (which stands for “Explain Like I'm 5”) that enable us to see which transaction traits contribute to its likelihood of being fraudulent (just like in rule based systems). Even if the algorithm is not tree-based, there are many tools that try to demystify the internal workings of those black boxes (deep neural networks included). XAI which stands for “Explainable Artificial Intelligence” is a new field that gained a lot of attention recently due to the fact that many real-world applications of ML models demand explainability.