Combating insurance fraud with machine learning
Most insurance companies depend on human expertise and business rules-based software to protect themselves from fraud. However, people move on. And the drive for digital transformation and process automation means data and scenarios change faster than you can update the rules.
By Georgios Kapetanvasileiou, Analytical Consultant at SAS
Machine learning has the potential to allow insurers to move from the current state of 'detect and react' to 'predict and prevent'.
It excels at automating the process of taking large volumes of data, analysing multiple fraud indicators in parallel - which taken individually may often be quite normal - and finding potential fraud. Generally, there are two ways to teach or train a machine learning algorithm, which depend on the available data: supervised and unsupervised learning.
In predictive modelling or supervised learning, algorithms make predictions based on a set of examples from historical data. You can present an algorithm with historical claims information and associated outcomes often called labelled data. It will attempt to identify the underlying patterns in fraudulent cases.
Once the algorithm has been trained on past examples, you can use it to infer the probability of a new claim being fraudulent. AKSigorta Insurance is using advanced predictive modelling as part of its investigation process. The company has managed to increase its fraud detection rate by 66% and prevent fraud in real time.
There is a wide variety of predictive modelling algorithms to choose from, so users should take into account issues such as accuracy, interpretability, training time and ease of use. There is no single approach that works universally. Even experienced data scientists have to try different methods to find the right algorithm for a specific problem.
It is, therefore, best to start simple and explore more advanced machine learning methodologies later. Decision trees, for example, are an excellent way to start exploring complex relationships within data. They are relatively easy to implement and fast to train on large volumes of data.
More importantly, they are very easy to understand or interpret, and can be a good starting point for new business rules.
Other options for more accuracy
Decision trees can, however, become unstable over time. When accuracy becomes a priority, practitioners should look at other options. Support Vector Machines (SVMs) and neural networks are capable of learning complex class boundaries and generalise well to unseen cases. They have been extensively used for fraud detection.
Tree-based algorithms, such as gradient boosting and random forests, have also become more popular in recent years. Ideally, analysts should try multiple approaches in parallel before deciding what works best.
Supervised learning is effective in identifying familiar cases of fraudulent activity but cannot uncover new patterns. Another challenge is the limited numbers of fraud examples with which to train the algorithm. Fraud is a relatively rare event, after all. The ratio between fraud and nonfraud cases can sometimes be as much as 1-10,000. This means that predictive algorithms tend to be overwhelmed by the sheer volume of nonfraud cases, and may miss the fraudulent ones. Labelling new data for training a model can also be time consuming and expensive.
Unsupervised learning algorithms are trained against data with no historical labels. In other words, the algorithm is not given the answer or outcome beforehand. It is merely asked to explore the data and uncover any 'interesting' structures within them.
For example, given certain behavioural information, unsupervised learning algorithms can identify groups (or clusters) of customer transactions that appear similar. Anything that appears different or rare could be flagged as an anomaly (or an outlier) for further investigation.
Unsupervised learning methods can, therefore, identify both existing and new types of fraud. They are not restricted to predefined labels, so can quickly adapt to new and emerging patterns of dishonest behaviour. For example, a New Zealand health insurer used unsupervised learning methods to identify cases where practitioners were deliberately overcharging patients for a particular procedure or providing unnecessary treatment for certain diagnoses.
Unsupervised anomaly detection methods include univariate outlier analysis or clustering-based methods such as k-means. However, the recent move towards digitalisation means more data, at higher volumes, from a wider range of data sources.
New algorithms, such as Support Vector Data Description, Isolation Forest or Autoencoders, have been introduced to address this. These may be a more efficient way of detecting anomalies and allow for faster reaction to new fraud.
Social network analysis
These methods are useful for identifying opportunistic fraud. However, many fraudsters today operate as part of professional, organised rings. Activity may include staged motor accidents to collect on premiums, ghost brokering, or collusion between patients and health practitioners to inflate claim amounts. These career fraudsters can repeatedly disguise their identities and evolve their way of operating over time.
Social network analysis is a tool for analysing and visually representing relationships between known entities. Examples of shared entities could be different applicants using the same telephone number or IP address, or a motor accident involving multiple people.
Social network methods can automate the process of drawing connections from disparate data sources and visually representing them as a network. This significantly reduces the investigation time - in one case, from ten days to just two hours. In the UK, a large P&C insurer made £7m savings per annum by uncovering groups of collaborating fraudsters using network analytics.
A hybrid approach
No single technique, however, is capable of systematically identifying all complex fraud schemes. Instead, insurers need to combine sophisticated business rules and advanced machine learning approaches. This will allow them to cast the net wide, but improve accuracy and reduce false positives, making fraud detection more efficient.