We hear a lot about artificial intelligence. What is artificial intelligence? How would you define it?
The field of artificial intelligence (AI) had its beginnings in the fifties and was, for a long time, mainly the focus of academic and industry research projects. The main inspiration was our human brain and relevant cognitive functions. At that time, the main AI approaches were based on logic and computer programs. AI based on learning from data took longer to mature and had its breakthrough with the significant increase of available storage and computational resources as well as the vastly larger quantities of data.
In a simplified way, AI is the attempt to teach computers how to learn from examples, adjust their actions accordingly and perform a human-like task. Machine learning (ML) is a subset of AI and includes algorithms which allow computers to learn from data without being explicitly programmed.
As Terrence Sejnowski wrote in Deep Learning Revolution: “Data are the new oil. Learning algorithms are refineries that extract information from raw data; information can be used to create knowledge; knowledge leads to understanding; and understanding leads to wisdom.” It is a long journey from raw data to building wise computers, and we are only at the very beginning. However, it is fascinating now to see and be able to benefit from the progress that has already been made in the area.
What is fraud detection? How does it work?
Fraud detection is a process designed with the goal to identify anomalies in the data and as such prevent unauthorized financial activities, i.e. unauthorized payment transactions.
The anomalies are traditionally detected by applying rule-based models (e.g. certain number of transactions within a timeframe, in a certain amount, in a particular merchant category, or swipe transactions originating from specific countries). Within the last few years, ML models have gained more and more interest due to the ability of ML to learn from historical fraud patterns (e.g. cumulated amount in the last 24 hours combined with the cumulated number of transactions) and recognize them in the future.
In general, every payment you make with your debit card is vetted by the fraud detection process to ensure the legitimacy of the payment. It goes through a ruleset and possibly also an ML model.
What are the main challenges with fraud detection? How does machine learning help tackle them?
With the rapid rise of e-commerce markets, the sophistication and dynamics behind fraudulent behavior have also increased. The enhancement and maintenance of rule-based models is becoming not only more complex but also more time consuming. It is exactly in this context where the application of machine learning can help. It helps to reduce manual work by learning new patterns itself and adjusting the decision-making process according human feedback on fraudulent/non-fraudulent behavior.
Another benefit is what is known as collective intelligence. I believe, with the increasing maturity of privacy preserving technologies, i.e. the ability to share insights without sharing the data per se, involved parties such as banks, payment service providers, etc. will become more open to contributing to the collective intelligence network. This concept will enable the training of ML models with more data and therefore better prevent fraudulent behavior.
What are the main techniques used in the area?
Rule-based fraud detection is still the most common practice. However, with the rise of AI during the last decade, ML-based solutions are also slowly gaining in popularity. The bundle of techniques used ranges from decision tree type algorithms to deep learning. In addition, we see more and more solutions leveraging unsupervised learning to detect new patterns of fraud. Feeding classifiers with new labeled data – the output of unsupervised learning – further improves the ability to act quickly towards detecting unknown fraud patterns.
Machine learning is often called a black box because it’s difficult to understand how a decision was made. What do you think?
Yes, machine learning is not always self-explanatory or straightforward. Therefore, I believe, knowledge is crucial to increase both acceptance and the level of confidence in using these systems. It is important to educate users about the main principles behind machine learning, its advantages and its limitations. In addition, making the decision process of a machine learning model understandable is almost as important as the accuracy of these models. Explainable AI is a new concept that aims at bringing transparency and traceability to decision-making powered by machine learning.
Machine learning is the use and development of computer systems that are able to learn and adapt without following explicit instructions, by using algorithms and statistical models to analyze and draw inferences from patterns in data. [source]
Machine learning includes two main visions: supervised and unsupervised learning. In supervised learning, we train models based on labeled data e.g. “fraud”/”no fraud” and use these models to classify future events. Unsupervised learning aims to identify clusters of data that show similar properties. These methods are used when labeled data is not available.
Deep learning is a class of machine learning, and it involves the use of multiple layers in the network – learning process – to progressively extract higher-level features from the raw input. [source]
Julinda Gllavata
Julinda has been with SIX for more than 7 years. In her current role, she leads the Banking Services data science team. Prior to joining SIX, Julinda worked for different international companies such as Accenture, Bosch and OMRON.
Her educational path led her from Albania to Germany to pursue her PhD studies. The focus of her dissertation was on the application of machine learning techniques in image processing and pattern recognition.