Knowledge Graph: Facilitates Fraud Analytics

ISB Institute of Data Science
6 min readOct 22, 2020

--

Dr. Shruti Mantri

Email: shruti_mantri@isb.edu

ISB Institute of Data Science

Vishal Siram

Email: vishalsiram_anil@isb.edu

ISB Institute of Data Science

Today, companies rely more and more on AI and Analytics applications in their day-to-day business decision making. The advances in the field of AI focuses on hybrid intelligence and integrates learning data fused with reasoning(knowledge) to make decisions. Knowledge graph enables machines to incorporate human expertise for making meaningful decisions and bring context to AI applications. Connecting datasets enables every business to gain context from existing knowledge. For enterprises/organizations to stay competitive in the current knowledge economy, it is crucial to manage knowledge efficiently and be ready for changes that might serve either as a threat or opportunity to their business. That is where Knowledge Graphs facilitates decision making by connecting the datasets. Most, of the tech giants like Amazon, Facebook, Microsoft, and Google have invested millions of dollars to create their own Knowledge Graphs. Knowledge Graphs processes and represents data and knowledge in a format which is very close to the way a human brain processes and stores information. This enables users to quickly access two closely connected objects and adapt the connection based on the context.

A Knowledge Graph brings together machine learning and graph technologies to give AI the context it needs. Knowledge Graphs represent a complex network of information in a meaningful way (in a similar way to human intelligence) by integrating data from a wide range of data silos and incorporating learning and reasoning. In this article, we highlight the use of cases of Knowledge Graphs for financial services. Augmented analytics supported by knowledge graph and machine learning enables companies to identify the fraudulent pattern and investigate specific criminal links.

Global fraud detection and prevention market was valued at USD 28.9 billion in 2019 and is expected to reach USD 85.3 billion by 2025. The global fraud detection and prevention market is segmented into BFSI, Retail, Telecommunication, Manufacturing, Healthcare, and others. Among all the different domains, BFSI domain is the major threat vulnerabilities for threat vectors as the BFSI deals with sensitive data. The amount of transaction and digitalization of payment in the BFSI sector allows the attackers to exploit the vulnerability in the digital system. Geographically, North America was the largest fraud detection and prevention market in 2019[1]. This is due to increase in the fraudulent activities exploiting the vulnerabilities in the digital systems, increase in the adoption of the artificial intelligence and machine learning and IoT for fraud detection in the organization is predicted to give a boost to the North America region market in the forecast period. There has also been increase in the number of cases in the Asia-Pacific in last decade and also the fraud detection and prevention market has witnessed growth in the fraud detection and prevention market during the forecast period, owing to the adoption of fraud detection and prevention software and system across various verticals.

Knowledge Graphs empowered by machine learning and reasoning capabilities allow companies to better identify fraudulent patterns by traversing many real-time interconnected entities in a large network. It is a technique for understanding connections and relationships between different entities for example when a transaction happens, a digital footprint is created connecting the two entities involved in the transaction. It uses the concept of graph databases to identify the links in fraud detection. The graph is a collection of nodes and edges where each node is used to represent an entity and each edge describes relationships between entities. A graph database is a type of NoSQL database and uses graph theory to store, map, and query relationships. Graph databases can be used to process financial and purchase transactions in real-time and identify the relationships between the parties involved in the transaction.

Credit Card Fraud Transaction Detection and Analysis

Photo by William Iven on Unsplash

Problem statement:

Banks, merchants, and payment gateway companies lose billions of dollars every year to credit card fraud. Hackers/threat vectors tend to steal credit card details by using different techniques for example, intercept card details using card reader chip inserted in the card swiping machine at the petrol pump or data stolen by hackers in a mass breach of data set of large retailer. In December 2013, police in Abington, Pennsylvania arrested two post office employees for stealing credit card information and using it to buy more than $50,000 worth of merchandise. The post office clerks: (i) copied the credit card information of some of their customers while processing transactions; (ii) then located these customers’ home addresses; (iii) using the credit card numbers, they would place orders online for goods or gift cards, to be delivered at their victims’ home address; (iv) with goods ordered online, an accomplice would wait at the address to intercept the deliveries.

Methodology Used for Knowledge Graph Based Fraud Detection

Knowledge Graph is used to represent every transaction as a graph and then identify common denominator in the fraud cases and find the origin of the scam.

Data Collection for the Use Case

Neo4j sample dataset is used to construct knowledge graph to identify the credit card fraud origin and its spread.

Design and Development of Knowledge Graph

Step 1: Identify the customers and merchants involved in fraudulent transactions

MATCH (victim:Person)-[r:HAS_BOUGHT_AT]->(merchant)WHERE r.status = "Disputed"RETURN victim.name AS `Customer Name`, merchant.name AS `Store Name`, r.amount AS Amount, r.time AS `Transaction Time`ORDER BY `Transaction Time` DESC
Figure 1. Customers and Merchants involved in Fraudulent Transactions

Step 2: Identify the legitimate and illegitimate transactions date and time

MATCH (victim:Person)-[r:HAS_BOUGHT_AT]->(merchant)WHERE r.status = "Disputed"MATCH (victim)-[t:HAS_BOUGHT_AT]->(othermerchants)WHERE t.status = "Undisputed" AND t.time < r.timeWITH victim, othermerchants, t ORDER BY t.time DESCRETURN victim.name AS `Customer Name`, othermerchants.name AS `Store Name`, t.amount AS Amount, t.time AS `Transaction Time`ORDER BY `Transaction Time` DESC
Figure 2. Identify the legitimate and illegitimate transactions date and time

Step 3: Identify the common denominator (common merchant in all the seemingly innocuous transactions)

MATCH (victim:Person)-[r:HAS_BOUGHT_AT]->(merchant)WHERE r.status = "Disputed"MATCH (victim)-[t:HAS_BOUGHT_AT]->(othermerchants)WHERE t.status = "Undisputed" AND t.time < r.timeWITH victim, othermerchants, t ORDER BY t.time DESCRETURN DISTINCT othermerchants.name AS `Suspicious Store`, count(DISTINCT t) AS Count, collect(DISTINCT victim.name) AS VictimsORDER BY Count DESC
Figure 3. Identification of Common Denominator in all fraudulent transactions.

In each instance of a fraudulent transaction, the credit card holder had visited Walmart (Figure 3) in the days just prior of the fraudulent transactions for example, Olivia’s first disputed transaction happened after transaction at Walmart. We also know the location and the date on which the customer’s credit cards numbers were stolen. With a graph visualization solution, we could inspect the data to confirm our intuition. Now we can alert the authorities and the merchant on the situation. They should have enough information to take it from there!

Conclusion:

Detection of real-time fraud as quickly as possible is extremely important to stop criminals from creating massive damage. As business processes become faster and more automated, the time margins for detecting fraud are becoming narrower and narrower, increasing the call for real-time solutions. Sophisticated criminals have learned to attack systems with vulnerabilities. Traditional technologies, while still suitable and necessary for certain types of prevention, are not designed to detect elaborate fraud rings. This is where graph databases add value. Creating a Knowledge Graph with semantic description of information context allows users to access a machine-readable representation of complex interdependencies that form a real-world model of the knowledge domain. The key to integrating knowledge efficiently among various systems and human users to detect and prevent fraudulent activities is to provide knowledge representation and reasoning in a machine-readable form.

References:

[1] ‘Global Fraud Detection and Prevention Market was valued at USD 28.9 billion in 2019 and is Expected to Reach USD 85.3 billion by 2025, Observing a CAGR of 17.8% during 2020–2025, Vynz Research.

[2] ‘Detecting fraud with link analysis’, Cambridge Intelligence.

[3] ‘Knowledge Graphs for Financial Services’, Deloitte, 2020

[4] G. Sadowski and P. Rathle, ‘Fraud Detection: Discovering Connections with Graph Database Technology’, neo4j

[5] Data Source: https://neo4j.com/graphgist/credit-card-fraud-detection

--

--

ISB Institute of Data Science

ISB Institute of Data Science (IIDS) brings together data science enthusiast to drive research into AI and Data Science