ISACA Hyderabad - Dec 14th 2014

advertisement
Fraud Detection in Banking
using Big Data
By
Madhu Malapaka
madhu@wilshiresoft.com
For ISACA, Hyderabad Chapter
Date: 14th Dec 2014
Revised:
14th
Dec 2014
Wilshire Software Technologies
1
Agenda
•
•
•
•
•
•
•
•
•
Revised:
Common Banking Frauds
Fraud Fighting Activities
Enterprise Fraud Systems Diagnostic Anatomy
Big Data
Hadoop Ecosystem
Banks Data Source
Social Network Data Providers
Big Data Integration – Technology Stack
Reporting Tools
14th
Dec 2014
Wilshire Software Technologies
2
Fraud
•
Revised:
A deception deliberately practiced in order to secure unfair or unlawful
gain or causing loss to another party.
14th
Dec 2014
Wilshire Software Technologies
3
Common Banking Frauds
•
Revised:
A bank is typically exposed to different types of frauds.
14th
Dec 2014
Wilshire Software Technologies
4
Fraud Fighting Activities
•
Fraud fighting activities can be grouped into three primary categories:
 Fraud Prevention - Proactive
 Fraud Detection
- Reactive
 Fraud Investigation - Action
Revised:
14th
Dec 2014
Wilshire Software Technologies
5
Enterprise Fraud Systems Diagnostic Anatomy
Source: www.executiveboard.com
Revised:
14th
Dec 2014
Wilshire Software Technologies
6
ATMS
Policy
ONLINE
Users
CREDIT
Data
Collection
Data
Analysis
Compliance
Fraud Detection
External
Data Feeds
Data Logs
Legal Action
Business Process Change
Adopt New Technologies
Report Management
7
ATMS
Policy
ONLINE
Users
CREDIT
Data
Collection
Data
Analysis
FraudMA
P™
Reputation Manager
360
Compliance
Fraud Detection
External
Data Feeds
Data Logs
Legal Action
Business Process Change
Adopt New Technologies
Report Management
8
Monitoring Account Holder Behavior
•
It is organized around different phases or aspects of the online
banking process.
9
Revised:
14th
Dec 2014
Wilshire Software Technologies
10
ATMS
Policy
ONLINE
Users
CREDIT
Data
Collection
Data
Analysis
Compliance
Fraud Detection
External
Data Feeds
Data Logs
Legal Action
Business Process Change
Adopt New Technologies
Report Management
11
How Banks can leverage
Data Mining capabilities of
Big Data
for
Fraud Detection
Revised:
14th
Dec 2014
Wilshire Software Technologies
12
BIG DATA
• Velocity
 Moves at very high rates (think sensor-driven systems).
 Valuable in its temporal, high velocity state.
• Volume
 Fast-moving data creates massive historical archives.
 Valuable for mining patterns, trends and relationships.
• Variety
 Structured (logs, business transactions).
 Semi-structured and unstructured.
Revised:
14th
Dec 2014
Wilshire Software Technologies
13
BIG DATA BY HADOOP
 Hadoop is a combination of :
• HDFS
• MapReduce


Storage
Computation
 Hadoop Distributed File System (HDFS)
• Distributed file system for redundant storage.
• Designed to reliably store data on commodity hardware.
 MapReduce
• A programming model for distributed data processing.
• A data processing primitives are functions: Mappers and Reducers.
Revised:
14th
Dec 2014
Wilshire Software Technologies
14
Hadoop Ecosystem
 Pig
• High-level data flow language.
• Made of two components:
 Data processing language Pig Latin (Pig Scripts).
 Compiler to translate Pig Latin to MapReduce.
 Hive
•
•
Data Warehousing Layer on top of Hadoop.
Allows analysis and queries using SQL–like language.
 Mahout
• Scalable machine learning algorithms on top
of Hadoop.
Revised: 21/10/2013
Wilshire Software Technologies
15
Hadoop Ecosystem
 Sqoop
• A tool to automate data transfer between
structured datastores and Hadoop.
 Flume
• Distributed data/log collection service.
• Collects data/log from their sources and puts in
a centralized location for storage and processing.
Revised:
14th
Dec 2014
Wilshire Software Technologies
16
Hadoop Ecosystem
Revised:
14th
Dec 2014
Wilshire Software Technologies
17
Banks Data Source
Identify Data Sources
•
Consider what data sources you’ll need to take advantage of.
 Existing data sources
• This includes a wide variety of data, such as transactional data,
survey data, web logs, etc.
 Purchased data sources
• Does your organization use supplemental data, such as
demographics?
• If not, consider social media and news stream would complement
your current data to create additional project value.
Revised:
14th
Dec 2014
Wilshire Software Technologies
18
Social Network Data Providers
•
Revised:
This data works as input data to build big-data and can integrate with
Bank’s Customer data.
14th
Dec 2014
Wilshire Software Technologies
19
Banks Internal and Purchased Data
CRM/customer support
POS/purchases
email/documents/collab.
BI & data warehouse
system & network logs
web logs/clickstream
google analytics/omniture
facebook/twitter/yelp/
foursquare/google
experian/epsilon/acxiom
mobile devices
sensors
product reviews
google search results
+ more
Revised:
14th
Dec 2014
many terabytes of data,
sometimes many
PETABYTES
Wilshire Software Technologies
BIG DATA
20
Big Data Integration – Technology Stack
Revised:
14th
Dec 2014
Wilshire Software Technologies
21
Analytics
Data Logs
RDBMS
Wilshire Software Technologies
22
Reporting Tools
Revised: 21/10/2013
Wilshire Software Technologies
23
81% of global banks
say Big Data is a top
priority in 2015
Are You Ready?
Revised:
14th
Dec 2014
Wilshire Software Technologies
24
Thank You!
• Questions?
Wilshire Software Technologies, based in Hyderabad, India is
engaged in Consulting & Training for Big Data Analytics.
Contact Information:
Madhu Malapaka
Managing Director
Wilshire Software Technologies
Hyderabad, India
Cell +91 800 820 4581
madhu@wilshiresoft.com
www.wilshiresoft.com
Revised:
14th
Dec 2014
Wilshire Software Technologies
25
Download