Global IDs Presentation

advertisement
Graph Data Analytics
Resolving Complexity at an Enterprise Scale
Arka Mukherjee, Ph.D.
Global IDs
Arka.Mukherjee@globalids.com
www.globalids.com
Topics
1
2
3
Proprietary
© 2013 Global IDs
The “Complex Data” Context
Current Challenges
Governance Methodology
2
The “Complex Data” Context
The Big Shift
Proprietary
© 2013 Global IDs
4
The cost structure is unsustainable
The cost of managing information is going up exponentially.
Proprietary
© 2013 Global IDs
5
The Complexity growth is unmanageable
1. Complex data ecosystems
Financial
Services
Institutions
2. Highly dynamic
3. Limited traceability
4. Systemic Risk : Hard to measure
Proprietary
© 2013 Global IDs
6
Question
How can Enterprises handle the cost and complexity
of managing complex data landscapes ?
Proprietary
© 2013 Global IDs
7
Global IDs Focus
To organize enterprise data landscapes
Proprietary
© 2013 Global IDs
8
Global IDs: Product Suite
Global IDs Software Products
Metadata
Objective
4
Embed
Analytics
3
Accelerate
Integration
2
Improve
Quality
1
Create
Transparency
Governance Suite
Function
Master Data
Enterprise Data
Big Data
Governance Suite Governance Suite Governance Suite
Visualize
20
Dashboards and Infographics
Link
19
Graph Databases with Linked Data
Measure
18
KPIs and Trend Metrics
Analyze
17
Reporting and Ad-Hoc Analysis
Distribute
16
Data Services for Master Data
Integrate
15
Integrated Master Data
Standardize
14
Enriched Master Data
Move
13
Data Repositories in Relational Databases or Hadoop
Dashboards
12
Master Data Governance Portals
Stewardship
11
RACI Matrix of Data Stewards
Validation
10
Data Quality Metrics
Rules
9
Rules Repository
Monitor
8
Change Monitors, Impact Analysis
Model
7
Master Data Models
Search
6
Enterprise Search
Map
5
Business Ontologies
Classify
4
Business Taxonomies
Profile
3
Semantic Metadata Repository
Ingest
2
Inventory of External Data Assets
Discover
1
Comprehensive Data Asset Inventory
© Global IDs Inc. (2001-2013)
Proprietary
© 2013 Global IDs
Deliverables
Under Development Using Hadoop Stack
9
Challenges
The typical Financial Institution’s
Proprietary
© 2013 Global IDs
# Databases
> 1000
# Tables
> 200,000
# Columns
> 2,000,000
11
Question
How can we understand the relationships across 2,000,000
attributes?
Proprietary
© 2013 Global IDs
12
Converging Data Variety
Data Content
Structured
Multi
Structured
Unstructured
Proprietary
© 2013 Global IDs
13
Converging Data Ecosystems
Data Ecosystems
Social
Data
Machine
Data
Enterprise
Data
Proprietary
© 2013 Global IDs
14
Current Approaches do not Scale
Small
# Databases
Proprietary
© 2013 Global IDs
> 1,000
Average
> 10,000
Large
> 100,000
15
A New Approach is Required
Proprietary
© 2013 Global IDs
16
5 Utilize Graph Structures for Governance
Proprietary
© 2013 Global IDs
17
Graph Analytics : Use Cases
Key Challenges
Proprietary
© 2013 Global IDs
•
Vast diversity and volume of metadata and data
•
Storage and indexing of metadata to facilitate
search and navigation
•
Understanding the connection between
different pieces of metadata (Crosswalk)
19
Utilize Graphs Structures
for Storing Complex Data
Proprietary
© 2013 Global IDs
20
Use Case 1:
Enterprise Metadata Search with Hadoop
Proprietary
© 2013 Global IDs
21
Use Case 2: Unstructured Data Integration
Proprietary
© 2013 Global IDs
22
Use Case 3: Cross Database Similarity Mapping
Proprietary
© 2013 Global IDs
23
Use Case 4 : Graph Analytics
Proprietary
© 2013 Global IDs
24
Demo
Methodology
What we do
1. Scan
2. Analyze
3. Map / Organize
4. Govern
Proprietary
© 2013 Global IDs
27
Automation
Proprietary
© 2013 Global IDs
28
1 : Scan
Proprietary
© 2013 Global IDs
29
2 : Semantic Analysis
Proprietary
© 2013 Global IDs
30
3 Automate Semantic Mapping
Proprietary
© 2013 Global IDs
31
4 Link the Data Landscape
Proprietary
© 2013 Global IDs
32
Thank You!
Download