Data Virtualization and Information as a Service

advertisement
Data Virtualization
&
Information As A Service (IaaS)
By Anil Allewar
Senior Solutions Architect - Synerzip
www.synerzip.com
1
About Me!!
Anil Allewar
Senior Solutions Architect @
Synerzip
Technology Evangelist &
speaker
Core interests: JEE, EAI, EII
Confidential
www.synerzip.com
2
Agenda
• Use cases
• What does it mean?
• Architecture explained
• Implementation Frameworks
• Demo
• Questions?
www.synerzip.com
3
Why it makes sense?
Confidential
www.synerzip.com
4
Use Cases
Data
Mart
Data
Warehouse
ETL
ETL
Financial
Data
OLTP
Data
3rd Party
Data
Custom
Program
ETL
Web
Service 1
Web
Service 2
Legacy
Data
Excel
files
Confidential
www.synerzip.com
5
Traditional Data Integration
Business Applications
Enterprise Information System
ETL
ETL
Source
System
Confidential
www.synerzip.com
Source
System
6
Problems with ETL
More than 1 copy of
data for staging
Intermediate data =>
Errors
Lead time to add new
source
Domain knowledge for
mapping
Batch Process => No
real time data
Confidential
www.synerzip.com
7
Problems with DBMS consolidation
Alternate approach =>
Single EIS (say RDBMS)
Extensive changes to
existing apps
Might not satisfy
everyone’s requiremets
Confidential
www.synerzip.com
8
Agenda
• Use cases
• What does it mean?
• Architecture explained
• Implementation Frameworks
• Demo
• Questions?
www.synerzip.com
9
Data Virtualization & Federation
Single API to access
data
Only metadata stored
at virtualization layer
Real time access
without
copying/moving data
Confidential
www.synerzip.com
Federate data across
hetero/homogenous
sources
10
Data Virtualization
Confidential
www.synerzip.com
11
Agenda
• Use cases
• What does it mean?
• Architecture explained
• Implementation Frameworks
• Demo
• Questions?
www.synerzip.com
12
Architecture
User
Application
Common Access
API
Virtual
Database
Translator
1
Connector 1
Translator
2
Connector 2
RUNTIME &
QUERY ENGINE
Confidential
www.synerzip.com
13
Agenda
• Use cases
• What does it mean?
• Architecture explained
• Implementation Frameworks
• Demo
• Questions?
www.synerzip.com
14
Vendors
• Commercial Products
– Composite Software
• http://www.compositesw.com/data-virtualization/
– Denodo
• http://www.denodo.com/en/product/overview.php?n=h
– IBM
• http://www-03.ibm.com/software/products/en/ibminfofedeserv
– Informatica
• http://www.informatica.com/us/data-virtualization/
– Red Hat
• http://www.redhat.com/products/jbossenterprisemiddleware/data-virtualization/
• Open Source
– Jboss Teiid
• http://teiid.jboss.org/
Confidential
www.synerzip.com
15
Selected Platform – JBoss Teiid
Open Source
JEE standards
Number of
relational/NoSQL/E
RP/CRM data stores
Active & responsive
community
Confidential
Add custom EIS
support using JEE
components
Synerzip contribution: Defect
discovery, root cause analysis,
feature verification
www.synerzip.com
16
Teiid Components
• Virtual Database
– container for components used to integrate data from
multiple data sources
• Source Models
– structure and characteristics of physical data sources
• View Models
– structure and characteristics of abstract structures you want to expose to your
applications
• Teiid Designer
– Eclipse based UI to dynamically discover data source
objects and apply data federation
– Generate virtual database from 1 or more sources
Confidential
www.synerzip.com
17
Teiid Components
• Translator
– Provides abstraction later between Teiid Query
Engine and source system
– Convert Teiid SQL commands to source specific
execution commands
– Convert result data from source system to Teiid
specific format
• Resource Adapter
– Provides connectivity to the physical data source
– Integration provided through Java Connector
Architecture (JCA) API
Confidential
www.synerzip.com
18
Teiid – Supported EIS
•
•
•
•
•
•
•
•
•
•
Amazon SimpleDB
Apache Accumulo
Apache SOLR
Cassandra
File
Google Spreadsheet
JPA
LDAP
Excel – as file
SalesForce
Confidential
• JDBC
– MS access, DB2, derby, excelodbc, greenplum, h2 ,
hive(for accessing Hadoop),
oracle, teradata and most
RDBMS
•
•
•
•
•
•
www.synerzip.com
MongoDB
Object
OData
OLAP
Web Services
SAP Netweaver Gateway
19
Performance Characteristics
• Access same data using Oracle and Teiid drivers
No. of rows Vs Time: No Blobs
25,000
ms
20,000
15,000
10,000
Oracle-JDBC
5,000
Teiid-JDBC
0
No. of rows
– Retrieval times comparable when accessing tables
having no Blobs
Confidential
www.synerzip.com
20
Performance Characteristics
No. of rows Vs Time: Blobs
30,000
25,000
ms
20,000
15,000
Oracle-JDBC
10,000
Teiid-JDBC
5,000
0
0
0
2
42
21,804
32,531
185,454
No. of rows
–
Confidential
Teiid slower when accessing Blob data
•
Can be tuned
www.synerzip.com
21
Agenda
• Use cases
• What does it mean?
• Architecture explained
• Implementation Frameworks
• Demo
• Questions?
www.synerzip.com
22
Demo
JDBC
Client
RDBMS
Resource
Adapter
MongoDB
Translator
JDBC
API
Federated
VDB
mySQL
Translator
mySQL
MongoDB
Resource
Adapter
TEIID RUNTIME
& QUERY
ENGINE
Confidential
www.synerzip.com
23
Demo-Steps
• Pre-requisites
– mySQL server 5.5+ installed
– MongoDB 2.4.x+ installed
• Steps
– Load the mySql and MongoDB database with sample data
– Setup environment – JBoss, Eclipse
– Create Teiid project in Eclipse using Teiid designer
• Import source model using JDBC
• Create the virtual model and federate data from the source
model
• Create a virtual database (VDB) and deploy to JBoss
– Access data using JDBC client or through browser using OData
Confidential
www.synerzip.com
24
Demo – Scenario
Federated
Data
Confidential
www.synerzip.com
25
Demo – Connection Profile
Confidential
www.synerzip.com
26
Demo – Source Model
Confidential
www.synerzip.com
27
Demo - Source Model Generation
Confidential
www.synerzip.com
28
Demo – Map Source To View
Confidential
www.synerzip.com
29
Demo - Association
Confidential
www.synerzip.com
30
Demo – Data Federation
Confidential
www.synerzip.com
31
Demo – Source Code
• Source code
–https://github.com/Synerzip/JBossTeiid
–Contains
• Configuration files
• Instructions
• “How-to” videos
• VDBs, source models and view models
Confidential
www.synerzip.com
32
Conclusion
• Data Virtualization and Federation is
a rapidly emerging technology that
solves traditional BI/ETL problems.
• It provides lower time to market,
distributes data across the enterprise
as a service and provides real time
access to enterprise data.
Confidential
www.synerzip.com
33
Agenda
• Use cases
• What does it mean?
• Architecture explained
• Implementation Frameworks
• Demo
• Questions?
www.synerzip.com
34
Contact Me
• anil.allewar@synerzip.com
Confidential
www.synerzip.com
35
Questions?
www.synerzip.com
Hemant Elhence
hemant@synerzip.com
469.322.0349
www.synerzip.com
•36
84
Synerzip in a Nutshell
1. Software product development partner for small/mid-sized
technology companies
•
•
•
Exclusive focus on small/mid-sized technology companies, typically
venture-backed companies in growth phase
By definition, all Synerzip work is the IP of its respective clients
Deep experience in full SDLC – design, dev, QA/testing, deployment
2. Dedicated team of high caliber software professionals for
each client
•
•
•
Seamlessly extends client’s local team, offering full transparency
Stable teams with very low turn-over
NOT just “staff augmentation”, but provide full mgmt support
3. Actually reduces risk of development/delivery
•
•
Experienced team - uses appropriate level of engineering discipline
Practices Agile development – responsive, yet disciplined
4. Reduces cost – dual-shore team, 50% cost advantage
5. Offers long term flexibility – allows (facilitates) taking
offshore team captive – aka “BOT” option
www.synerzip.com
Our Clients
www.synerzip.com
Thanks!
Call Us for a Free Consultation!
Hemant Elhence
hemant@synerzip.com
469.322.0349
www.synerzip.com
Download