red hat jboss data virtualization

advertisement
RED HAT JBOSS
DATA VIRTUALIZATION
Bill Kemp
Sr. Solutions Architect
August 28, 2014
Red Hat is…
“By running tests and executing numerous examples for specific teams, we were able to prove […] not
only would the solution work, but it will perform better & at a fraction of the costs.”
MICHAEL BLAKE, Director, Systems & Architecture
2
RED HAT JBOSS MIDDLEWARE
Innovate faster, in a smarter way
A family of a lightweight, enterprise-grade products
that are ideal for open hybrid cloud environments.
3
RED HAT JBOSS MIDDLEWARE
Red Hat JBoss Middleware
Business Process
Management
•
•
JBoss BRMS
JBoss BPM Suite
Application
Integration
•
•
•
JBoss A-MQ
JBoss Fuse
JBoss Fuse Service Works
Data Integration
Foundation
ACCELERATE
4
•
•
•
•
JBoss Data
Virtualization
JBoss EAP
JBoss Web Server
JBoss Data Grid
INTEGRATE
RED HAT JBOSS MIDDLEWARE
AUTOMATE
JBoss Operations Network
JBoss Developer Studio
JBoss Portal
•
•
•
Management
Tools
Development
Toolsh
User Interaction
Agenda
5
●
Business Problem
●
Product Overview
●
Customer Stories
●
Competition
●
Prospecting Guidance
●
Pricing & Promotions
RED HAT JBOSS MIDDLEWARE
Business Challenges
Data Driven Economy
Data is becoming the new raw
material of business: an economic
input almost on a par with capital and
labor. “Every day I wake up and ask,
‘how can I flow data better, manage
data better, analyze data better?”
CIO - Wal-Mart
7
RED HAT JBOSS MIDDLEWARE
Data Challenges Getting Bigger
Big Data, Cloud, and Mobile
Existing Data Integration approaches are not sufficient
●
Extracting and moving data adds latency and cost
●
Every project solves data access and integration in a different way
●
Solutions are tightly coupled to data sources
●
Poor flexibility and agility
BI Reports
Operational
Reports
Enterprise
Applications
SOA
Applications
Mobile
Applications
Constant
Change
How to align?
Integration Complexity
Siloed &
Complex
Hadoop
8
NoSQL
Cloud Apps
Data Warehouse
& Databases
Mainframe
RED HAT JBOSS MIDDLEWARE
XML, CSV
& Excel Files
Enterprise Apps
Business Objective
Turn Data into Actionable Information
Only
28%
Users have any meaningful
data access
 Reduce costs for finding and
accessing highly fragmented data
Over
70%
BI project efforts lies in the
integration of source data
 Improve time to market for new
products and services by simplifying
data access and integration
 Deliver IT solution agility
necessary to capitalize on constantly
changing market conditions
 Transform fragmented data into
actionable information that delivers
competitive advantage
9
RED HAT JBOSS MIDDLEWARE
Technology Overview
What does Data Virtualization software do?
Turn Fragmented Data into Actionable Information
Data Virtualization software virtually
unifies data spread across various
disparate sources; and makes it
available to applications as a single
consolidated data source.
DATA CONSUMERS
BI Reports
The data virtualization software
implements 3 steps process to bridge
data sources and data consumers:
●
●
●
11
Connect: Fast access to data from
diverse data sources
Compose: Easily create unified
virtual data models and views by
combining and transforming data
from multiple sources.
Consume: Expose consistent
information to data consumers in
the right form thru standard data
access methods.
SOA Applications
Easy,
Real-time
Information
Access
Virtual Consolidated Data Source
Data Virtualization Software
•
•
•
Consume
Compose
Connect
Oracle DW
SAP
XML, CSV
& Excel files
DATA SOURCES
RED HAT JBOSS MIDDLEWARE
Salesforce.com
Virtualize
Abstract
Federate
Siloed &
Complex
Turn Siloed Data into Actionable Information
Mobile Applications
Data
Consumers
JBoss
Data Virtualization
ESB, ETL
BI Reports & Analytics
SOA Applications & Portals
Design Tools
Standard based Data Provisioning
JDBC, ODBC, SOAP, REST, OData
Consume
Easy,
Real-time
Information
Access
Dashboard
Optimization
Compose
Unified Virtual Database / Common Data Model
Data Transformations
Caching
Virtualize
Transform
Federate
Security
Connect
Native Data Connectivity
Data
Sources
Metadata
Siloed &
Complex
Hadoop
12
NoSQL
Cloud Apps
Data Warehouse
& Databases
Mainframe
RED HAT JBOSS MIDDLEWARE
XML, CSV
& Excel Files
Enterprise Apps
Consider...
Inconsistent,
Incomplete
Information
Uninformed,
Delayed Decisions
Costly Business Risk
and Exposure
How would your organization change…
●
●
●
13
If data were readily reusable in place rather than
requiring significant effort to build new intermediary data
tiers?
If data could be repurposed quickly into new applications
and business processes?
If all applications and business processes could get all of
the information needed in the form needed, where
needed and when needed?
RED HAT JBOSS MIDDLEWARE
JBoss Data Virtualization – Use Cases
Self-Service
Business
Intelligence
The virtual, reusable data model provides business-friendly representation of data,
allowing the user to interact with their data without having to know the complexities of their
database or where the data is stored and allowing multiple BI tools to acquire data from
centralized data layer. Gain better insights from Big Data using JBoss Data Virtualization to
integrate with existing information sources.
360◦
Unified
View
Deliver a complete view of master & transactional data in real-time. The virtual data layer
serves as a unified, enterprise-wide view of business information that improves users’ ability
to understand and leverage enterprise data.
Agile SOA
Data
Services
A data virtualization layer deliver the missing data services layer to SOA applications. JBoss
Data Virtualization increases agility and loose coupling with virtual data stores without the
need to touch underlying sources and creation of data services that encapsulate the data
access logic and allowing multiple business service to acquire data from centralized data
layer.
Regulatory
Compliance
Data Virtualization layer deliver the data firewall functionality. JBoss Data Virtualization
improves data quality via centralized access control, robust security infrastructure and
reduction in physical copies of data thus reducing risk. Furthermore, the metadata
repository catalogs enterprise data locations and the relationships between the data in
various data stores, enabling transparency and visibility.
14
RED HAT JBOSS MIDDLEWARE
Enable Self-Service Business Intelligence
Shared, Reusable Logic = Lighter, Faster Client Development
Microsoft
Cognos
BI Tool Centric
Non-sharable &
Duplicated
BI Tool Centric
Non-sharable &
Duplicated
Presentation Logic
Presentation Logic
KPI Calculations
KPI Calculations
Semantic Data Model
Semantic Data Model
Data Security Policy
Data Security Policy
Data Transformation
Logic
Data Transformation
Logic
Data Transformation Logic
Data Integration Logic
Data Integration Logic
Data Integration Logic
Data Access Logic
Data Access Logic
Data Access Logic
Database
DB
15
ERP App
Data Warehouse
DB
Microsoft
Cognos
Presentation
Logic
Presentation
Logic
JBoss Data Virtualization
Shared & Reusable
KPI Calculations
Semantic Data Model
Data Security Policy
Cloud
App
Database
DB
DB
RED HAT JBOSS MIDDLEWARE
ERP App
Data Warehouse
DB
DB
Cloud
App
360◦ Unified View
Complete View of Master and Transactional Data in Real-time
BI Reports
CRM Apps
Portal
JBoss Data Virtualization
Shared & Reusable
Unified
Customer View
Data Repository
Workflow
Unified
Product View
…
Enterprise
Apps
DB
DB
Operational Data Sources
Master Data Management Hub
16
Unified
xBusiness View
RED HAT JBOSS MIDDLEWARE
DB
Agile SOA Data Services
Shared, Reusable Logic = Lighter, Faster Service Development
Web Services
Web Services
Business Logic
Business Logic
Web Service
Web Service
Non-sharable &
Duplicated
Non-sharable &
Duplicated
Business Logic
Business Logic
Semantic Data Model
Semantic Data Model
Data Security Policy
Data Security Policy
Data Transformation
Logic
Data Transformation
Logic
Data Transformation Logic
Data Integration Logic
Data Integration Logic
Data Integration Logic
Data Access Logic
Data Access Logic
Data Access Logic
Database
DB
17
ERP App
Data Warehouse
DB
JBoss Data Virtualization
Shared & Reusable
Semantic Data Model
Data Security Policy
Cloud
App
Database
DB
DB
RED HAT JBOSS MIDDLEWARE
ERP App
Data Warehouse
DB
DB
Cloud
App
JBoss Data Virtualization
Key Business Values
Increase ROA
• Improved utilization of data assets
• Derive more value from existing investments
• Complements existing systems
Boost Agility
• Better/faster than hand coding
• Faster, less costly than batch data movement
• Data virtualization provides loose coupling
Improve
Productivity
• Right data at the right time to the right people
• Decision support, BI with a complete view of information
Better Information
Control
18
• Powerful security, Auditing, Data Firewall
• Avoid data silo proliferation
• Central data access and policy, Compliance
RED HAT JBOSS MIDDLEWARE
JBoss Data Virtualization
Key Differentiators
Lowest TCO
Openness
Cloud Ready
Comprehensive
Performance
19
• Cost leadership lower adoption barrier
• Core based subscription provide flexibility across small to large deployment
• Open, community based innovation
• No vendor lock-in
• Private, public and hybrid cloud deployments
• Integrated with JBoss Middleware portfolio for end-to-end business solution
• Single vendor support simplify IT operations
• Fast query processing optimizations, low footprint
• Comprehensive data provisioning options
• Quick data visualization through business dashboard
RED HAT JBOSS MIDDLEWARE
Customer Success
Self-Service BI and
Hybrid data
integration use case
Global Biotech Company
Self-Service Data for Self-Service Business Intelligence
●
Situation/Needs
–
●
●
Portal
Needed to integrate cloud application data (salesforce.com) with
on-premise, real-time data (role mgmt, territory mgmt and
authentication systems) for operational reporting and monitoring
–
Need to ensure HIPAA compliance
–
Need to support multiple BI tools
Spotfire
Crystal
Reports
Solution
–
Used Data Virtualization to provide unified interface to data to
multiple BI tools
–
Virtual views isolate BI applications from changes in the source
data systems
–
Single point of data access ensured security policy enforcement
and HIPAA compliance
Consume
Compose
Connect
JBoss Data Virtualization
Benefits
Web Service
–
Enabled business users to use the BI tools of choice while IT
ensured better control of information
–
Rapid development cycle thru the use of common data models
–
21
Business
Objects
Sensitive data is protect to ensure strict compliance requirements
RED HAT JBOSS MIDDLEWARE
Cloud CRM
JNDI
JDBC
Navigator Security
Role Membership
LDAP Server
Unified 360* view use
case
Regional Bank
Single View of Loans Processing
●
●
●
Situation / Needs:
–
Thousands of loans in process
–
Management seeks visibility and control, while loan
operations needs to speed up funding steps
–
Loan data spread across many databases/systems
–
Consolidate all data into “virtual data mart”
–
Transformation of data differences
–
Provide real-time data access to management portal and
loan workflow system
Loan Processing
Workflow Mgmt.
Web Services
Consume
Compose
Connect
Solution:
JBoss Data Virtualization
Web Services
Benefits:
–
22
Management
Reporting
Management get timely information on funding needs,
exposure and operating metrics
–
Loan officers received all the information to process the
loan faster
–
Sensitive data is protected
RED HAT JBOSS MIDDLEWARE
Loan
Origination
& Approvals
Risk
Analysis
Loan
Funding
Data firewall use case
Large US Bank
VISA Data Security & Governance
●
●
●
Web Portal
Situation / Needs:
–
VISA PCI mandates protection of cardholder info
–
Can’t maintain common security policy across
multiple data stores
Solution:
–
Create “data firewall” across many data sources
–
Federate rather than replicate
–
Common access policy across all sources
–
Common data definitions
–
Audit trail
Consume
Compose
Connect
JBoss Data Virtualization
Benefits:
–
One set of data security policies
–
Can prove to regulators that data is protected
Data Sources
23
RED HAT JBOSS MIDDLEWARE
Agile SOA Data
Services use case
Multinational Insurance Company
SOA Data Services Layer
●
●
●
24
SOA Applications
Situation/Needs:
–
Deploying SOA reference architecture
–
Want common data model across all sources
–
Don’t want “tightly bound” physical data sources
–
Change data sources without breaking apps/services
SOA/ESB
Solution:
–
All data is access via data services
–
Data Virtualization provides abstraction and logical
data model for enterprise
–
Expose data as Web services and SQL
Consume
Compose
Connect
JBoss Data Virtualization
Benefits:
–
All applications will “get” the same data through use
of common model
–
Easier to expose data to new applications
–
Easier to make changes to data sources
RED HAT JBOSS MIDDLEWARE
Data Sources
Big Data integration
use case
Gain Better Insight from Big Data
Intelligent Inventory Management
●
–
●
Right merchandise, at right time and price
JBoss
BRMS
Problem:
–
●
Analytical Apps
Objective:
Data Driven
Decision
Management
Cannot utilize social data and sentiment
analysis with their inventory and purchase
management system
Solution:
–
Leverage JBoss Data Virtualization to
mashup Sentiment analysis data with
inventory and purchasing system data.
Leveraged BRMS to optimize pricing and
stocking decisions.
Consume
Compose
Connect
JBoss Data Virtualization
Hive
Purchase Mgmt
Application
Inventory
Databases
Sentiment
Analysis
25
RED HAT JBOSS MIDDLEWARE
Better Together - Big Data and Data Virtualization
Big Data is not another Silo - Customers Combine Multiple Technologies
●
Combine structured and unstructured analysis
–
●
Combine high velocity and historical analysis
–
●
Analyze and react to data in motion; adjust models with deep
historical analysis
Reuse structured data for analysis
–
26
Augment data warehouse with additional external sources, such
as social media
Experimentation and ad-hoc analysis with structured data
RED HAT JBOSS MIDDLEWARE
Better Together - Big Data and Data Virtualization
BI Analytics
(historical, operational, predictive)
SOA Composite Applications
Data Integration
JBoss Data Virtualization
Capture & Process
In-memory Cache
JBoss Data Grid
Messaging and Event Processing
JBoss A-MQ and JBoss BRMS
J
Structured Data
27
Streaming
Data
RED HAT JBOSS MIDDLEWARE
Hadoop
Semi-Structured
Data
Red Hat Storage
Red Hat Enterprise Linux & Virtualization
Integrate & Analyze
Capture, Process and Integrate Data Volume, Velocity, Variety
Product Details
JBoss Data Virtualization:
Supported Data Sources
Enterprise RDBMS:
• Oracle
• IBM DB2
• Microsoft SQL Server
• Sybase ASE
• MySQL
• PostgreSQL
• Ingres
Enterprise EDW:
• Teradata
• Netezza
• Greenplum
29
Hadoop:
• Apache
• HortonWorks
• Cloudera
• More coming…
Office Productivity:
• Microsoft Excel
• Microsoft Access
• Google Spreadsheets
Specialty Data Sources:
• ModeShape Repository
• Mondrian
• MetaMatrix
• LDAP
RED HAT JBOSS MIDDLEWARE
NoSQL:
• JBoss Data Grid
• MongoDB
• More coming…
Enterprise & Cloud
Applications:
• Salesforce.com
• SAP
Technology Connectors:
• Flat Files, XML Files,
XML over HTTP
• SOAP Web Services
• REST Web Services
• OData Services
Key New Features and Capabilities
●
●
●
●
Data connectivity enhancements
–
Hadoop Integration (Hive – Big Data),
–
NoSQL (MongoDB – Tech Preview) and JBoss Data Grid
–
Odata support (SAP integration)
Developer Productivity improvements
–
New VDB Designer 8 and integration with JBoss Developer Studio v7
–
Enhanced column level security,
–
VDB import/reuse, and native queries
Simplify deployment and packaging
–
Requires JBoss EAP only; included with subscription
–
Remove dependency with SOA Platform
Business Dashboard
–
30
New rapid data reporting/visualization capability
RED HAT JBOSS MIDDLEWARE
Business Dashboard
Quickly Visualize your Data
31
RED HAT JBOSS MIDDLEWARE
OData Support
●
OData (OASIS Open Data Protocol)
●
https://www.oasis-open.org/committees/tc_home.php?wg_abbrev=odata
●
32
Objective: OASIS OData TC works to simplify the querying and sharing of data
across disparate applications and multiple stakeholders for re-use in the
enterprise, Cloud, and mobile devices. A REST-based protocol, OData builds on
HTTP, AtomPub, and JSON using URIs to address and access data feed resources.
It enables information to be accessed from a variety of sources including (but not
limited to) relational databases, file systems, content management systems, and
traditional Web sites.
●
Data Services v6 supports Odata in two ways:
●
Connect to and access Odata sources
●
Act as an Odata server to client applications
RED HAT JBOSS MIDDLEWARE
Data Virtualization Designer
Model Driven Development
Eclipse-based graphical
tool for
• modeling,
• analyzing,
• Integrating,
• resolving semantic
differences and
• testing
multiple data sources to
produce
• Relational,
• XML and
• Web Service Views
that expose your business
data without any
programming.
• Shows structural
transformations and
dependencies
• Defines
transformations
33
RED HAT JBOSS MIDDLEWARE
Metadata Repository & Governance
S-RAMP: SOA Repository Artifact Model & Protocol
OASIS specification that defines:
●
●
–
a common data model for repositories
–
an interaction protocol to facilitate the use
of common tooling and sharing data.
S-RAMP repository capabilities:
–
Store and retrieve content and metadata
–
Classification of artifacts (e.g. XSD, WSDL,
VDB, ...)
●
Clients interact via ATOM/REST
●
XPath2 based query language
●
Integration with Maven
34
RED HAT JBOSS MIDDLEWARE
ATOM Binding
(REST)
Core Model
Documents
Derived Models
(Read Only)
JCR Storage
(Modeshape + Infinispan)
Semantic Mediation & Integration
Business
Intelligence
Applications
Search
Applications
Web
Services
XML Document
<a>
<b>
…
</b>
</a>
• Relational, XML
T
T
T
Semantic Data Services
Claims, Billing, Policies, …
Location_ID
T
bldg_type
Application views of
informationn:
bldg_id
Data Dictionary:
Location_Type
T
T
Depot_Number
•Based on logical data model or XML schema
•Support for multiple COIs
•Support for multiple versions
SITENUM
Facility_ID
Authoritative Sources:
•Mapped to logical view
Multiple Internal/External Information Sources
35
RED HAT JBOSS MIDDLEWARE
JBoss Data Virtualization
Logical Architecture
Data Consumers
Data Sources
36
RED HAT JBOSS MIDDLEWARE
JBoss Data Virtualization
System Flow
Tooling
37
VirtualDB
Engine
RED HAT JBOSS MIDDLEWARE
Server
Tooling
VirtualDB
Engine
Server
Users create data models
based on metadata:
• Imported from data
sources
• Supplied via DDL
• Provided by Engine
• Specified by user
Models are packaged in a
Virtual Database (VDB)
38
RED HAT JBOSS MIDDLEWARE
Tooling
VirtualDB
Engine
Server
Virtual Databases (VDBs) are deployment
archives similar to .WAR.
Source Models
View Models
VDB Internals
Connector
Binding
Properties
39
VDBs contain
• Source metadata and models
• View metadata and models
• System metadata
• Connection information, which is
bound to sources at deployment time
Manifesto Info
VDBs are deployed to the query engine
RED HAT JBOSS MIDDLEWARE
Tooling
VirtualDB
Engine
Server
Data Consumer Apps
Query Engine is core data virtualization
functionality: Federating relational query
engine. Rule and cost based optimizer,
advanced query planner, caching, hint
processing.
JDBC API
C1
VDB
C2
Connector
Binding (1)
Connector
Binding (2)
Query Engine
DB
Oracle
40
Query Engine hosts VDBs, binds to data
sources, performs query execution and
results processing.
DB
SQL Server
RED HAT JBOSS MIDDLEWARE
Tooling
VirtualDB
Engine
JBoss EAP
DS
DS
Security
JAAS
Transaction
Manager
Embedded DS
xxx-ds.xml
RHQ
Profile
Service
Admin /
AdminShell
Admin Socket
Transport
JDBC
JDBC Socket
Transport
ODBC
ODBC Socket
Transport
41
VDB
VDBs
Translators
JDV Runtime Engine
BufferMgr
Threading
Local Caches
etc.
DS
DS
yyy-ds.xml
zzz-ds.xml
JCA
Applications
RED HAT JBOSS MIDDLEWARE
Server
The server runtime environment
is JBoss EAP.
The Teiid Query engine is hosted
in JBoss EAP and uses key
container-provided services:
• Transaction manager
• JAAS security framework
• Container managed data
sources
• EAP management
infrastructure
• EAP deployment
The Server exposes views
/services to consumers and
managed connections and
connection pools for data sources.
Rich Security Capabilities
Multiple forms of Authentication:
– Client Authentication: LoginModules (File, LDAP); Kerberos (JDBC/ODBC);
HTTP Basic, WS UsernameToken Profile (Web Services)
• PassThrough Authentication
– Source Authentication: Source credentials, Caller Identity (same credentials
as client), RoleBasedCredentialMap (credentials per role), Execution
payload/Custom
Authorization:
– Create, Read, Update, Delete, Execute permissions
– Row-based security
– Column masking
Additional Security:
– Transport encryption (SSL: Anon, 1-way, 2-way)
– Password encryption
42
RED HAT JBOSS MIDDLEWARE
Transactions Support
●
All scopes are handled by JBoss Transactions JTA
●
Three scopes
– Global (through XAResource)
– Local (autocommit = false)
– Command (autocommit = true)
●
Command scope behavior is handled through
txnAutoWrap={ON|OFF|DETECT}
●
43
Isolation level is set on a per connector basis.
RED HAT JBOSS MIDDLEWARE
Customization & Extensibility
Many forms of customization available:
– Extended connectors/translators
– New connectors/translators
– User-defined functions
– Custom logging
– Administrative API
– XML-based virtual database, DDL support
– Custom metadata injection
– Embeddable engine
44
RED HAT JBOSS MIDDLEWARE
Performance Optimization
Load Handling
●
●
●
●
45
Memory Usage – the BufferManager acts as a memory
manager for batches (with passivation) to ensure that
memory will not be exhausted.
Non-blocking source queries – rather than waiting for
source query results processor thread detach from the
plan and pick up a plan that has work.
Time slicing – plans produce batches for a time slice
before re-queuing and allowing their thread to do other
work (preemptive control only between batches)
Caching – ResultSets, processing plans, internal
materialized views, etc.
RED HAT JBOSS MIDDLEWARE
Performance Optimization
Caching & Materialized View
External or Internal materialized views
Ability to override use of materialized
views
• Result set Caching
•
•
Applied to results return from user
queries and virtual procedure calls
Configurable time to live and max.
number of entries
• Code Table Caching
•
Suited for integrating reference data with
transaction/operational data e.g. Country
code, State Code etc.
Yes
Cached?
Result set Cache
Yes
• Caching hints to set time-to-live, memory
preference, and updatability
RED HAT JBOSS MIDDLEWARE
Save?
Virtual Table
Materialization
Support
Materialized
Table
T
Source
SourceTable
Table
Oracle
46
Results
No
Virtual Database
•
•
In-coming Query
JBoss Data Virtualization Server
Multiple levels of caching to meet performance
requirements and manage load on source
systems
• Materialized Views
Source
SourceTable
Table
Files
XML, Text etc.
SQL Server
No
Performance Optimization
Query
●
Access Patterns – criteria requirements on pushdown queries
●
Pushdown – decompose user query into source queries
– Projection minimization to remove unused select items
– Decompose aggregates over joins/unions
– Generating SQL matching Teiid system functions
●
●
Partition aware aggregation and joins
●
Optional Join (can use hints) – removes an unused join child
●
●
47
Dependent Joins (can use hints) – feed equi-join values from one
side of the join to the other
Multi-source models – allows for multiple homogeneous schemas to
be used through the same model.
Copy Criteria – uses criteria transitivity to minimize join tuples.
RED HAT JBOSS MIDDLEWARE
Performance Optimization
Query Planning
●
Distinct phases: parse, resolve, validation, rewrite,
optimization, process plan creation.
●
Rewrite canonicalizes and simplifies.
●
The optimization phase follows with rules/hints/costing
– Non-federated optimization is similar to mature RDBMS
●
●
48
Optimizer plan structure is a flexible tree - distinct from
the command form and processing plans.
Planning is typically quick and deterministic – prepared
plans are recommended
RED HAT JBOSS MIDDLEWARE
Thank You
Q&A
Additional Position Slides
Integration Technologies
Integration Technologies
When to use What?
Real Time
Service Oriented (ESB)
Responsiveness
Data Virtualization
Extract, Transform, Load
(ETL)
Batch
Data
51
Integration Style
RED HAT JBOSS MIDDLEWARE
Process
Data Virtualization
Complements SOA-Centric Integration (ESB)
Our key message is that soa-centric approaches to implementing data
integration/synchronization require large amounts of ‘service/workflow development’
and result in solutions with lots of moving parts which can benefit from a modelbased data virtualization technology that requires no data integration coding
SOA-Centric Integration
52
Data Virtualization
Multi-step process or workflow
development using graphical tooling
Real-time transactional access to data across
multiple heterogeneous data sources for
operational data needs
Data is treated as a special type of ‘step’
that typically contains a SQL statement to
execute against a source
Specialized, graphical tooling for easy mapping
between different models of data
Resulting approach is static and cannot be
queried
On demand, query-able access and update of
real-time up-to-date data
Relational or XML data only
Any data source
RED HAT JBOSS MIDDLEWARE
Data Virtualization
Complements Extract, Transform, Load (ETL)
Our key message is that most operational data consumption problems
cannot be solved with a data warehouse but instead require specific tooling
and technology focusing on model-based data consumption, integration, and
exchange
ETL
53
Data Virtualization
Bulk / batch data operations for data
consolidation, reporting and analysis
Real-time bi-directional access to data across
multiple heterogeneous data sources for
operational and analytical data needs
Involves periodically moving / copying /
consolidating large amounts of data
No moving or copying of data required – finer
grained operational data sets
No on-demand access to real-time data
On demand access and update of real-time upto-date data
Limited data sources only (relational,
structured files)
Any data source
RED HAT JBOSS MIDDLEWARE
Additional Position Slides
Top 10 Ways
Data Virtualization enables
Agile business intelligence development
#1 Data Flattening- Simplified Tables
55
RED HAT JBOSS MIDDLEWARE
#2 Tools Agnostic Common Data Model
Data Consumers
JBoss Data Virtualization
Virtual DB
Jaspersoft
Cognos
Reusable, Common, Semantic Data Model
Data Sources
56
Business
Object
RED HAT JBOSS MIDDLEWARE
Microsoft
#3 Centralized Data Transformation
Data Consumers
Report 1
Report 2
Report 3
Report 4
1234567890
JBoss Data Virtualization
Format consistency
123-4567890
(123)-4567890
123/456/789
0
Data Sources
57
RED HAT JBOSS MIDDLEWARE
123,456,789
0
[123]-4567890
#4 Centralized Business KPIs & Metrics
Calculations
Data Consumers
JBoss Data Virtualization
BI App 1
BI App 2
Net Profit
Operating
Margin
Data Sources
58
RED HAT JBOSS MIDDLEWARE
BI App 3
Net Sales
BI App 4
#5 Centralize Data Integration
Data Consumers
JBoss Data Virtualization
BI App 1
BI App 2
Virtual
Customer
Master
Virtual
Master Data
Data Sources
59
RED HAT JBOSS MIDDLEWARE
BI App 3
Virtual
Product
Master
BI App 4
#6 Ubiquitous Data Consumption
Data Consumers
JBoss Data Virtualization
Standard based
Provisioning
BI App 1
BI App 2
JDBC, ODBC, SOAP, REST, XML, JMS, POJO, Hibernate
Data Sources
60
BI App 3
RED HAT JBOSS MIDDLEWARE
BI App 4
#7 Optimized Data Access
61
●
Federating relational query engine.
●
Rule and cost based optimizer, advanced query planner
●
Multi-level caching
●
Pushdown Queries
RED HAT JBOSS MIDDLEWARE
#8 No Data Latency
Virtual Table
select e.title, e.lastname from Employees as e JOIN Departments as d ON e.dept_id = d.dept_id
where year(e.birthday) >= 1970 and d.dept_name = 'Engineering'
Data Source(s)
62
RED HAT JBOSS MIDDLEWARE
#9 Minimize Need for Data Replication and
Duplication
Activities required to setup a physical vs. virtual data mart
Define Data
Structure
Define ETL
Logic
Prepare HW
Server
Install and
Configure
RDBMS
Create
Database
VS.
Design
Data
Structure
63
Define
Mappings
Define
Virtual
Tables
Enable
Caching
(if need)
RED HAT JBOSS MIDDLEWARE
Physical DB
Design and
Tuning
Load Tables
and Setup
Batch
Updates
Require
DBA,
Developer
to maintain
and manage
#10 Centralize Security
64
●
Data Sanitization
●
Column level masking
●
Access and audit control
●
Centralize compliance policies
RED HAT JBOSS MIDDLEWARE
Appendix
Download