InfoSphere CDC - Real Time data Integration UC

advertisement
CDC Transformation and Delivery
Data at the speed of business
1
© 2008 IBM Corporation
What is CDC
 Change Data Capture
• Capture data events in source database and move only
the changes to the target
 Many different ways of doing CDC
• Timestamps
• Triggers
• API
• Log-based
2
What fuels the IBM CDC Roadmap?
 The widest breadth of functionality:
• Batch/pull and real-time push processing
• Guaranteed delivery/transactional integrity
• Multiple topologies (peer to peer, 1 to many, many to 1, uni-directional, bidirectional)
• Homogeneous & heterogeneous data synchronization
 Broadest range of sources and targets
• Log-based capture agents for DB2 (on all platforms), Oracle, SQL Server, Sybase,
IMS, VSAM, IDMS, ADABAS
• Native/parallel applies for all RDBMS and JMS
• Multiple data delivery protocols (TCP/IP, JMS)
 Industry leading performance and scalability
• End to end throughput and low latency
• Parallel Apply to target system
• Low impact on source database systems
3
What fuels the IBM CDC Roadmap ….
 3000+ customers using the existing CDC products for;
• HA/DR (DB back-up, fault tolerance)
• Real-time reporting/off-load querying
• Application Co-existence (migrations, upgrades, modernization)
• eCommerce (web apps, portals, data distribution)
• Dynamic Data Warehousing, Master Data Management
 700+ people in engineering focused on Information Integration including 170+
focused on CDC technologies
 The most comprehensive suite of data integration products
• BoB transform / cleanse / discovery, metadata management, scalable
performance, services enabled for SOA architectures
• 5000+ customers using Information Server components
4
The IBM Solution: IBM Information Server
Delivering information you can trust
IBM Information Server
Unified Deployment
Discover, model, and
govern information
structure and content
Standardize, merge,
and correct information
Combine and
restructure information
for new uses
Unified Metadata Management
5
Capture, virtualize and
move information for inline delivery
InfoSphere CDC Solution
 Provides real-time change data
capture and delivery for
• Dynamic warehousing and real-time
reporting
• Synchronization and replication
• Event detection
Developers
Architects
DataMirror
Delivers real time changed data to
Information Server, applications and
targets or message queues
 Minimal impact on production
systems
 High scalability and end-to-end
performance
 Guaranteed data integrity
 Proven Heterogeneous support
6
Without impacting performance of
production systems
Key Value Proposition
LATENCY
1. Near zero latency for pervasive integration projects.
2. ETL can also deliver low latency but at what impact to
product systems and mission-critical applications.
Low
Impact
Low
Latency
IMPACT
1. Reduces risk to operational systems.
2. Non intrusive to applications and databases.
3. Use of native DB logs, documented overhead of 2-5%.
4. No use of disk based staging or triggers.
5. Management easily integrated into existing IT operations.
6. Help reduce/manage operational windows.
Continuous
Consistent
Data Delivery
CONSISTENT DATA DELIVERY
1. Data pushed from source, delivered in continuous stream, continuous with business operations.
2. Transaction consistency maintained to preserve units of work, referential integrity.
3. Full transaction granularity, before and after image of all transactional changes.
4. Data event aware, can be used to trigger specific business processes.
5. Fault tolerance, recover to last committed transaction.
7
Architecture
Java-based GUI
for admin & monitoring
Subscriber
Publisher
Database
ODS
TCP/IP
Audit
JMS
Journal Log
Redo/Archive Logs
Source Engine
And Metadata
Target Engine
And Metadata
Business
Process
Flat files
Databases
Oracle, DB2, DB2 UDB, SQL Server, Sybase, Teradata, Netezza, PointBase
IMS, VSAM, IDMS, Adabas, DataCom - Classic
Direct to existing ETL
Platforms
z/OS, System i5, Red Hat and SUSE Linux, AIX, HP/UX (PA-RISC and Itanium), Solaris SPARC,
Tru64 UNIX, Windows
Messaging Middleware
MQSeries, Sun Open Message Queue (JMS), TIBCO, BEA AquaLogic, Oracle Fusion Middleware
8
Use Cases
Customer examples
9
© 2008 IBM Corporation
1. Building A Low Latency ODS for Operational Reporting and Auditing
“Solution deployed to improve visibility into lines of business for organizations with
Operational BI and Data Auditing requirements”
Production Server
ERP
Native
Operational Data Store
DB
OLTP
Log
Manufacturing
ODS
Production Server
Finance
Native
DB
OLTP
Each OLTP insert, update and delete operation
can be stored as an insert, update and delete to
maintain synchronized copy of data.
Log
Manufacturing
All OLTP insert, update and delete operations can
be stored as inserts to maintain complete
transaction history.
Add relevant information such as timestamp,
transaction type, source system id, and id of user
who changed the transaction.
10
2. Complementing An Existing ETL Technology
“Solution deployed to improve visibility into lines of business (i.e. Dynamic Warehousing)
and help manage impact concerns caused by ETL on mission critical systems”
Production Server
ETL Server
Data Warehouse
Point Of Sale
Native
OLTP
Continuous
DB
Stage
Log
ETL
Scheduled Batch
EDW
Retail
11
Stage can be:
1. Relational Table
Complementary ETL Technologies:
1. Informatica “Power Center”
2. Flat File
2. Business Objects “Data Integrator”
3. Message Queue
3. Ab Initio
4. Direct to ETL
4. IBM “DataStage” (has native integration)
3. Continuous Feed Of A Business Intelligence Appliance
“Solution deployed to improve visibility into lines of business by combining the
cost/performance benefits of a BI Appliance with real-time data feeds”.
Production Server
Appliance Nodes/Cluster
Staging Server
ERP
Native
OLTP
DB
Log
“CDC”
Continuous (to Appliance)
CDC
Stage
Appliance
Load API
Flat File
Appliance
Manufacturing
Flat file containing transaction changes viewed as
an external file to the appliance.
Supported Appliances
1. Teradata
Load threshold based on # of Transactions or time
interval.
2. Netezza
Once threshold reached, call appliance “load API”
to bulk load transactions into appliance.
3. GreenPlum
4. Paraccel
5. IBM Balanced Warehouse
12
Data Event Synchronization via an Enterprise Service Bus
“Solution deployed to provide real time data feeds for SOA and application
“Solution deployed
to provide
real timerequirements”.
data feeds for SOA and application
integration
business
integration business requirements”.
Production Server
Production Server
Billing
CRM
E
Native
OLTP
“CDC”
Continuous
DB
Queue 1
“CDC”
Continuous
S
Queue 1
B
OLTP
Log
ETL
Telco
CDC/Replication Process
Other Technology
CDC/Replication License
A license would reside on the server that
hosts the message oriented middleware.
13
Telco
Complimentary ESB Technologies:
1. IBM “MQ Series”
2. TIBCO “Business Works”
3. BEA “Aqualogic”
4. WebMethods “Fabric”
5. e-Commerce Application Synchronization
“Solution deployed to provide continuous customer, sales and inventory visibility
in web base e-commerce applications”.
Website Orders
Native
DB
OLTP
Log
Production Server
Inventory
Corporate
Native
OLTP
DB
Log
Point Of Sale
Retail
Native
DB
Log
Provides continuous bi-directional synchronization
between web based applications and mission
critical business applications.
Downtown Store
Helps organizations improve customer online shopping
experience with improved visibility into inventory and customer
shopping activities.
14
OLTP
6. Data Synchronization for Upgrades, Migrations and Workload
Balancing
“Solution deployed to help IT support application, database and platform migrations”.
Production Server
Testing Server
ERP
Native
OLTP
ERP
Upgrades, Migrations
DB
Log
Manufacturing
Native
DB
Workload Balancing
OLTP
Log
Manufacturing
Keep data synchronized between current
production server and a server deployed to test a
new application upgrade/version, or a
hardware/OS upgrade.
Workload balancing capability (i.e. master to master support)
allows database instances to remain synchronized where dual or
double data entry is a requirement (i.e. data entry occurring on
both systems at the same time).
15
7. Offloading Production Query & Reporting Cycles
“Solution deployed to allow organizations to offload the impact of query and
reporting to a non mission critical system”.
Production Server(s)
Finance 1
OLTP
Native
DB
Log
Services
Reporting Server
Finance 2
“Table Copy”
Report
Query
OLTP
Services
Finance 3
Native
OLTP
DB
Log
Services
16
Reporting server can also be used for
consolidation requirements i.e. consolidating
financials from multiple branches into a single
corporate instance.
Replication frequency generally varies from
continuous (near real-time) to periodic. Table level
refresh or copy can be used in addition to log
based change data capture.
8. Data Backup And Availability
“Solution deployed to allow organizations to backup copies of critical data for
recovery where a full disaster solution is not a requirement”.
Production Server
Availability Server
Finance
Native
OLTP
“CDC”
Continuous (to backup instance)
DB
Backup
Log
Partition 1
Backup
Partition 2
Availability of data only, does not support DDL
replication.
Exact image replication to produce a backup copy
on a separate server or in a different partition on
the same server.
A separate license is not required for each partition
used on the production server.
17
Thank You
18
© 2008 IBM Corporation
Download