IBM and Information Management Today

advertisement
IBM Data Strategy
DB2 101
Agenda







The Fillmore Group, IBM Business Partner
What’s New in 2013?
Data Governance
Information Integration
Data Warehouse and Big Data
Mainframe Update
Competing with Oracle
The Fillmore Group, Inc.






Frank Fillmore, DB2 Gold Consultant, IBM Champion
Kim May, IBM Champion
Dozens of experienced, certified consultants
Information Management Consulting Services
IBM Authorized Training Partner
IBM Information Management Software Reseller
Services


DB2 – implementation, migration, and tuning
BigData





BigInsights (IBM’s Hadoop distribution)
InfoSphere Streams
Data Governance tools
Information Integration
Data Warehousing
IBM Authorized Training

Exclusive NA provider of IBM BigData training






DW611
DW641
DW652
DW723
DW731
“IBM InfoSphere BigInsights Foundation”
“BigInsights Analytics for Business Analysts”
“BigInsights Analytics for Programmers”
“Programming for InfoSphere Streams V3 with SPL”
“Administration of InfoSphere Streams V3”
DB2, InfoSphere, Information Server, Optim
IBM 2013 News

PureData System for Hadoop




8x faster deployment than custom-built solutions
First appliance with built-in analytics accelerator
Only Hadoop system with built-in archiving tools
DB2 for LUW v10.5 with BLU Acceleration



8-25x faster reporting and analytics
10x storage space savings seen during beta test
No indexes, aggregates, tuning, or SQL / schema changes
BLU Acceleration

Dynamic In-Memory



Actionable Compression


Patented compression technique that preserves order so that the data can
be used without decompressing
Parallel Vector Processing



In-memory columnar processing
Dynamic movement of data from storage
Multi-core and SIMD parallelism
(Single Instruction Multiple Data)
Data Skipping

Skips unnecessary processing of irrelevant data

Data Servers - DB2 AESE bundle

Data Governance

Information Integration

Data Warehouse
2013 IBM Portfolio
What’s in the Information
Management portfolio?
DB2
Single Instance (scaleup)
Scalable
Basic
Read/write
Rapid deployment
Storage efficient
Multi-tenancy
SSD-aware?
DB2 pureScale
DB2 Data
Partitioning
Facility
Clustered Database
(scale-out)
Partitioned Database
(scale-out)
Massive scale
Non-partitionable
Transactional
Read/write
Highly Available
High concurrency
Fast response time
Storage efficient
Multi-tenancy
Massive scale
Partitionable
Analytic
Read-mostly
Mixed query types
High concurrency
High throughput
Storage efficient
Multi-tenancy
“shared disk”
“shared nothing”
PureData System
for Transactions
PureData System
For
Operational Analytics
DB2 with
BLU Acceleration
Columnar Compression
Database
Memory scale
Ultra-high speed
Analytic
Read-mostly
Column store
Column compressed
Very storage efficient
DB2 pureScale
 Unlimited Capacity
– Buy only what you need, add capacity as your
needs grow
 Application Transparency
– Avoid the risk and cost of application changes
 Continuous Availability
– Deliver uninterrupted access to your data with
consistent performance
Learning from the undisputed Gold Standard... System z
pureScale - Scalability and High Availability

Efficient Centralized Locking and Caching
 As the cluster grows, DB2 maintains one place to go for locking information and
shared pages
 Optimized for very high speed access
 DB2 pureScale uses Remote Direct Memory Access (RDMA) to communicate with
the powerHA pureScale server
 No IP socket calls, no interrupts, no context switching

Results
 Near Linear Scalability to large
numbers of servers
 Constant awareness of what each
member is doing
 If one member fails, no need to block
I/O from other members
 Recovery runs at memory speeds
Member 1
Member 2
Member 3
CF
Group Buffer
Pool
Group Lock
Manager
PowerHA
pureScale
Delivering trusted information for smarter business decisions
– across the entire information supply chain
INTEGRATE
MANAGE
InfoSphere
Information Server
DB2, Informix
FileNet
ANALYZE
solidDB
PureData
System
InfoSphere BigInsights
InfoSphere
MDM
Cognos,
InfoSphere
InfoSphere Streams
Data Explorer
External
Information
Sources
Business
Analytic
Applications
GOVERN
Quality
InfoSphere
Information
Server
Lifecycle
InfoSphere
Optim
Security & Privacy
InfoSphere
Guardium
Data Governance
DB2 101
IBM Optim solutions
Managing data throughout its lifecycle in heterogeneous environments
Discover
Speed understanding and project time through relationship
discovery within and across data sources
Understand sensitive data to protect and secure it

Retire

Test Data Management
Discover
Understand
Classify
Production
Training
•Subset
•Mask


Easily refresh & maintain right sized non-production
environments, while reducing storage costs
Improve application quality and deploy new functionality
more quickly
Data Masking
•Compare Development
•Refresh
Test
Protect sensitive information from misuse & fraud
Prevent data breaches and associated fines


Data Growth Management
Reduce hardware, storage & maintenance costs
Streamline application upgrades and improve application
performance


Application Retirement
Archive


Retire legacy & redundant applications while retaining data
Ensure application-independent access to archive data
InfoSphere Guardium
OPM Slide Needed
InfoSphere Guardium
Guardium Slide Needed
Optim Query Capture and Replay
Capture production workloads and
replay them in testing environments
Record and replay SQL
Application
Requirements
•
Minimize unexpected
production problems
•
Shorten testing cycles
•
Develop more realistic
database testing scenarios
Source Database
Record
InfoSphere
Optim Query
Capture and
Replay
Play
Benefits
Test
Database
•
Identify database problems
sooner with validation reports
and performance tuning
•
Use actual production
workloads for testing rather
than fabricated scenarios
•
Extend quality testing efforts
to include the data layer
Optim Performance Manager
OPM Slide Needed
OPM – Locking Dashboard
Add-ons:
 Query Tuner
 Extended Insight
Information Integration
DB2 101
IBM InfoSphere Information Server
MANAGE
INTEGRATE
ANALYZE
DB2, Informix
InfoSphere
Information Server
InfoSphere BigInsights
InfoSphere InfoSphere
Warehouse
MDM
InfoSphere
Cognos,
Streams
InfoSphere
Warehouse
FileNet
solidDB
External
Information
Sources
Business
Analytic
Applications
GOVERN
Quality
InfoSphere
Information
Server
Lifecycle
InfoSphere
Optim
Security & Privacy
InfoSphere
Guardium
IBM’s Information Integration
DB2,
Informix
FileNet
solidDB
InfoSphere InfoSphere
Optim
Guardium
Organizations Use Integration for...
DataStage, QualityStage
…data loading, cleansing,
migrating, movement, high
availability and high
performance
Foundation Tools
…deep analysis, design,
metadata, collaboration
and data governance
CDC, Federation, Replication
…data delivery for low impact,
timely access to critical business
data and operational data
requirements
InfoSphere Foundation Tools
Foundation tools help profile, model, define,
map and govern information that is spread
across your enterprise, so your business can
deliver the right information, to the right
people, at the right time




InfoSphere Information Analyzer
InfoSphere Information Architect
InfoSphere FastTrack
Business Glossary
 Foundation tools are
designed to be deployed in
a heterogeneous IT
environment to leverage
existing IT investments.
 They work with any IBM
or non-IBM data source,
business intelligence tool
or operating system -- or
in conjunction with the
Tools’ own comprehensive
set of integration
products.
InfoSphere QualityStage
Requirements
Standardize, cleanse and
deduplicate data, ensuring a complete,
accurate view of information
QualityStage
 Resolution of data quality
issues
 Standardization of data
formats
 Cleanse data
 Manage duplicate data
 Enable ongoing quality
Benefits
 Removes duplicates
• Redundancies
• Lack of standards
• Unlinked records
• Incorrect data
Complete &
accurate
view of
information
 Cross-references matching
records
 Survives a single, complete
record
 Validate and enriches data
InfoSphere DataStage
Requirements
Integrate, transform and deliver
data on demand across multiple
sources and targets including
databases and enterprise
applications
DataStage
 Integrate and transform
multiple, complex, and
disparate sources of
information
 Demand for data is diverse –
DW, MDM, Analytics,
Applications, and real time
Benefits
 Transform and aggregate any
volume of information
 Deliver data in batch or real
time through visually designed
logic
 Hundreds of built-in
transformation functions
 Metadata-driven productivity,
enabling collaboration
IBM - InfoSphere Data Replication (IIDR)
Real-time change data capture and
delivery to deliver critical information
at the speed of business
IBM replication technologies – bundled!

SQL Replication



Q Replication




Easy to set up
Staging tables
High volume, low latency
Native Oracle and DB2 sources and targets
WebSphere MQ transport layer
Change Data Capture (CDC)


Broadest set of heterogeneous sources and targets
TCP/IP transport layer
InfoSphere Change Data Capture
Real-time change data capture and
delivery to deliver critical information
at the speed of business
Change Data Capture
Requirements
 Rapid data delivery for mainly
heterogeneous environments
 Minimize CPU utilization on
source systems
 Increased visibility into lines of
business
Benefits
• DB2
on LUW, i, z/OS
• Informix
• Oracle
• Sybase ASE
• SQL Server
• PureData System for Analytics (Netezza)
 Delivers real-time data for
information management
projects
 Minimize batch windows to
optimize ETL processes
 Flexible implementation for
multiple topologies
InfoSphere Replication Server
High volume, low latency data
replication for continuous business
availability
Replication Server
Requirements
 Rapid data delivery for DB2
and Oracle environments
 Resiliency to hardware,
software, and network
failures
 Visibility into replication
processes for monitoring
Benefits
 Delivers real-time data for
information management
projects
 Deep performance monitoring
& troubleshooting
 To publish data to MQ, use
InfoSphere Data Event
Publisher
.
InfoSphere Federation Server
Enable access and delivery of diverse
and distributed information as if it
were in one system
Federation Server
Requirements
 Access to critical information
in disparate databases/apps
 Physical replication not viable
 Budget constraints prevent
new database purchases
Benefits
 Virtually consolidates data
from multiple lines of
business
 Cost-effective point of
access to multiple DBs
 Enable SOA with
InfoSphere Information
Services Director
InfoSphere MDM Product Portfolio
IBM® Initiate® Master
Data Service®
IBM® InfoSphere™
MDM Server
Virtual master registry for
delivering single, trusted
versions of master data
& their relationships
Physical master repository
of customer, account
& product data for the
business to consume
as services
Governance
Single View
IBM® InfoSphere™
Identity Insight
Analytical capabilities to
discover where threat &
fraud exists to enable
immediate action
Common
Components
IBM® InfoSphere™
MDM Server for PIM
Authors, enriches and
maintains product and other
master data for a 360 degree
view
What is Master Data Management?

Discipline that provides a consistent understanding of master data entities (customer, product, etc.)

A set of functionality for data governance that provides mechanisms
& governance for consistent use of master data across the organization

Is designed to accommodate, control and manage change
SOURCE SYSTEMS
MASTER DATA
MANAGEMENT
CRM
Name:
B. Jones
Address: 35 West 15th Street
Address: Toledo, OH 12345
ERP
Name:
William Jones
Address: 35 West 15th St.
Address: Toledo, OH 12345
INFORMATION
First:
Bill
Last:
Jones
Address:
35 West 15th Street
City:
Toledo
ENTERPRISE
APPLICATIONS
Sales
Customer
Support
State/Zip: OH / 12345
Legacy
Name:
Billie Jones
Address: 36 West 15th St.
Address: Toledo, OH 12345
Gender:
M
Age:
50
DOB:
1/1/65
Claims
IBM provides a cost-effective, rapidly deployable solution
to complex customer data management challenges
Entity Resolution and Analysis
…It is used to identify the use of false identities and networks of individuals who are
trying to hide their relationships to each other. The same technologies or analyses
are used in the detection of fraud networks, racketeering and money-laundering. -
Gartner
Who is who?
Identity
Resolution
Company&
External
Sources
 Establish Identity
 Physical/Digital
Attributes
 People & Organizations
 Multicultural Names
Who knows who?
Who does what?
Relationship
Resolution
Complex Event
Processing
 Obvious & Non-Obvious
 Links people & groups
 Degrees of Separation
 Role Alerts
 Events & Transactions
 Criteria Based Alerting
 Quantify Identity
Activities
Consuming
Applications
!
Threat and
Fraud
Alerts
Data Warehouse and Big Data
DB2 101
Simplicity, Flexibility, Choice
IBM Data Warehouse & Analytics Solutions
PureData System
for Analytics
True Appliance
PureData System for
Operational Analytics
Flexible Integrated System
DB2 Advanced
Enterprise Server
Edition v10.5
Custom Solution
• IBM investment in solution
design, integration and upgrades
• IBM investment in solution design,
integration and upgrades
• Client investment in solution
design, integration and upgrades
• Speed and ease of deployment and
administration
• Flexibility of multiple options platform, capacity, and integrated
software
• Complete flexibility to mix and
optimize software, servers and
storage for the complete range
and mix of workloads
• Optimized performance for
a specific workload range
True Appliance
Simplicity
• Customizable to optimize for
a range and mix of workloads
Flexible Integrated System
The right mix of simplicity and flexibility
Custom Solution
Flexibility
Why “Big Data”?
Homeland Security
Up to
10,000
times
larger
600,000 records/sec, 50B/day
1-2 ms/decision
320TB for Deep Analytics
Telco Promotions
Data Scale
Data at Rest
100,000records/sec, 6B/day
10 ms/decision
270TB for Predictive Analytics
Multi-Channel sales
Weblogs and transactions
Traditional
Data warehouse
& Business
Intelligence
yr
mo
Occasional
wk
day
Predictive analytics to up sell
and cross sell
1 sec/decision
Data in Motion
hr
min
Frequent
sec
Decision Frequency
Up to 10,000
times faster
…
ms
Real-time
s
Smart Traffic
250K GPS probes/sec
630K segments/sec
2 ms/decision, 4K vehicles
1 – Unlock Big Data
InfoSphere Data
Explorer
Analytic Applications
BI /
Exploration / Functional Industry Predictive Content
BI /
Reporting Visualization
App
App
Analytics Analytics
Reporting
IBM Big Data Platform
2 – Analyze Raw Rata
Visualization
& Discovery
Application
Development
Systems
Management
3 – Simplify your
warehouse
PureData System for
Analytics
InfoSphere
BigInsights
Accelerators
Hadoop
System
Stream
Computing
Data
Warehouse
5 – Analyze Streaming
Data
4 – Reduce costs with
Hadoop
InfoSphere Streams
InfoSphere
BigInsights
Information Integration & Governance
Traditional Approach
New Approach
Structured, analytical, logical
Creative, holistic thought, intuition
Data
Warehouse
Transaction
Data
Hadoop and
Streams
Multimedia
Web Logs
Internal App
Data
Mainframe
Data
Social Data
Structured
Repeatable
Linear
OLTP System
Data
Unstructured Text Data:
Exploratory
emails
Dynamic
Sensor data:
images
ERP
Data
RFID
Traditional
Sources
New
Sources
Mainframe Strategy
DB2 101
DB2 Analytics Accelerator for z/OS
Netezza appliance connected to System z only accessible through DB2
Blending System z and Netezza
What is the value?
technologies to deliver unparalleled,
•
Fast, predictable response times
for “right-time” analysis
•
Accelerate analytic query
response times
•
Improve price/performance for
analytic workloads
•
Minimize the need to create data
marts for performance
•
Highly secure environment for
sensitive data analysis
•
Transparent to the application and
user
mixed workload performance for
complex analytic business needs.
Large Insurance Company – Business Reporting
“we had this up and running in days with queries that ran over 1000 times faster”
Query
Query 1
Query 2
Query 3
Query 4
Query 5
Query 6
Query 7
Query 8
Query 9
Total
Total
Rows
Rows
Reviewed Returned
2,813,571
853,320
2,813,571
585,780
8,260,214
274
2,813,571
601,197
3,422,765
508
4,290,648
165
361,521
58,236
3,425.29
724
4,130,107
137
DB2 Only
DB2 with
IDAA
Hours Sec(s)
2:39 9,540
2:16 8,220
1:16 4,560
1:08 4,080
0:57 4,080
0:53 3,180
0:51 3,120
0:44 2,640
0:42 2,520
Hours Sec(s)
0.0
5
0.0
5
0.0
6
0.0
5
0.0
70
0.0
6
0.0
4
0.0
2
0.1
193
 DB2 Analytics Accelerator (Netezza 1000-12)
• Production ready - 1 person, 2 days
• Choose a Table for “Acceleration”
 Table Acceleration Setup in 2 Hours
• DB2 “Add Accelerator
• Choose a Table for “Acceleration”
• Load the Table (DB2 Loads Data to the Accelerator
• Knowledge Transfer
• Query Comparisons
Times
Faster
1,908
1,644
760
816
58
530
780
1,320
13
 Initial Load Performance
• 400 GB Loaded in 29 Minutes
• 570 Million Rows
• Loaded 800 GB to 1.3 TB per hour
 Extreme Query Acceleration - 1908x faster
• 2 Hours 39 minutes to 5 Seconds
 CPU Utilization Reduction to 35%
IDAA - High Performance Storage Saver (HPSS)
Oracle Takeout Strategy
DB2 101

IBM DB2 Advanced Enterprise Edition is
as low as 1/3rd the price1 of Oracle

DB2 on POWER is as 3X faster per core2
than Oracle Database on SPARC

Breakthrough migration technology delivers
up to 98% compatibility3 with Oracle PL/SQL
1.
PRICE based on price per core of comparable hardware and publicly avail U.S. info on 11/11/2011 for IBM DB2 Advanced Enterprise Edition + Oracle software w/comparable capabilities. Price comparison is NOT based on
the specific benchmarks listed here. IBM: 100 Processor Value Units. Oracle: assumes 1.0 processor multiplier. Both incl. Y1 maint/support.
2.
PERFORMANCE: www.tpc.org (http://www.tpc.org) as of 11/11/11 [IBM Power 780 (3 x 64 C)(24 Ch/192 C/768 Th); 10,366,254 tpmC; $1.38/tpmC; avail 10/13/10 v. Oracle SPARC SuperCluster w/T3-4 Servers (27 x 64
C)(108 Ch/1728 C/13824 Th); 30,249,688 tpmC; $1.01/tpmC; avail 6/1/11]. TPC-C is a trademark of Transaction Performance Processing Council. www.sap.com/solutions/benchmark/
(http://www.sap.com/solutions/benchmark/) as of 11/11/11 [IBM Power 795 (32 P/256 C/1024 Th); 126063 users/2-tier SAP ERP 6.0 pack4/AIX 7.1 + DB2 9.7; cert 2010046 v. Oracle SPARC Enterprise Server M9000 (64
P/256 C/512 Th); 39100 users/2-tier SAP ERP 6.0/Solaris 10, Oracle 10g; cert 2008042]. SAP is registered trademark of SAP AG in Germany and in several other countries.
3.
COMPATIBILITY: Based on internal tests and reported client experience from June 30, 2010 through July 20, 2011.
Business Value Assessment Tool
Calculate your savings
with DB2
• The tool uses a simple graphical
interface - enter a few key inputs
and assumptions.
• Assumptions can be modified if
inputs change.
• Cost categories show detailed
calculations.
• Talk to your IBM Rep or The
Fillmore Group to run an
assessment.
45
45
Migration – Discovery to Implementation
Step 1
Step 2
Review
Process
Review
Business
Value
Calculate
ROI
(BVA)
Oracle
Contract
& ULA
BVA
46
46
Step 3
Attend
DB2
Training
Review
Proposal
Perform
App
Migration
Test
IBM
POC
Convert
IBM Database Facts:








Has more patents than Oracle and SQL Server combined
Over 1300 Core Database developers located in 5 main development laboratories
around the world.
Over 300 researchers located in labs in the United State, Canada, Japan and Israel
are currently working on advancement in database technologies for IBM. Examples
of past research projects which have developed into product include : database
compression, heterogeneous replication, advancements in query optimization for
optimal performance, high availability , pureScale and cubing engines. All of these
are best of breed in the marketplace.
25 of the top 25 banks
9 of the top 10 insurance providers
23 of the top 25 U.S. retailers
Migrated over 100 customers from SAP / Oracle to SAP / DB2 in last 2 years
Over 700 customers have made the SAP / DB2 decision
DB2 101 Summary
DB2 101
Our next event

Mid-Atlantic IM Virtual Users Group


http://www.thefillmoregroup.com/virtual-user-groups/midatlantic-im-vug/
“IBM Data Management Announcements - What You
Should Know”




https://www2.gotomeeting.com/register/387815986
Tuesday, May 7, 2013; 11:00 AM - 12:00 PM EDT
Kim May, The Fillmore Group
Warren Heising, IBM
Resources

The Fillmore Group’s blog


Kim May



http://www.thefillmoregroup.com/blog
kim.may@thefillmoregroup.com
http://twitter.com/KimMayTFG
Frank Fillmore




frank.fillmore@thefillmoregroup.com
http://twitter.com/ffillmorejr
http://tinyurl.com/ChannelDB2
Flipboard for iPad, iPhone, Android: “BigData”
Download