IBM Data Strategy DB2 101 Agenda The Fillmore Group, IBM Business Partner What’s New in 2013? Data Governance Information Integration Data Warehouse and Big Data Mainframe Update Competing with Oracle The Fillmore Group, Inc. Frank Fillmore, DB2 Gold Consultant, IBM Champion Kim May, IBM Champion Dozens of experienced, certified consultants Information Management Consulting Services IBM Authorized Training Partner IBM Information Management Software Reseller Services DB2 – implementation, migration, and tuning BigData BigInsights (IBM’s Hadoop distribution) InfoSphere Streams Data Governance tools Information Integration Data Warehousing IBM Authorized Training Exclusive NA provider of IBM BigData training DW611 DW641 DW652 DW723 DW731 “IBM InfoSphere BigInsights Foundation” “BigInsights Analytics for Business Analysts” “BigInsights Analytics for Programmers” “Programming for InfoSphere Streams V3 with SPL” “Administration of InfoSphere Streams V3” DB2, InfoSphere, Information Server, Optim IBM 2013 News PureData System for Hadoop 8x faster deployment than custom-built solutions First appliance with built-in analytics accelerator Only Hadoop system with built-in archiving tools DB2 for LUW v10.5 with BLU Acceleration 8-25x faster reporting and analytics 10x storage space savings seen during beta test No indexes, aggregates, tuning, or SQL / schema changes BLU Acceleration Dynamic In-Memory Actionable Compression Patented compression technique that preserves order so that the data can be used without decompressing Parallel Vector Processing In-memory columnar processing Dynamic movement of data from storage Multi-core and SIMD parallelism (Single Instruction Multiple Data) Data Skipping Skips unnecessary processing of irrelevant data Data Servers - DB2 AESE bundle Data Governance Information Integration Data Warehouse 2013 IBM Portfolio What’s in the Information Management portfolio? DB2 Single Instance (scaleup) Scalable Basic Read/write Rapid deployment Storage efficient Multi-tenancy SSD-aware? DB2 pureScale DB2 Data Partitioning Facility Clustered Database (scale-out) Partitioned Database (scale-out) Massive scale Non-partitionable Transactional Read/write Highly Available High concurrency Fast response time Storage efficient Multi-tenancy Massive scale Partitionable Analytic Read-mostly Mixed query types High concurrency High throughput Storage efficient Multi-tenancy “shared disk” “shared nothing” PureData System for Transactions PureData System For Operational Analytics DB2 with BLU Acceleration Columnar Compression Database Memory scale Ultra-high speed Analytic Read-mostly Column store Column compressed Very storage efficient DB2 pureScale Unlimited Capacity – Buy only what you need, add capacity as your needs grow Application Transparency – Avoid the risk and cost of application changes Continuous Availability – Deliver uninterrupted access to your data with consistent performance Learning from the undisputed Gold Standard... System z pureScale - Scalability and High Availability Efficient Centralized Locking and Caching As the cluster grows, DB2 maintains one place to go for locking information and shared pages Optimized for very high speed access DB2 pureScale uses Remote Direct Memory Access (RDMA) to communicate with the powerHA pureScale server No IP socket calls, no interrupts, no context switching Results Near Linear Scalability to large numbers of servers Constant awareness of what each member is doing If one member fails, no need to block I/O from other members Recovery runs at memory speeds Member 1 Member 2 Member 3 CF Group Buffer Pool Group Lock Manager PowerHA pureScale Delivering trusted information for smarter business decisions – across the entire information supply chain INTEGRATE MANAGE InfoSphere Information Server DB2, Informix FileNet ANALYZE solidDB PureData System InfoSphere BigInsights InfoSphere MDM Cognos, InfoSphere InfoSphere Streams Data Explorer External Information Sources Business Analytic Applications GOVERN Quality InfoSphere Information Server Lifecycle InfoSphere Optim Security & Privacy InfoSphere Guardium Data Governance DB2 101 IBM Optim solutions Managing data throughout its lifecycle in heterogeneous environments Discover Speed understanding and project time through relationship discovery within and across data sources Understand sensitive data to protect and secure it Retire Test Data Management Discover Understand Classify Production Training •Subset •Mask Easily refresh & maintain right sized non-production environments, while reducing storage costs Improve application quality and deploy new functionality more quickly Data Masking •Compare Development •Refresh Test Protect sensitive information from misuse & fraud Prevent data breaches and associated fines Data Growth Management Reduce hardware, storage & maintenance costs Streamline application upgrades and improve application performance Application Retirement Archive Retire legacy & redundant applications while retaining data Ensure application-independent access to archive data InfoSphere Guardium OPM Slide Needed InfoSphere Guardium Guardium Slide Needed Optim Query Capture and Replay Capture production workloads and replay them in testing environments Record and replay SQL Application Requirements • Minimize unexpected production problems • Shorten testing cycles • Develop more realistic database testing scenarios Source Database Record InfoSphere Optim Query Capture and Replay Play Benefits Test Database • Identify database problems sooner with validation reports and performance tuning • Use actual production workloads for testing rather than fabricated scenarios • Extend quality testing efforts to include the data layer Optim Performance Manager OPM Slide Needed OPM – Locking Dashboard Add-ons: Query Tuner Extended Insight Information Integration DB2 101 IBM InfoSphere Information Server MANAGE INTEGRATE ANALYZE DB2, Informix InfoSphere Information Server InfoSphere BigInsights InfoSphere InfoSphere Warehouse MDM InfoSphere Cognos, Streams InfoSphere Warehouse FileNet solidDB External Information Sources Business Analytic Applications GOVERN Quality InfoSphere Information Server Lifecycle InfoSphere Optim Security & Privacy InfoSphere Guardium IBM’s Information Integration DB2, Informix FileNet solidDB InfoSphere InfoSphere Optim Guardium Organizations Use Integration for... DataStage, QualityStage …data loading, cleansing, migrating, movement, high availability and high performance Foundation Tools …deep analysis, design, metadata, collaboration and data governance CDC, Federation, Replication …data delivery for low impact, timely access to critical business data and operational data requirements InfoSphere Foundation Tools Foundation tools help profile, model, define, map and govern information that is spread across your enterprise, so your business can deliver the right information, to the right people, at the right time InfoSphere Information Analyzer InfoSphere Information Architect InfoSphere FastTrack Business Glossary Foundation tools are designed to be deployed in a heterogeneous IT environment to leverage existing IT investments. They work with any IBM or non-IBM data source, business intelligence tool or operating system -- or in conjunction with the Tools’ own comprehensive set of integration products. InfoSphere QualityStage Requirements Standardize, cleanse and deduplicate data, ensuring a complete, accurate view of information QualityStage Resolution of data quality issues Standardization of data formats Cleanse data Manage duplicate data Enable ongoing quality Benefits Removes duplicates • Redundancies • Lack of standards • Unlinked records • Incorrect data Complete & accurate view of information Cross-references matching records Survives a single, complete record Validate and enriches data InfoSphere DataStage Requirements Integrate, transform and deliver data on demand across multiple sources and targets including databases and enterprise applications DataStage Integrate and transform multiple, complex, and disparate sources of information Demand for data is diverse – DW, MDM, Analytics, Applications, and real time Benefits Transform and aggregate any volume of information Deliver data in batch or real time through visually designed logic Hundreds of built-in transformation functions Metadata-driven productivity, enabling collaboration IBM - InfoSphere Data Replication (IIDR) Real-time change data capture and delivery to deliver critical information at the speed of business IBM replication technologies – bundled! SQL Replication Q Replication Easy to set up Staging tables High volume, low latency Native Oracle and DB2 sources and targets WebSphere MQ transport layer Change Data Capture (CDC) Broadest set of heterogeneous sources and targets TCP/IP transport layer InfoSphere Change Data Capture Real-time change data capture and delivery to deliver critical information at the speed of business Change Data Capture Requirements Rapid data delivery for mainly heterogeneous environments Minimize CPU utilization on source systems Increased visibility into lines of business Benefits • DB2 on LUW, i, z/OS • Informix • Oracle • Sybase ASE • SQL Server • PureData System for Analytics (Netezza) Delivers real-time data for information management projects Minimize batch windows to optimize ETL processes Flexible implementation for multiple topologies InfoSphere Replication Server High volume, low latency data replication for continuous business availability Replication Server Requirements Rapid data delivery for DB2 and Oracle environments Resiliency to hardware, software, and network failures Visibility into replication processes for monitoring Benefits Delivers real-time data for information management projects Deep performance monitoring & troubleshooting To publish data to MQ, use InfoSphere Data Event Publisher . InfoSphere Federation Server Enable access and delivery of diverse and distributed information as if it were in one system Federation Server Requirements Access to critical information in disparate databases/apps Physical replication not viable Budget constraints prevent new database purchases Benefits Virtually consolidates data from multiple lines of business Cost-effective point of access to multiple DBs Enable SOA with InfoSphere Information Services Director InfoSphere MDM Product Portfolio IBM® Initiate® Master Data Service® IBM® InfoSphere™ MDM Server Virtual master registry for delivering single, trusted versions of master data & their relationships Physical master repository of customer, account & product data for the business to consume as services Governance Single View IBM® InfoSphere™ Identity Insight Analytical capabilities to discover where threat & fraud exists to enable immediate action Common Components IBM® InfoSphere™ MDM Server for PIM Authors, enriches and maintains product and other master data for a 360 degree view What is Master Data Management? Discipline that provides a consistent understanding of master data entities (customer, product, etc.) A set of functionality for data governance that provides mechanisms & governance for consistent use of master data across the organization Is designed to accommodate, control and manage change SOURCE SYSTEMS MASTER DATA MANAGEMENT CRM Name: B. Jones Address: 35 West 15th Street Address: Toledo, OH 12345 ERP Name: William Jones Address: 35 West 15th St. Address: Toledo, OH 12345 INFORMATION First: Bill Last: Jones Address: 35 West 15th Street City: Toledo ENTERPRISE APPLICATIONS Sales Customer Support State/Zip: OH / 12345 Legacy Name: Billie Jones Address: 36 West 15th St. Address: Toledo, OH 12345 Gender: M Age: 50 DOB: 1/1/65 Claims IBM provides a cost-effective, rapidly deployable solution to complex customer data management challenges Entity Resolution and Analysis …It is used to identify the use of false identities and networks of individuals who are trying to hide their relationships to each other. The same technologies or analyses are used in the detection of fraud networks, racketeering and money-laundering. - Gartner Who is who? Identity Resolution Company& External Sources Establish Identity Physical/Digital Attributes People & Organizations Multicultural Names Who knows who? Who does what? Relationship Resolution Complex Event Processing Obvious & Non-Obvious Links people & groups Degrees of Separation Role Alerts Events & Transactions Criteria Based Alerting Quantify Identity Activities Consuming Applications ! Threat and Fraud Alerts Data Warehouse and Big Data DB2 101 Simplicity, Flexibility, Choice IBM Data Warehouse & Analytics Solutions PureData System for Analytics True Appliance PureData System for Operational Analytics Flexible Integrated System DB2 Advanced Enterprise Server Edition v10.5 Custom Solution • IBM investment in solution design, integration and upgrades • IBM investment in solution design, integration and upgrades • Client investment in solution design, integration and upgrades • Speed and ease of deployment and administration • Flexibility of multiple options platform, capacity, and integrated software • Complete flexibility to mix and optimize software, servers and storage for the complete range and mix of workloads • Optimized performance for a specific workload range True Appliance Simplicity • Customizable to optimize for a range and mix of workloads Flexible Integrated System The right mix of simplicity and flexibility Custom Solution Flexibility Why “Big Data”? Homeland Security Up to 10,000 times larger 600,000 records/sec, 50B/day 1-2 ms/decision 320TB for Deep Analytics Telco Promotions Data Scale Data at Rest 100,000records/sec, 6B/day 10 ms/decision 270TB for Predictive Analytics Multi-Channel sales Weblogs and transactions Traditional Data warehouse & Business Intelligence yr mo Occasional wk day Predictive analytics to up sell and cross sell 1 sec/decision Data in Motion hr min Frequent sec Decision Frequency Up to 10,000 times faster … ms Real-time s Smart Traffic 250K GPS probes/sec 630K segments/sec 2 ms/decision, 4K vehicles 1 – Unlock Big Data InfoSphere Data Explorer Analytic Applications BI / Exploration / Functional Industry Predictive Content BI / Reporting Visualization App App Analytics Analytics Reporting IBM Big Data Platform 2 – Analyze Raw Rata Visualization & Discovery Application Development Systems Management 3 – Simplify your warehouse PureData System for Analytics InfoSphere BigInsights Accelerators Hadoop System Stream Computing Data Warehouse 5 – Analyze Streaming Data 4 – Reduce costs with Hadoop InfoSphere Streams InfoSphere BigInsights Information Integration & Governance Traditional Approach New Approach Structured, analytical, logical Creative, holistic thought, intuition Data Warehouse Transaction Data Hadoop and Streams Multimedia Web Logs Internal App Data Mainframe Data Social Data Structured Repeatable Linear OLTP System Data Unstructured Text Data: Exploratory emails Dynamic Sensor data: images ERP Data RFID Traditional Sources New Sources Mainframe Strategy DB2 101 DB2 Analytics Accelerator for z/OS Netezza appliance connected to System z only accessible through DB2 Blending System z and Netezza What is the value? technologies to deliver unparalleled, • Fast, predictable response times for “right-time” analysis • Accelerate analytic query response times • Improve price/performance for analytic workloads • Minimize the need to create data marts for performance • Highly secure environment for sensitive data analysis • Transparent to the application and user mixed workload performance for complex analytic business needs. Large Insurance Company – Business Reporting “we had this up and running in days with queries that ran over 1000 times faster” Query Query 1 Query 2 Query 3 Query 4 Query 5 Query 6 Query 7 Query 8 Query 9 Total Total Rows Rows Reviewed Returned 2,813,571 853,320 2,813,571 585,780 8,260,214 274 2,813,571 601,197 3,422,765 508 4,290,648 165 361,521 58,236 3,425.29 724 4,130,107 137 DB2 Only DB2 with IDAA Hours Sec(s) 2:39 9,540 2:16 8,220 1:16 4,560 1:08 4,080 0:57 4,080 0:53 3,180 0:51 3,120 0:44 2,640 0:42 2,520 Hours Sec(s) 0.0 5 0.0 5 0.0 6 0.0 5 0.0 70 0.0 6 0.0 4 0.0 2 0.1 193 DB2 Analytics Accelerator (Netezza 1000-12) • Production ready - 1 person, 2 days • Choose a Table for “Acceleration” Table Acceleration Setup in 2 Hours • DB2 “Add Accelerator • Choose a Table for “Acceleration” • Load the Table (DB2 Loads Data to the Accelerator • Knowledge Transfer • Query Comparisons Times Faster 1,908 1,644 760 816 58 530 780 1,320 13 Initial Load Performance • 400 GB Loaded in 29 Minutes • 570 Million Rows • Loaded 800 GB to 1.3 TB per hour Extreme Query Acceleration - 1908x faster • 2 Hours 39 minutes to 5 Seconds CPU Utilization Reduction to 35% IDAA - High Performance Storage Saver (HPSS) Oracle Takeout Strategy DB2 101 IBM DB2 Advanced Enterprise Edition is as low as 1/3rd the price1 of Oracle DB2 on POWER is as 3X faster per core2 than Oracle Database on SPARC Breakthrough migration technology delivers up to 98% compatibility3 with Oracle PL/SQL 1. PRICE based on price per core of comparable hardware and publicly avail U.S. info on 11/11/2011 for IBM DB2 Advanced Enterprise Edition + Oracle software w/comparable capabilities. Price comparison is NOT based on the specific benchmarks listed here. IBM: 100 Processor Value Units. Oracle: assumes 1.0 processor multiplier. Both incl. Y1 maint/support. 2. PERFORMANCE: www.tpc.org (http://www.tpc.org) as of 11/11/11 [IBM Power 780 (3 x 64 C)(24 Ch/192 C/768 Th); 10,366,254 tpmC; $1.38/tpmC; avail 10/13/10 v. Oracle SPARC SuperCluster w/T3-4 Servers (27 x 64 C)(108 Ch/1728 C/13824 Th); 30,249,688 tpmC; $1.01/tpmC; avail 6/1/11]. TPC-C is a trademark of Transaction Performance Processing Council. www.sap.com/solutions/benchmark/ (http://www.sap.com/solutions/benchmark/) as of 11/11/11 [IBM Power 795 (32 P/256 C/1024 Th); 126063 users/2-tier SAP ERP 6.0 pack4/AIX 7.1 + DB2 9.7; cert 2010046 v. Oracle SPARC Enterprise Server M9000 (64 P/256 C/512 Th); 39100 users/2-tier SAP ERP 6.0/Solaris 10, Oracle 10g; cert 2008042]. SAP is registered trademark of SAP AG in Germany and in several other countries. 3. COMPATIBILITY: Based on internal tests and reported client experience from June 30, 2010 through July 20, 2011. Business Value Assessment Tool Calculate your savings with DB2 • The tool uses a simple graphical interface - enter a few key inputs and assumptions. • Assumptions can be modified if inputs change. • Cost categories show detailed calculations. • Talk to your IBM Rep or The Fillmore Group to run an assessment. 45 45 Migration – Discovery to Implementation Step 1 Step 2 Review Process Review Business Value Calculate ROI (BVA) Oracle Contract & ULA BVA 46 46 Step 3 Attend DB2 Training Review Proposal Perform App Migration Test IBM POC Convert IBM Database Facts: Has more patents than Oracle and SQL Server combined Over 1300 Core Database developers located in 5 main development laboratories around the world. Over 300 researchers located in labs in the United State, Canada, Japan and Israel are currently working on advancement in database technologies for IBM. Examples of past research projects which have developed into product include : database compression, heterogeneous replication, advancements in query optimization for optimal performance, high availability , pureScale and cubing engines. All of these are best of breed in the marketplace. 25 of the top 25 banks 9 of the top 10 insurance providers 23 of the top 25 U.S. retailers Migrated over 100 customers from SAP / Oracle to SAP / DB2 in last 2 years Over 700 customers have made the SAP / DB2 decision DB2 101 Summary DB2 101 Our next event Mid-Atlantic IM Virtual Users Group http://www.thefillmoregroup.com/virtual-user-groups/midatlantic-im-vug/ “IBM Data Management Announcements - What You Should Know” https://www2.gotomeeting.com/register/387815986 Tuesday, May 7, 2013; 11:00 AM - 12:00 PM EDT Kim May, The Fillmore Group Warren Heising, IBM Resources The Fillmore Group’s blog Kim May http://www.thefillmoregroup.com/blog kim.may@thefillmoregroup.com http://twitter.com/KimMayTFG Frank Fillmore frank.fillmore@thefillmoregroup.com http://twitter.com/ffillmorejr http://tinyurl.com/ChannelDB2 Flipboard for iPad, iPhone, Android: “BigData”