Data Aging Strategies in SAP Business Warehouse BW 7.3 Rainer Uhle, SAP Product Manager Dr. Peter Zimmerer, SAP Development Architect Mannheim, Rosengarten - June 22, 2011 Disclaimer This presentation outlines our general product direction and should not be relied on in making a purchase decision. This presentation is not subject to your license agreement or any other agreement with SAP. SAP has no obligation to pursue any course of business outlined in this presentation or to develop or release any functionality mentioned in this presentation. This presentation and SAP's strategy and possible future developments are subject to change and may be changed by SAP at any time for any reason without notice. This document is provided without a warranty of any kind, either express or implied, including but not limited to, the implied warranties of merchantability, fitness for a particular purpose, or non-infringement. SAP assumes no responsibility for errors or omissions in this document, except if such damages were caused by SAP intentionally or grossly negligent. © 2011 SAP AG. All rights reserved. 2 You Need Complete and Trusted Information to Make Good Business Decisions “ 90% of upper level management feel they don’t have the necessary information for critical business decisions; 50% of them are afraid they are making poor decisions because of it.” “ BI strategies are deemed to fail without a trusted data foundation “ The #1 risk for building a data mart or data warehouse is data quality © 2011 SAP AG. All rights reserved. 3 How Good is the Data Behind My Dashboard? Where did these numbers come from? Are we considering all our relevant sources? Are these terms consistent with our business definitions? How current is this data? When was it last updated? Can I trust this data enough to make my critical decisions? Has the data passed all our business rule checks? © 2011 SAP AG. All rights reserved. 4 Enterprise Data Warehouse (EDW) Characteristics and Requirements © 2011 SAP AG. All rights reserved. 5 SAP NetWeaver Business Warehouse Strong EDW capabilities Integrated, scalable Enterprise Data Warehouse (EDW) platform EDW = DBMS + X Business Content Fast, sustainable implementation through Modeling Patterns Business Content Openness and data quality through Reliable Data Acquisition Streamlined Operations Lifecycle Management Out-of-the box integration for data originating in SAP systems Integrated with SAP BusinessObjects Data Services (Data Integrator and Data Quality Management) Efficient data management through: Management of data consistency, data base abstraction, data base neutral Sophisticated Security, Authorization and Identity Handling High availability Enable sophisticated lifecycle management at different levels: System Meta Data © 2011 SAP AG. All rights reserved. Data (Nearline storage, archiving) 6 What does BW know about my Business? © 2011 SAP AG. All rights reserved. 7 Introduction into the term "Layered, Scalable Architecture (LSA)" The Layered, Scalable Architecture (LSA) is a standard term for SAP for common, unified understanding. The LSA is a Reference Architecture and not only a data model. At the center is the service idea of the reference architecture: Each layer provides a service that can be used. Layered Layer-based data model in which each layer performs a specific task. Scalable The data model is scalable and can be enhanced for example by other source systems, regions and scenarios. Architecture The LSA is an architecture that is applied in the entire BW system. © 2011 SAP AG. All rights reserved. 8 The LSA Reference Architecture layers Layer optimized for reporting (consists of InfoCubes and MultiProviders) Reporting BI Applications (Architected Data Mart Layer) Near real-time reporting, close to operational reporting Reporting Layer Business Transformation Layer Harmonisation Layer Corporate Memory Data Acquisition Layer Source system close structure, complete storage of history as granular as possible, “Master the Unknown” Harmonization, securing data quality, plausibility EDW Layer (Single Point of truth, reusable, granular, complete history) Data Sources © 2011 SAP AG. All rights reserved. Application of Business Logic for the applications LSA Data Propagation Layer Operational Data Store Easily digestible, consumable , integrated and independent data Extractor inbox, 1:1 mapping, temporary storage 9 LSA Data Flow Templates as Content © 2011 SAP AG. All rights reserved. 10 SAP NetWeaver BW adoption Productive SAP NetWeaver BW systems – constant growth 15.500 15.000 14.500 12.500 14.214 13.910 13.359 13.000 13.728 15.238 13.500 14.948 14.000 14.687 Adoption of SAP NetWeaver BW constantly growing Unaffected by economic down-turn in 2009 More than 12000 customers referring to more than 15000 productive systems 14.446 16.000 12.000 Q4 10 Q3 10 Q2 10 Q1 10 Q4 09 Q3 09 Q2 09 Q1 09 Stable Product, Large installed Base, Constant Growth © 2011 SAP AG. All rights reserved. 11 Analyst Opinions Forrester 2011 © 2011 SAP AG. All rights reserved. 12 SAP BW EDW and Reality „60 TB Proof of Concept‟ on RDBMS (IBM/ DB2) Discussions about corporate DWH architectures (EDW) are frequently driven by fears and prejudices. This results in vague questions like: Can BW handle 30, 40,..., 100 Terabyte ? The answer: SAP BW - 60TB Proof of Concept © 2011 SAP AG. All rights reserved. 13 Aggregation “on the fly” Information BW Analytical Engine Merging and results preparation for BI queries Query & SAP NetWeaver BW Accelerator SAP NetWeaver 7.0 Business Intelligence BW Accelerator Query Run Time Response InfoCube Indexing (*) property setting („load index into main memory‟) or schedule program RSDDTREX_INDEX_LOAD_UNLOAD © 2011 SAP AG. All rights reserved. 14 BWA Linear Scalability - Data Volume vs. Resources (25 TB Showcase 2009) 1.2 TB / h 101,000 reports / h 4.2 sec 37 M records Total DB Size 25 TB 1.1 TB / h 101,000 reports / h 4.2 sec 22 M records 15 TB 5 TB Legend: Index creation throughput Multiuser reporting throughput avg. report response time avg. # records touched per report 0.6 TB / h 100,000 reports / h 4.5 sec 6 M records 27 blades 81 blades 135 blades BWA Resources © 2011 SAP AG. All rights reserved. 15 Bill Inmon‟s Corporate Information Factory & Nearline Storage DSS Applications Departmental Data Marts Acctg Finance Marketing ERP ERP ERP Sales CRM Changed Data Staging Area ETL eComm. EDW Bus. Int. Exploration warehouse/ data mining Global ODS ERP Corporate Applications local ODS Oper. Mart Granularity Manager Session Analysis Cross media Storage Management Near line Storage Dialogue Manager Internet © 2011 SAP AG. All rights reserved. Cookie Cognition Preformatted dialogues Archives Web Logs Source:Bill Inmon 16 Data-Aging Strategies for Volume Performance Storage Type / Nearline Storage Classic Archive Information Lifecycle according to Importance/Age: Online Database Data Category (read only) (read only) Frequently read / changed data (actual) Infrequently read data (mature) Very rarely read data (aged) © 2011 SAP AG. All rights reserved. 17 Key facts about SAP NLS NLS should be a part of an Information Lifecycle Management (ILM) strategy Based on wellestablished SAP / SAP BW archiving concepts Data archived in NLS can be incorporated into reporting Data consistency guaranteed before deleting the data from source High compression rate (up to 95%) Supports archiving of InfoCubes and DataStore Objects Saves storage costs and other system resources © 2011 SAP AG. All rights reserved. NLS is an application from a third party vendor, running on a separate system Mainly timebased archiving, yet can also be based on other characteristics Lock of the archived data slice in the original InfoProviders Process Chain support Increases retention period for analysis data Scheduling and Monitoring of archiving sessions from SAP BW system Copes with changes in the meta data to the BW objects of the archived data Included in the query statistic data collection (RSRT) 18 Evolution by SAP NetWeaver BW Releases SAP NetWeaver BW 7.00 Enhanced Look-Up API Suspension and selective continuation of archiving processes within Process Chains Restore of an archiving request with all successors Smaller Data Object size for ADK-based Nearline Solution without semantic grouping © 2011 SAP AG. All rights reserved. SAP NetWeaver BW 7.01 (EhP1) Support of write-optimized DataStore Objects for ADK archiving and the NearlineStorage interface Request based Archiving Enhanced status and job monitoring within InfoProvider management view SAP NetWeaver BW 7.30 Support for accessing Nearline-Storage data for MultiProviders Feature to allow archiving from uncompressed InfoCubes Archiving of Semantic Partioned Objects (SPO) with SP1 Automatic rebuild of BW Accelerator index possible 19 The Nearline Storage Solution for SAP NetWeaver BW Based on the Nearline Storage Interface Development Partners can implement their Solutions for Archiving and NLS into the SAP BW 3rd Party NLS Solutions are implemented within the SAP BW ABAP Stack in partner specific namespaces have to pass a certification process can offer specific Application Area in the SAP Support Portal have to be licensed in addition to SAP licenses can have a different release cycle compared to SAP NetWeaver BW NLS Partner Solution Present development partners Certified since SAP BW 7.0 (in alphabetical order of their products) CBW® – PBS Software Dynamic NearLine Access® - SAND Technology DB2 Viper 9.5® - IBM DataVard OutBoard 1.0 yes yes 7.01 SP6 yes (see also http://www.sap.com/ecosystem/customers/directories/SearchSolution.epx ) © 2011 SAP AG. All rights reserved. 20 Customer Adoption - BW Archiving and Nearline Storage (based on 895 customer messages) © 2011 SAP AG. All rights reserved. 21 Data analysis and assistance for ROI analysis Sizing of Nearline Storage solutions: Hardware sizing of the NearLine-Storage solution has to be done by the vendor Different Nearline Storage technologies on the market From database solutions, to file-based solutions, to column-based storage solutions Data volume services by SAP Active Global Support (AGS) http://service.sap.com/dvm Deliver a thorough analysis of BW objects distribution Can help on estimating the data volume that may be archived / transferred to NLS for the largest InfoProviders within the system Considers only “technical facts” (and not the customer’s “business requirements”) © 2011 SAP AG. All rights reserved. 22 Data Management with Nearline Storage Implementation Aspects 1 2 3 4 5 Look-up during Transformation Create a Data Archiving Process Create and schedule archiving requests Restore archiving requests Load data to subsequent Data Targets Reporting Layer SAP Sales InfoCube (Architected Data Marts) 6 6 Query Settings 7 MultiProvider Settings MultiProvider 7 Nearline Storage 4 Data Propagation Layer Nearline Storage DTP 2 3Nearline Storage DTP DTP DAP 5 Data Acquisition Layer InfoSource DTP PSA InfoPackage © 2011 SAP AG. All rights reserved. 1 LSA Corporate Memory DTP DataSource 23 Design Aspects – Nearline Storage (NLS) vs. BW Accelerator (BWA) BI InfoMarts (InfoCube) ADK Archive BWA Archiving Acceleration Nearline Storage Acquisition RDBMS Access - very frequently © 2011 SAP AG. All rights reserved. frequently not frequently rarely 24 Data Management at Query Runtime The Data Manager identifies the availability of alternative data storage of any kind, such as 1. 2. 3. 4. Data resides in the InfoProvider in the database Data resides in a classical Aggregate Data resides in the BW Accelerator Index Data resides in an NLS Partition Aggregate Types • BW Accelerator Index • NLS Partition © 2011 SAP AG. All rights reserved. 25 NLS Related MultiProvider Settings Nearline read mode • disabled at all • enabled at all • InfoProvider settings © 2011 SAP AG. All rights reserved. 26 MultiProvider: Query Runtime Statistics Listing of Basis Providers and NLS partitions used during Query execution © 2011 SAP AG. All rights reserved. 27 NLS Related Query Designer Settings Reporting Fixed NLS Settings • read NLS • do not read NLS • see InfoProvider settings © 2011 SAP AG. All rights reserved. 28 NLS Related Query Designer Settings: Variable Variable NLS Settings (Dialog) • read NLS • do not read NLS • see InfoProvider settings © 2011 SAP AG. All rights reserved. 29 InfoCube: Archiving of Uncompressed Data Central setting in Data Archiving Process (DAP) Valid for all archiving requests und DAP-Variants Can be changed during operation Prerequisite: only already processed requests (aggregates, Delta DTP) Allow Archiving for noncompressed data © 2011 SAP AG. All rights reserved. 30 Data Management at Archiving Runtime During the delete phase of the archiving request the new setup of the BWA index is offered in the dialog. BWA consistence reflected during DAP processing © 2011 SAP AG. All rights reserved. 31 Optimized Support for Navigational Attributes Optimized Support for navigational attributes during Query processing on NLS Navigational attributes are master data attributes that can be used to navigate/filter in queries. Master data attributes are located outside the InfoCube persistence in the extended star schema and thus are not a component of the NLS data stock. Previous solution: – Selections for navigational attributes were not transferred to NLS as selections … – The attribute values were assigned subsequently and filtered in the result set – Performance problems for highly selective attribute values Improvement: – Selections for navigational attributes are converted first to a selection for the characteristic bearing attributes (max. 100 characteristic values) – The attribute selection is replaced by this characteristic selection in the query selection. © 2011 SAP AG. All rights reserved. 32 DSO Lookup for „nearlined‟ Partitions SAP NetWeaver BW 7.30 will come up with a separate transformation rule type, a DSO lookup In case a NLS solution is attached to the BW system, the lookup will automatically read from both the “online” and “near lined” data partitions. © 2011 SAP AG. All rights reserved. 33 Data Access within the APD With SAP NetWeaver BW 7.30, the Analysis Process Designer will be enabled to read from Nearline-Storage also for the source type “Read data from InfoProvider” Option to allow reading from NLS for InfoProvider sources © 2011 SAP AG. All rights reserved. 34 Reload data from both Online and Nearline partitions for InfoCubes Option to extract data from both the Online and Nearline Partition in a single DTP © 2011 SAP AG. All rights reserved. 35 Transaction LISTCUBE Read data from NLS combined © 2011 SAP AG. All rights reserved. 36 Archiving of Semantic Partitioned Objects Facts: Semantic Partitioning possible for InfoCubes (only standard InfoCubes) and DSOs (standard and write-optimized) There is not a DAP per PartProvider but only one DAP for the entire SPO. As a consequence, there is not a set of tables / files created in the NLS system per PartProvider but only a set of tables / files per SPO. The DAP itself has the same options / settings as a regular InfoProvider. However, the DAP must contain the logical partitioning criterion as additional archiving criterion so that data can be archived, reloaded, or restore for a dedicated Semantic Partition. Semantic Partitioning criterion © 2011 SAP AG. All rights reserved. 37 Archiving of Semantic Partitioned Objects Since archiving is not carried out per PartProvider, there is not “Archive” tab within the administration user interface. Instead, an archiving request can be scheduled by means of a dedicated / global button. Maintain Archiving © 2011 SAP AG. All rights reserved. 38 Archiving of Semantic Partitioned Objects Since archiving is not carried out per PartProvider, there is not “Archive” tab within the administration user interface. Instead, an archiving request can be scheduled by means of a dedicated / global button. An archiving request can be schedule to archive data from all available partitions or only from a dedicated partitions (which is equal to an archiving run being restricted to the semantic partition) Cross-partition archiving or only for a specific partition © 2011 SAP AG. All rights reserved. 39 Reading data from SPOs Query In SAP NetWeaver BW 7.30 data contained within a Nearline-Storage system can be read with a query being directly flagged to read data from NLS (query properties to read NLS data do no longer have to be maintained via transaction RSRT) Query can be set to read or to not read data from a NLS. Furthermore, it is possible to specify the same on InfoProvider level, which can also be taken into consideration. © 2011 SAP AG. All rights reserved. 40 Summary and Outlook Latest Enhancements Enhanced lookup support especially for temporal lookups (non-equal lookup conditions) Request-based archiving for InfoCubes (avoid compression before archiving) (BW 7.30) Combined DTP extraction from online and archive partition of an InfoCube (BW 7.30) Enhanced NLS support for Semantically Partitioned Objects (SPO) based on standard InfoCubes and standard DSOs (BW 7.30 SP 1). NLS support for SPOs based on write-optimized DSOs is available with SP3. NLS support for DSO lookup within transformations (DSO lookup feature to be released with SAP NetWeaver BW 7.30 with lookup for online data only) Master Data deletion to consider data within NLS Medium term NLS support for BW 7.3 running on HANA In-Memory Physical deletion of NLS requests from the nearline Storage (BW 7.30 SP5) Long term Archiving of InfoCubes with non-cumulative key figures, as well as InfoSets and HybridProviders Archiving of master data and hierarchies Archiving with free selection criteria (not only time slice archiving) © 2011 SAP AG. All rights reserved. 41 Planned Roadmap HANA & SAP NetWeaver BW BW 7.3 / BWA 7.2 BW 7.0 EhP1 (7.01) BW 7.0 / BWA 7.0 Major release BW Accelerator New features and improvements across all components 2006 Go-to release for integration with SAP Business Objects BI 2009 Major step on Enterprise Data Warehousing scalability and flexibility BW Accelerator: additional performance Integration Improvements with SAP BusinessObjects Data Services 2011 2010 © 2011 SAP AG. All rights reserved. BW running on HANA as the underlying In-Memory DB Platform In-Memory for Enterprise Data Warehousing Integrated Planning InMemory enabled Future direction HANA V1.0 SPSnn HANA V1.0 SAP NetWeaver BW evolving to a fully In-Memory enabled EDW solution on top of HANA BW 7.3 SPnn Real-time operational analytics on mass data Rapid creation of agile data marts Non disruptive deployments of HANA side by side ERP and/or BW Additional calculation capabilities Primary persistence layer under BW; eliminates need for separate database Models for SAP business content enabling new applications 42 Data-Aging Strategies: Nearline Storage Only Storage Type / Data Category Online Database FrequentlyInformation read / changed data (actual) Nearline Storage (read only) Classic Archive (read only) Lifecycle according to Importance/Age: Archive Infrequently read data (mature) Very rarely read data (aged) Current Situation Nearline Storage is the leading and only persistency No isolated Delete from Nearline Storage possible Workaround: Restore to Online Database and delete from there © 2011 SAP AG. All rights reserved. 43 Data-Aging Strategies: Classic Archive + Nearline Storage Storage Type / Data Category Online Database FrequentlyInformation read / changed data (actual) Nearline Storage (read only) Classic Archive (read only) Lifecycle according to Importance/Age: Archive (ADK … Infrequently read data (mature) Very rarely read data (aged) … + NLS) Current Situation ADK (Classic) Archive is the leading persistency Nearline Storage is filled from ADK Archive during Verification Phase Nearline Storage is strictly coupled to ADK Archive (no independent Delete) © 2011 SAP AG. All rights reserved. 44 Details for the planned NLS Deletion Features (for SAP BW 7.3, SP05) 1) Data resides in NLS only (without ADK) First step "logical" Deletion of NLS Data (set NLS Request to "Invalid" ) NLS Status in NLS Archiving-Request-List will be set to „Marked for Deletion“/ "Deleted" NLS Data will be deleted asynchronously using a Clean-Up Job or (later) a Process Chain Time slices will remain locked 2) Data resides in NLS and ADK Request can only be deleted from NLS, Data in ADK stays untouched ADK delete is not supported from NLS Dialog (see SAP Data Life Cycle/ Retention concepts in ERP) Later Restore from ADK to NLS supported © 2011 SAP AG. All rights reserved. 45 Data resides in NLS (only) (Final) Deletion of Nearline Request © 2011 SAP AG. All rights reserved. 46 Data resides in NLS only Three Alternatives lead to Nearline Request Status "Deleted" Finally Deleted from NLS (after successful archiving) Restored (Deleted from NLS but stored in Online-DB again) Invalidated (never deleted from Online-DB) © 2011 SAP AG. All rights reserved. 47 Data resides in ADK and NLS Restore deleted Nearline Request from ADK © 2011 SAP AG. All rights reserved. 48 Data resides in ADK and NLS New Nearline Request after Restore from ADK © 2011 SAP AG. All rights reserved. 49 Thank You! Contact information: rainer.uhle@sap.com SAP NW BW PM SAP AG - Walldorf