Data Aging Strategies in SAP BW 7.3

Data Aging Strategies
in
SAP Business Warehouse BW 7.3
Rainer Uhle, SAP Product Manager
Dr. Peter Zimmerer, SAP Development Architect
Mannheim, Rosengarten - June 22, 2011
Disclaimer
This presentation outlines our general product direction and should not be relied on in
making a purchase decision. This presentation is not subject to your license agreement
or any other agreement with SAP. SAP has no obligation to pursue any course of
business outlined in this presentation or to develop or release any functionality
mentioned in this presentation. This presentation and SAP's strategy and possible future
developments are subject to change and may be changed by SAP at any time for any
reason without notice. This document is provided without a warranty of any kind, either
express or implied, including but not limited to, the implied warranties of
merchantability, fitness for a particular purpose, or non-infringement. SAP assumes no
responsibility for errors or omissions in this document, except if such damages were
caused by SAP intentionally or grossly negligent.
© 2011 SAP AG. All rights reserved.
2
You Need Complete and Trusted Information
to Make Good Business Decisions
“
90% of upper level management feel they don’t
have the necessary information for critical
business decisions; 50% of them are afraid they
are making poor decisions because of it.”
“
BI strategies are deemed to fail without a trusted
data foundation
“
The #1 risk for building a data mart or data
warehouse is data quality
© 2011 SAP AG. All rights reserved.
3
How Good is the Data Behind My Dashboard?
Where did these numbers
come from? Are we
considering all our relevant
sources?
Are these terms consistent
with our business
definitions?
How current is this data?
When was it last updated?
Can I trust this data enough
to make my critical
decisions? Has the data
passed all our business rule
checks?
© 2011 SAP AG. All rights reserved.
4
Enterprise Data Warehouse (EDW)
Characteristics and Requirements
© 2011 SAP AG. All rights reserved.
5
SAP NetWeaver Business Warehouse
Strong EDW capabilities
Integrated,

scalable Enterprise Data Warehouse (EDW) platform
EDW = DBMS + X
Business
Content
Fast, sustainable implementation through


Modeling Patterns
Business Content
Openness and data quality through
Reliable
Data Acquisition
Streamlined
Operations
Lifecycle
Management


Out-of-the box integration for data originating in SAP systems
Integrated with SAP BusinessObjects Data Services (Data Integrator and Data Quality Management)
Efficient data management through:
 Management of data consistency, data base abstraction, data base neutral
 Sophisticated Security, Authorization and Identity Handling
 High availability
Enable sophisticated lifecycle management at different levels:
 System
 Meta Data

© 2011 SAP AG. All rights reserved.
Data (Nearline storage, archiving)
6
What does BW know about my Business?
© 2011 SAP AG. All rights reserved.
7
Introduction into the term "Layered, Scalable Architecture
(LSA)"
The Layered, Scalable Architecture (LSA) is a standard term for SAP for
common, unified understanding.
The LSA is a Reference Architecture and not only a data model.
At the center is the service idea of the reference architecture: Each layer
provides a service that can be used.
Layered
Layer-based data model in which each layer performs a
specific task.
Scalable
The data model is scalable and can be enhanced for
example by other source systems, regions and scenarios.
Architecture
The LSA is an architecture that is applied in the entire BW
system.
© 2011 SAP AG. All rights reserved.
8
The LSA Reference Architecture layers
Layer optimized for reporting
(consists of InfoCubes and
MultiProviders)
Reporting
BI Applications
(Architected Data Mart Layer)
Near real-time reporting, close
to operational reporting
Reporting Layer
Business Transformation Layer
Harmonisation Layer
Corporate
Memory
Data Acquisition Layer
Source system close structure,
complete storage of history as
granular as possible, “Master
the Unknown”
Harmonization, securing
data quality, plausibility
EDW Layer
(Single Point of truth,
reusable, granular,
complete history)
Data Sources
© 2011 SAP AG. All rights reserved.
Application of Business
Logic for the applications
LSA
Data Propagation Layer
Operational
Data Store
Easily digestible,
consumable ,
integrated and
independent
data
Extractor inbox, 1:1
mapping, temporary
storage
9
LSA Data Flow Templates as Content
© 2011 SAP AG. All rights reserved.
10
SAP NetWeaver BW adoption
Productive SAP NetWeaver BW systems – constant growth
15.500
15.000
14.500
12.500
14.214
13.910
13.359
13.000
13.728
15.238
13.500
14.948
14.000
14.687
Adoption of SAP NetWeaver BW
constantly growing
 Unaffected by economic down-turn
in 2009
 More than 12000 customers
referring to more than 15000
productive systems
14.446

16.000
12.000
Q4 10
Q3 10
Q2 10
Q1 10
Q4 09
Q3 09
Q2 09
Q1 09
Stable Product, Large installed Base, Constant Growth
© 2011 SAP AG. All rights reserved.
11
Analyst Opinions
Forrester 2011
© 2011 SAP AG. All rights reserved.
12
SAP BW EDW and Reality „60 TB Proof of Concept‟ on RDBMS (IBM/ DB2)
Discussions about corporate DWH architectures (EDW) are frequently driven by fears
and prejudices. This results in vague questions like:
Can BW handle 30, 40,..., 100 Terabyte ?
The answer:
SAP BW - 60TB Proof of Concept
© 2011 SAP AG. All rights reserved.
13
Aggregation
“on the fly”
Information
BW
Analytical
Engine
Merging and results
preparation for BI
queries
Query &
SAP NetWeaver BW Accelerator
SAP NetWeaver 7.0 Business Intelligence
BW Accelerator Query Run Time
Response
InfoCube
Indexing
(*) property setting („load index into main
memory‟) or schedule program
RSDDTREX_INDEX_LOAD_UNLOAD
© 2011 SAP AG. All rights reserved.
14
BWA Linear Scalability - Data Volume vs. Resources
(25 TB Showcase 2009)
1.2 TB / h
101,000 reports / h
4.2 sec
37 M records
Total DB Size
25 TB
1.1 TB / h
101,000 reports / h
4.2 sec
22 M records
15 TB
5 TB
Legend:
Index creation throughput
Multiuser reporting throughput
avg. report response time
avg. # records touched per report
0.6 TB / h
100,000 reports / h
4.5 sec
6 M records
27 blades
81 blades
135 blades
BWA Resources
© 2011 SAP AG. All rights reserved.
15
Bill Inmon‟s Corporate Information Factory & Nearline
Storage
DSS Applications
Departmental Data Marts
Acctg Finance
Marketing
ERP
ERP
ERP
Sales
CRM
Changed
Data
Staging Area
ETL
eComm.
EDW
Bus. Int.
Exploration
warehouse/
data mining
Global
ODS
ERP
Corporate
Applications
local
ODS
Oper.
Mart
Granularity
Manager
Session
Analysis
Cross media
Storage
Management
Near line
Storage
Dialogue
Manager
Internet
© 2011 SAP AG. All rights reserved.
Cookie
Cognition
Preformatted
dialogues
Archives
Web Logs
Source:Bill Inmon
16
Data-Aging Strategies for Volume Performance
Storage Type
/
Nearline
Storage
Classic Archive
Information
Lifecycle
according
to Importance/Age:
Online
Database
Data Category
(read only)
(read only)
Frequently read /
changed data
(actual)

Infrequently read
data (mature)


Very rarely read data
(aged)


© 2011 SAP AG. All rights reserved.

17
Key facts about SAP NLS
NLS should
be a part of an
Information
Lifecycle
Management
(ILM) strategy
Based on wellestablished SAP
/ SAP BW
archiving
concepts
Data archived in
NLS can be
incorporated
into reporting
Data
consistency
guaranteed
before
deleting the
data from
source
High
compression
rate (up to
95%)
Supports
archiving of
InfoCubes and
DataStore
Objects
Saves storage
costs and
other system
resources
© 2011 SAP AG. All rights reserved.
NLS is an
application
from a third
party vendor,
running on a
separate
system
Mainly timebased archiving,
yet can also be
based on other
characteristics
Lock of the
archived data
slice in the
original
InfoProviders
Process Chain
support
Increases
retention period
for analysis data
Scheduling
and
Monitoring of
archiving
sessions from
SAP BW
system
Copes with
changes in the
meta data to the
BW objects of
the archived
data
Included in
the query
statistic data
collection
(RSRT)
18
Evolution by SAP NetWeaver BW Releases
SAP NetWeaver BW 7.00
 Enhanced Look-Up API
 Suspension and selective
continuation of archiving
processes within Process
Chains
 Restore of an archiving
request with all successors
 Smaller Data Object size for
ADK-based Nearline
Solution without semantic
grouping
© 2011 SAP AG. All rights reserved.
SAP NetWeaver BW 7.01
(EhP1)
 Support of write-optimized
DataStore Objects for ADK
archiving and the NearlineStorage interface
 Request based Archiving
 Enhanced status and job
monitoring within
InfoProvider management
view
SAP NetWeaver BW 7.30
 Support for accessing
Nearline-Storage data for
MultiProviders
 Feature to allow archiving
from uncompressed
InfoCubes
 Archiving of Semantic
Partioned Objects (SPO)
with SP1
 Automatic rebuild of BW
Accelerator index possible
19
The Nearline Storage Solution for SAP NetWeaver
BW
Based on the Nearline Storage Interface Development Partners can implement their Solutions for Archiving
and NLS into the SAP BW
3rd Party NLS Solutions





are implemented within the SAP BW ABAP Stack in partner specific namespaces
have to pass a certification process
can offer specific Application Area in the SAP Support Portal
have to be licensed in addition to SAP licenses
can have a different release cycle compared to SAP NetWeaver BW
NLS
Partner
Solution
Present development partners
Certified since SAP BW 7.0
(in alphabetical order of their products)




CBW® – PBS Software
Dynamic NearLine Access® - SAND Technology
DB2 Viper 9.5® - IBM
DataVard OutBoard 1.0
yes
yes
7.01 SP6
yes
(see also http://www.sap.com/ecosystem/customers/directories/SearchSolution.epx )
© 2011 SAP AG. All rights reserved.
20
Customer Adoption - BW Archiving and Nearline Storage
(based on 895 customer messages)
© 2011 SAP AG. All rights reserved.
21
Data analysis and assistance for ROI analysis
 Sizing of Nearline Storage solutions:
 Hardware sizing of the NearLine-Storage solution has to be done by the
vendor


Different Nearline Storage technologies on the market
From database solutions, to file-based solutions, to column-based storage solutions
 Data volume services by SAP Active Global Support (AGS)
 http://service.sap.com/dvm
 Deliver a thorough analysis of BW objects distribution
 Can help on estimating the data volume that may be archived /
transferred to NLS for the largest InfoProviders within the system
 Considers only “technical facts” (and not the customer’s “business
requirements”)
© 2011 SAP AG. All rights reserved.
22
Data Management with Nearline Storage
Implementation Aspects
1
2
3
4
5 Look-up during Transformation
Create a Data Archiving Process
Create and schedule archiving requests
Restore archiving requests
Load data to subsequent Data Targets
Reporting Layer
SAP Sales InfoCube
(Architected Data Marts)
6
6 Query Settings
7
 MultiProvider
Settings
MultiProvider
7
Nearline Storage
4
Data Propagation Layer
Nearline Storage
DTP
2 3Nearline Storage
DTP
DTP
DAP
5
Data
Acquisition
Layer
InfoSource
DTP
PSA
InfoPackage
© 2011 SAP AG. All rights reserved.
1
LSA
Corporate
Memory
DTP
DataSource
23
Design Aspects –
Nearline Storage (NLS) vs. BW Accelerator (BWA)
BI
InfoMarts (InfoCube)
ADK Archive
BWA
Archiving
Acceleration
Nearline Storage
Acquisition
RDBMS
Access - very frequently
© 2011 SAP AG. All rights reserved.
frequently
not frequently
rarely
24
Data Management at Query Runtime
The Data Manager identifies the availability of alternative data storage of any
kind, such as
1.
2.
3.
4.
Data resides in the InfoProvider in the database
Data resides in a classical Aggregate
Data resides in the BW Accelerator Index
Data resides in an NLS Partition
Aggregate Types
• BW Accelerator Index
• NLS Partition
© 2011 SAP AG. All rights reserved.
25
NLS Related MultiProvider Settings
Nearline read mode
• disabled at all
• enabled at all
• InfoProvider settings
© 2011 SAP AG. All rights reserved.
26
MultiProvider: Query Runtime Statistics
Listing of Basis Providers and NLS
partitions used during Query execution
© 2011 SAP AG. All rights reserved.
27
NLS Related Query Designer Settings
Reporting
Fixed NLS Settings
• read NLS
• do not read NLS
• see InfoProvider settings
© 2011 SAP AG. All rights reserved.
28
NLS Related Query Designer Settings: Variable
Variable NLS Settings
(Dialog)
• read NLS
• do not read NLS
• see InfoProvider settings
© 2011 SAP AG. All rights reserved.
29
InfoCube: Archiving of Uncompressed Data
Central setting in Data Archiving Process (DAP)
 Valid for all archiving requests und DAP-Variants
 Can be changed during operation
 Prerequisite: only already processed requests (aggregates, Delta DTP)
Allow Archiving for noncompressed data
© 2011 SAP AG. All rights reserved.
30
Data Management at Archiving Runtime
During the delete phase of the archiving request
the new setup of the BWA index is offered in the dialog.
BWA consistence
reflected during
DAP processing
© 2011 SAP AG. All rights reserved.
31
Optimized Support for Navigational Attributes
Optimized Support for navigational attributes during Query processing on NLS

Navigational attributes are master data attributes that can be used to navigate/filter in
queries. Master data attributes are located outside the InfoCube persistence in the
extended star schema and thus are not a component of the NLS data stock.

Previous solution:

–
Selections for navigational attributes were not transferred to NLS as selections …
–
The attribute values were assigned subsequently and filtered in the result set
–
Performance problems for highly selective attribute values
Improvement:
–
Selections for navigational attributes are converted first to a selection for the
characteristic bearing attributes (max. 100 characteristic values)
–
The attribute selection is replaced by this characteristic selection in the query selection.
© 2011 SAP AG. All rights reserved.
32
DSO Lookup for „nearlined‟ Partitions
SAP NetWeaver BW 7.30 will come
up with a separate transformation rule
type, a DSO lookup
In case a NLS solution is attached to
the BW system, the lookup will
automatically read from both the
“online” and “near lined” data
partitions.
© 2011 SAP AG. All rights reserved.
33
Data Access within the APD
With SAP NetWeaver BW 7.30, the Analysis Process Designer will be enabled to read
from Nearline-Storage also for the source type “Read data from InfoProvider”
Option to allow
reading from NLS for
InfoProvider sources
© 2011 SAP AG. All rights reserved.
34
Reload data from both Online and
Nearline partitions for InfoCubes
Option to extract data
from both the Online
and Nearline Partition
in a single DTP
© 2011 SAP AG. All rights reserved.
35
Transaction LISTCUBE
Read data from NLS combined
© 2011 SAP AG. All rights reserved.
36
Archiving of Semantic Partitioned Objects
Facts:
Semantic Partitioning possible for InfoCubes (only standard InfoCubes) and DSOs (standard
and write-optimized)
There is not a DAP per PartProvider but only one DAP for the entire SPO. As a consequence,
there is not a set of tables / files created in the NLS system per PartProvider but only a set of
tables / files per SPO.
The DAP itself has the same options / settings as a regular InfoProvider. However, the DAP
must contain the logical partitioning criterion as additional archiving criterion so that data can
be archived, reloaded, or restore for a dedicated Semantic Partition.
Semantic
Partitioning criterion
© 2011 SAP AG. All rights reserved.
37
Archiving of Semantic Partitioned Objects
Since archiving is not carried out per PartProvider, there is not “Archive” tab within
the administration user interface. Instead, an archiving request can be scheduled by
means of a dedicated / global button.
Maintain Archiving
© 2011 SAP AG. All rights reserved.
38
Archiving of Semantic Partitioned Objects
Since archiving is not carried out per PartProvider, there is not “Archive” tab within the
administration user interface. Instead, an archiving request can be scheduled by means of a
dedicated / global button.
An archiving request can be schedule to archive data from all available partitions or only from
a dedicated partitions (which is equal to an archiving run being restricted to the semantic
partition)
Cross-partition archiving or
only for a specific partition
© 2011 SAP AG. All rights reserved.
39
Reading data from SPOs
Query
In SAP NetWeaver BW 7.30 data contained within a Nearline-Storage system can be read with a query
being directly flagged to read data from NLS (query properties to read NLS data do no longer have to be
maintained via transaction RSRT)
Query can be set to read or to not read data from a NLS. Furthermore, it is possible to specify the same on
InfoProvider level, which can also be taken into consideration.
© 2011 SAP AG. All rights reserved.
40
Summary and Outlook
Latest Enhancements
Enhanced lookup support especially for temporal lookups (non-equal lookup conditions)
Request-based archiving for InfoCubes (avoid compression before archiving) (BW 7.30)
Combined DTP extraction from online and archive partition of an InfoCube (BW 7.30)
Enhanced NLS support for Semantically Partitioned Objects (SPO) based on standard InfoCubes and
standard DSOs (BW 7.30 SP 1). NLS support for SPOs based on write-optimized DSOs is available with
SP3.
NLS support for DSO lookup within transformations (DSO lookup feature to be released with SAP
NetWeaver BW 7.30 with lookup for online data only)
Master Data deletion to consider data within NLS
Medium term
NLS support for BW 7.3 running on HANA In-Memory
Physical deletion of NLS requests from the nearline Storage (BW 7.30 SP5)
Long term
Archiving of InfoCubes with non-cumulative key figures, as well as InfoSets and HybridProviders
Archiving of master data and hierarchies
Archiving with free selection criteria (not only time slice archiving)
© 2011 SAP AG. All rights reserved.
41
Planned Roadmap HANA & SAP NetWeaver BW
BW 7.3 / BWA 7.2
BW 7.0 EhP1 (7.01)
BW 7.0 / BWA 7.0
 Major release
 BW Accelerator
 New features and
improvements across all
components
2006
 Go-to release for
integration with SAP
Business Objects BI
2009
 Major step on Enterprise
Data Warehousing
scalability and flexibility
 BW Accelerator: additional
performance
 Integration Improvements
with SAP BusinessObjects
Data Services
2011
2010
© 2011 SAP AG. All rights reserved.
 BW running on HANA as
the underlying In-Memory
DB Platform
 In-Memory for Enterprise
Data Warehousing
 Integrated Planning InMemory enabled
Future
direction
HANA V1.0 SPSnn
HANA V1.0
SAP NetWeaver BW evolving to a
fully In-Memory enabled EDW
solution on top of HANA
BW 7.3 SPnn
 Real-time operational analytics on
mass data
 Rapid creation of agile data marts
 Non disruptive deployments of
HANA side by side ERP and/or
BW
 Additional calculation
capabilities
 Primary persistence layer
under BW; eliminates need
for separate database
 Models for SAP business
content enabling new
applications
42
Data-Aging Strategies: Nearline Storage Only
Storage Type /
Data Category
Online Database
FrequentlyInformation
read /
changed data
(actual)
Nearline Storage
(read only)
Classic Archive
(read only)
Lifecycle according
to Importance/Age:
Archive

Infrequently read
data (mature)


Very rarely read data
(aged)



Current Situation
 Nearline Storage is the leading and only persistency
 No isolated Delete from Nearline Storage possible
 Workaround: Restore to Online Database and delete from there
© 2011 SAP AG. All rights reserved.
43
Data-Aging Strategies: Classic Archive + Nearline Storage
Storage Type /
Data Category
Online Database
FrequentlyInformation
read /
changed data
(actual)
Nearline Storage
(read only)
Classic Archive
(read only)
Lifecycle according
to Importance/Age:
Archive
(ADK …

Infrequently read
data (mature)


Very rarely read data
(aged)


… + NLS)

Current Situation
 ADK (Classic) Archive is the leading persistency
 Nearline Storage is filled from ADK Archive during Verification Phase
 Nearline Storage is strictly coupled to ADK Archive (no independent Delete)
© 2011 SAP AG. All rights reserved.
44
Details for the planned NLS Deletion Features
(for SAP BW 7.3, SP05)
1) Data resides in NLS only (without ADK)
 First step "logical" Deletion of NLS Data (set NLS Request to "Invalid" )
NLS Status in NLS Archiving-Request-List will be set to „Marked for Deletion“/
"Deleted"
 NLS Data will be deleted asynchronously using a Clean-Up Job or (later) a
Process Chain
 Time slices will remain locked
2) Data resides in NLS and ADK
 Request can only be deleted from NLS, Data in ADK stays untouched
 ADK delete is not supported from NLS Dialog (see SAP Data Life Cycle/
Retention concepts in ERP)
 Later Restore from ADK to NLS supported
© 2011 SAP AG. All rights reserved.
45
Data resides in NLS (only)
(Final) Deletion of Nearline Request
© 2011 SAP AG. All rights reserved.
46
Data resides in NLS only
Three Alternatives lead to Nearline Request Status "Deleted"
 Finally Deleted from NLS
(after successful
archiving)
 Restored
(Deleted from NLS but
stored in Online-DB again)
 Invalidated
(never deleted from
Online-DB)
© 2011 SAP AG. All rights reserved.
47
Data resides in ADK and NLS
Restore deleted Nearline Request from ADK
© 2011 SAP AG. All rights reserved.
48
Data resides in ADK and NLS
New Nearline Request after Restore from ADK
© 2011 SAP AG. All rights reserved.
49
Thank You!
Contact information:
rainer.uhle@sap.com
SAP NW BW PM
SAP AG - Walldorf