Bringing Value of Big Data to Business: SAP`s Integrated Strategy

Bringing Value of Big Data to Business:
SAP's Integrated Strategy [1]
Group 6 - Ziqi Fan, Sheng Chen
SAP’s Integrated Big Data Strategy
• SAP is attempting to create an integrated approach
that allows companies to perform all the following
operations in one environment
– Analytics;
– Make big data operational;
– Support applications for high resolution
management .
Architecture Vision of SAP’s Integrated
Big Data
• SAP HANA, an in memory database is the key to
SAP’s integrated strategy.
• HANA DB takes advantage of the low cost of main
memory (RAM), data processing abilities of multicore processors and the fast data access of solidstate drives relative to traditional hard drives to
deliver better performance of analytical and
transactional applications.
• It offers a multi-engine query processing environment
which allows it to support both relational data as well
as graph and text processing for semi- and
unstructured data management within the same
• HANA DB is 100% ACID compliant.
Main-Memory DB Query Optimization [3]
• Logical Optimization
– Almost same like that in conventional database
• Physical Optimization
– goal : minimize execution costs with respect to a
given cost model
– Quite different from that in conventional database
due to lack of I/O as dominant cost factor
• A “simple” cost model
T = TMem + TCPU
Main-Memory DB Query Optimization
• CPU Cost
TCPU = c0 + c1 · n + c2 · m
c0 - fix startup costs
c1 - per tuple costs for processing input tuples
c2 - per tuple costs for producing output tuples
n - # input tuples
m - # output tuples
Main-Memory DB Query Optimization
• Memory Access Cost
Mis - # cache miss of level i for sequential access
Mir - # cache miss of level i for random access
lis - cache latency of level i for sequential access
lir - cache latency of level i for random access
Estimating Mis and Mir is very difficult !
Main-Memory DB Query Optimization
• Basic Access Pattern
– single sequential traversal
– repetitive sequential traversal
– single random traversal
– random access
– etc.
• Compound Access Pattern
– Nested loop Join
– Hash-join
– etc.
• [1] Dan Woods, “Bringing Value of Big Data to Business: SAP's
Integrated Strategy”, Forbes, 01/05/2012
• [2]
• [3] Manegold S.: Understanding, Modeling, and Improving MainMemory Database Performance, SIKS Dissertation Series No.
2002-17, ISBN 90 6196 5179, pp. 71-104