Data Management in Enterprise Apps: Some Perspectives Dr. Vishal Sikka Chief Software Architect SAP AG A Brief Introduction to SAP and Data Management in Our Applications The Current Situation: Some Existing and Emerging “Divides” Our Approach to Two of These Divides The Lessons Learned and Some Open Problems SAP at a Glance Mobile Duet RSS Embedded Voice Portal Project Muse Other Composites CRM SRM RFID Widgets Forms SAP GUI Dashboards SAP Composites SCM PLM mySAP ERP ISV SAP Legacy Founded in 1972 2005 revenues: €8.5 Billion 34,600+ customers 37,500+ employees 12+ Million users in 120 countries 1,600+ partners What we do Largest enterprise applications company in the world Serve most back-end and frontend business processes SAP NetWeaver Home Grown / Who we are Biz partner Data Infrastructure Infrastructure SAP AG 2006, SAP Tech Ed 2006 - Shai Agassi / 3 Biz partner Industry Standards Leader in ERP, CRM, SCM, … Leading platform to build and run apps on 25+ industry solutions Our data management requirements are massive SAP for Engineering & Construction Customer with 5,000 concurrent active users SAP NetWeaver Portal Customer with 300,000 users (20,000 concurrent) SAP for Utilities 25 million business partners – 85 million service and sales orders per year SAP AG 2006, SAP Tech Ed 2006 - Shai Agassi / 4 mySAP ERP HCM Customer with payroll calculations for 500,000 employees in 3 hours mySAP Business Suite mySAP ERP A customer with 5 users on a laptop SAP for Consumer Products Customer with 1.4 million sales order line items per day mySAP SCM Customer with 4.5 million characteristic combinations & 512 GB memory in live cache SAP NetWeaver BI Customer with 40 TB database live Average DB size of top 10 live BI customers: 5.5TB Data Management from SAP’s Perspective There is >10 PB of transactional and analytical data processed by SAP apps worldwide We are the largest applications consumer and reseller of data worldwide Event SAP AG 2006, SAP Tech Ed 2006 - Shai Agassi / 5 Unstructured Significant need for deriving value from this data Master Data has different requirements & different optimizations Transactional Transactional, Analytical, Text/Unstructured, Master, Events, … Analytical Our data is of many different types, shape and sizes: SAP Applications Data through the SAP Lens – “Not All Data Is Alike” Progression Over Time Transactional Data Analytical Data Order ~ 100G Write > read Many changes Accurate Consistent Performance All back-end apps Order > Tb Read only Slow changes Many queries Flexibility Performance SAP AG 2006, SAP Tech Ed 2006 - Shai Agassi / 6 Master Data Order ~ 1G Mostly read Mid change Many queries Distributed Event Data Order < Tb Many writes Few queries Distributed Filtering Correlation Textual and Unstructured Data Order > Tb Mostly read Slow change Many queries Unstructured Contextual 3-tier C/S Architecture of Basis: Our Application Server Presentation SAP GUI SAP GUI SAP GUI SAP GUI R R R R Application Displatcher R R Roll Area Work Process Roll Area Work Process R Database R Database Management System Database SAP AG 2006, SAP Tech Ed 2006 - Shai Agassi / 7 R Roll Area Work Process R Buffers Memory Management in Basis outside the DBMS SAP UI SAP UI R Request Queue R Dispatcher Dispatcher Work Process 1 SAP UI R R R Shared Memory and Buffers SAP UI R R Work Process n Application Server 1 Shared Memory and Buffers Work Process 1 Request Queue R Work Process n R Enqueue Process Enqueue Table Application Server n R R R R R Database Management System Database Buffers in the application server help significantly improve performance. In a classical 3-tier system, network round trips mitigated benefits of the DBMS cache, while TCO optimization required one DB for >10+ app servers. Application level locking (Enqueue and Application LUW) mitigates the absence of fine granularity of locking in DBMS and transaction support needed by Application Servers (multiple users accessing the same DB, complex screen processing with workflow on front-end). Numerous other optimizations and DB abstractions. SAP AG 2006, SAP Tech Ed 2006 - Shai Agassi / 8 Bringing Data Closer to Applications: SAP LiveCache LiveCache is a main-memory DB component used in SAP SCM’s APO Rapid Planning Matrix in the Automotive Industry Common ERP system: Plan the mfg of 20,000 Cars / Day Needed volumes are much higher liveCache enables planning 500,000 Cars / Hour Demand Planning (DP): Interactive planning: 10x performance gain compared to DB based solution Consistent storage of data (no need for aggregation/disaggregation batch jobs) Production Planning (PP/DS): Performance gain of 15x in rescheduling production runs and DS heuristics Data volume 5x higher in planning board compared to common ERP system Consolidation of data structures via generic liveCache data types: E.g. 1 order data type 1 order type with multiple attributes instead of a few dozen different specific order types in ERP Bringing development teams closer together LiveCache applications team bridges technology knowledge with business process knowledge by working together with the application team on the usage of the liveCache, as well as in optimization of business logic. Common team working together for several years 3000+ happy deployments. SAP AG 2006, SAP Tech Ed 2006 - Shai Agassi / 9 A Brief Introduction to SAP and Data Management in Our Applications The Current Situation: Some Existing and Emerging “Divides” Our Approach to Some of These Divides The Lessons Learned and Some Open Problems New needs: Innovate, Be flexible, Stay high-performant “ Once my system is up and running, you, SAP, can touch my core processes once every 5 years ... and it needs to be a Saturday … and my CEO wants me to innovate every quarter” CIO, Fortune 1000 Manufacturing Company SAP AG 2006, SAP Tech Ed 2006 - Shai Agassi / 11 New requirements, New “divides” More decoupled business processes Mobi le Duet RSS Form s Embedded Project Muse Widget s RFID Voice Portal SAP GUI Dashboa rds Other SAP Composites Composites More visible Physical-Digital divide Infrastructure subjected to much higher volumes (events, sensors, …) Greater need for in-context usage Multiple UIs CR M SRM SCM PLM mySAP ERP SAP NetWeaver Home Grown / ISV SAP Legacy More visible work-personal divide Users are a lot more used to search, lack of structure is academic to them Different requirements on front-end than on back-end Biz partner Industry Biz partner Standar ds Data Infrastruct Infrastructure ure e.g. easier front-end application composition Many more deployment options Greater flexibility easy integration, better components semantics New application architectures are necessary: SOA is the biggest component, but there are others SAP AG 2006, SAP Tech Ed 2006 - Shai Agassi / 12 Technology Shifts Architectural Shift 1990 Disk based data storage 2006 In-memory data stores Simple Multi-channel consumption of UI, high event applications volume, cross (Fat client UI, industry value EDI) chains Generalpurpose, applicationagnostic database Technology Drivers Applicationaware and intelligent data management CPU Memory Addressable Memory Network Speed Disk Speed SAP AG 2006, SAP Tech Ed 2006 - Shai Agassi / 13 1990 2006 Improvement 0.05 7.15 MIPS/$ MIPS/$ 0.02 5 MB/$ MB/$ 16 64 Bits Bits 100 10 Mbps Gbps 5 15 Kilo RPM Kilo RPM 143x 250x 48 2 x 100x 3x A Brief Introduction to SAP and Data Management in Our Applications The Current Situation: Some Existing and Emerging “Divides” Our Approach to Two of These Divides The Lessons Learned and Some Open Problems Addressing DB Architecture Gap: SAP BI Accelerator Any source, any tool legacy Performance 1 Billion records analyzed in 3 seconds Delivery Off the shelf hardware, appliance setup Predictability Consistent response, no tuning, fast load Integration Built for & closely integrated with SAP NW BI SAP AG 2006, SAP Tech Ed 2006 - Shai Agassi / 15 Addressing DB Architecture Gap: SAP BI Accelerator Performance 1 Billion records analyzed in 3 seconds Affordability Off the shelf hardware, appliance setup Agility Consistent response, no tuning, fast load Integration Closely integrated with SAP BI SAP AG 2006, SAP Tech Ed 2006 - Shai Agassi / 16 BI Accelerator Key Technology Main memory technology Inspired by text search On the fly aggregation L2 cache miss optimization BI SAP BI Application AppServer Server SAP BI Accelerator Storage subsystem Database Server Scalability by adding blades Column based data structures Highly compressed, dictionary based, golomb, sparse, ... Fast updates with write-optimized delta mechanism Compressed data structures for read access Parallel and distributed execution engine Distributed joins, horizontal table split Intelligent partitioning (along join paths) Data distribution optimizer Model based data layer Exploit data model for performance optimization and data distribution SAP AG 2006, SAP Tech Ed 2006 - Shai Agassi / 17 Key Benefits Predictable (near constant) query response time Query execution shifted from DB to BI Accelerator Fast in memory full table scans guarantee stable response times Column based data structures support fast joins Intelligent partitioning and data distribution allows massive parallelization Reduced maintenance costs Simplified cube modeling (normalization for semantic reasons only) No more aggregates (or aggregate administration) Less need for DB optimization Reduced hardware costs Commodity hardware (blades) with standard equipment Linear scalability with number of processors / cores Use of blade infrastructure instead of big SMP box Packaged as an appliance SAP AG 2006, SAP Tech Ed 2006 - Shai Agassi / 18 SAP Enterprise Search Search in the enterprise Business objects Business context awareness Portal Desktop Devices Office SAP Enterprise Search Role Authorizations, Compliance Current work context Graceful degradation with decreasing structure Multiple clients SAP NetWeaver Business Process Platform Desktop Internet Search Search Service Service Search Indexing 3rd party my SAP Bus. Suite Documents SAP AG 2006, SAP Tech Ed 2006 - Shai Agassi / 20 R/3 via BAPI’s Stand alone and embedded into applications Integration into non-SAP sources SAP Enterprise Search is a stand alone business search xApp and a framework for search as a service SAP Enterprise Search Access more information from any place Get the right answer to enterprise questions anywhere, anytime Access data from your workplace or mobile device. Simple to use: Open to everyone Pre-build common queries Smart context Better Answers: Leverage context information and meta data Support targeted search for object types Enhance search and displays by contextual meta data: related queries, object scoping Go Deep: Find the right information – Across all your sources Penetrate entire corporate data sources including Search for documents and business objects simultaneously Ensure service-oriented, multi-device scalable operation Reach Out: Embed search into everyday tools Design simple search front ends that are compliant to the respective devices, including Portal, Desktop, SMS, e-mail, mobile SAP AG 2006, SAP Tech Ed 2006 - Shai Agassi / 21 The Argo Widget SAP AG 2006, SAP Tech Ed 2006 - Shai Agassi / 22 Enterprise Search Example SAP AG 2006, SAP Tech Ed 2006 - Shai Agassi / 23 Enterprise Search Example (Cont’d) SAP AG 2006, SAP Tech Ed 2006 - Shai Agassi / 24 A Brief Introduction to SAP and Data Management in Our Applications The Current Situation: Some Existing and Emerging “Divides” Our Approach to Some of These Divides The Lessons Learned and Some Open Problems Master Data Management Characterized By Business Entities with Multiple data models Multiple application sources Reference Models Single logical model Multiple physical models Master Data Management Architecture MDM Application Services Quality Visibility Meta-data Master Source of Truth Serves as reference data Few systems write Many systems read 360 ° view of data Validation Analytics Unified Data Management Layer Distributed Query Data Federation No single source of truth Access Characteristics Governance Full analytics view Full operational view SAP AG 2006, SAP Tech Ed 2006 - Shai Agassi / 26 Multiple Data Source Management Data Legacy Unstructured Mappings Data Data Structured Data Connectivity Fabric Events Services Event Processing Characterized By Continuous Streams of near real-time data Event Streams Data (IN) Significant main memory processing Continuous evaluation of rules Edge Devices as data producers — (RFID, sensor data) generate significant number of events — orders of magnitude scale data e.g., shop floor sensor devices Large volume of event data dictates pre-processing for consumption Events externalized non-invasively for several forms of consumption Automatic correlation and context determination of business events SAP AG 2006, SAP Tech Ed 2006 - Shai Agassi / 27 Event Management High data flow rate and large volume needs parallel processing Business Events Actions Query Results BI/Reports Alerts Input Streams Output Streams Filters Response Correlation Engine Event Memory/Storage Correlation Rules Lessons Learned It’s not the technology, stupid. Application perspectives provide grounding for data management. So learn what the apps needs are One size does not fit all. Applications’ data mgmt needs are changing and this requires a rethink in data mgmt architecture. So let’s go rethink data mgmt for the enterprise SAP AG 2006, SAP Tech Ed 2006 - Shai Agassi / 28