AN OVERVIEW OF BUSINESS INTELLIGENCE TECHNOLOGY Source: Communications of the ACM, Vol. 54 No. 8 Surajit Chaudhuri, Umeshwar Dayal, Vivek Narasayya, Presented by: Sneha Maniraju Prince Bajracharya M OT I VAT I O N In the Sea of data, Business Intelligence tells you where the waves are. SUMMARY • BI Software Market • Applications • End to End process of how BI works B I S O F T WA R E M A R K E T • • • 4 Giants : SAP, Oracle, IBM, Microsoft “Worldwide sales of business intelligence software grew a hearty 22% in 2008, according to Gartner, proving that many companies see BI as a good investment during tough economic times” “BI Software Market to reach nearly $13 Billion in revenue by 2014” A P P L I C AT I O N S • RISK MANAGEMENT • MAXIMIZING PROFITS • CUSTOMERS • HEALTH CARE Ty p i c a l B I A r c h i t e c t u r e DATA S O U RC E S STRUCTURED UN-STRUCTURED Data Movement, Streaming Engine • Extract-Transform-Load (ETL): refers to a collection of tools that play a crucial role in helping discover and correct data quality issues and efficiently load large volumes of data into the warehouse. • • • • • • Data Quality : Find error, inconsistency, missing information Data Profiling tool : Verify key duplication, find keys Extracting Structures : Parsing strings to separate attribute values Deduplication : remove duplicate entries Data load and refresh : efficiently capture data to be moved and move to warehouse Complex Event Processing (CEP): engines to support BI tasks in near real time, that is, make business decisions based on the operational data itself D ATA WA R E H O U S E S E R V E R S • Relational DBMS • Execute complex SQL queries as efficiently as possible against very large databases • Query Optimization: Takes a complex query and compiles that query into an execution plan (The execution plan is a composition of physical operators (such as Index Scan, Hash Join, Sort) that when evaluated generates the results of the query) • To ensure that the execution plan can scale well to large databases, data partitioning and parallel query processing are done. D ATA WA R E H O U S E S E R V E R S • Map Reduce Engine: • • • MapReduce job can directly be executed on schema-less input files Automatically handle important issues such as data partitioning, node failures, managing the flow of data across nodes, and heterogeneity of nodes There have been recent efforts to develop engines that can take a SQL-like query, and automatically compile it to a sequence of jobs on a MapReduce engine MID-TIER SERVERS Mid-tier servers provide specialized functionality for different BI scenarios • Online analytic processing (OLAP) servers efficiently expose the multidimensional view of data to applications or users and enable the common BI operations such as filtering, aggregation, drill-down and pivoting • In-memory BI engines are appearing that exploit today’s large main memory sizes to dramatically improve performance of multidimensional queries (no i/o overhead) Student Name Exam Result John Collins Database 70 John Collins Programming 72 John Collins Operating Systems 60 Larry Wall Database 80 Larry Wall Programming 99 Larry Wall Operating Systems 70 Linus Torvalds Databases 80 Linus Torvalds Programming 90 Linus Torvalds Operating Systems 9 MID-TIER SERVERS • Reporting servers enable definition, efficient execution and rendering of reports—for example, report total sales by region for this year and compare with sales from last year • Enterprise search engines support the keyword search paradigm over text and structured data in the warehouse • Data mining engines enable in-depth analysis of data that goes well beyond what is offered by OLAP or reporting servers, and provides the ability to build predictive models to help answer business questions F R O N T - E N D A P P L I C AT I O N S Relationship to Course Data Warehouses Online Analytical Processing(OLAP) Data mining Re fe re n c e s • http://cacm.acm.org/magazines/2011/8/114953-an-overview-ofbusiness-intelligence-technology/pdf • http://www.computerweekly.com/Articles/2009/07/30/237111/U sing-business-intelligence-to-steer-through-the-recession.htm • http://www.cio.com/article/375365/Gas_Prices_How_Oil_Compa nies_Use_Business_Intelligence_To_Maximize_Profits?page=2&tax onomyId=3002 • http://www.panorama.com/industry-news/articleview.html?name=Business-intelligence-in-healthcare-579093 • http://www.zaptechnology.com/solutions/microsoftcrm/microsoft-crm-capabilities.asp • http://mohamednabeel.blogspot.com/2011/03/starting-subsandwitch-business.html