slides

advertisement
AN OVERVIEW
OF BUSINESS
INTELLIGENCE
TECHNOLOGY
Source:
Communications of the ACM, Vol. 54 No. 8
Surajit Chaudhuri, Umeshwar Dayal, Vivek Narasayya,
Presented by:
Sneha Maniraju
Prince Bajracharya
M OT I VAT I O N
In the
Sea of data,
Business Intelligence
tells you
where the waves are.
SUMMARY
• BI Software Market
• Applications
• End to End process of how BI works
B I S O F T WA R E M A R K E T
•
•
•
4 Giants : SAP, Oracle, IBM, Microsoft
“Worldwide sales of business intelligence
software grew a hearty 22% in 2008, according
to Gartner, proving that many companies see BI
as a good investment during tough economic
times”
“BI Software Market to reach nearly $13 Billion
in revenue by 2014”
A P P L I C AT I O N S
• RISK MANAGEMENT
• MAXIMIZING PROFITS
• CUSTOMERS
• HEALTH CARE
Ty p i c a l B I A r c h i t e c t u r e
DATA S O U RC E S
STRUCTURED
UN-STRUCTURED
Data Movement, Streaming Engine
•
Extract-Transform-Load (ETL): refers to a collection of tools that play a crucial
role in helping discover and correct data quality issues and efficiently load large
volumes of data into the warehouse.
•
•
•
•
•
•
Data Quality : Find error, inconsistency, missing information
Data Profiling tool : Verify key duplication, find keys
Extracting Structures : Parsing strings to separate attribute values
Deduplication : remove duplicate entries
Data load and refresh : efficiently capture data to be moved and move to warehouse
Complex Event Processing (CEP): engines to support BI tasks in near real time,
that is, make business decisions based on the operational data itself
D ATA WA R E H O U S E S E R V E R S
•
Relational DBMS
•
Execute complex SQL queries as efficiently as possible against very large
databases
•
Query Optimization: Takes a complex query and compiles that
query into an execution plan (The execution plan is a composition of
physical operators (such as Index Scan, Hash Join, Sort) that when
evaluated generates the results of the query)
•
To ensure that the execution plan can scale well to large databases,
data partitioning and parallel query processing are done.
D ATA WA R E H O U S E S E R V E R S
•
Map Reduce Engine:
•
•
•
MapReduce job can directly be executed on schema-less input files
Automatically handle important issues such as data partitioning, node failures, managing the
flow of data across nodes, and heterogeneity of nodes
There have been recent efforts to develop engines that can take a SQL-like query, and
automatically compile it to a sequence of jobs on a MapReduce engine
MID-TIER SERVERS
Mid-tier servers provide specialized functionality for different BI scenarios
• Online analytic processing (OLAP) servers efficiently expose the
multidimensional view of data to applications or users and enable the common BI
operations such as filtering, aggregation, drill-down and pivoting
• In-memory BI engines are appearing that exploit today’s large main
memory sizes to dramatically improve performance of multidimensional queries
(no i/o overhead)
Student Name
Exam
Result
John Collins
Database
70
John Collins
Programming
72
John Collins
Operating Systems
60
Larry Wall
Database
80
Larry Wall
Programming
99
Larry Wall
Operating Systems
70
Linus Torvalds
Databases
80
Linus Torvalds
Programming
90
Linus Torvalds
Operating Systems
9
MID-TIER SERVERS
• Reporting servers enable definition, efficient execution and rendering of
reports—for example, report total sales by region for this year and compare with sales
from last year
• Enterprise search engines support the keyword search paradigm over text and
structured data in the warehouse
• Data mining engines enable in-depth analysis of data that goes well beyond
what is offered by OLAP or reporting servers, and provides the ability to build
predictive models to help answer business questions
F R O N T - E N D A P P L I C AT I O N S
Relationship to Course
Data Warehouses
Online Analytical Processing(OLAP)
Data mining
Re fe re n c e s
• http://cacm.acm.org/magazines/2011/8/114953-an-overview-ofbusiness-intelligence-technology/pdf
• http://www.computerweekly.com/Articles/2009/07/30/237111/U
sing-business-intelligence-to-steer-through-the-recession.htm
• http://www.cio.com/article/375365/Gas_Prices_How_Oil_Compa
nies_Use_Business_Intelligence_To_Maximize_Profits?page=2&tax
onomyId=3002
• http://www.panorama.com/industry-news/articleview.html?name=Business-intelligence-in-healthcare-579093
• http://www.zaptechnology.com/solutions/microsoftcrm/microsoft-crm-capabilities.asp
• http://mohamednabeel.blogspot.com/2011/03/starting-subsandwitch-business.html
Download