Uploaded by Ansar Yergeshov

Lectures 6,7

advertisement
Lectures 6,7
Business intelligence
technologies and tools.
OLAP technologies
1
Instructor
assoc. prof. Sembina G.K.
BUSINESS INTELLIGENCE (BI)
Business intelligence (BI) is a set of theories,
methodologies, architectures, and technologies that
transform raw data into meaningful and useful
information for business purposes.
Business intelligence (BI) comprises the
strategies and technologies used by enterprises for
the data analysis of business information.
BI helps to transform:
 Data into information
 Information into knowledge
 Knowledge into decisions
 Finally, decisions to action
2
THE BI GOAL
BI is about using
data to help
enterprise users
make better
business decisions
3
FUNCTIONS
OF BUSINESS
INTELLIGENCE
Common functions of business intelligence
technologies include reporting, online analytical
processing, analytics, dashboard development, data
mining, process mining, complex event processing,
business performance management, benchmarking,
text mining, predictive analytics, and prescriptive
analytics.
BI technologies can handle large amounts of
structured and sometimes unstructured data to help
identify, develop, and otherwise create new strategic
4
business opportunities.
BUSINESS INTELLIGENCE TOOLS
5
DATA WAREHOUSING
Data warehouse (DW or DWH) is a system used
for reporting and data analysis and is considered a
core component of business intelligence.
DWs are central repositories of integrated data
from one or more disparate sources. They store
current and historical data in one single place that
are used for creating analytical reports for workers
throughout the enterprise.
Extract, transform, load (ETL) and Extract,
load, transform (E-LT) are the two main
approaches used to build a data warehouse system.
6
THE ETL PROCESS
7
BUSINESS PERFORMANCE MANAGEMENT
(BPM) AND DATA MINING
BPM is a set of performance
management and analytic processes that
enables
the
management
of
an organization's performance to achieve
one or more pre-selected goals.
Data mining is a process of discovering
patterns in large data sets involving
methods at the intersection of machine
learning, statistics, and database systems.8
PROCESS MINING, QUERYING, AND
REPORTING
Process mining is a family of techniques
relating the fields of data science and process
management to support the analysis of
operational processes based on event logs.
The goal of process mining is to turn event data
into insights and actions.
Querying: a request for specific data or
information from a database
Reporting: sharing operating and financial data
analysis with decision-makers so they can draw
9
conclusions and make decisions
DECISION ENGINEERING AND DASHBOARDS
Decision engineering is applying relevant
knowledge to design, build, maintain, and
improve systems for making decisions.
A dashboard is a type of graphical user
interface which often provides at-a-glance views
of key performance indicators (KPIs) relevant to
a particular objective or business process. In
other usage, "dashboard" is another name for
"report" and considered a form of data
visualization. The “dashboard” is often accessible
by a web browser and is usually linked to
10
regularly updating data sources.
BUSINESS INTELLIGENCE BENEFITS
Improved decision making
Integrating architecture
Common user interface for data reporting and
analysis
Common data repository fosters single version of
company data
Improved organizational performance
11
THE BEST BUSINESS INTELLIGENCE
TOOLS FOR 2020
https://synergytop.com/blog/top-10-businessintelligence-tools-in-2020/
12
ONLINE ANALYTICAL PROCESSING
(OLAP)
OLAP is a category of software that allows
users to analyze information from multiple
database systems at the same time.
It is a technology that enables analysts to
extract and view business data from different
points of view.
Analysts frequently need to group, aggregate
and join data. These operations in relational
databases are resource intensive.
With OLAP data can be pre-calculated and preaggregated, making analysis faster.
13
OLAP CONCEPTUAL DATA MODEL
 An OLAP cube is a multidimensional
dataset built from the data warehouse.
 Goal of OLAP is to support ad-hoc querying
for the business analyst.
 Business analysts are familiar with
spreadsheets.
 Extend spreadsheet analysis model to work
with warehouse data.
14
THE OLAP PROCESS
15
HOW DOES OLAP WORK?
•
•
•
•
There are multiple steps of OLAP:
First, data is first extracted from various data
sources and formats, like text files and
spreadsheets. This data is then stored in the Data
Warehouse.
Next, the data is cleaned, transformed, and stored
in OLAP Cubes .
Once in the OLAP cubes, information is then precalculated and pre-aggregated in advance for
further analysis.
Lastly, the user gets the data from the OLAP
cubes by running queries against them.
16
OLAP CUBE
OLAP databases are divided into one or more
cubes.
At the core of the OLAP, concept is an OLAP Cube.
The OLAP cube is a data structure optimized for
very quick data analysis.
The OLAP Cube consists of numeric facts called
measures which are categorized by dimensions.
OLAP Cube is also called the hypercube if the
number of dimensions is greater than 3.
OLAP contains multidimensional data, with data
usually obtained from a different and unrelated
sources. Using a spreadsheet is not an optimal option.
The cube can store and analyze multidimensional data
17
in a logical and orderly manner.
OLAP CUBE
Date
sum
2Qtr
3Qtr
4Qtr
sum
Portugal
Spain
Germany
Country
TV
PC
VCR
1Qtr
sum
18
DIMENSIONS
 Dimensions
are how people like to segment, or slice,
the data. Dimensions are how you want to see the
data
 You usually want to see data by product, country,
date, account, employee, …
 Almost anytime someone asks a question, they
describe how they want to see it. For example, sales
by store by month.
 Dimensions are made up of attributes and may or
may not include hierarchies
 Year – Semester – Quarter – Month – Day
19
 Product Category – Product Subcategory - Product
MEASURES
 Measures
are what you want to see
 They are almost always numeric
 They are often additive
 Dollar sales, unit sales, profit, expenses,
and more
 Some measures are not additive
 Date of last shipment
 Inventory counts and number of unique
customers
 Measures may be KPIs
20
ATTRIBUTES
 Attributes
represent different ways of looking
at something in a dimension.
 Attributes are individual values that make up
dimensions
 A Time dimension may have a Month
attribute, a Year attribute, and so forth
 A Geography dimension may have a Country
attribute, a Region attribute, a City
attribute, and so on
 A Product dimension may have a Part
Number attribute, a size attribute, a color 21
attribute, a manufacturer attribute, and
more
HIERARCHIES
 Most
dimensions contain hierarchies which allow
users to drill down on data.
 You can put attributes into a hierarchical structure
to assist user analysis
 One of the most common functions in BI is to “drill
down” to a more detailed level
 For example, Time hierarchy might be to go from
Year to Quarter to Month to Day
 Another Time hierarchy might go from Year to
Month to Week to Day to Hour
22
BASIC ANALYTICAL OPERATIONS OF OLAP
Four types of analytical
operations in OLAP are:
Roll-up
Drill-down
Slice and dice
Pivot (rotate)
23
1. ROLL-UP

Roll-up is also known as "consolidation" or
"aggregation."
In this example, cities New jersey and
Los Angeles rolled up into country USA.
The sales amount of New Jersey and
Los Angeles are 440 and 1560
respectively. They become 2000 after rollup. In this aggregation process, data is
location hierarchy moves up from city to
the country.
In this process at least one or 24more
dimensions need to be removed. In this
example, Quarter dimension is removed.
Quarter Q1 is drilled
down to months January,
February, and March.
Corresponding sales are
also registering.
2. DRILL-DOWN
In drill-down data is
fragmented into
smaller parts. It is the
opposite of the rollup
process.
25
3A. SLICE
Here,
one
dimension
is
selected, and a
new sub-cube is
created.
26
3b. Dice
is similar
to a slice.
The
difference
in dice is
you select 2
or
more
dimensions
that result
in
the
creation of
a sub-cube.27
4. In Pivot,
you rotate the
data axes to
provide a
substitute
presentation
of data.
In the
following
example, the
pivot is based
on item types.
28
TYPES OF OLAP SYSTEMS
OLAP Hierarchical
Structure
29
TYPES OF OLAP SYSTEMS
ROLAP works with data that exist in a
relational database. Facts and dimension tables
are stored as relational tables. It also allows
multidimensional analysis of data and is the
fastest growing OLAP.
 MOLAP uses array-based multidimensional
storage engines to display multidimensional
views of data. Basically, they use an OLAP cube.
 Hybrid OLAP is a mixture of both ROLAP and
MOLAP. It offers fast computation of MOLAP
and higher scalability of ROLAP. HOLAP uses
30
two databases.

TYPES OF OLAP SYSTEMS
 WOLAP.
Web OLAP which is OLAP system
accessible via the web browser. WOLAP is a
three-tiered architecture.
 DOLAP. In Desktop OLAP, a user downloads a
part of the data from the database locally, or on
their desktop and analyze it.
 MOLAP. Mobile OLAP helps users to access and
analyze OLAP data using their mobile devices
 SOLAP. Spatial OLAP is created to facilitate
management of both spatial and non-spatial
data in a Geographic Information system (GIS).
31
ADVANTAGES OF
OLAP
OLAP is a platform for all type of business includes
planning, budgeting, reporting, and analysis.
 Information and calculations are consistent in an
OLAP cube. This is a crucial benefit.
 Quickly create and analyze "What if" scenarios
 Easily search OLAP database for broad or specific
terms.
 OLAP provides the building blocks for business
modeling tools, Data mining tools, performance
reporting tools.
 Allows users to do slice and dice cube data all by
various dimensions, measures, and filters.
 It is good for analyzing time series.
 Finding some clusters and outliers is easy with OLAP.32
 It is a powerful visualization online analytical process
system which provides faster response times

DISADVANTAGES OF OLAP
OLAP requires organizing data into a star or
snowflake schema. These schemas are complicated to
implement and administer
 You cannot have large number of dimensions in a
single OLAP cube
 Transactional data cannot be accessed with OLAP
system.
 Any modification in an OLAP cube needs a full update
of the cube. This is a time-consuming process

What is Business Intelligence and an OLAP Cube?
 https://www.youtube.com/watch?v=yoE6bgJv08E&feat
33
ure=youtu.be
Thank you for your attention!
34
Download