Finding Hidden Intelligence with Predictive Analysis of Data Mining

The Analyst's Perspective:
Advanced BI with PowerPivot DAX,
SharePoint Dashboards, and SQL
Data Mining
Rafal Lukawiecki
Strategic Consultant, Project Botticelli Ltd
rafal@projectbotticelli.com
1
1
Objectives
Introduce more advanced BI analytics from
Microsoft
Discuss using SharePoint 2010 as a BI
Dashboard environment
This seminar is based on a number of sources including a few dozen of
Microsoft-owned presentations, used with permission. Thank you to
Chris Dial, Tara Seppa, Aydin Gencler, Ivan Kosyakov, Bryan Bredehoeft,
Marin Bezic, and Donald Farmer with his entire team for all the support.
The information herein is for informational purposes only and represents the opinions and views of Project Botticelli and/or Rafal Lukawiecki. The
material presented is not certain and may vary based on several factors. Microsoft makes no warranties, express, implied or statutory, as to the
information in this presentation.
Portions © 2010 Project Botticelli Ltd & entire material © 2010 Microsoft Corp. Some slides contain quotations from copyrighted materials by
other authors, as individually attributed or as already covered by Microsoft Copyright ownerships. All rights reserved. Microsoft, Windows,
Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The
information herein is for informational purposes only and represents the current view of Project Botticelli Ltd as of the date of this
presentation. Because Project Botticelli & Microsoft must respond to changing market conditions, it should not be interpreted to be a
commitment on the part of Microsoft, and Microsoft and Project Botticelli cannot guarantee the accuracy of any information provided after the
date of this presentation. Project Botticelli makes no warranties, express, implied or statutory, as to the information in this presentation. E&OE.
2
2
PowerPivot on SharePoint 2010
Manageability
3
PowerPivot for SharePoint 2010
Managed Self-Service Business Intelligence
Collaborative, shared gallery of PowerPivots
IT Pro management
Lifecycle & Workflow
Server Resource Management
4
4
Share Insights
Common view of organizational performance
5
5
1. PowerPivot for SharePoint: Uploading
Documents to Server
2. Galleries
6
Managing the BI Environment
User driven application administration
and monitoring
Manage and facilitate access to
secure organizational data
7
7
1. PowerPivot Management Dashboard
2. Anticipating a self-created BI that can
become an organisational concern
8
PowerPivot Client Architecture
9
9
PowerPivot Server Architecture
10
10
PowerPivot – Part of SharePoint
11
11
PowerPivot SharePoint Integration
Excel Services
Power User
PowerPivot
System Service
SharePoint Farm
Excel, RB,
PerfPoint
AS Engine
WFE
NLB
Excel Web Access
Data Sources
App Servers
Standard User
Content dBs
Browser
12
12
PowerPivot DAX
13
Data Analysis Expressions (DAX)
Simple Excel-style formulas
Define new fields in the PivotTable field list
Enable Excel users to perform powerful data
analysis using the skills they already have
Has elements of MDX but does not replace
MDX
14
14
Data Analysis Expressions (DAX)
No notion of addressing individual cells or ranges
DAX functions refer to columns in the data
Sample DAX expression
= [First Name] &“ ”& [Last Name]
=SUM(Sales[Amount])
=RELATED (Product[Cost])
15
Means:
String concatenation just like Excel
SUM function takes a column name
instead of a range of cells
new RELATED function follows
relationship between tables
15
DAX Aggregation Functions
DAX implements aggregation functions from Excel including SUM,
AVERAGE, MIN, MAX, COUNT, but instead of taking multiple
arguments (a list of ranges,) they take a reference to a column
DAX also adds some new aggregation functions which aggregate
any expression over the rows of a table
16
SUMX
(Table, Expression)
AVERAGEX
(Table, Expression)
COUNTAX
(Table, Expression)
MINX
(Table, Expression)
MAXX
(Table, Expression)
16
More than 80 Excel Functions in DAX
Date and Time
DATE
DATEVALUE
DAY
EDATE
EOMONTH
HOUR
MINUTE
MONTH
NOW
SECOND
TIME
TIMEVALUE
TODAY
WEEKDAY
WEEKNUM
YEAR
YEARFRAC
17
Information
ISBLANK
ISERROR
ISLOGICAL
ISNONTEXT
ISNUMBER
ISTEXT
Logical
AND
IF
IFERROR
NOT
OR
FALSE
TRUE
Math and Trig
Statistical
ABS
AVERAGE
CEILING, ISO.CEILING AVERAGEA
EXP
COUNT
FACT
COUNTA
FLOOR
COUNTBLANK
INT
MAX
LN
MAXA
LOG
MIN
LOG10
MINA
MOD
MROUND
PI
POWER
QUOTIENT
RAND
RANDBETWEEN
Text
CONCATENATE
EXACT
FIND
FIXED
LEFT
LEN
LOWER
MID
REPLACE
REPT
RIGHT
SEARCH
SUBSTITUTE
TRIM
UPPER
VALUE
ROUND
ROUNDDOWN
ROUNDUP
SIGN
SQRT
SUM
SUMSQ
TRUNC
17
Example: Functions over a Time Period
TotalMTD (Expression, Date_Column [, SetFilter])
TotalQTD (Expression, Date_Column [, SetFilter])
TotalYTD (Expression, Date_Column [, SetFilter] [,YE_Date])
OpeningBalanceMonth (Expression, Date_Column [,SetFilter])
OpeningBalanceQuarter (Expression, Date_Column [,SetFilter])
OpeningBalanceYear (Expression, Date_Column [,SetFilter] [,YE_Date])
ClosingBalanceMonth (Expression, Date_Column [,SetFilter])
ClosingBalanceQuarter (Expression, Date_Column [,SetFilter])
ClosingBalanceYear (Expression, Date_Column [,SetFilter] [,YE_Date])
18
18
1. DAX for Creating New Columns
2. DAX for Creating Calculated Measures
19
SharePoint 2010 BI
Dashboards:
PerformancePoint Services
20
PPS in SharePoint 2010
PerformancePoint Services in SharePoint 2010
improve over PerformancePoint Server 2007:
SharePoint does all security, management, backup,
respository of dashboard
Decomposition Tree
KPI Details
Scorecard drilldown, dynamic hierarchies, calculated KPIs
Dynamic, up-to-date filters for time intelligence
SharePoint Dashboard Designer is smoother
Better accessibility
Analytic charts with value filtering and server-based
conditional formatting
21
21
Monitoring with PPS
Business users
can build
performance
dashboards
easily
22
22
Analytics with PPS
Integration of KPIs and
analytics
Multidimensional slice
and dice, drill-across,
drill-to-detail, root-cause
analysis, prediction and
centralized business
logic definitions
No coding
23
23
Reporting and Consolidation in PPS
Combine operational
and financial data into
one report
No need to
reconsolidate manually
Dynamic and standard
reports
Consistent live reports
published from Excel
to Reporting Services
and SharePoint
24
24
1. Building a Dashboard, Scorecard, and a
KPI Using SharePoint Server
PerformancePoint Services
27
Visualising BI with Microsoft
Visio and SharePoint 2010
28
Two Trends that Lead to…
The Messy Diagram
29
Data Visualization
Fault Analysis Tree
Color By
Value
Text
Callouts
Status
Indicators
Data Bars
30
Data Visualization
Manufacturing
Specialized
Shapes
31
Strategy Maps
32
Visualize
PPS Scorecard
data in context
32
Data Mining with SQL Server
33
What does Data Mining Do?
Explores Your
Data
34
Finds
Patterns
Performs
Predictions
34
Data Mining Techniques
38
Algorithm
Description
Decision Trees
Finds the odds of an outcome based on values in a training set
Association Rules
Identifies relationships between cases
Clustering
Classifies cases into distinctive groups based on any attribute sets
Naïve Bayes
Clearly shows the differences in a particular variable for various data elements
Sequence
Clustering
Groups or clusters data based on a sequence of previous events
Time Series
Analyzes and forecasts time-based data combining the powerof ARTXP (developed by
Microsoft Research) for short-term predictionswith ARIMA (in SQL 2008) for long-term
accuracy.
Neural Nets
Seeks to uncover non-intuitive relationships in data
Linear Regression
Determines the relationship between columns in order to predict an outcome
Logistic
Regression
Determines the relationship between columns in order to evaluate the probability that a
column will contain a specific state
38
1. Association Rules for Market Basket
Analysis
2. Automatic recommendation engine
using DMX queries
40
Summary
SharePoint makes PowerPivot manageable
Advanced self-service analysis requires a rich
expression language: DAX
Team, and organisational BI dashboards and
scorecards are easy to build using SharePoint
2010
Data Mining enables advanced pattern
(correlation) discovery in your data
41
41
© 2010 Microsoft Corporation & Project Botticelli Ltd. All rights reserved.
The information herein is for informational purposes only and represents the opinions and views of Project Botticelli and/or Rafal Lukawiecki. The material presented is
not certain and may vary based on several factors. Microsoft makes no warranties, express, implied or statutory, as to the information in this presentation.
Portions © 2010 Project Botticelli Ltd & entire material © 2010 Microsoft Corp. Some slides contain quotations from copyrighted materials by other authors, as
individually attributed or as already covered by Microsoft Copyright ownerships. All rights reserved. Microsoft, Windows, Windows Vista and other product names are
or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the
current view of Project Botticelli Ltd as of the date of this presentation. Because Project Botticelli & Microsoft must respond to changing market conditions, it should
not be interpreted to be a commitment on the part of Microsoft, and Microsoft and Project Botticelli cannot guarantee the accuracy of any information provided after
the date of this presentation. Project Botticelli makes no warranties, express, implied or statutory, as to the information in this presentation. E&OE.
42
42