Big Data for BI - Beyond the Hype - Pentaho

BI for Big Data
Beyond the Hype
1
© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555
Pentaho Mission
The Future of Analytics: Big Data Exploration without Boundaries
Modern, unified data integration and business
analytics platform
•
Native integration into big data ecosystem
•
Embeddable, cloud-ready analytics
Fast and Broad Innovation
•
Open source development model
Critical mass achieved
2
•
Over 1,000 commercial customers
•
Over 10,000 production deployments
© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555
Ian Fyfe
Big Data Solutions Engineering, Pentaho
Ian brings over 20 years of experience in the business analytics software market
with roles spanning consulting services, pre-sales engineering, product
management and product marketing. Ian started his career by co-founding a
business intelligence startup and has worked at Business Objects, Informix,
Epiphany, PeopleSoft and Jaspersoft.
3
© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555
3
Common Use Cases
4
© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555
4
The Value of Big Data for our Customers
Big opportunities
Drive incremental revenue
•
Predict customer behavior across all channels
•
Understand and monetize customer behavior
Improve operational effectiveness
•
Machines/sensors: predict failures, network attacks
•
Financial risk management: reduce fraud, increase security
Reduce data warehouse cost
5
•
Integrate new data sources without increased database cost
•
Provide online access to ‘dark data’
© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555
Example Use Cases Today
Transactional
Non-Transactional
•Fraud detection
•Web pages, blogs etc
•Financial services / stock
markets
•Documents
•Physical events
•Application events
Sub-Transactional
•Machine events
•Weblogs
•Social/online media
•Telecoms events
© 2010, Pentaho. All Rights Reserved. www.pentaho.com.
6
© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555
US and Worldwide: +1 (866) 660-7555 | Slide
Click Stream Analytics
From buying patterns to revenue
Business Challenge
•
Monetize buying patterns hidden in billions of
data points
•
Quickly analyze multi-channel click stream data
Pentaho Benefits
7
•
Reduced ETL time to analyze blended data
from Hadoop, Hbase & data warehouse
•
Use of big data analytics to grow revenue from
targeted campaigns
© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555
Device Data Analytics
Big Data for Fortune 100 Enterprise Storage provider
Business Challenge
•
Affordably scale machine data from storage
devices for customer support app
•
Predict device failure
•
Enhance product performance
Pentaho Benefits
8
© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555
•
Easy to use ETL & analysis for Hadoop, Hbase,
& Oracle data sources
•
15x cost improvement
•
Stronger performance against customer SLA’s
Innovative Organizations Use Pentaho
to Unlock Value from Big Data Stores
9
Online Retailer
Mobile & Digital Media
Understanding the buying patterns
of 5 million users from click stream
data stored in Hadoop & HBase
Embedded Pentaho to measure
massive volumes of mobile and
event data generated from mobile
devices stored in MongoDB
Gaming
Travel & Entertainment
Better monetization of premium
game features through analyzing
large volumes of player data stored in MongoDB & Infobright
Helping thousands of travel
partners like expedia.co.uk and
thomascook.fr improve promotional
targeting using Hbase and Hadoop
Social Commerce
Healthcare
Better campaign performance
through monitoring social media,
page clicks and email marketing
data stored in HP Vertica
Embedded Pentaho to better
patient care & compliance through
analysis of unstructured digital pen
data stored in CouchDB
© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555
Pentaho Embedded Analytics
New Revenue Stream in Eight Weeks
Business Challenge
•
Gain new revenue source from add-on
module with reporting, analysis & dashboards
•
Get to market fast to differentiate
Pentaho Benefits
10
•
Easy to embed & brand
•
Broad capabilities result in new revenue stream
•
Increased functionality & compelling
visualizations
© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555
Embedded Analytics
Pentaho Uniquely Positioned to Win
Dashboard Designer
Why We Win in Embedded:
•
•
•
•
Architectural ‘sweet spot’ for Pentaho
platform
Flexible pricing, adaptable to fit partner
pricing
Open source and innovation
Fastest time-to-market for embedded
analytics
Continued Leadership:
•
•
•
•
11
Cloud & multi-tenancy ease-of-use
Simplified REST services for ISVs
BI Platform SDK enhancements – deep
solution examples, tutorials and training
Continued focus on standards and
extensibility
© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555
Dashboard Framework
Big Data Technologies
BI Strengths and Weaknesses
12
© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555
© 2012, Pentaho. All Rights Reserved.
12
The Current Solutions
GIGABYTES OF DATA CREATED (IN BILLIONS)
10,000
Current Database Solutions are designed for
structured data.
5,000
•
Optimized to answer known questions quickly
•
Schemas dictate form/context
•
Difficult to adapt to new data types and new
questions
•
Expensive at petabyte scale
10%
0
2005
2015
2010
STRUCTURED DATA
13
© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555
UNSTRUCTURED DATA
Main Big Data Technologies
Hadoop
NoSQL Databases
Analytic RDBMS
•
•
•
•
•
Low cost, reliable
scale-out architecture
Distributed computing
Proven success in
Fortune 500
companies
Exploding interest
Hadoop
14
•
•
Huge horizontal scaling
and high availability
Highly optimized for
retrieval and appending
Types
•
•
•
Document stores
Key Value stores
Graph databases
NoSQL Databases
© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555
•
Optimized for bulk-load
and fast aggregate
query workloads
Types
•
•
•
Column-oriented
MPP
In-memory
Analytic Databases
Hadoop Core Components
HADOOP DISTRIBUTED FILE SYSTEM (HDFS)
❯
Massive redundant storage across a commodity
cluster
MAPREDUCE
❯
❯
Map: distribute a computational problem
across a cluster
Reduce: Master node collects the answers to
all the sub-problems and combines them
MANY DISTROS AVAILABLE
US and Worldwide: +1 (866) 660-7555 | Slide
© 2010, Pentaho. All Rights Reserved. www.pentaho.com.
15
© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555
Major Hadoop Utilities
Apache Pig
Apache Hive
High-level language
for expressing data
analysis programs
SQL-like language and
metadata repository
Apache HBase
The Hadoop database.
Random, real -time
read/write access
Hue
Apache Zookeeper
Browser-based
desktop interface for
interacting with
Hadoop
Highly reliable
distributed
coordination service
Oozie
Flume
Server-based
workflow engine for
Hadoop activities
Distributed service for
collecting and
aggregating log and
event data
Sqoop
Integrating Hadoop
with RDBMS
16
© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555
Apache Whirr
Library for running
Hadoop in the cloud
Hadoop & Databases
17
© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555
Big Data Platform Challenges
“The working conditions can
be are shocking”
ETL Developer
18
© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555
Challenges
1.
2.
3.
4.
5.
6.
7.
19
Somewhat immature
Lack of tooling
Steep technical learning curve
Hiring qualified people
Availability of enterprise-ready products and tools
High latency (Hadoop)
Running inside the cluster
© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555
Challenges
Ingestion / Manipulation /
Integration
Scheduling
Modeling
WOULD YOU RATHER DO THIS?
20
© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555
… OR THIS?
Investigating
BI & Big Data Solutions
21
© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555
21
Questions to Ask
Business Drivers
1.
Mandate to reduce EDW costs?
2.
Clear use case that you need to solve?
3.
Do you have access to technical skill set?
Technical
1.
Do you have more than one kind of big data store, for example Hadoop as well as HBase,
MongoDB or Cassandra?
2.
Would you prefer to use the same tool for big data stores in addition to your traditional relational
data stores?
3.
Are you ok waiting minutes or even hours to access your big data?
4.
Are you ok using a spreadsheet-like interface to access and analyze your data?
5.
Do you need complete BI capabilities, including reporting, interactive visualization, and predictive
analytics?
6.
Do you need to enrich your big data with data from outside of the big data platform?
7.
Is the big data you want to analyze bigger than the amount of memory you have available?
http://blog.pentaho.com/tag/ian-fyfe/
22
© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555
Demo
23
© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555
© 2012, Pentaho. All Rights Reserved.
23
Complete Big Data Analytics &
Visual Data Management
Data Ingestion
Manipulation
Integration
Enterprise &
Ad Hoc Reporting
Data Discovery
Visualization
Predictive Analytics
Pentaho Big Data Analytics
Hadoop
24
NoSQL
© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555
Analytic
Databases
Relational
Open
Discussion
25
© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555
Thank You
JOIN THE CONVERSATION. YOU CAN FIND US ON:
26
blog.pentaho.com
Facebook.com/Pentaho
@Pentaho
Pentaho Business Analytics
© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555