BI for Big Data Beyond the Hype 1 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho Mission The Future of Analytics: Big Data Exploration without Boundaries Modern, unified data integration and business analytics platform • Native integration into big data ecosystem • Embeddable, cloud-ready analytics Fast and Broad Innovation • Open source development model Critical mass achieved 2 • Over 1,000 commercial customers • Over 10,000 production deployments © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Ian Fyfe Big Data Solutions Engineering, Pentaho Ian brings over 20 years of experience in the business analytics software market with roles spanning consulting services, pre-sales engineering, product management and product marketing. Ian started his career by co-founding a business intelligence startup and has worked at Business Objects, Informix, Epiphany, PeopleSoft and Jaspersoft. 3 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 3 Common Use Cases 4 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 4 The Value of Big Data for our Customers Big opportunities Drive incremental revenue • Predict customer behavior across all channels • Understand and monetize customer behavior Improve operational effectiveness • Machines/sensors: predict failures, network attacks • Financial risk management: reduce fraud, increase security Reduce data warehouse cost 5 • Integrate new data sources without increased database cost • Provide online access to ‘dark data’ © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Example Use Cases Today Transactional Non-Transactional •Fraud detection •Web pages, blogs etc •Financial services / stock markets •Documents •Physical events •Application events Sub-Transactional •Machine events •Weblogs •Social/online media •Telecoms events © 2010, Pentaho. All Rights Reserved. www.pentaho.com. 6 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 US and Worldwide: +1 (866) 660-7555 | Slide Click Stream Analytics From buying patterns to revenue Business Challenge • Monetize buying patterns hidden in billions of data points • Quickly analyze multi-channel click stream data Pentaho Benefits 7 • Reduced ETL time to analyze blended data from Hadoop, Hbase & data warehouse • Use of big data analytics to grow revenue from targeted campaigns © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Device Data Analytics Big Data for Fortune 100 Enterprise Storage provider Business Challenge • Affordably scale machine data from storage devices for customer support app • Predict device failure • Enhance product performance Pentaho Benefits 8 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 • Easy to use ETL & analysis for Hadoop, Hbase, & Oracle data sources • 15x cost improvement • Stronger performance against customer SLA’s Innovative Organizations Use Pentaho to Unlock Value from Big Data Stores 9 Online Retailer Mobile & Digital Media Understanding the buying patterns of 5 million users from click stream data stored in Hadoop & HBase Embedded Pentaho to measure massive volumes of mobile and event data generated from mobile devices stored in MongoDB Gaming Travel & Entertainment Better monetization of premium game features through analyzing large volumes of player data stored in MongoDB & Infobright Helping thousands of travel partners like expedia.co.uk and thomascook.fr improve promotional targeting using Hbase and Hadoop Social Commerce Healthcare Better campaign performance through monitoring social media, page clicks and email marketing data stored in HP Vertica Embedded Pentaho to better patient care & compliance through analysis of unstructured digital pen data stored in CouchDB © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Pentaho Embedded Analytics New Revenue Stream in Eight Weeks Business Challenge • Gain new revenue source from add-on module with reporting, analysis & dashboards • Get to market fast to differentiate Pentaho Benefits 10 • Easy to embed & brand • Broad capabilities result in new revenue stream • Increased functionality & compelling visualizations © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Embedded Analytics Pentaho Uniquely Positioned to Win Dashboard Designer Why We Win in Embedded: • • • • Architectural ‘sweet spot’ for Pentaho platform Flexible pricing, adaptable to fit partner pricing Open source and innovation Fastest time-to-market for embedded analytics Continued Leadership: • • • • 11 Cloud & multi-tenancy ease-of-use Simplified REST services for ISVs BI Platform SDK enhancements – deep solution examples, tutorials and training Continued focus on standards and extensibility © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Dashboard Framework Big Data Technologies BI Strengths and Weaknesses 12 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 © 2012, Pentaho. All Rights Reserved. 12 The Current Solutions GIGABYTES OF DATA CREATED (IN BILLIONS) 10,000 Current Database Solutions are designed for structured data. 5,000 • Optimized to answer known questions quickly • Schemas dictate form/context • Difficult to adapt to new data types and new questions • Expensive at petabyte scale 10% 0 2005 2015 2010 STRUCTURED DATA 13 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 UNSTRUCTURED DATA Main Big Data Technologies Hadoop NoSQL Databases Analytic RDBMS • • • • • Low cost, reliable scale-out architecture Distributed computing Proven success in Fortune 500 companies Exploding interest Hadoop 14 • • Huge horizontal scaling and high availability Highly optimized for retrieval and appending Types • • • Document stores Key Value stores Graph databases NoSQL Databases © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 • Optimized for bulk-load and fast aggregate query workloads Types • • • Column-oriented MPP In-memory Analytic Databases Hadoop Core Components HADOOP DISTRIBUTED FILE SYSTEM (HDFS) ❯ Massive redundant storage across a commodity cluster MAPREDUCE ❯ ❯ Map: distribute a computational problem across a cluster Reduce: Master node collects the answers to all the sub-problems and combines them MANY DISTROS AVAILABLE US and Worldwide: +1 (866) 660-7555 | Slide © 2010, Pentaho. All Rights Reserved. www.pentaho.com. 15 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Major Hadoop Utilities Apache Pig Apache Hive High-level language for expressing data analysis programs SQL-like language and metadata repository Apache HBase The Hadoop database. Random, real -time read/write access Hue Apache Zookeeper Browser-based desktop interface for interacting with Hadoop Highly reliable distributed coordination service Oozie Flume Server-based workflow engine for Hadoop activities Distributed service for collecting and aggregating log and event data Sqoop Integrating Hadoop with RDBMS 16 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Apache Whirr Library for running Hadoop in the cloud Hadoop & Databases 17 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Big Data Platform Challenges “The working conditions can be are shocking” ETL Developer 18 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Challenges 1. 2. 3. 4. 5. 6. 7. 19 Somewhat immature Lack of tooling Steep technical learning curve Hiring qualified people Availability of enterprise-ready products and tools High latency (Hadoop) Running inside the cluster © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Challenges Ingestion / Manipulation / Integration Scheduling Modeling WOULD YOU RATHER DO THIS? 20 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 … OR THIS? Investigating BI & Big Data Solutions 21 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 21 Questions to Ask Business Drivers 1. Mandate to reduce EDW costs? 2. Clear use case that you need to solve? 3. Do you have access to technical skill set? Technical 1. Do you have more than one kind of big data store, for example Hadoop as well as HBase, MongoDB or Cassandra? 2. Would you prefer to use the same tool for big data stores in addition to your traditional relational data stores? 3. Are you ok waiting minutes or even hours to access your big data? 4. Are you ok using a spreadsheet-like interface to access and analyze your data? 5. Do you need complete BI capabilities, including reporting, interactive visualization, and predictive analytics? 6. Do you need to enrich your big data with data from outside of the big data platform? 7. Is the big data you want to analyze bigger than the amount of memory you have available? http://blog.pentaho.com/tag/ian-fyfe/ 22 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Demo 23 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 © 2012, Pentaho. All Rights Reserved. 23 Complete Big Data Analytics & Visual Data Management Data Ingestion Manipulation Integration Enterprise & Ad Hoc Reporting Data Discovery Visualization Predictive Analytics Pentaho Big Data Analytics Hadoop 24 NoSQL © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Analytic Databases Relational Open Discussion 25 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Thank You JOIN THE CONVERSATION. YOU CAN FIND US ON: 26 blog.pentaho.com Facebook.com/Pentaho @Pentaho Pentaho Business Analytics © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555