Hadoop Distribution Comparison ©2013 OpalSoft Big Data The following slides would compare Hadoop distribution for 5 prominent big data companies in the market today, based on various aspects: • • • • • Cloudera HortonWorks MapR Amazon EMR Intel Hadoop ©2013 OpalSoft Big Data Setup Flexibility: Cloudera Horton Works MapR Easy Setup using Cloudera Manager HDP Installer (Ambari). Requires a few different databases. Easy Setup Installer Amazon EMR Easy, Management Console Intel Hadoop Yes, Intel Manager ©2013 OpalSoft Big Data Security Infrastructure: Cloudera Horton Works MapR Role Based authorization No info about data at rest security. Hadoop itself supports encryption. SSL based authentication between client and server machines. No role based access control or data classification or compliance management No mention about the data classification or compliance management Restrictions can be applied at volume level to prevent unauthorized access to files Amazon EMR Has Amazon virtual private cloud IAM tools by users, roles. Intel Hadoop Encryption, decryption uusing Inten AES-NI ©2013 OpalSoft Big Data Learning support: Cloudera Well Documented and provides training and certifications Horton Works Well Documented and provides training and certifications MapR Well documented Amazon EMR Well documented with tutorials Intel Hadoop Training provided by Intel ©2013 OpalSoft Big Data Operation Tools: Cloudera Horton Works MapR Web Based GUI, Cloudera Navigator (Enterprise Subscription), Cloudera manager (Enterprise subscription) Ganglia, Nagios Yes Amazon EMR Ganglia and other Amazon cloud monitoring facilities Intel Hadoop Intel Manager ©2013 OpalSoft Big Data Operation Support: Cloudera Horton Works MapR Professional support provided for production environment Cloudera Enterprise Support. POC, dev support is available too Professional support provided. 3 levels, Developer, standard, enterprise 24/7 support and also professional services for POC, implementation Amazon EMR 24/7 Intel Hadoop 24/7 ©2013 OpalSoft Big Data Market Share: Cloudera Horton Works MapR Used by many leading organisation Used by a few leading companies. Not as many as Cloudera Used by many companies in commercial, finance and government sectors. Tested partner with Amazon AWS and Google compute engine Amazon EMR Widely used Intel Hadoop Not much info about Customers using. # customers mentioned in the website ©2013 OpalSoft Big Data Developer/People availability: Cloudera Horton Works Good Fair MapR No Info Amazon EMR No Info Intel Hadoop No Info ©2013 OpalSoft Big Data Editions available: Cloudera Horton Works MapR Standard, Enterprise Windows, HDP2, Sandbox Comes with 3 support level M3, M5, M7 Amazon EMR EC2, may options of data store is available Intel Hadoop One ©2013 OpalSoft Big Data Integration with BI tools: Cloudera Horton Works MapR Cloudera developed connectors for BI tools Microstrategy, Netezza, Oracle, Qlikview, Tableau, Teradata Basic ODBC drivers for BI integration Well integrated with BI tools using JDBC, ODBC, NFS based interfaces, Hadoop interfaces Amazon EMR Can be used with BI tools Intel Hadoop No Info ©2013 OpalSoft Big Data Performance: Cloudera No particular performance advantage Horton Works No particular performance advantage MapR Faster than any other Hadoop distribution because of their Native NFS based file system. Leads to lesser cost HA for job tracker Amazon EMR HA for Job tracker, faster Intel Hadoop Faster than normal Hadoop setup as hardware and, storage all tuned for Hadoop ©2013 OpalSoft Big Data Supported OS: Cloudera Horton Works MapR RHEL, CentOS, SLES, Debian, Ubuntu, Oracle Enterprise Linux RHEL, CentOS, SLES, Ubuntu, Windows (Only distribution available for windows) No Info published Amazon EMR Amazon OS Intel Hadoop No Info published ©2013 OpalSoft Big Data Professional Services: Cloudera Cluster Certification, ETL pilot, analytics pilot, production readiness Horton Works Yes MapR Yes Amazon EMR Yes Intel Hadoop Yes ©2013 OpalSoft Big Data