BigData - NoSQL Hadoop - Couchbase Tugdual “Tug” Grall Technical Evangelist email: tug@couchbase.com twitter: @tgrall About me • Tugdual “Tug” Grall • Web - Couchbase - Technical Evangelist - eXo - CTO - Oracle - Developer/Product Manager - Mainly Java/SOA - Developer in consulting firms - @tgrall http://blog.grallandco.com tgrall • NantesJUG co-founder • Pet Project : • http://www.resultri.com $30B Database Market Being Disrupted <50%? 95% Relational Technology Other Relational Technology 2012 Relational Technology Relational Technology NoSQL Technology 2027 All new database growth will be NoSQL Operational vs. Analytic Databases Real-time, Interactive Databases Analytic Databases NoSQL Fast access to data Couchbase MongoDB Get insights from data Cassandra Cloudera Hbase Hortonworks Mapr What Is Biggest Data Management Problem Driving Use of NoSQL in Coming Year? % 49 35% 29% 16% Lack of flexibility/ rigid schemas Inability to scalePerformance challenges out data Source: Couchbase Survey, December 2011, n = 1351. Cost 12% All of these 11% Other Hadoop & NoSQL What is Sqoop? Sqoop is a tool designed to transfer data between Hadoop and relational databases. You can use Sqoop to import data from a relational database management system (RDBMS) such as MySQL or Oracle into the Hadoop Distributed File System (HDFS), transform the data in Hadoop MapReduce, and then export the data back into an RDBMS. sqoop.apache.org What is Sqoop? Traditional ETL T Data Application Data What is Sqoop? A different paradigm Application Data Data What is Sqoop? A very scalable different paradigm Application Data Application Data Application Data Data What is Sqoop? Where did the Transform go? TTT TTT TTT TTT Application Data Sqoop Details • Sqoop • Default connection is via JDBCLots of custom connectorsCouchbase, VoltDB, VerticaTeradata, NetezzaOracle, MySQL, Postgres Ad and offer targeting 40 milliseconds to respond with the decision. 3 profiles, real time campaign statistics 2 1 events profiles, campaigns Moving Parts Content and Recommendation Targeting Content Driven Site: Moving Parts Couchbase Couchbase Server Core Principles Easy Scalability Grow cluster without application changes, without downtime with a single click Always On 24x365 No downtime for software upgrades, hardware maintenance, etc. Consistent High Performance Consistent sub-millisecond read and write response times with consistent high throughput Flexible Data Model JSON document model with no fixed schema. Couchbase Handles Real World Scale Q&A