Company Overview Tubular Labs Accelerates YouTube Audience Growth with Cloudera-powered Enterprise Data Hub Every day, 1.2 million videos are uploaded and more than three billion videos are viewed on Google-owned YouTube, one of the most recognized and popular video-sharing websites on the planet. Its global reach and recent emphasis on channel development make YouTube an appealing vehicle for brands and content publishers to reach relevant audiences worldwide. Finding those audiences in the massive pool of YouTube viewers can be a challenge, but Tubular Labs is changing that with an enterprise data hub (EDH) powered by Cloudera. The real-time insights delivered by Tubular Labs’ platform, via Impala in particular, reveal who is engaging with the videos, when they are engaging, and how they are sharing them – empowering upwards of 2900 content creators and brand marketers including AwesomenessTV, Jamie Oliver, Maker Studios, Seventeen Magazine, and Vice to grow and engage relevant audiences like never before. The Challenge Before starting Tubular Labs with former YouTube analyst Allison Stern in June of 2012, founder and CEO Rob Gabel was an executive at Machinima, one of the first corporate entities that published content in the YouTube space – in Machinima’s case, for the gaming community. “Millions of gamers watch uploads of walk-throughs, players, news, and clips of game play on YouTube on demand, typically following a favorite game, such as Call of Duty or Halo,” said Gabel. The data generated by this YouTube activity could be mined for audience insights that would help Machinima develop better products and build even more momentum within its online community, but tools to help Machinima understand and grow its YouTube following were lacking. YouTube video “success,” measured in views, didn’t tell the full story. That critical need became the genesis of Tubular Labs. When Gabel and team began evaluating technologies for the architecture that would support their business, they identified several key requirements that would need to be addressed. The most critical of these: to deliver real-time reporting performance over enormous data sets. Other key requirements included: • Ingesting data from multiple sources that accumulate hundreds of millions of data points each day • Writing all of that information and reconstituting it into something actionable • Executing very complex queries of this big data set on the fly • Scaling with rapidly growing data volumes CUSTOMER SUCCESS STORY 1 Key Highlights Industry • Digital Media Locations • Mountain View, CA, USA Business Application Supported • Enterprise data hub empowering YouTube audience insights in real time Impact • 56% faster YouTube subscription growth among Tubular customers; 52% faster growth in YouTube views • Sub-second ad hoc query response via Cloudera Impala Technologies in Use • Hadoop Platform: Cloudera Enterprise • Hadoop Components: Cloudera Impala, Cloudera Manager, HDFS, Hive, MapReduce • Servers: Amazon Web Services EC2 Big Data Scale • Millions of data points ingested daily from dozens of data sources Gabel explained, “Questions about the data change constantly. ‘Which channels have audiences that are 80 percent male?’ ‘Which videos for the latest blockbuster film are getting the most Facebook likes and shares?’ We wanted to enable very complex queries that could answer any kind of question on-the-fly, even with our large volume of data. And we wanted to return those results in under a minute.” Tubular’s search for technology solutions meeting these criteria surfaced few contenders. The team knew that traditional database systems wouldn’t provide the performance at scale that they’d need, especially on the limited budget of a start-up, so they looked into emerging open source technologies. Dave Koblas, Tubular’s vice president of engineering, said, “We considered many solutions ranging from Twitter Storm running on Cassandra to some very elaborate MySQL set-ups. Our original prototypes ran on a MySQL-based platform, but we quickly realized that it wasn’t a sustainable approach, and couldn’t support the required performance, analytic flexibility, or scale that we needed.” With its enterprise data hub approach, Cloudera offered a centralized big data solution based on Apache Hadoop that uniquely addressed Tubular’s needs for an infrastructure that would deliver performance and analytic flexibility at scale, integrating tools such as Impala for real-time SQL-on-Hadoop queries and Cloudera Manager for automated system management. “When we looked at Cloudera Impala and the Apache Hadoop system, it was very clear that these would grow with us and meet our needs over the long haul,” said Koblas. The company evaluated other SQL-on-Hadoop options too, but found that most of them were designed to take queries from running in minutes to tens of seconds, and their management tools were sub-par. Tubular’s goal was to deliver results in less than two seconds - to truly deliver a real-time platform with an interactive user interface (UI) that would ensure the best customer experience. “Impala was one of the only choices that was trying to deliver on that mission statement, as opposed to just making [Apache] Hive faster,” explained Koblas. Tubular adopted Impala before it was released for general availability (GA). Koblas and team knew they’d be taking a bet on such an early technology, but having Cloudera’s enterprise-grade support behind the technology gave them confidence. “It was a risk and it has delivered in spades.” Solution In August 2013 - just weeks past the one-year anniversary of its founding - Tubular went into public beta with its AudienceGraph technology. AudienceGraph relies on its EDH to track and synthesize billions of YouTube video views, comments, and related “events” as well as social engagements including tweets, comments, and likes. And eight months later, in April 2014, Tubular launched its Tubular Intelligence solution, delivering key enhancements built for enterprises. The Tubular team continues to engage with Cloudera Proactive Support regularly to ensure fast, ongoing innovation and to make sure they’re getting the most out of their platform. The team relies on Cloudera Manager to manage the Hadoop system. “My experience with Cloudera Support has been really amazing,“ Koblas added, “and if it wasn’t for Cloudera Manager, I think I would have pulled out all of my hair.” “Tubular exists to put information recommendations in the hands of our customers,” said Gabel. Most of Tubular’s clients fall into one of two categories: content marketers and content publishers. • Content marketers are selling a brand - a beauty product, an energy drink, or a deodorant for instance - and engage YouTube audiences with online advertising. “The ads’ viewers can leave comments and can share YouTube content across social networks,” explained Gabel. “Effective marketers participate in the conversation by replying to viewer comments or tweets. These conversations build rapport and positive sentiment between the brand and consumers, while helping to spread the word about the brand. But they can also deliver CUSTOMER SUCCESS STORY 2 important insights. Our platform helps content marketers understand, for example, how a product is being received in the market, and by whom.” • Content publishers - such as Machinima, the gaming publisher, and Awesomeness TV, one of Tubular’s first customers - are examples of content publishers. “These are the new media companies with on-demand ‘channels’ on YouTube,” said Gabel. “Not everyone can be a CBS, but companies are making high-quality content that’s engaging, and that entertains and informs people. They can use YouTube to distribute themselves as an on-demand channel instantly to 150 countries. It’s a disruptive model.” For both content marketers and content publishers, the hardest part of YouTube is getting seen by relevant audiences. That’s where Tubular comes in. A YouTube-certified company and partner, Tubular uses APIs to get injections of billions of social actions from YouTube, as well as from Twitter, web crawls, and dozens of other publicly available data sources. Koblas explained, “We normalize all of the data, feed it through the appropriate queuing systems, then load it into Cloudera and perform further manipulations with Hive scripts to make the data available for real-time analytics using Impala.” The company’s AudienceGraph solution was released at approximately the same time that Cloudera launched Impala 1.0. Tubular enlisted Cloudera Professional Services to tune the speed, stability, and response time of its Impala installation. “It was fantastic,” said Gabel, “and gave our customers the responsive interface they were looking for.” Today, thousands of Tubular customers use Impala to access, query, and explore synthesized, live information in real time. The company’s SaaS-based platform is hosted on the Amazon EC2 cloud. “For the Impala real-time nodes,” Koblas clarified, “we're using dedicated, high-storage two-terabyte (TB) Amazon SSD machines which further enhance our ability to give real-time responses.” When we looked at Cloudera Impala and the Apache Hadoop system, it was very clear that these would grow with us and meet our needs over the long haul. Dave Koblas, Vice President of Engineering, Tubular Labs Video is dominating the new world of content and YouTube presents a massive big data opportunity for publishers and content marketers to capitalize on, allowing them to better reach and understand their audiences. Tubular employs cutting-edge big data analytics to provide new insights, and we're proud to be the platform empowering their enterprise data hub. Doug Cutting, Chief Architect, Cloudera The data is presented in both visual and tabular formats in a dashboard interface. Using real-time channel views, customers can see numbers of views and engagements, as well as number of subscribers. And they can run elaborate queries. They can see if anyone influential has been commenting on their videos. They can look through their entire backlog Sample Tubular Dashboard CUSTOMER SUCCESS STORY 3 and look for questions. “If they’re an enterprise customer,” said Gabel, “they can run complex queries to search hundreds of millions of videos or millions of channels to find, for example, the fastest-growing electronic dance music artists in Sweden that have not been signed to a YouTube network.” And they can get that information in seconds. Impact: Smarter Content, Faster Growth The future of media, said Gabel, is about content, distribution, and information. “Tubular is here to help content marketers and content publishers leverage the big data element to gain competitive intelligence - and deeper insights - that will help them identify, grow, and engage passionate communities of followers on YouTube more effectively in the future.” “Video is dominating the new world of content and YouTube presents a massive big data opportunity for publishers and content marketers to capitalize on, allowing them to better reach and understand their audiences,” said Doug Cutting, co-founder of Hadoop and chief architect at Cloudera. “Tubular employs cutting-edge big data analytics to provide new insights, and we’re proud to be the platform empowering their enterprise data hub.” In one case, AwesomenessTV was launching a brand new YouTube channel and wanted to attract and understand its teen audiences. The series turned to Tubular to generate an audience graph that helped answer questions like, “What types of content are teens interested in? What are their behaviors and interests?” Tubular was able to provide lists of the channel’s fans, super fans, and most influential fans that could be leveraged for outreach and activation. Later, when AwesomenessTV lost one of its show hosts, Tubular was able to recommend replacements - one of which was hired. AwesomenessTV soon reached a million subscribers on YouTube, and in May 2013 was purchased by DreamWorks Animation for a $33 million US dollars upfront, with additional payments of up to $117 million. Overall, Tubular customers are growing their YouTube subscriber base 56% faster than non-Tubular users, and are growing their views 52% faster. And, it seems, the startup has created its own loyal following. One survey indicated that 72% of Tubular customers have referred the company to a friend. “We haven't spent a dollar on paid media. And beyond speaking at a few conferences, we haven't done any marketing. But we already have 2900 customers, including dozens of large enterprises,” noted Gabel. “So, it's working out.” About Cloudera Cloudera is revolutionizing enterprise data management by offering the first unified Platform for Big Data, an enterprise data hub built on Apache Hadoop. Cloudera offers enterprises one place to store, process and analyze all their data, empowering them to extend the value of existing investments while enabling fundamental new ways to derive value from their data. Only Cloudera offers everything needed on a journey to an enterprise data hub, including software for business critical data challenges such as storage, access, management, analysis, security and search. As the leading educator of Hadoop professionals, Cloudera has trained over 40,000 individuals worldwide. Over 1,400 partners and a seasoned professional services team help deliver greater time to value. Finally, only Cloudera provides proactive and predictive support to run an enterprise data hub with confidence. Leading organizations in every industry plus top public sector organizations globally run Cloudera in production. www.cloudera.com. cloudera.com 1-888-789-1488 or 1-650-362-0488 Cloudera, Inc. 1001 Page Mill Road, Palo Alto, CA 94304, USA © 2015 Cloudera, Inc. All rights reserved. Cloudera and the Cloudera logo are trademarks or registered trademarks of Cloudera Inc. in the USA and other countries. All other trademarks are the property of their respective companies. Information is subject to change without notice. cloudera-casestudy-tubular-labs-102