Overview San Jose State University (SJSU) is the oldest public institution of higher education on the United States’ West Coast. The school was founded in 1857 to train teachers for the “developing frontier.”1 A lot has changed since then, but SJSU’s present-day tagline, “powering Silicon Valley,” demonstrates a consistent goal: to arm students with the knowledge and experience that will help them thrive in today’s science- and technologydriven market. San Jose State University Creates ‘Data Wranglers’ in Partnership with Cloudera One of SJSU’s adjunct professors, Peter Zadrozny, educates students on big data analytics. Zadrozny brings a wealth of real-world experience to the courses he teaches, with a background as a software architect and developer at companies ranging from start-ups to the Fortune 500. His goal is to offer students hands-on experience with big data technologies that hiring managers are looking for. The Challenge In describing his motivation to put together a practical big data course, Zadrozny explained, “I wanted to really teach students what big data is all about, and to help them get beyond the buzzword. Hiring managers want people that have gone through the steepest part of the learning curve. That’s the objective of what we’re trying to do here.” His goal is to create “data wranglers” – people who may not have deep domain expertise, but can apply technology and big data experience to a wide range of industry-specific challenges. According to Zadrozny, there are two main areas that have driven today’s need for big data: • The human digital footprint: Society’s movement from in-person activities, interactions, and transactions to the web has resulted in an exhaust of digital breadcrumbs that can be ingested and analyzed for better understanding of our needs, preferences, and behaviors. “We have Facebook, we email, we tweet, we check in with Foursquare – that is what I call the human digital footprint,” said Zadrozny. • Machine data: The proliferation of mobile and digital enabled devices also translates into big data that can be ingested and acted upon for better manufacturing, supply chain optimization, and other operational processes. Zadrozny explained, “All the servers, data centers, cloud services – those machines are producing a ton of log files that have to be processed.” CUSTOMER SUCCESS STORY 1 Key Highlights Industry • Education Location • San Jose, CA, USA Objectives • Combine lecture-based education with practical experience • Maximize marketability of students Technologies In Use • Hadoop Platform: Cloudera University License • Hadoop Components: Cloudera Manager, Hive • Server: GoGrid • Analytic Tool: Splunk The Solution In developing the curriculum for SJSU’s Big Data Analytics course, Zadrozny decided the logical approach would be to teach Apache Hadoop and Splunk. Within the Hadoop curriculum, students learn Hive, which leverages existing analytical skills, including SQL, for the big data sets at the core of the emerging data economy. A key part of the course is having students deliver a big data project that demonstrates they know how to work with the tools. In addition to partnering with Cloudera and Splunk, SJSU has established a partnership with GoGrid to give students a cloud-based environment on which to build their big data projects. Zadrozny led SJSU’s participation in the Cloudera Academic Partnership (CAP) program to streamline and accelerate the Hadoop curriculum development. He noted, “When people think Hadoop, they think Cloudera. I have to give students something that makes them marketable. If I don’t teach Hadoop on Cloudera, their chances of getting a job are slimmer.” As part of the CAP program, Cloudera provides SJSU with: • Course materials along with exercises that allow students to correlate theory with hands-on experience • Discounts to Cloudera University training and certification exams • A 12-month University License to Cloudera Enterprise for the university’s research staff • Unlimited access to Cloudera Express or the Cloudera Quickstart VM for both professors and students For their projects, students gain experience working with live data from sources such as the Federal Aviation Administration, Foursquare, IMDb, Twitter, and Yelp. They learn how to set up a Hadoop cluster, load data, query it using Hive, verify that their queries are running properly, and then visualize and communicate the results of their analyses. “We encourage students to tell a story with the data,” explained Zadrozny. “As you start digging into it, you find interesting things, unusual facts, things that you wouldn’t have anticipated or that are historically relevant.” CUSTOMER SUCCESS STORY 2 Impact: Improved Marketability Through Practical Education The Big Data Analytics course at SJSU is very popular, largely due to its integration of hands-on exercises. “The support of Cloudera with the Cloudera Academic Partnership has been incredible,” commented Zadrozny. “It provides slides and courseware that we can use for teaching the theory, and also allows students to get the experience they need. It’s not only theoretical; it is also very practical.” “Whenever I go to job fairs, if I have something on my resume about Hive, Hadoop, or big data, that’s what hiring managers ask about,” said Tanuvir Singh, a student of the course pursuing his master’s degree in computer science. Another student, Jaideep Katkar, reflected, “Before taking this course, the only thing I knew about big data was the three Vs. After the course, I know all about big data – how Hadoop handles it, how to use Hive, all of its features. Big data is a hot topic right now and it will be very useful in my job.” “The support of Cloudera with the Cloudera Academic Partnership has been incredible. It provides slides and courseware that we can use for teaching the theory, and also allows students to get the experience they need. It’s not only theoretical; it is also very practical.” Peter Zadrozny, Professor, SJSU The students particularly value having access to Cloudera Manager, which simplifies cluster configuration and administration. “If we had done it manually, it would have taken a very long time. But with Cloudera Manager, it was very quick – it didn’t take more than five minutes to set up our cluster,” said graduate student Nikitha Ganesh. Rohit Vobbilisetty, another Big Data Analytics student, enthusiastically summarized, “Hadoop and Hive have great importance in my resume. It’s great that I learned this.” CUSTOMER SUCCESS STORY 3 About Cloudera Cloudera is revolutionizing enterprise data management by offering the first unified Platform for Big Data, an enterprise data hub built on Apache Hadoop. Cloudera offers enterprises one place to store, process and analyze all their data, empowering them to extend the value of existing investments while enabling fundamental new ways to derive value from their data. Only Cloudera offers everything needed on a journey to an enterprise data hub, including software for business critical data challenges such as storage, access, management, analysis, security and search. As the leading educator of Hadoop professionals, Cloudera has trained over 40,000 individuals worldwide. Over 1,400 partners and a seasoned professional services team help deliver greater time to value. Finally, only Cloudera provides proactive and predictive support to run an enterprise data hub with confidence. Leading organizations in every industry plus top public sector organizations globally run Cloudera in production. www.cloudera.com. cloudera.com 1-888-789-1488 or 1-650-362-0488 Cloudera, Inc. 1001 Page Mill Road, Palo Alto, CA 94304, USA © 2015 Cloudera, Inc. All rights reserved. Cloudera and the Cloudera logo are trademarks or registered trademarks of Cloudera Inc. in the USA and other countries. All other trademarks are the property of their respective companies. Information is subject to change without notice. cloudera-casestudy-sjsu-102