Wrangling Customer Usage Data with Hadoop Clearwire – Thursday, June 27th Carmen Hall – IT Director Mathew Johnson – Sr. IT Manager Starting With… • …a little ingenuITy! ingenuITy Day @ Clearwire • Opportunity for everyone in IT to innovate and present new and even crazy ideas • One of those crazy ideas was from Roger Hosto • Roger had the solution for Clearwire’s Big Data problem: Hadoop But Wait! • Now we had a solution for Big Data • We needed a Big Data opportunity • We had just the thing… The Perfect Problem • Customer Usage Data – our commodity to Wholesale partners Totally (un)Wired • Americans used more than 1,304 petabytes of wireless data in 2012 - an increase of 69.3% over the previous 12 months' usage (827 TB) • Clearwire processes over 3B individual usage detail records each month Shifting Landscape • The U.S. wireless industry is a $195.5 billion enterprise - larger than publishing, agriculture, hotels and lodging, air transportation and movies – just to name a few • Prepaid/Pay-As-You-Go services' share of overall market penetration is 23.4% driving higher exposure of lost revenue if usage delivery is delayed. • In some cases, a customer can consume data faster than we can bill for it Anatomy Of Latency - Legacy Up to 90 Minutes 1 Hour AAA IT Usage Processing ASN GW PTS SPB Internet OSS SDU Wholesale Partners Let’s Talk Numbers • Assume a 2GB plan • An HD movie from Netflix consumes 2+ GB per hour • Assume wholesale price = $6/GB • Assume the retail price for a GB of data (as top up or overage) ranges from $20 – $100 As if that wasn’t enough • Clearwire was locked into a very expensive vendor contract which handled both network provisioning and usage delivery needs • Legacy solution was not adaptable or flexible • We needed something innovative, reliable, internally supportable, scalable – and we needed it fast Putting ingenuITy to Work! • Roger’s idea was suddenly a project • We needed to build a platform to ingest, process, and provide cleaned usage data for downstream applications – and quickly • We needed: • A Hadoop Cluster • 24x7 Operations • Code to ingest data and handle a myriad of business rules • Integration with legacy and new systems Atlas was Born • Development work began immediately on Clearwire’s private cloud infrastructure • Selected BigTop Packaging of Apache Hadoop v1.0.1 • Custom code leveraging Hive and other common tools to ingest and process data was written • Infrastructure was built Hybrid Approach to Hadoop • Virtual Edge Nodes • Leveraged our existing private cloud • Physical Data Nodes • Per Unit Cost (Storage & CPU) was lower than existing infrastructure • Smaller and more efficient than you think • 24 data nodes, each with 3TB of usable storage • Gives us 72TB of usable space • 3x block replication for production data • Deployed identical DR/Analytics platform Operational in No Time • 2.5 months from project approval to production • Leveraged our existing support organizations • Solution leveraged common tools, did not require specialized teams • Fault tolerance inherent within Hadoop helps us minimize late night calls • An endless supply of data was quickly flowing through the system • The results were looking good! Real Results • 65% improvement in end to end delivery times • From 2.5 hours to 1.3 hours • Reduced catch up time from upstream outages by more than half • Reduced outage impacts by introducing flexibility to deliver partial files • Eliminated 4 hour weekly usage delivery outages tied to provisioning system maintenance Anatomy of Latency - Now 1 Hour Average of 15 Minutes ~6 Minutes ~9 Minutes AAA ASN GW PTS SPB Internet OSS SDU Atlas Medusa Wholesale Partners Real (Financial) Results • 6 month return on investment • Delivered at 1/3 the cost of competing solutions • Foundational – Enabling Wholesale support plan of legacy platform migration • Saving Clearwire 10’s of millions of dollars over life of contract and internalizing support and development The Intangibles • Proved to internal and external partners that we deliver what we promise with limited negative impacts to ongoing business • This was KEY to the speed at which we were able to migrate our billing platform • Delivered more than just a single, targeted process – delivered an enterprise usage platform to grow from • Kept true to our innovative spirit and the commitment to IT professionals that they can make a difference Evolution – Proving More The Atlas Hadoop platform is now a go-to IT solution • • • • LTE Usage Data – Now in production Other Data Sources - ESR Data Data Replication and real-time ETL Exploring opportunities with network team to move closer to usage generation • Changing mindset of what IT can mean to an organization Q&A