ANALYTICS AND BIG DATA Philip Kim Senior Director, Big Data and Analytics UNDER ARMOUR® pkim@underarmour.com Overview OPPORTUNITY CENTER THE VISION TECHNICAL ARCHITECTURE USER STORIES ENGAGEMENT MODEL DESIGN TEAM STRUCTURE SEIZING OPPORTUNITY … CROSSING THE BIG DATA CHASM BIG DATA CHASM 70% of data generated by customers 80% of data stored 3% prepared for analysis My basic chasm plan: 1. Create shared vision 2. Build fast & cheap 3. Deliver quick wins Source: Gartner Group 0.5% being analyzed <0.5% being operationalized Overview OPPORTUNITY CENTER THE VISION TECHNICAL ARCHITECTURE USER STORIES ENGAGEMENT MODEL DESIGN TEAM STRUCTURE UA’s VISION TO LEVERAGE BIG DATA Center vision around the Customer/Athlete CRM ECOMM ERP Distill real time data into impact on customer relationship across: 3RD PARTY CREATE AUTHENTIC CONNECTIONS SOCIAL RETAIL AUGMENT PRODUCT INNOVATION WHOLESALE • • • • Business Products Channel Geography Enable actionable multi-channel customer engagement MARKETING OPTIMIZE OPERATIONS PRODUCT BRAND Store everything to create a life time of value to the customer OVERVIEW OPPORTUNITY CENTER THE VISION TECHNICAL ARCHITECTURE USER STORIES ENGAGEMENT MODEL DESIGN TEAM STRUCTURE UA TECH SLIDE ANALYZE & ACT Single Sign-On Hi-Performance Cache / RT engines ETL and visualization API’s Analytics & Visualization IDE Retail Low latency data retrieval 3rd party data Wholesale Big data tools / processing API Cleanse & join new data models Social CAPTURE Hadoop clusters HDFS in the cloud Master Data & Meta Data STORE & PROCESS MANAGE OPERATIONS DATA ENGINEERS SCRUM MASTER BUSINESS USERS DATA SCIENTISTS OVERVIEW OPPORTUNITY BUSINESS OBJECTIVES TECHNICAL ARCHITECTURE CAPTURE USER STORIES ENGAGEMENT MODEL DESIGN TEAM STRUCTURE EX. HARNESSING SOCIAL CONNECTIONS & DATA Brand House Purchase: 20Dec14 Time in Store Last Login: 17Feb15 @1PM $9.99 Shared Tweet: 4Jan15 Updated Run & Shared with Personal Trainer: 4Jan15 Last Login: 17Feb15 @11AM Loyalty Points $44.99 $59.99 Online Purchase: 1Feb15 Products visited Gift: 22Dec14 EX. story #1 – Retail visualization User story: Data & transformation: • As a retail analyst, I need to perform time series analysis to establish expected variation of actuals vs forecast so I can deep dive into the top / significant outliers and save 10 hours/week • • • • Aggregate data test: Analytic questions: • Ingest data from <start> to <end> • Expected range of transactions ~50 million records • ID & clean bad data algorithmically • Verify & ID seasonality – adjust for time • Validate time series patterns with analyst 1. 2. 3. Create mockup of visualization Ingest transactional data Stage the data in HDFS Perform regression to normalize data prior to visualization What is the performance over time? What are the key drivers or predictors of performance? Can we use this model to reliably forecast performance? OVERVIEW OPPORTUNITY CENTER THE VISION TECHNICAL ARCHITECTURE USER STORIES ENGAGEMENT MODEL DESIGN TEAM STRUCTURE ENGAGEMENT MODEL User stories – examples ONLY method: 1. 2. 3. 4. 5. 6. As Senior Mgr of Allocation, I need to forecast store sales by size so that I can allocate inventory more accurately and decrease inventory holding cost by $xxM As a retail analyst, I need to perform time series analysis to establish expected variation of actuals vs forecast so I can deep dive into the top / significant outliers and save 10 hours/week As the BD analyst, I need a shareable visualization of retail performance to recommend workforce planning and no impact on retail gross sales As the strategic manager, I need to map existing store sales and extrapolate new store sales so that I can identify microsegmented markets and increase my gross revenue / SQ foot As the supply chain VP, I need to forecast demand versus factory deliveries so I can reduce my days of inventory by $xx /Y ……… PUT POINTS ON THE BOARD None . . . . . . . . BUSINESS IMPACT . . . . . . . . $50M COLLECT TO PRIORITIZE 6 5 1 PHASE 1 PHASE 2 2 3 7 PHASE 3 PHASE 4 4 Easy . . . . . . EFFORT . . . . Difficult TRANFER TO A ROADMAP Phase 1 Phase 2 Analytics & Visualization Time series for retail analytics Forecast inventory by customer size SC demand forecast Phase 4 Phase 3 shareable visualization of retail Capacity map existing store sales and new store sales Capacity Capacity Capacity • • • Phase 1 are easy problems with big benefits ID champions with appetite for change Timebox projects; iterate fast; minimal products! Tip: • Use Agile methodology • • Phase 2 projects are important and hard … reserve for your top talent!!! Larger teams; capital investments xx >$MM and payoffs xxx > $MM Tip: • LEAN before digitize • Phase 3 are medium • • Reduce friction in bulk with architecture … i.e. shift all projects to the easy axis by leveraging tech Phase 4 projects are the fillers for other phases or backlog when resources are available Tip: • Tech shifts are next year’s big projects * Completed analytics labs Team structure … fast delivery Define done … Small teams … Fast iteration Story acceptance … daily standups … deliver in 2 weeks N*(N-1) 2 Story accepted Iterative development Release to UAT Big Data to visualization example: