The Use of Big Data and Data Mining in Supply Chains David L. Olson College of Business Administration University of Nebraska-Lincoln BIG DATA (Davenport, 2014) • Data too big to fit on single server • Too unstructured to fit in row-and-column database • Too continuously flowing to fit into static data warehouse • THE MOST IMPORTANT ASPECT IS LACK OF STRUCTURE, NOT SIZE • The point is to ANALYZE • Convert data into insights, innovation, business value • Waller & Fawcett (2013) • Shed obsession for causality in exchange for simple correlations • Not knowing why, but only what Governmental & Non-Profit Examples Dobbs et al. 2014, McKinsey Report • European & US food safety regulations • Need to monitor, gather data • Need to analyze • Hospitals • Biological data • Operational data • Insurance data • Schools • Government • Monitor Web site use • Monitor use of apps Data Types (Davenport, 2014) • Text & Voice • Been around forever • Internet presence initiates a new era (text mining) • Social Media data • Sentiment analysis – identify opinions from posted comments • Sensor data • • • • The “Internet of Things” Digital cow – sensors in 2nd stomach Humans – sensors for fitness, productivity, health Industrial – manufacturing, transportation, energy grids Contemporary Big Data Examples • Baseball • Moneyball • Flu detection • Google searches • Wal-Mart disaster relief • Hurricane Katrina • Pop-tarts & water Sathi (2012) • Internal Corporate data • Generated by e-mails, logs, blogs, documents • Business process events • ERP • External to firm • Social media • Competitor literature • Customer Web data • Complaints Mayer-Schonberger & Cukier (2013) • Logistics firm • Masses of data – product shipments • Turned into a source of revenue • Accenture • Big data provides • • • • Better customer service More effective order fulfillment Faster response to supply chain problems Greater overall efficiency • Zillow • Masses of real estate data Supply Chain Analytics • Big data supports real-time decision making • • • • Grocery stores Wal-Mart American Airlines – yield management Trucking – monitor real-time breakdown response • SUPPLY CHAIN ANALYTICS (Chae 2014) • Data management resources • Data acquisition & management (RFID, ERP, database) • Analysis (data mining) • IT-based supply chain planning resources • Performance management resources • Statistical process control, Six Sigma, etc. Knowledge Management Performance management How things are done (tacit knowledge, resources BPR) Elaboration Process control Six Sigma Information systems Database, reports, decision support Cloud computing Data sources ERP & related systems External sources Big data Descriptive analysis Data mining RFID Government publications Social media Analytics Operations Research Classification Prediction Clustering Link analysis Text mining Mathematical programming Stochastic modeling Monte Carlo Simulation Supply Chains & Big Data • RFID/GPS • Tracking now affordable • Manufacturing links to supply chains • Discrete manufacturing has for some time • Process industries (oil refining) behind Example Supply Chain Big Data Sources Waller & Fawcett (2013a) – Journal of Business Logistics Data Type Volume Velocity Variety Sales More detail – price, quantity, items, time of day, date, customer From monthly & weekly to daily & hourly Direct sales, Distributor sales, Internet sales, international sales, competitor sales Consumer More detail – items browsed & bought, frequency, dollar value, timing (RFM+) From click through to card usage Shopper identification, emotion detection, “Likes”, “Tweets”, product reviews Inventory Perpetual inventory by style, color, size From monthly updates to hourly updates Warehouse, store, Internet store, vendor inventories Location/Time Sensor data to detect location, better inventory control Frequent updates within store and in transit Not only where, but what is close, who moved it, path, future path, mobile device evidence Supply Chain Analytics Objectives • Cost reduction • Develop innovative new products & services • LinkedIn – developed array of offerings • Google • Zillow real estate site • Reduce time needed to analyze • Department store chain – 73 million items • • Reduced pricing optimization from 27 hours to around 1 hour SAS high-performance analytics (HPA) – takes data out of Hadoop cluster, places in-memory on parallel computers • Financial asset management company • • • Analyze single bond issue, risk analysis using 25 variables, 100 simulations With big data system can run 100 variables and 1 million simulations in 10 minutes Better discovery process • Support Internal Business Decisions • United Healthcare – insurance • Analyze customer attrition • Wells Fargo, Bank of America, Discover use for multichannel CRM • Unstructured data – website clicks, transaction records, banker notes, voice recordings from call centers Responsibility Locus for SCA Projects Cost Savings DISCOVERY IT innovation group Product/Service Innovation R&D/product development group Faster Decisions Business unit or function analytics group Business unit or function analytics group Better Decisions PRODUCTION IT architecture & operations Product development Or Product management Executive Executive Vertical vs. Horizontal Data Scientists • VERTICAL • In-depth technical knowledge of narrow field • Econometricians • Software engineers • HORIZONTAL • Blend: business analysts, statisticians, computer scientists, domain experts • Vision with some technical knowledge • Focus on robust, efficient, simple, replicable, scalable applications • Horizontal more marketable • NEED A TEAM • WANT TO AUTOMATE AS MUCH AS POSSIBLE Big Data Opportunities to Improve: Waller & Fawcett (2013b) - Journal of Business Logistics • Demand forecasting • Link real-time sensors to machine-learning algorithms • Bar-coded checkout & Wal-Mart RFID chips already exist • Enables real-time response • Warehouse design & location • System design for optimality • A classical operations research problem • Can use network analysis to be more complete • Supplier evaluation & selection • Probably the most commonly researched supply chain function • Can consider more factors, more up-to-date data • Selection of transportation nodes • Real-time truck/rail assignment • Already exists Company Examples (Davenport, 2014) LinkedIn Start-up Coined “data scientist– unified search eBay Start-up Data hub, virtual data marts Kyruus Start-up Data about physician networks – track patient leakage Recorded Future Start-up Use Internet data to help predict UPS Established Track packages, monitor vehicles & route them United Healthcare Established Take voice calls, put in text, text-mine Macys.com Established Personalization of ads Bank of America Established Better understand customers by channel Citigroup Established Monitor customer credit risk Sears Holdings Established Real-time retail monitoring Verizon Wireless Established Sell data on mobile phone user behavior (movement, buying) Schneider International Established Trucking – sensors for location, driver behavior US • Great economic changes • Wages too high • Outsourcing • Computer programming (service) to India • Manufacturing to China • Technology • Robotics – no health benefits, no vacations, no complaints • Computers • ERP systems replacing multiple legacy systems • Layoff most human IT people • Business Analytics • BIG DATA Erik Brynjolfsson and Andrew McAfee 2011 Digital Frontier Press Race Against The Machine: How the Digital Revolution is Accelerating Innovation, Driving Productivity, and Irreversibly Transforming Employment and the Economy • Computer progress advancing exponentially • AFFECT ON • • • • Jobs Skills Wages The Economy Supply Chain Areas with Big Data Impact • Globalization • Japan; Asian Tigers; BRIC Supply Chain involvement • Digitization • Enterprise systems Supply Chain Enabler • Paradox: More Integrated Systems ˃˃ Fewer Systems People • Energy supply • Peak Oil (Fracking) • Global warming Big Data won’t predict major shifts • Complexity • Unintended consequences Medicare false positives • DEREGULATION/PRIVATIZATION • Home mortgage crisis Reliance on statistics gone wrong Potential Areas of Interest – SCA & Big Data Friedman (The World is Flat) • THREE CONVERGENCES • New players (through global access) • BRICS • New playing field (Web economy) • Global warming • Green emphasis • Cultural conflicts • Ability to develop new ways