Move Over, Big Data! How Small, Simple Models Can Yield Big Insights Richard C. Larson, Ph.D., rclarson@mit.edu Mitsui Professor of Engineering Systems and Director of the Center for Engineering Systems Fundamentals, MIT September 8, 2014 © Richard C. Larson 2014 1 © Richard C. Larson 2014 2 © Richard C. Larson 2014 3 • Fishing in the Ocean…. • Random location? No strategy? • Or, location and strategy based on prior analysis? © Richard C. Larson 2014 4 In Trying to Make Sense of a Sea of Data, We Need Small Simple Models to Guide our Search © Richard C. Larson 2014 5 © Richard C. Larson 2014 6 From an MIT SDM alum: • I work on big data “stuff” in my day job and I think simple models are too often discounted, often due to bedazzlement by big data trends, tools, and the quest for the holy grail. © Richard C. Larson 2014 7 A D =C N F(y) = B(y) – [F(y-1) + F(y-2) + F(y-3)] L = lW = r /(1- r ) implies W = (1/ l) r /(1- r) = (1/ m) /(1- r) © Richard C. Larson 2014 8 • What we are not saying about Big Data and Data Analytics….. Small • What we are saying about small Models models….. Big Data • Ideally, in many applications, these two approaches are complementary, going hand in hand. © Richard C. Larson 2014 9 • • • • • Outline. Flaws of Averages Square root laws Nonlinearities in Queueing Case Study: Marrying Small Models and Big Data Analysis © Richard C. Larson 2014 10 If we are about to deal with lots of data, averages will be important. • An average is one of the simplest operations on any dataset. • We need to be savvy customers of averages! © Richard C. Larson 2014 11 Flaws of Averages • Simple model: The average of N quantities, X1, X2, …, XN. • Average = (X1 +X2 +… + XN)/N. • Simple, right? © Richard C. Larson 2014 12 Flaws of Averages © Richard C. Larson 2014 13 Flaws of Averages • We tend to think in averages, often to the point of believing that the average is a constant describing all! • Warning: Average River Depth is 4 feet! • Mutual Fund: Average total annual returns --- 7%. © Richard C. Larson 2014 14 Flaws of Averages • We’ve all heard the joke: When Bill Gates walks into a crowded establishment, ON AVERAGE everyone becomes a millionaire! • The mean salary of a tech worker in San Mateo County is $291,497. • $81,000 of this is due to Mark Zuckerberg! • Medians anyone? © Richard C. Larson 2014 15 © Richard C. Larson 2014 16 Flaws of Averages • Garrison Keeler: Lake Wobegon, where all the women are strong, all the men are good looking and all the children are above average. • Possible? Impossible? © Richard C. Larson 2014 17 Flaws of Averages • Movie Theaters: Estimate the fraction of offered seats that are sold. • Movie Theater Management: What do they see? • Typically – 5% • Selection bias – occurs everywhere. © Richard C. Larson 2014 18 Flaws of Averages • Selection bias – occurs everywhere. • Think of waking up and being a chocolate chip in a chocolate chip cookie! –Your perceived distribution of chips in a cookie –Management’s experience.. © Richard C. Larson 2014 19 Flaws of Averages • Selection bias – Extends to friends on FaceBook. • Yes, it is true that on average my friends on FaceBook have more friends than I do! • How does this type of selection bias extend into your business? © Richard C. Larson 2014 20 Flaws of Averages • Viral growth, R0. • R0 initially from Germany, population growth • In epidemics, R0 is the average number of new infections created by a newly infected person when almost everyone is susceptible to the disease. © Richard C. Larson 2014 21 Flaws of Averages • Suppose R0 = 2.0. Consider two very different possibilities… • 1: Every infection generates 2 more. • 2: A new infection has a 50% chance of generating 4 new infections and a 50% chance of generating none. • Can you picture the temporal dynamics of each case? © Richard C. Larson 2014 22 Ebola Summer 2014 © Richard C. Larson 2014 23 Flaws of Averages • “Outliers”: What to do with them? • Many say, clip them off, they distort the analysis, they mislead intuition. • But, “outliers” have determined the course of human history. – Meteors hitting Planet Earth – Richter 9 and above earthquakes – Financial collapses. © Richard C. Larson 2014 24 Earthquakes • Richter Scale: logarithmic. Each whole number step in the magnitude scale corresponds to the release of about 31 times more energy than the amount associated with the preceding whole number value. © Richard C. Larson 2014 25 Seismic Energy Yield 5.6 kg (12.4 lb) 32 kg (70 lb) 178 kg (392 lb) 1 metric ton 5.6 metric tons 32 metric tons 178 metric tons 1 kiloton 5.6 kilotons 32 kiloton 178 kilotons 1 megaton 5.6 megatons 50 megatons 178 megatons 1 gigaton 5.6 gigatons 32 gigatons Example Hand grenade Construction site blast WWII conventional bombs late WWII conventional bombs WWII blockbuster bomb Massive Ordnance Air Blast bomb Chernobyl nuclear disaster, 1986 Small atomic bomb Average tornado (total energy) Nagasaki atomic bomb Little Skull Mtn., NV Quake, 1992 Double Spring Flat, NV Quake, 1994 Northridge quake, 1994 Tsar Bomba, largest thermonuclear weapon ever tested Landers, CA Quake, 1992 San Francisco, CA Quake, 1906 Anchorage, AK Quake, 1964 2004 Indian Ocean earthquake © Richard C. Larson 2014 Richter Magnitude 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0 8.5 9.0 10.0 26 Flaws of Averages: Summary Points • Averages can be deceiving. • Treating a distribution as its average value usually results in incorrect inferences. • Averages as experienced by one population may be very different from those experienced by another. • Ignore “outliers” at your peril. © Richard C. Larson 2014 27 And we haven’t even considered… • Regression to the Mean • Variance • Exponential smoothing –Example: Baseball batting averages • And much more… © Richard C. Larson 2014 28 One More Average: Based on Dimensionality Arguments • Mean travel distance in a city, N police cars, area A. A D =C N • This is a Square Root Law. • In our analysis of Big Data, we can look for this type of behavior. © Richard C. Larson 2014 29 Let’s Now Switch: From Averages to a Simple Operational Model © Richard C. Larson 2014 30 What Kinds of Queues Occur in Systems of Interest to ESD? Queues, Queues Everywhere! © Richard C. Larson 2014 31 Queueing System Arriving Customers Queue of Waiting Customers SERVICE FACILITY Departing Customers © Richard C. Larson 2014 32 Queues, Queues Everywhere! • Queueing Theory: 100 Years Old! • Most queues are complicated, and folks want to simulate almost all of the detail. • And today there are numerous files of Big Data drawn from queues. • But let’s look at simple models first! – to Guide Us © Richard C. Larson 2014 33 It May be Little, But It’s The Law! L = lW • L= Time average number of customers in the system, both in queue and in service • l = Average rate of arrivals of customers into the system • W = Mean time spent by a customer in the system, both in queue and in service © Richard C. Larson 2014 34 L = lW • Formula applies in all sorts of places, including those not normally thought of as queues. • Example: Annual rate of new hires of assistant professors in a university. • MIT: L = 1,000 tenure-track faculty members W= mean duration of a faculty career • If W moves upwards from 20 to 22 years, l moves down accordingly, since L = 1,000 remains constant. © Richard C. Larson 2014 35 The M/M/k Queue Queue notation: Input/Service/Servers • First M: “Memoryless” input process, meaning Poisson process • Second M: “Memoryless” service time, meaning exponential probability density function • k = number of servers. © Richard C. Larson 2014 36 M/M/1 Queue 1 Wµ 1- r Mean Wait vs. Rho 25 Queue Explodes! 20 W 15 Series1 Note the Elbow! 10 5 0 0 0.2 0.4 0.6 0.8 1 Rho Rho = r = fraction of time that server is busy serving customers © Richard C. Larson 2014 37 Elbow as we Increase the Number of Servers (k = 1,2,3,9,16) © Richard C. Larson 2014 38 Do You See Why Large Call Centers are More Productive? © Richard C. Larson 2014 39 What Do You See as a Role for Big Data Analysis Here? © Richard C. Larson 2014 40 D = Deterministic © Richard C. Larson 2014 41 Averages in Queues • Performance degrades as arrival rate increases and/or mean service time increases. • Performance degrades as Variance of time between arrivals increases and/or variance of the service time increases. • Can you think of examples? © Richard C. Larson 2014 42 Now for Final Switch: From Queueing Overview to Case Study © Richard C. Larson 2014 43 Queue Inference Engine: A Personal Big Data–Small Models Experience • It started with Reams of Old-fashioned Paperbased Computer Printouts © Richard C. Larson 2014 44 Queue Inference Engine: • Big Data: – Time ATM card inserted; – Time ATM Transaction completed. © Richard C. Larson 2014 45 Queue Inference Engine: • Knowing the probability properties of the Arrival Process, a “Poisson Process,” we were able to derive a mathematically valid algorithm to determine many statistics of customers’ queue delays. • It’s called an O(N3) algorithm, since the number of computations grows as the 3rd power of the number of customers in a busy period. © Richard C. Larson 2014 46 Queue Inference Engine: • Imagine receiving your monthly bank statement and with it is a statement of the times you spent waiting in bank queues. The queues could include both those involving human tellers and automatic teller machines (ATMs). • With the technology of the Queue Inference Engine (QIE) such an innovation is now well within the realm of possibility. © Richard C. Larson 2014 47 Queue Inference Engine: • With our first results published in 1990, Dr. David Simchi-Levi and others call this one of the first applications of Big Data analysis to modern-day problems: “This is just a beautiful example of how data drive new research…” (Simchi-Levi, 2014) • But this “QIE” Big Data algorithm could not have been derived without marrying Small Models (with their behavior) with Big Data recursive thinking. © Richard C. Larson 2014 48 Big Data and Small Models © Richard C. Larson 2014 49 © Richard C. Larson 2014 50 References • Larson, R.C., "The Queue Inference Engine: Deducing Queue Statistics From Transactional Data." Management Science 36(5):586-601, May 1990. • Larson, Richard C., QUEUE INFERENCE ENGINE, chapter in Encyclopedia of Operations Research and Management Science, Centennial Edition, Saul I. Gass and Carl M. Harris (eds.), Kluwer, Boston, 2001, pp.674-679. • Jones, Lee K. and Richard C. Larson, "Efficient Computation of Probabilities of Events Described by Order Statistics and Applications to Queue Inference." ORSA Journal on Computation., vol. 7, no. 1, Winter 1995, pp. 89-100. • Gross, Donald and Richard C. Larson, “Queuing Systems,” in International Encyclopedia of Business and Management (IEBM), 2nd edition, 8-volume set, Malcolm Warner, ed., Thomson Learning, London, U.K., 2001, pp. 5502-5513. • Larson, Richard C. and Mauricio Gomez Diaz, “Nonfixed Retirement Age for University Professors: Modeling Its Effects on New Faculty Hires,” Service Science, V. 4, No. 1, March 2012, p. 69-78. • Simchi-Levi, David. “OM Research: From Problem-Driven to Data-Driven Research,” M&SOM, 16 (1) 2014 pp. 2-10. © Richard C. Larson 2014 51