(Really) Knowing Your Customer with Big Data Hugh E. Williams hugh@hughwilliams.com @hughewilliams http://hughewilliams.com Test versus control experimentation • • • • Divide the customers into populations One (or two) population is the control One or more populations are the tests Collect vast amounts of data from each population • Compute metrics from the data, including confidence intervals • Understand the results • Make decisions Example: Larger Images in eBay’s Search Test versus control experimentation… • It’s hard to segment the population: – Many users aren’t logged in – Many users use several browsers – Many users use several devices • Attributing a result to a change is hard – Need to know that the user was affected by the change – Need to understand interaction effects of changes – Metrics are noisy • Decisions are made with a confidence that isn’t 100% What won’t data tell you? • • • • (Usually) An unambiguous picture What you don’t measure Exactly what to do How to leap from one mountain to another • When to take risks • When to be decisive PS. I’m available for advising and consulting. hugh@hughwilliams.com THANK YOU