(Really) Knowing Your Customer
with Big Data
Hugh E. Williams
Test versus control experimentation
Divide the customers into populations
One (or two) population is the control
One or more populations are the tests
Collect vast amounts of data from each
• Compute metrics from the data, including
confidence intervals
• Understand the results
• Make decisions
Example: Larger Images in eBay’s Search
Test versus control experimentation…
• It’s hard to segment the population:
– Many users aren’t logged in
– Many users use several browsers
– Many users use several devices
• Attributing a result to a change is hard
– Need to know that the user was affected by the
– Need to understand interaction effects of changes
– Metrics are noisy
• Decisions are made with a confidence that isn’t
What won’t data tell you?
(Usually) An unambiguous picture
What you don’t measure
Exactly what to do
How to leap from one mountain to
• When to take risks
• When to be decisive
