(Really) Knowing Your Customer
with Big Data
Hugh E. Williams
[email protected]
Test versus control experimentation
Divide the customers into populations
One (or two) population is the control
One or more populations are the tests
Collect vast amounts of data from each
• Compute metrics from the data, including
confidence intervals
• Understand the results
• Make decisions
Example: Larger Images in eBay’s Search
Test versus control experimentation…
• It’s hard to segment the population:
– Many users aren’t logged in
– Many users use several browsers
– Many users use several devices
• Attributing a result to a change is hard
– Need to know that the user was affected by the
– Need to understand interaction effects of changes
– Metrics are noisy
• Decisions are made with a confidence that isn’t
What won’t data tell you?
(Usually) An unambiguous picture
What you don’t measure
Exactly what to do
How to leap from one mountain to
• When to take risks
• When to be decisive
PS. I’m available for advising and consulting. [email protected]