Henry Kautz
Department of Computer Science
Director, Institute for Data Science
Henry Kautz
Department of Computer Science
Director, Institute for Data Science
Henry Kautz
Department of Computer Science
Director, Institute for Data Science
•
[Fuhn-juh-buh] ( adjective ) Being of such nature or kind as to be freely exchangeable or replaceable, in whole or in part, for another of like nature or kind.
•
Related forms fungibility, noun
•
[Dey-tuh] ( noun ) Individual facts, statistics, or items of information.
•
Related forms metadata
•
[me-ta-day-tuh] ( noun ) A set of data that describes and gives information about other data.
•
Related forms data
Phone call audio
Email body
Youtube videos
Phone call numbers, time, date
Email headers: to, from, date, received-from, …
User ID, number of views, comments, subscribers
Twitter text (140 characters)
GPS, sender ID, sender profile, followers / following
Organic Sensor
Networks
•
52% of adults use online social networks
•
Real time, location aware smartphone access
•
Detailed measurements at a population scale
•
No active user participation
•
Fine granularity
•
Inference & Prediction
•
What will happen in the future?
•
What factors (places/events/actions) influence behavior?
•
GPS location + Tweet contents
Friend (mutual follows) relationship
•
Example of fungibility of data and metadata
•
Your privacy settings cannot hide your friendships!
•
New York City & Los Angeles
•
26M tweets over a month
•
1.2M unique users
•
7.6M geo-tagged tweets
?
?
0.8
0.7
?
0.9
Learn&
Decision&
Tree&
Belief&
Propaga3on&
Predicted&
Friendships&
Loca*on % Text%
Network(
Structure(
%%%%%%%%%%%%Twi.er%Feed%
(2010)
•
Friends’ GPS locations
Your location
•
Your friends are noisy sensors of your location (and other things about you)
•
Your privacy settings cannot hide your location!
Most Frequent
(2011)
•
Once we have access to even sporadic data about your location, messages, and social network, we can find many other signals about peoples’ state and behavior
•
Individual (noisy, but useful)
•
Aggregate (can be highly accurate)
•
Aggregate accuracy comparable with:
•
Google Flu Trends
(R = 0.73)
•
CDC statistics
+
we can model finegrained interactions between specific individuals
Fungibility of Science,
Commerce, & Government
•
Nathan Eagle (MIT, Harvard,
Northeastern)
•
“My research involves engineering computational tools, designed to explore how the petabytes of data generated about human movements, financial transactions, and communication patterns can be used for social good.”