Xiangnan_KDD_Debrief

advertisement
KDD’14 Debrief
24th April - 27stAugust, 2014
New York City, US
WING Monthly Meeting (Oct 24, 2014)
Presented by Xiangnan He
Open Ceremony
2
Welcome Words
“Donot spend your precious time asking
‘Why isn’t the world a better place?’
It will only be time wasted.
The question to ask is
‘How can I make it better?’
To that there is an answer.“
--- Leo Buscaglia
Overview
• The largest KDD conference ever.





Number of attendees: 2200 + (last year is 1176).
151 Research papers (20% growth over KDD’13), a
43 industry & govt. papers (30% growth)
26 workshops (75% growth)
12 tutorials (100% growth)
• What’s new?
 Paper spotlights every morning (1 min/paper)
 All papers are required to have a poster presented.
 Networking Session: Building a Career in Data Science
Research Track
Reviewing Process
Submissions per Country
Acceptance by Subject Area
Predicting Paper Acceptance
Predicting Paper Acceptance
Academia VS. Industry
Review Statistics
Review Statistics
Research Topics
• Some technical topics that I found especially
notable/popular include:
 Topic/Graphical modeling (not only for text mining,
many tasks are addressed with this method)
 Deep Learning (2 tutorials, but no full papers)
 Social Networks and graph analytics (popular for the
last 10 years, and even more so this year)
 Recommendations
 Workforce analytics
Best Paper Awards
• Best paper:
Reducing the Sampling Complexity of Topic Models.
Aaron Q Li, Carnegie Mellon University; Amr Ahmed, Sujith
Ravi, Alexander J Smola, Google.
• Best student paper:
An Efficient Algorithm For Weak Hierarchical Lasso
Yashu Liu, Jie Wang, Jieping Ye, Arizona State
University,Arizona State University.
Test of Time Award
• Integrating Classification and Association Rule
Mining [KDD 1998], cited by over 2000 times.
Some interesting papers
• Mining Topics in Documents: Standing on the
Shoulders of Big Data.
Zhiyuan Chen, Bing Liu; University of Illinois at Chicago;
• Matching Users and Items Across Domains to
Improve the Recommendation Quality.
Chung-Yi Li,Shou-De Lin; National Taiwan University
• FoodSIS: A Text Mining System to Improve the State of Food
Safety in Singapore
Kiran Kate, Sneha Chaudhari, Andy Prapanca, Jayant
Kalagnanam; IBM Research;
• Mining Topics in Documents: Standing on the Shoulders of
Big Data.
Zhiyuan Chen, Bing Liu; University of Illinois at Chicago;
• Proposed a variant of topic model that can generate more
accurate and coherent topics via integrating knowledge.
• 2 kinds of Knowledge:
 Must-links, e.g. <battery, life>, <price, cheap>
 Cannot-links, e.g. <life, movie>, <money, slow>
• Knowledge are mined through frequent itemset mining.
• But knowledge can be wrong, authors further propose
some rules to clean up the knowledge.
• Knowledge can be easily integrated the into the inference
algorithm with generalized Polya Urn Model.
Innovation Award Talk
• Principles of Very Large Scale Modeling
by Pedro Domingos, from University of Washington.
• Three principles:
 1. Model the whole, not just parts;
People (customers) influence each other - model the whole
network, not each person separately.
 2. Tame complexity via hierarchical decomposition;
We can make 2 assumptions: 1) Subparts are independent given the
part; 2) Probability for class is the avg over subclasses. Using hierarchy
and 2 previous assumptions makes our inference tractable.
Example: Markov Logic Network + Sum-Product Theorem = Tractable
Markov Log
 3. Time and space should not depend on data size.
THANK YOU!
Video recordings of KDD:
http://videolectures.net/kdd2014_newyork/
Download