Xiangnan_KDD_Debrief
Download
Report
Transcript Xiangnan_KDD_Debrief
KDD’14 Debrief
24th April - 27stAugust, 2014
New York City, US
WING Monthly Meeting (Oct 24, 2014)
Presented by Xiangnan He
Open Ceremony
2
Welcome Words
“Donot spend your precious time asking
‘Why isn’t the world a better place?’
It will only be time wasted.
The question to ask is
‘How can I make it better?’
To that there is an answer.“
--- Leo Buscaglia
Overview
• The largest KDD conference ever.
Number of attendees: 2200 + (last year is 1176).
151 Research papers (20% growth over KDD’13), a
43 industry & govt. papers (30% growth)
26 workshops (75% growth)
12 tutorials (100% growth)
• What’s new?
Paper spotlights every morning (1 min/paper)
All papers are required to have a poster presented.
Networking Session: Building a Career in Data Science
Research Track
Reviewing Process
Submissions per Country
Acceptance by Subject Area
Predicting Paper Acceptance
Predicting Paper Acceptance
Academia VS. Industry
Review Statistics
Review Statistics
Research Topics
• Some technical topics that I found especially
notable/popular include:
Topic/Graphical modeling (not only for text mining,
many tasks are addressed with this method)
Deep Learning (2 tutorials, but no full papers)
Social Networks and graph analytics (popular for the
last 10 years, and even more so this year)
Recommendations
Workforce analytics
Best Paper Awards
• Best paper:
Reducing the Sampling Complexity of Topic Models.
Aaron Q Li, Carnegie Mellon University; Amr Ahmed, Sujith
Ravi, Alexander J Smola, Google.
• Best student paper:
An Efficient Algorithm For Weak Hierarchical Lasso
Yashu Liu, Jie Wang, Jieping Ye, Arizona State
University,Arizona State University.
Test of Time Award
• Integrating Classification and Association Rule
Mining [KDD 1998], cited by over 2000 times.
Some interesting papers
• Mining Topics in Documents: Standing on the
Shoulders of Big Data.
Zhiyuan Chen, Bing Liu; University of Illinois at Chicago;
• Matching Users and Items Across Domains to
Improve the Recommendation Quality.
Chung-Yi Li,Shou-De Lin; National Taiwan University
• FoodSIS: A Text Mining System to Improve the State of Food
Safety in Singapore
Kiran Kate, Sneha Chaudhari, Andy Prapanca, Jayant
Kalagnanam; IBM Research;
• Mining Topics in Documents: Standing on the Shoulders of
Big Data.
Zhiyuan Chen, Bing Liu; University of Illinois at Chicago;
• Proposed a variant of topic model that can generate more
accurate and coherent topics via integrating knowledge.
• 2 kinds of Knowledge:
Must-links, e.g. <battery, life>, <price, cheap>
Cannot-links, e.g. <life, movie>, <money, slow>
• Knowledge are mined through frequent itemset mining.
• But knowledge can be wrong, authors further propose
some rules to clean up the knowledge.
• Knowledge can be easily integrated the into the inference
algorithm with generalized Polya Urn Model.
Innovation Award Talk
• Principles of Very Large Scale Modeling
by Pedro Domingos, from University of Washington.
• Three principles:
1. Model the whole, not just parts;
People (customers) influence each other - model the whole
network, not each person separately.
2. Tame complexity via hierarchical decomposition;
We can make 2 assumptions: 1) Subparts are independent given the
part; 2) Probability for class is the avg over subclasses. Using hierarchy
and 2 previous assumptions makes our inference tractable.
Example: Markov Logic Network + Sum-Product Theorem = Tractable
Markov Log
3. Time and space should not depend on data size.
THANK YOU!
Video recordings of KDD:
http://videolectures.net/kdd2014_newyork/