Transcript ppt

Topic Evolution and Social Interactions:
How Authors Effect Research
Ding Zhou, Xiang Ji,
Hongyuan Zha, C. Lee Giles
CIKM’06
Advisor: Prof. Hsin-Hsi Chen
Reporter: Yu-Hui Chang
2008/09/10
2008/9/10
Topic Evolution and Social Interactions:
How Authors Effect Research
1
“Given a seemingly new topic,
from where does this topic
evolve?”
“What author or authors
cause such a transition between
topics?”
2008/9/10
Topic Evolution and Social Interactions:
How Authors Effect Research
2
Introduction
• In order to interpret and understand the
changes of topic dynamics in documents, we
resort to discovering the social reasons of why
a topic evolves and relates dependencies with
others.
– Consider an actor au associating a topic ti at time k.
For some reason, this actor meets and establishes a
social tie with actor av who is mostly associated
with a new topic tj and they start to work on the
new topic with a higher probability.
2008/9/10
Topic Evolution and Social Interactions:
How Authors Effect Research
3
Introduction
• we identify the Markov topic transition matrix
via maximum likelihood estimation of the 1stand 2nd-order constraints brought about by the
hidden social interactions of authors
2008/9/10
Topic Evolution and Social Interactions:
How Authors Effect Research
4
Introduction
• Our contributions are:
• (1) a model of the topic dynamics in social documents which
connect the temporal topic dependency with the latent social
interactions;
• (2) a novel method to estimate the Markov transition matrix of
topics based on social interactions of different order;
• (3) the use of the properties of finite state Markov process as
the basis for discovering hierarchical clustering of topics,
where each cluster is a Markov metastable state;
• (4) a new topic-dependent metric for ranking social actors
based on their social impact. We test this metric by applying it
to CiteSeer authors.
2008/9/10
Topic Evolution and Social Interactions:
How Authors Effect Research
5
Problem Definition
2008/9/10
Topic Evolution and Social Interactions:
How Authors Effect Research
6
Social Network
2008/9/10
Topic Evolution and Social Interactions:
How Authors Effect Research
7
Problem Formalize
• Transform matrix DW (word)=> DT (topic)by LDA
• Using the matrix DA, a collaboration matrix A is
obtained by setting {αi,j}A×A = A = (DA)tDA
• Let the author set be Λ
• where a is the set of authors on a document and t is
the distribution over topic specifying this document
2008/9/10
Topic Evolution and Social Interactions:
How Authors Effect Research
8
Multiple orders of social
interactions
Idea :“collaborations bring about new topics”.
2008/9/10
Topic Evolution and Social Interactions:
How Authors Effect Research
9
Social Interactions & Markov
Topic Transition
2008/9/10
Topic Evolution and Social Interactions:
How Authors Effect Research
10
Model Estimation &
Markov Metastable State
Discovery
2008/9/10
Topic Evolution and Social Interactions:
How Authors Effect Research
11
The P(ti|tj) then costs O((NLT+NL2)(A+A2)),
which is bounded by O(A2NLT)
2008/9/10
Topic Evolution and Social Interactions:
How Authors Effect Research
12
Markov Metastable State
Discovery
• Markov chains are called nearly uncoupled if:
– the state space can be decomposed into several
disjoint subsets A such that ωπ(Ai|Aj) ≈ 1 for i = j
and ωπ(Ai|Aj) ≈ 0 for i = j.
• Each aggregate in a nearly uncoupled Markov
chain M is called a metastable state of M.
2008/9/10
Topic Evolution and Social Interactions:
How Authors Effect Research
13
Experiment
2008/9/10
Topic Evolution and Social Interactions:
How Authors Effect Research
14
Data preparation
• Corpus: Citeseer
– over 739,135 academic documents
– 418,809 distinct authors (after name
disambiguation)
– 1991 to 2004
– Eliminate the authors with <50 publications (in
1991~2004)
2008/9/10
Topic Evolution and Social Interactions:
How Authors Effect Research
15
Data preparation
• Associate each document with the list of
disambiguated authors
• Perform breadth-first-search
– search on the co-authorship graph from several
predefined well known author seeds until the graph
is completely connected or there are no new nodes.
– Choose Michael Jordan and Jiawei Han as seeds,
from statistical learning and data mining and
database respectively.
2008/9/10
Topic Evolution and Social Interactions:
How Authors Effect Research
16
Discovered topics
• train a Latent Dirichlet Allocation (LDA) model
setting the topic number as T = 50,
– T is small, because we only work on a small subset of
author in CiteSeer (3,974 authors out of 418,809).
2008/9/10
Topic Evolution and Social Interactions:
How Authors Effect Research
17
Discovered topics
2008/9/10
Topic Evolution and Social Interactions:
How Authors Effect Research
18
Markov topic transition
• We use the properties of finite state Markov process
as the basis for discovering hierarchical clustering of
topics, where each cluster is a Markov metastable
state
After
permutation
2008/9/10
Topic Evolution and Social Interactions:
How Authors Effect Research
19
Markov topic transition
• Permute the matrix Γ such that Γ is
approximately a block diagonal matrix
– The metastable states have in effect reduced the
original Markov transition process to a new
Markov process with fewer states
– Each diagonal block can be seen as a metastable
state which is a cluster of topics with tight intratransition edges.
2008/9/10
Topic Evolution and Social Interactions:
How Authors Effect Research
20
• We observe that diagonal
elements show the existence
of high self-transition
probabilities
• Both matrices are almost
symmetric, meaning the
pair-wise transition between
topics in the same mTopic
are largely balanced.
2008/9/10
Topic Evolution and Social Interactions:
How Authors Effect Research
21
data management,
data mining
• Transitions with probability
numerical analysis,
lower than 0.16 are hidden
machine learning
from the graph to clarify the
major transition among the
five mTopics.
• mT4 (numerical analysis) has been essential in these
mTopics. And there is a transition to mT5 (statistical
methods) and which is tightly coupled with research
in mT1 (data management and data mining).
• Results also imply that researchers in mT3 (networks)
will be concerned with mT2 (systems)
2008/9/10
Topic Evolution and Social Interactions:
How Authors Effect Research
22
Who powers the topic transtion
• We give a new metric δ(au) for the author impact
ratio of au as measuring the difference between the
obtained P(ti|tj )’s, with and without au.
2008/9/10
Topic Evolution and Social Interactions:
How Authors Effect Research
23
Conclusion
• Relating social actors to their associated social
topics and use them to derive topic trends.
• We model the topic dynamics as a Markov
chain and discover the probabilistic
dependency between topics from the latent
social interactions.
2008/9/10
Topic Evolution and Social Interactions:
How Authors Effect Research
24
Thanks
Any Questions?
2008/9/10
Topic Evolution and Social Interactions:
How Authors Effect Research
25