He11topic8 - socialnetworks-2011

Download Report

Transcript He11topic8 - socialnetworks-2011

Dynamics of Conversations
Presented by Junfeng He
Outline
Data Description
Observations
Branching Processes Model
– And why it is not sufficient
TI-Model (T: Time I: Identity)
– And why it is good
Anecdotal Examples
Outline
Data Description
Observations
Branching Processes Model
– And why it is not sufficient
TI-Model (T: Time I: Identity)
– And why it is good
Anecdotal Examples
Data Description
Statistics
Each record (message) contains:
–
–
–
–
id of the message
Id of the parent message
the author of the message
timestamp
Every thread is a tree with each node as a
message
Outline
Data Description
Observations
Branching Processes Model
– And why it is not sufficient
TI-Model (T: Time I: Identity)
– And why it is good
Anecdotal Examples
Observations
Size and depth
– Heavy tailed for both size and depth
– The size is roughly quadratic to depth
Preferential attachment (logarithmic diameter) is
not sufficient here
Observations
Degree
– Power law
Observations
Degree
– Dependent on the level
Observations
Authorship
– Polynomial relationship between the number
of authors and the size of thread
Outline
Data Description
Observations
Branching Processes Model
– And why it is not sufficient
TI-Model (T: Time I: Identity)
– And why it is good
Anecdotal Examples
Branching Processes (BP) Model
Each thread starts with a root node
At the i-th step each leaf at the i-th level
independently generates a certain number
of children according to distribution p
Terminates when there are no more new
children
Properties of BP model
The size of the thread Z depends the mean μof
distribution p
The size follows power law  p follows power
law
If μ<=1, the depth can not follow power law (i.e.
heavy tailed)
Why BP model is not sufficient
Depth distribution is inconsistent with
observations
BP model
Observations
Why BP model is not sufficient
It can not capture
– Quadratic relationship between size and depth
– How degree depends on level
Ignored the time (order) and author
information of each message
Outline
Data Description
Observations
Branching Processes Model
– And why it is not sufficient
TI-Model (T: Time I: Identity)
– And why it is good
Anecdotal Examples
T-Model: model with time
T-model
– The probability to add a child to a node v:
– where
with
– Possibility is higher if
degree of v is large: i.e., deg_v is large
the message is new: i.e., r_v is small
TI-Model: model with time and
identity
Authors tend to respond to those who
respond to their earlier message
v: one message, a(v): author of v
A’(v): authors on the path from parent of v
to the root
A: all authors in whole data set
Why TI-model is good?
It captures the quadratic relationship
between size and depth
Observations
TI model
Why TI-model is good?
It captures how degree depends on level
Observations
TI model
Why TI-model is good?
It captures the polynomial relationship
between #authors and #message
Observations
TI model
Outline
Data Description
Observations
Branching Processes Model
– And why it is not sufficient
TI-Model (T: Time I: Identity)
– And why it is good
Anecdotal Examples
Anecdotal Examples for Usenet
Preferential behavior
Recency
Identify copying
Anecdotal Examples for Y! Groups
and Twitter
Preferential behavior
Recency
Thank You!
धन्यवाद !
Vielen Dank!
Spasiba !
‫! شكور‬
Köszönöm!
감사합니다 !
謝謝!
ありがとう !
Merci beaucoup !
υχαριστίες !
Grazias !
Obrigado !
Kop Koon Ka !