who-says-what-to
Download
Report
Transcript who-says-what-to
Who says what to whom on
twitter
Xiaomei Wu
Winter A.Mason
Jake M.Hofman
Duncan J.Watts
Main research of this paper
Introduce a method for classifying users using Twitter Lists into
“elite” and “ordinary” users, further classifying elite users into
one of four categories of interest—media,celebrities,
organizations, and bloggers.
Investigate the flow of information among these categories,
finding that although audience attention is highly concentrated
on a minority of elite users, much of the information they
produce reaches the masses indirectly via a large population of
intermediaries.
Find that different categories of users emphasize different types
of content, and that different content types exhibit dramatically
different characteristic lifespans,ranging from less than a day to
months.
Motivation of this paper
Theories of communications have tended to focus
either on “mass” communication or on
“interpersonal” communication
New channels of Mass communication :cable
television, satellite radio, specialist book and
magazine publishers, sponsored blogs, online
communities,and social news sites.
New channels of interpersonal
communication :personal blogs, email lists, and
social networking sites
Masspersonal channel:Twitter
Related work
Kwak et studied the
topological features of the
Twitter follower graph,
number of followers, pagerank, and number of
retweets
Cha et compared three
measures of influence—
number of followers, number
of retweets, and number of
mentions
Weng et compared number
of followers and page rank
with a modified page-rank
measure which accounted
for topic
Bakshy et studied the
distribution of retweet
cascades on Twitter
Innovation of this paper
Shifting attention to the flow of information
among different categories of users. Focus
on 4 identifying specific categories of “elite”
users:media,celebrities, organizations, and
bloggers.
Data and methods
Data set
Twitter follower
graph
Twitter firehouse
Twitter list
Twitter follower graph
Observed by Kwark et in July 31st,2009
Included 42M users and 1.5B edges
Follower graph network is a directed network
characterized by skewed distributions both of
in-degree(followers) and out-degree(friends)
Twitter firehouse
5B tweets generated over a 223 days from
July 28,2009 to March 8,2010 from the
Twitter
Focus on the subset of 260M containing bit.ly
URLs
Twitter lists
Conclusions
Who listens to whom
Who listens to what
Two-step flow of information
Lifespan of content
Lifespan by category
Who listens to whom
0.05% of the population
accounts for almost half
of all posted URLs.
Attention is highly
homophilous-celebrities
following celebrities,
media following media,
and bloggers following
bloggers.
Who listens to what
Category:World News,U.S
News,business,
sports,Heath,Technology,Scien
ce,Arts
organizations show little interest
in business and arts-related
stories, and high interest in
science, technology, and
possibly world news. Celebrities,
by contrast, show greater
interest in sports and less
interest in health, while the
media shows somewhat greater
interest in U.S. news stories.
Two-step flow of information
Half the information that originates from the media
passes to the masses indirectly via a diffuse
intermediate layer of opinion leaders
Lifespan of content
Different types of content exhibit different lifespans
Classic music videos,movie clips,long-format
magazine articles have long lifespan than daily news
stories
Lifespan by category
For vast majority of
URLs,longevity is
determined by
rediscoving
For URLs introduced by
elite users,longevity is
determined by retweet
Strength & Weakness
Use twitter as their
research object
Classify twitter lists into
elite users and ordinary
users
Emphasize elite users
Restrict attention just to
URLs on Twitter
Overlook the
unanticipated
categories that may be
of equal or greater
relevance than the
selected four categories
Future work
Apply similar methods to quantifying information flow
via more traditional channels, such as TV and radio
Explore automatic classification schemes from which
additional user categories could emerge.
To extract content information in a more systematic
manner—the “what” of Lasswell’s maxim; and
second, to focus more on the effects of
communication by merging the data regarding
information flow on Twitter with other sources of
outcome data.
Thank you