News Networks
Download
Report
Transcript News Networks
San Francisco Bay
Area News Ecology
Daniel Ramos
CS790G
Fall 2010
Outline
Introduction
Related Work
Methodology
Conclusion
Evolution of News
Keeping current on events has changed radically.
“Mass Media”
Radio, television, newspapers, magazines, books, etc.
“Networked Media”
Based on the Internet.
Collaborative and global in nature.
Role of Journalists
Traditional Reporting
Journalists worked mostly alone and locally.
National/International news from other
organizations (Associated Press).
Network Media Reporting
Journalists can easily talk to others across the globe.
They can freelance for many news outlets.
Project Goals
Use network analysis to characterize a “news
ecosystem”.
Traditional news outlets are shrinking.
Start-up news organizations are quickly forming.
Using the San Francisco Bay area.
Transitioning from “mass media” to “networked
media”?
Tracing Ties
Between news organizations
Between reporters and anybody else
Between users and news organizations
Potentially measure the density of ties
Related Work
I. Himelboim, “The International Network
Structure of News Media: An Analysis of
Hyperlinks Usage in News Web sites”
Analyzed 6,298 foreign news stories.
223 news web sites.
73 countries.
Studied use of external hyperlinks.
Related Work (Cont.)
Found news sites rarely used external hyperlinks.
Only 6% had one or more.
If they did, it followed patterns based on:
Preferential Attachment Theorem
World System Theory
Conclusions
Journalists trained to not reveal sources.
Distrust for outside sources.
Lead users away from the news site.
Related Work (Cont.)
Gordon, Contractor, and Johnson, “Linking
Audiences to News and Information: A
Network Analysis of Chicago Websites”
Collected a list of 277 “seed sites”
Categorized the sites into: legacy, legacy-affiliated,
micropublisher, organization/institution, national
brand, and service.
Used a web crawler to examine links.
Related Work (Cont.)
Conclusion
Organizations are authorities.
Micropublishers and organizations are hubs.
Organizations are intermediaries and switchboards.
Organizations are most prestigious.
Methodology
Use network theory to study three main ties:
News Organization to News Organization
Journalists to the “Community”
Commenters to News Organizations
News Organization Ties
Compiled a list of 143 different web sites we
feel encompasses our news ecosystem.
Traditional news outlets web sites
Blogs
Other non-traditional (e.g. news aggregators)
Use a web crawler to crawl the seed sites and
record all external links to a database.
Each site will be its own network at first.
News Organization Ties
Won’t record duplicates, but will record number
of references.
News Organization graphs will be generated
from the database.
Nodes are websites.
Edges are directional hyperlink references.
Edge weights are number of times linked.
News Organization Ties
Categorizing Links
First pass will be try to categorize news sites if they
match the seed site list.
Second pass will require manual human coding
Remove all links deemed not a news organization
Merge all independent networks together.
News Organization Ties
Metrics
Degree (both in and out)
Betweenness
Determine hubs and authorities.
Determine which sites link otherwise unconnected sites.
Centrality
Determine which sites are important to the network.
Journalists to the Community
Determine the linking patterns of reporters who
publish on the seed sites.
Traditional writing versus using the web to its full
potential.
Use a web crawler to crawl the seed sites and
record all external links to a database.
Focus on only a few larger sites.
No standard for bylines of article authors.
Requires site specific crawling rules.
Journalists to the Community
Journalist graph will be generated from the
database.
Forms a bipartite graph.
Nodes are authors and sites.
Edges are an author linking a site.
Some manual human coding required to remove
non-community sites.
Journalists to the Community
Metrics
Degree (both for journalists and sites)
Determine which authors cite more often
Determine which sites are referenced most often.
Commenters to News Organizations
Determine the patterns of users who comment
on stories the seed sites.
How do they interact with news organizations and
each other?
Use a web crawler to crawl the seed sites and
record all commenters to a database.
Focus on only a few larger sites.
No standard for user comments and accounts.
Requires site specific crawling rules.
Commenters to News Organizations
Commenter graph will be generated from the
database.
Forms a bipartite graph.
Each site will be its own graph.
Nodes are commenters and news stories.
Edges are a user commenting on a story.
Might require some manual human coding to
remove spam & bots.
Commenters to News Organizations
Metrics
Degree (both for users and stories)
Determine which users comment most.
Determine which stories garner most attention.
Transform to a 1-mode network leaving users.
Edge weights are how many of the same stories two users
commented on.
Do users form clusters and communities?
Tools
WebSPHINX
Pajek
GUESS or Gephi
Conclusion
Is news media transitioning because of new
technologies like it has in the past?
How is the Internet affecting news outlets,
journalists, and readers?
Hopefully, network theory and analysis can help
find these answers.
References
[1] I. Himelboim, "The International Network Structure of
News Media: An Analysis of Hyperlinks Usage in News Web
sites," Journal of Broadcasting & Electronic Media, Volume
54, Issue 3, pp. 373-390, July 2010.
[2] R. Gordon, N. Contractor, and Z. P. Johnson, "Linking
Audiences to News and Information: A Network Analysis of
Chicago Websites," unpublished,
http://www.cct.org/sites/cct.org/files/CNM_LinkingAudienc
es1.pdf