News Networks

Download Report

Transcript News Networks

San Francisco Bay
Area News Ecology
Daniel Ramos
CS790G
Fall 2010
Outline

Introduction

Related Work

Methodology

Conclusion
Evolution of News

Keeping current on events has changed radically.

“Mass Media”


Radio, television, newspapers, magazines, books, etc.
“Networked Media”
Based on the Internet.
 Collaborative and global in nature.

Role of Journalists

Traditional Reporting
Journalists worked mostly alone and locally.
 National/International news from other
organizations (Associated Press).


Network Media Reporting
Journalists can easily talk to others across the globe.
 They can freelance for many news outlets.

Project Goals

Use network analysis to characterize a “news
ecosystem”.
Traditional news outlets are shrinking.
 Start-up news organizations are quickly forming.
 Using the San Francisco Bay area.
 Transitioning from “mass media” to “networked
media”?

Tracing Ties

Between news organizations

Between reporters and anybody else

Between users and news organizations

Potentially measure the density of ties
Related Work

I. Himelboim, “The International Network
Structure of News Media: An Analysis of
Hyperlinks Usage in News Web sites”
Analyzed 6,298 foreign news stories.
 223 news web sites.
 73 countries.
 Studied use of external hyperlinks.

Related Work (Cont.)

Found news sites rarely used external hyperlinks.


Only 6% had one or more.
If they did, it followed patterns based on:
Preferential Attachment Theorem
 World System Theory


Conclusions
Journalists trained to not reveal sources.
 Distrust for outside sources.
 Lead users away from the news site.

Related Work (Cont.)

Gordon, Contractor, and Johnson, “Linking
Audiences to News and Information: A
Network Analysis of Chicago Websites”
Collected a list of 277 “seed sites”
 Categorized the sites into: legacy, legacy-affiliated,
micropublisher, organization/institution, national
brand, and service.
 Used a web crawler to examine links.

Related Work (Cont.)

Conclusion
Organizations are authorities.
 Micropublishers and organizations are hubs.
 Organizations are intermediaries and switchboards.


Organizations are most prestigious.
Methodology

Use network theory to study three main ties:

News Organization to News Organization

Journalists to the “Community”

Commenters to News Organizations
News Organization Ties

Compiled a list of 143 different web sites we
feel encompasses our news ecosystem.
Traditional news outlets web sites
 Blogs
 Other non-traditional (e.g. news aggregators)


Use a web crawler to crawl the seed sites and
record all external links to a database.

Each site will be its own network at first.
News Organization Ties

Won’t record duplicates, but will record number
of references.

News Organization graphs will be generated
from the database.
Nodes are websites.
 Edges are directional hyperlink references.
 Edge weights are number of times linked.

News Organization Ties

Categorizing Links
First pass will be try to categorize news sites if they
match the seed site list.
 Second pass will require manual human coding



Remove all links deemed not a news organization
Merge all independent networks together.
News Organization Ties

Metrics

Degree (both in and out)


Betweenness


Determine hubs and authorities.
Determine which sites link otherwise unconnected sites.
Centrality

Determine which sites are important to the network.
Journalists to the Community

Determine the linking patterns of reporters who
publish on the seed sites.


Traditional writing versus using the web to its full
potential.
Use a web crawler to crawl the seed sites and
record all external links to a database.
Focus on only a few larger sites.
 No standard for bylines of article authors.
 Requires site specific crawling rules.

Journalists to the Community

Journalist graph will be generated from the
database.
Forms a bipartite graph.
 Nodes are authors and sites.
 Edges are an author linking a site.
 Some manual human coding required to remove
non-community sites.

Journalists to the Community

Metrics

Degree (both for journalists and sites)

Determine which authors cite more often

Determine which sites are referenced most often.
Commenters to News Organizations

Determine the patterns of users who comment
on stories the seed sites.


How do they interact with news organizations and
each other?
Use a web crawler to crawl the seed sites and
record all commenters to a database.
Focus on only a few larger sites.
 No standard for user comments and accounts.
 Requires site specific crawling rules.

Commenters to News Organizations

Commenter graph will be generated from the
database.
Forms a bipartite graph.
 Each site will be its own graph.
 Nodes are commenters and news stories.
 Edges are a user commenting on a story.
 Might require some manual human coding to
remove spam & bots.

Commenters to News Organizations

Metrics

Degree (both for users and stories)
Determine which users comment most.
 Determine which stories garner most attention.


Transform to a 1-mode network leaving users.
Edge weights are how many of the same stories two users
commented on.
 Do users form clusters and communities?

Tools

WebSPHINX

Pajek

GUESS or Gephi
Conclusion

Is news media transitioning because of new
technologies like it has in the past?

How is the Internet affecting news outlets,
journalists, and readers?

Hopefully, network theory and analysis can help
find these answers.
References

[1] I. Himelboim, "The International Network Structure of
News Media: An Analysis of Hyperlinks Usage in News Web
sites," Journal of Broadcasting & Electronic Media, Volume
54, Issue 3, pp. 373-390, July 2010.

[2] R. Gordon, N. Contractor, and Z. P. Johnson, "Linking
Audiences to News and Information: A Network Analysis of
Chicago Websites," unpublished,
http://www.cct.org/sites/cct.org/files/CNM_LinkingAudienc
es1.pdf