lect04 - Duke Computer Science
Download
Report
Transcript lect04 - Duke Computer Science
The Internet
How valuable is a network?
Metcalfe’s Law
Domain Name System: translates betweens names and IP
addresses
Properties of the Internet
Heterogeneity
Redundancy
Packet-switched
1.08 billion online (Computer Industry Almanac 2005)
Who has access?
How important is access?
CompSci 001
4.1
Tim Berners-Lee
I want you to realize that, if you
can imagine a computer doing
something, you can program a
computer to do that.
Unbounded opportunity...
limited only by your
imagination. And a couple of
laws of physics.
TCP/IP, HTTP
How, Why, What, When?
CompSci 001
4.2
Graphs: Structures and Algorithms
How do packets of bits/information get routed on the internet
Message divided into packets on client (your) machine
Packets sent out using routing tables toward destination
• Packets may take different routes to destination
• What happens if packets lost or arrive out-of-order?
Routing tables store local information, not global (why?)
What about The Oracle of Bacon, Erdos Numbers, and Word
Ladders?
All can be modeled using graphs
What kind of connectivity does each concept model?
Graphs are everywhere in the world of algorithms (world?)
CompSci 001
4.3
Vocabulary
Graphs are collections of vertices
and edges (vertex also called
node)
Edge connects two vertices
• Direction can be important,
directed edge, directed graph
• Edge may have associated
weight/cost
A vertex sequence v0, v1, …, vn-1 is
a path where vk and vk+1 are
connected by an edge.
If some vertex is repeated, the
path is a cycle
A graph is connected if there is
a path between any pair of
vertices
CompSci 001
78
NYC
Phil
268
204
190
Wash DC
LGA
$412
Boston
394
$441
$186
LAX
$1701
DCA
$186
ORD
4.4
Network/Graph questions/algorithms
What vertices are reachable from a given vertex?
Two standard traversals: depth-first, breadth-first
Find connected components, groups of connected vertices
Shortest path between any two vertices (weighted graphs?)!
Longest path in a graph
No known efficient algorithm
Longest shortest path: Diameter of graph
Visit all vertices without repeating? Visit all edges?
With minimal cost? Hard!
What are the properties of the network?
Structural: Is it connected?
Statistical: What is the average number of neighbors?
CompSci 001
4.5
Network Nature of Society
Slides from Michael Kearns - Univ. of Pennsylvania
CompSci 001
4.6
Emerging science of networks
Examining apparent similarities between many human and technological
systems & organizations
Importance of network effects in such systems
How things are connected matters greatly
Structure, asymmetry and heterogeneity
Details of interaction matter greatly
The metaphor of viral spread
Dynamics of economic and strategic interaction
Qualitative and quantitative; can be very subtle
A revolution of
measurement
theory
breadth of vision
CompSci 001
(M. Kearns)
4.7
“Real World” Social Networks
Example: Acquaintanceship networks
vertices: people in the world
links: have met in person and know last names
hard to measure
Example: scientific collaboration
vertices: math and computer science researchers
links: between coauthors on a published paper
Erdos numbers : distance to Paul Erdos
Erdos was definitely a hub or connector; had 507 coauthors
how do we navigate in such networks?
CompSci 001
(M. Kearns)
4.8
Online Social Networks
A somewhat recent example: Friendster
vertices: subscribers to www.friendster.com
links: created via deliberate invitation
More recent and interesting: thefacebook
Join the Computer Science 1 group!
Older example: social interaction in LambdaMOO
LambdaMOO: chat environment with “emotes” or verbs
vertices: LambdaMOO users
links: defined by chat and verb exchange
could also examine “friend” and “foe” sub-networks
CompSci 001
(M. Kearns)
4.9
Content Networks
Example: document similarity
vertices: documents on the web
links: defined by document similarity (e.g. Google)
here’s a very nice visualization
not the web graph, but an overlay content network
Of course, every good scandal needs a network
vertices: CEOs, spies, stock brokers, other shifty characters
links: co-occurrence in the same article
Then there are conceptual networks
a thesaurus defines a network
so do the interactions in a mailing list
CompSci 001
(M. Kearns)
4.10
Business and Economic Networks
Example: eBay bidding
vertices: eBay users
links: represent bidder-seller or buyer-seller
fraud detection: bidding rings
Example: corporate boards
vertices: corporations
links: between companies that share a board member
Example: corporate partnerships
vertices: corporations
links: represent formal joint ventures
Example: goods exchange networks
vertices: buyers and sellers of commodities
links: represent “permissible” transactions
CompSci 001
(M. Kearns)
4.11
Physical Networks
Example: the Internet
vertices: Internet routers
links: physical connections
vertices: Autonomous Systems (e.g. ISPs)
links: represent peering agreements
latter example is both physical and business network
Compare to more traditional data networks
Example: the U.S. power grid
vertices: control stations on the power grid
links: high-voltage transmission lines
August 2003 blackout: classic example of interdependence
CompSci 001
(M. Kearns)
4.12
US Power Grid
CompSci 001
4.13
Business & Economic Networks
Example: eBay bidding
vertices: eBay users
links: represent bidder-seller or buyer-seller
fraud detection: bidding rings
Example: corporate boards
vertices: corporations
links: between companies that share a board member
Example: corporate partnerships
vertices: corporations
links: represent formal joint ventures
Example: goods exchange networks
vertices: buyers and sellers of commodities
links: represent “permissible” transactions
CompSci 001
4.14
Content Networks
Example: Document similarity
Vertices: documents on web
Edges: Weights defined by similarity
See TouchGraph GoogleBrowser
Conceptual network: thesaurus
Vertices: words
Edges: synonym relationships
CompSci 001
4.15
Enron
CompSci 001
4.16
Social networks
Example: Acquaintanceship networks
vertices: people in the world
links: have met in person and know last names
hard to measure
Example: scientific collaboration
vertices: math and computer science researchers
links: between coauthors on a published paper
Erdos numbers : distance to Paul Erdos
Erdos was definitely a hub or connector; had 507 coauthors
How do we navigate in such networks?
CompSci 001
4.17
CompSci 001
4.18
Acquaintanceship & more
CompSci 001
4.19
Network Models (Barabasi)
Differences between Internet, Kazaa, Chord
Building, modeling, predicting
Static networks, Dynamic networks
Modeling and simulation
Random and Scale-free
Implications?
Structure and Evolution
Modeling via Touchgraph
CompSci 001
4.20
Web-based social networks
http://trust.mindswap.org
Myspace
73,000,000
Passion.com 23,000,000
Friendster
21,000,000
Black Planet 17,000,000
Facebook
8,000,000
Who’s using these, what are they doing, how often are they
doing it, why are they doing it?
CompSci 001
4.21
Golbeck’s Criteria
Accessible over the web via a browser
Users explicitly state relationships
Not mined or inferred
Relationships visible and browsable by others
Reasons?
Support for users to make connections
Simple HTML pages don’t suffice
CompSci 001
4.22
CSE 112, Networked Life (UPenn)
Find the person in Facebook with the most friends
Document your process
Find the person with the fewest friends
What does this mean?
Search for profiles with some phrase that yields 30-100
matches
Graph degrees/friends, what is distribution?
CompSci 001
4.23
CompSci 1: Overview CS0
Audioscrobbler and last.fm
Collaborative filtering
What is a neighbor?
What is the network?
CompSci 001
4.24
What can we do with real data?
How do we find a graph’s diameter?
This is the maximal shortest path between any pair of
vertices
Can we do this in big graphs?
What is the center of a graph?
From rumor mills to terrorists
How is this related to diameter?
Demo GUESS (as augmented at Duke)
IM data, Audioscrobbler data
CompSci 001
4.25
My recommendations at Amazon
CompSci 001
4.26
And again…
CompSci 001
4.27
Collaborative Filtering
Goal: predict the utility of an item to a particular user based on
a database of user profiles
User profiles contain user preference information
Preference may be explicit or implicit
• Explicit means that a user votes explicitly on some scale
• Implicit means that the system interprets user behavior or
selections to impute a vote
Problems
Missing data: voting is neither complete nor uniform
Preferences may change over time
Interface issues
CompSci 001
4.28