Transcript Lecture 15

CS503: Fifteenth Lecture, Fall 2008
Graphs
Michael Barnathan
Here’s what we’ll be learning:
• Data Structures:
– Graphs.
• Theory:
–
–
–
–
Graph nomenclature (there is a lot of it).
Depth-first search.
Breadth-first search.
Best-first search.
Review: Trees
• A tree is a data structure in which every node
points to a set of “children”.
• A binary tree is a special case in which a node
may contain up to 2 children.
• Each node has exactly one parent, except the
root, which has no parent.
• There is thus only one unique path to every node.
– This is nice; it simplifies many of the algorithms.
– You very seldom need to backtrack.
Unique Paths
This is not a tree:
This is a tree:
1
1
2
3
2
5
4
3
5
4
4 has two parents and there
are two ways to access it.
There goes another assumption!
• What if we get rid of the assumption that each
node has one parent and one path?
• We’re not assuming much anymore… now
we’re just looking at connected nodes.
1
2
3
5
4
Weird.
Graphs
• This data structure is called a graph.
• It is the most general data structure.
– Trees are special cases of graphs.
– Linked lists are special cases of graphs.
• Formally, a graph is simply a set of nodes V
connected by a set of lines E: G = <V,E>.
– The nodes are called vertices.
– The lines connecting them are edges.
– The number of edges adjacent to a vertex is called the
degree of that vertex.
Example
G=
Vertices
1
3
2
5
Edges
4
V=
E=
1
2
3
4
5
Why are they useful?
• Networks:
– Computer networks (routers!)
– Social networks.
– Spread of disease.
Bob
Mallory
You
Alice
Trudy
• Roads, paths, travel:
Larchwood
71
Woodland
Jonathon
Palmer
Undirected Graphs
Larchwood
71
Woodland
Jonathon
Palmer
• These are all two-way streets. Traffic can flow
both ways. We can turn from 71 onto
Larchwood, or Larchwood onto 71.
• The graph is therefore called undirected. The
edges can be traversed in either direction.
Directed Graphs
Larchwood
71
Woodland
Jonathon
Palmer
• What if Larchwood were one way only?
• You could not turn onto 71 from Larchwood, but could turn onto
Larchwood from 71.
• This is represented by adding arrows to edges to signify that the
edge only flows one way. Edges cannot be traversed against the
direction of the arrow.
• These are called directed edges and a graph containing at least one
of them is called a directed graph or digraph.
Cycles
• It is possible for a graph to loop back on itself,
directly or indirectly.
• The loop is called a cycle or closed walk.
• The number of vertices in the loop is called the
length of the cycle.
• A graph with cycles is known as a cyclic graph,
while one that contains none is called acyclic.
1
1
2
Length 1
3
Length 3
Trees
• Since you don’t have a pointer back to the
parent, trees are directed acyclic graphs.
1
2
3
5
4
Connected Components
• It is possible for some vertices to be isolated from others within the
same graph:
1
2
4
3
5
This is one graph.
• Each group is called a connected component. Formally, two vertices
are in the same connected component if one may be reached from
the other. A connected graph has only one connected component.
• A strongly connected component is a group in which every vertex
in the group can be reached from every other vertex in the group.
• Question: are the connected components of the graph shown
above strongly connected? Why or why not?
Path Length
• A traversal starting at one vertex and ending
at another is called a path.
• The number of edges traversed to get from
the start to the end vertex is the path length.
• The minimal path length between two
vertices is the length of the shortest path that
connects them.
Path Length Example
• What is the shortest path from 71 to Palmer?
Larchwood
1
71
2
2
2
Woodland
3
Palmer
Jonathon
3
The Problem With Path Length
• Of course, not all roads are created equal.
• Which is closer, Colorado or West Virginia?
Path Length = 27.
Path Length = 30.
Colorado, here we come!
Weighted Path Length
• In order to represent things like distance (I-95 !=
Route 36) or “cost” of walking down a certain
path, we can assign weights to edges.
• Instead of counting each edge as “1”, we count it
by its weight:
0.4
Larchwood
71
0.4
0.2
0.3
Woodland
0.2
Jonathon
0.2
Palmer
Shortest path length: 0.4 + 0.3 = 0.7 mi
Weighted Path Length
• Path lengths can also be negative in some cases
(maybe a certain road bypasses traffic and saves you
driving time?)
• Finding the shortest path length is obviously an
important problem.
– If you’re UPS, you want your truck drivers to deliver
packages on time in as short a distance as possible (to
conserve fuel).
– If you are routing a packet, you want to select the fastest
route that can get it to its destination.
• Intuitively, how would you find the shortest weighted
path length between two vertices?
• We’ll give some formal strategies for this next time.
Traversing a Graph.
• Very often, we will want to scan the vertices of a graph
(for example, to find the path length).
• There are three common ways of traversing a graph:
– Depth-first.
– Breadth-first.
– Best-first.
• There are also popular variations on best-first search,
such as A* search, which are used frequently in AI.
• A “root” (vertex to start at) must be selected in order
to give the traversal a place to begin.
Depth-First Search
• DFS is equivalent to preorder traversal of a
tree. Because graphs may be cyclic, it requires
keeping track of which vertices were visited.
• The idea: when encountering an unvisited
vertex, traverse down it immediately.
• Only once that traversal finishes do you
traverse down the remaining edges of the
current vertex.
• This is usually done recursively.
DFS Example
Start
4
3
1
2
5
When we traverse 3, 3 becomes the
new current vertex. We then traverse its
edges (to 4) before returning and
finishing up with 2’s other vertex (5).
DFS Algorithm
void dfs(Vertex v) {
if (v == null)
return;
visit(v);
//We can do anything with v here.
v.visited = true;
for (Edge e : v.edges())
if (!e.getOtherVertex(v).visited())
dfs(e.getOtherVertex(v));
}
Breadth First Search
• Where depth-first search scanned down the
entire path before checking additional edges,
breadth-first search does the opposite.
• Idea: scan each adjacent edge before
traversing into any of them.
• Whereas DFS used a stack to traverse (you did
realize it was using the system stack to keep
track of the history, right?), BFS uses a queue.
• Also, while DFS is recursive, BFS is iterative.
BFS Example
Start
5
3
1
2
4
All of 2’s adjacent vertices (3 and 4) are
labeled before we traverse into 3 and
check its adjacent vertices (5).
BFS Algorithm
void bfs(Vertex v) {
if (v == null)
return;
Queue<Vertex> vqueue = new Queue<Vertex>();
vqueue.add(v);
//Start with the start vertex.
v.visited = true;
while (!vqueue.empty()) {
v = vqueue.pop();
//Dequeue the next element and store it in v.
visit(v);
//We can do anything with v here.
for (Edge e : v.edges())
if (!e.getOtherVertex(v).visited()) {
vqueue.add(e.getOtherVertex(v));
e.getOtherVertex(v).visited = true;
}
}
}
Best First Search
• Best-first search uses a user-chosen heuristic function which ranks
nodes based on how “promising” they are in achieving a goal.
• The heuristic function may be based on the value or position of the
vertex or weight of the edges.
• For example, in a game of checkers, a move that results in jumping
an opponent’s piece may be ranked highly by the heuristic function,
since it makes progress towards attaining a goal (winning the
game).
• Best-first search always chooses the “best” next move at each step.
– What do we call those sorts of algorithms again?
• Whereas a stack is used in depth-first search and a queue is used in
breadth-first search, a priority queue can be used in best-first
search.
• The priority would be how “good” a vertex is ranked.
• Other than that change, the algorithm is the same as breadth-first
search.
A Classical Problem
• This is called the “7 Bridges of Konigsberg”. You may have
seen it on IQ tests.
• Euler first solved it in 1736, which hopefully just means no
one thought it was important enough to look at earlier.
• The problem: find a route that allows you to cross each of
the 7 bridges exactly once, or demonstrate that none exists.
A Bridge Too Far
• We discussed some basic graph theory today.
• Next time, we’ll cover algorithms for finding
the shortest path between two vertices and
an alternate representation of a graph.
• The lesson:
– Particularly in mathematics, it is possible to
simplify a problem by removing irrelevant
information. The clutter may make them seem
more difficult than they appear.