Distributed Monitoring and Management

Download Report

Transcript Distributed Monitoring and Management

What about the Network?
CS 525 Spring 2009
Advanced Distributed Systems
End-to-End Arguments in System
Design
J.. Saltzer, D.P. Reed and D.D. Clark
M.I.T. Laboratory for Computer Science
Presented by: Abdullah Al-Nayeem
Where to Place Functionalities?
• Example: Reliable file transfer
• Should reliability be implemented per-hop by the
communication subsystem?
• Or, end-to-end by host applications?
4/14/2009
Department of Computer Science, UIUC
3
Where to Place Functionalities?
• Possible failures in file transfer:
– Disk access failure (hardware)
– Packet drop or duplicated packet (communication)
– File system error (software)
• Communication subsystem cannot itself guarantee
reliability.
– Also increases network complexity
– More overheads for applications that do not require reliability.
• Application layer can provide full reliability, even without
any support from lower layers of the network.
– End-to-end checksum and retry
4/14/2009
Department of Computer Science, UIUC
4
End-to-End Argument (E2EA)
• The lower layers of the network are not the right
place to implement application-specific functions
– Move functions “up and out”
• “The function in question can completely and
correctly be implemented only with the
knowledge and help of the application standing
at the end points of the communication system.
Therefore, providing that questioned function as
a feature of the communication system itself is
not possible.”
4/14/2009
Department of Computer Science, UIUC
5
Typical Examples
•
•
•
•
•
Bit error recovery
Security using encryption
Duplicate message suppression
Recovery from system crashes
Delivery acknowledgement
4/14/2009
Department of Computer Science, UIUC
6
Benefits of E2EA
• Core network can be simpler and faster
• Less assumptions required on the networks
• More flexibility in developing new network
technologies and applications
– Helped in proliferation of the Internet
• Dumb networks, intelligent hosts
4/14/2009
Department of Computer Science, UIUC
7
Extension of E2EA
• Lower layers may implement partial applicationspecific functions, but only for performance
improvements.
– Reducing retries in data transmissions
• Should the level of reliability at the network be
higher than the expected application reliability?
• What are the possible tradeoffs?
– Short-term performance vs. long-term flexibility
– Performance vs. cost
4/14/2009
Department of Computer Science, UIUC
8
Identifying the Ends
Voice over IP
File Transfer
Voice
Files
• VoIP: Human user is the end-point
• File Transfer: Application is the end-point
• Only the end-points knows how to guarantee
required reliability
4/14/2009
Department of Computer Science, UIUC
9
Moving Away from E2EA
• Hosts are not always trustworthy
– Security attacks, e.g. denial of service
• E2EA does not guarantee congestion control
– Unfriendly host
• Communications are not always between two endpoints
– Multicast, broadcast
• How does the network handle these circumstances?
4/14/2009
Department of Computer Science, UIUC
10
Other Issues
• ISP control, filtering, network monitoring
• Government interventions
• More subtle end points
– Anonymous users using third-party services
– Cloud computing entities (SaaS user, SaaS
provider, Cloud provider)
• Do these factor imply the end of E2EA?
4/14/2009
Department of Computer Science, UIUC
11
Summary
• End-to-End argument is not an absolute, but a
design tool
• End-to-End argument can help in organizing
“layered” communication systems.
4/14/2009
Department of Computer Science, UIUC
12
Consensus Routing: The Internet
as a Distributed System
John P. John1, Ethan Katz-Bassett1, Arvind Krishnamurthy1, Thomas Anderson1,
Arun Venkataramani2
1Dept.
of Computer Science, Univ. of Washington, Seattle
2University of Massachusetts Amherst
5th USENIX Symposium on Networked Systems
Design and Implementation (NSDI), 2008
Presented by: Ahmed Khurshid
Motivation
• Internet routing protocols (both intra and inter
domain) usually favors responsiveness over
consistency
– A new route is incorporated in the forwarding table before
propagating the same to neighbors
• Results in routing loops and blackholes
• Usually there is no extra effort to ensure consensus
– Solutions have been proposed for intra-domain
routing
4/14/2009
Department of Computer Science, UIUC
14
Motivation – Routing loop
2 prefers the path through 3
2 and 3 each prefer the other over 6
5: 1-5, 5: 4-5
5: 3-4-5
5: 4-5
5: 2-4-5
Minimum Route
Advertisement
Interval (MRAI)
Timer
Link failure causing
BGP loops at 2 and 3
4/14/2009
Policy change causing BGP loops
at 2 and 3 when 4 withdraws a
prefix from 2 and 3 but not 6
Department of Computer Science, UIUC
15
Motivation – Blackhole
AP is prefered over CD
CD
Recovered
iBGP link recovery causing blackholes
4/14/2009
Department of Computer Science, UIUC
16
Consensus Routing
• A consistency first approach that cleanly separates
safety and liveness of routing
– Safety: All the routers use a consistent route towards a
destination (i.e. no loops)
– Liveness: Quick reaction to failures and policy changes
• Uses two simple ideas to ensure both consistent
behavior and quick reaction
1. Runs a distributed coordination algorithm to ensure
globally consistent view of routing state
2. Forwards packets using one of two logically distinct
modes
4/14/2009
Department of Computer Science, UIUC
17
Stable Mode
• Unlike BGP, consensus routing does not immediately
incorporate a newly learned route into the forwarding table
• Periodically, all routers engage in a distributed coordination
algorithm that determine the most recent set of complete
updates
• The coordination is based on classical distributed snapshot
and consensus algorithms
• Chandy-Lamport snapshot algorithm
• Paxos
• Output of the coordination is used to compute a set of stable
forwarding tables (SFTs) that are guaranteed to be consistent
• SFTs replace traditional FIBs (Forwarding Information Base)
4/14/2009
Department of Computer Science, UIUC
18
Stable Mode – Update Log
Tier-1
A
B
C
Route
advertisement/withdrawal
Tier-2
D
E
F
G
Tier-3
(Stub)
H
I
J
K
Users
Users
Users
Users
Store updates into the update log without modifying the SFT
4/14/2009
Department of Computer Science, UIUC
19
Stable Mode – Distributed Snapshot
Tier-1
A
B
C
Marker message
Tier-2
D
E
F
G
Tier-3
(Stub)
H
I
J
K
Users
Users
Users
Users
Updates in the snapshot may be complete or incomplete
4/14/2009
Department of Computer Science, UIUC
20
Stable Mode – Aggregation
Consolidators
Tier-1
A
B
C
Snapshots
Tier-2
D
E
F
G
Tier-3
(Stub)
H
I
J
K
Users
Users
Users
Users
• Better
reachability
• Longevity
• Full mesh
topology among
the ASes
Why?
Tier-1 ASes are good candidates for being consolidators
4/14/2009
Department of Computer Science, UIUC
21
Stable Mode – Consensus
Tier-1
A
B
C
Paxos message
Tier-2
D
E
F
G
Tier-3
(Stub)
H
I
J
K
Users
Users
Users
Users
Consolidators run Paxos to agree upon a global view by extracting
incomplete updates from the reported snapshots
4/14/2009
Department of Computer Science, UIUC
22
Stable Mode – Flood
Tier-1
A
B
C
Flooding message
Tier-2
D
E
F
G
Tier-3
(Stub)
H
I
J
K
Users
Users
Users
Users
Message contains the set of incomplete updates (I) and the set of ASes (S)
that successfully responded to the snapshot
4/14/2009
Department of Computer Science, UIUC
23
Stable Mode
• SFT Computation
– SFT is computed using the global set of
incomplete updates (I) and local logs
– Routes involving ASes not present in S are not
placed in the SFT
What happens to those ASes?
How does this strategy achieve consensus in
an asynchronous system?
4/14/2009
Department of Computer Science, UIUC
24
Router State
• Routing Information Base (RIB)
– Stores for each prefix the most recent
• Route update received from each neighbor
• Locally selected best route
• Route advertised to each neighbor
• History
– Stores for each prefix a chronological list of received and
selected routes in the RIB
• Stable Forwarding Table (SFT)
– Stores next hop interfaces corresponding to stable routes
4/14/2009
Department of Computer Science, UIUC
25
Triggers
• Each update carries a trigger
• A trigger is a globally unique identifier for a set of
causally related events propagating the network
– It is a two-tuple: (AS number, trigger number)
• Triggers ease tracking updates and reduces control
overhead in consensus routing
• A router ‘A’ stores all the received triggers in its local
History
• Triggers under processing are temporarily stored in a
local set IA
4/14/2009
Department of Computer Science, UIUC
26
Distributed Coordination
• During snapshot, router ‘A’ saves the
sequence of triggers in local History as HA
• Prepare a set of incomplete triggers (IA) that
contains
– All the triggers present in IA
– Triggers waiting in the outgoing queues
– Logged triggers received over incoming channels
(after the start of the current snapshot round)
• HA and IA are sent to the consolidators
4/14/2009
Department of Computer Science, UIUC
27
View Change
Hasn’t finished
computing (k+1)th SFT yet
Use kth SFT
B
C
Use (k+1)th SFT
A
D
E
Send packet to Y
Source (X)
Destination (Y)
Prefix - Y
A
B
C
D
kth SFT
B->C->D
C->D
D
Y
(k+1)th SFT
B->C->E
C->E
E
Y
4/14/2009
Department of Computer Science, UIUC
E
Y
28
Transient Mode
• Consensus routing switches to this mode
when
– The next-hop router along a stable route is
unreachable
– A stable route may not be available
• Uses several known schemes
– Routing deflection
– Detour Routing
– Backup route
4/14/2009
Department of Computer Science, UIUC
29
Route Deflection
1-5-D, 2-5-D, 3-5-D
• After encountering a failed
link, deflect the packet to a
neighboring AS after
consulting RIB
• If no neighbor can be
chosen, then deflect the
packet back to the sending
AS (backtracking)
5-D
5-D
5-D
D
D
D
D
D
– However, backtracking alone
is not sufficient to guarantee
reachability (see figure)
Limitations of backtracking
4/14/2009
Department of Computer Science, UIUC
30
Other Transient Schemes
• Detour Routing
– After encountering a failed link, select a
neighboring AS (arbitrarily) and tunnel transient
packets to it
– Tier-1 ASes are good choices in this selection
• Backup Routes
– Use pre-computed backup routes to forward
packets during failure
4/14/2009
Department of Computer Science, UIUC
31
Evaluation
• Simulation Methodology
– CAIDA AS-level graphs gathered from RouteViews
BGP tables
• Includes 23,390 ASes and 46,095 links annotated with
inferred business relationships of the linked ASes
• Using XORP prototype to measure
implementation overhead
• Using PlanetLab nodes to measure the cost of
consensus
4/14/2009
Department of Computer Science, UIUC
32
Link Failure
• One of the links of a multi-homed stub AS is failed during each
experiment
Consensus routing provides significantly higher levels of
connectivity than BGP
4/14/2009
Department of Computer Science, UIUC
33
Effect of Traffic Engineering
• Withdraw a subprefix from all but one of the providers (3 or
more) of a multi-homed AS
Consensus routing does not affect routing in case of policy
changes
4/14/2009
Department of Computer Science, UIUC
34
Overhead
Control traffic required by
consensus routing
Delay incurred by
consensus routing
In terms of bandwidth and time, consensus routing incurs
little overhead
4/14/2009
Department of Computer Science, UIUC
35
Discussion Points
• Selection of consolidators
– Will Tier-1 ASes (or other ASes) agree to perform
this additional duty?
• Slow ASes may face periods of disconnectivity
– How to handle this situation?
• What can we say about completeness and
accuracy of this strategy?
• Will ASes readily cooperate to handle
transient packets?
4/14/2009
Department of Computer Science, UIUC
36
CAIDA Tools
Presented by: Abdullah Al-Nayeem
CAIDA
• The Cooperative Association for Internet Data
Analysis (CAIDA)
– San Diego Supercomputing Center (SDSC), UCSD
• CAIDA provides data, tools and analyses on
Internet traffic for better understanding of
– current and future network topology, routing,
security, performance and economic issues.
4/14/2009
Department of Computer Science, UIUC
38
CAIDA Tools
• Measurement
– Tools for active or passive measurement of
Internet traffic and flow patterns
• Utilities
– Utilities to aid analysis of Internet traffic and flow
patterns
• Visualization
– Tools to visualize Internet data
4/14/2009
Department of Computer Science, UIUC
39
Internet Measurement Infrastructure
• Archipelago (Ark): CAIDA’s next-generation
active measurement infrastructure
– An evolution of the skitter infrastructure
33 active monitors
at different counties.
4/14/2009
Department of Computer Science, UIUC
40
Scamper
• Measurement tool used at Ark monitors
• Teams of Scamper probers probe all routed /24's in a
short period of time:
– a random address in each /24 prefix is probed
approximately every 48 hours (one probing cycle)
– Supports ICMP-Paris, TCP, UDP traceroute
• Features:
– Measures forward IP paths
– Measures round-trip time
– Discovers maximum transmission unit (MTU) length
4/14/2009
Department of Computer Science, UIUC
41
Scamper Datasets
• IPv4 Routed /24 Topology Dataset
– Useful for understanding the topology of internet
• IPv4 Routed /24 AS Links Dataset
– contains Autonomous System (AS) links derived
from the IP paths of the Topology Dataset
– RouteViews BGP data is used to know the AS
4/14/2009
Department of Computer Science, UIUC
42
Visualization of IPv4 Internet Topology
•
•
•
•
1-17 Jan, 2008
4,853,991 IPv4 address
5,682,419 IP links
17,791 Ases
• Outdegree of an AS is the number of next-hop
ASes that were observed accepting traffic
from this AS
4/14/2009
Department of Computer Science, UIUC
43
RRDTool
• Round Robin Database tool
– A system to store and display time-series data
– Network bandwidth, machine-room temperature, server
load average, etc.
• Features:
– Archives of fixed size for unlimited data
– Overwrite old spots if full
• Limitations:
– Can’t add data for past events
– Can’t add data twice at the same timestamp
4/14/2009
Department of Computer Science, UIUC
44
RRDTool (2)
• Example: Statistics for network interfaces
4/14/2009
Department of Computer Science, UIUC
45
Beluga
• Provides a real-time graph of RTTs and packet
loss to an end host
Stanford to m-root-server
(Tokyo)
4/14/2009
Department of Computer Science, UIUC
46
Walrus
• Directed-graph visualization tool in 3D space
• A meaningful spanning tree is required, for
better visualization.
4/14/2009
Department of Computer Science, UIUC
47
Thanks
Questions and Comments?