Multi-homing

Transcript Multi-homing

T-110.455 Network Application
Frameworks and XML
Distributed Hash Tables (DHTs) and
multi-homing
16.2.2005
Sasu Tarkoma
Based on slides by Pekka Nikander
Contents

Introduction




Layered model, routing table vs. address
Multi-homing
Overlay networks
Summary
Introduction


A name determines the node / service
An address gives its location





May change in mobility
A path determines how to get there
Routing determines the path from an
address
Directories convert names to addresses
You can think of routing tables and
directories as distributed databases
Layered-model revisited
Finding, meta-data, invoking,
syncing, mobility. Web Services
Object API
Presentation
XML presentation
DNS names
Firewall bypass
Congestion control
End-to-end
IP addresses
Routing
Routing paths
Topology in address vs.
routing table
Reactive
AD HOC
(MANET)
routing
Proactive
ad hoc
(MANET)
routing
ATM
PNNI
Pure source routing
Original IP
CIDR routing
Host-based hop-by-hop
Identity/Locator split

New name space for IDs





Maybe based on DNS
Maybe a separate
namespace
Maybe IP addresses are
used for location
Good for hiding IP versions
Communication endpoints (sockets) bound to
identifiers
Process
Transport
identifier
ID Layer
IP Layer
Link Layer
locator
Multi-addressing

Multi-homing




Server has multiple addresses on different
networks for increased reliability
Client has multiple addresses
Multi-homing requires support for
address change
Topology change can cause
renumbering



From old prefix to new prefix
Changing the IP host addresses of each
device within the network
Related with multi-homing and must be
supported by mobility protocols
HIP, SCTP
Multihoming
End-host
multihoming
End-host
mobility
Mobility
Mobile IP,
HIP,
One
SCTP
host
reliability, loadsharing
SoHo site
multihoming
enterprise
multihoming
Moving
ad hoc
multi-homed
networks Site multihoming networks
challenge: prefixes
Moving networks
create additional state
(NEMO) MULTI6 in IETF for
IPv6 multihoming
Single
Parts of
All
subnet
topology
hosts
Multi-homing Examples
Site multi-homing
ISP1
End-host multi-homing
NAT1
Multi-homed
site
ISP2
Internet
NAT2
WLAN
GPRS
Wireless Host
Multi-homing

End-host multihoming

Transport-level (SCTP)



Locator / Identity split




Good when there is a connection
Needs mobility management for boot-strapping
Host Identity Protocol
Natural way to decouple topological
location/locations and identity
Needs directories to solve bootstrapping
problems
We need directories for coping with
mobility and multi-homing

Scalability, low latency updates
Overlays



Motivation
Terminology
Overlays


Clusters and wide-area
Applications for DHTs


Internet Indirection Infrastructure (i3)
Delegation Oriented Architecture (DOA)
Motivation for Overlays

Directories are needed


Name resolution
Mobility support


Required properties



Fast updates
Scalability
DNS has limitations



Fast updates
Update latency
Administration
One solution

Overlay networks
Terminology

Peer-to-peer (P2P)



Overlay networks


Routing systems that run on top of another
network, such as the Internet.
Distributed Hash Tables (DHT)



Different from client-server model
Each peer has both client/server features
An algorithm for creating efficient distributed
hash tables (lookup structures)
Used to implement overlay networks
Typical features of P2P / overlays

Scalability, resilience, high availability, and
they tolerate frequent peer connections and
disconnections
Peer-to-peer in more detail

A P2P system is distributed



Large faction of nodes are unreliable



Nodes come and go
P2P enabled by evolution in data
communications and technology
Current challenges:


No centralized control
Nodes are symmetric in functionality
Security, IPR issues
P2P systems are decentralized overlays
Evolution of P2P systems

Started from centralized servers

Napster



Second generation used flooding

Gnutella



Centralized directory
Single point of failure
Local directory for each peer
High cost, worst-case O(N) messages for
lookup
Research systems use DHTs


Chord, Tapestry, CAN, ..
Decentralization, scalability
Challenges

Challenges in the wide-area




Scalability
Increasing number of users, requests, files,
traffic
Resilience: more components -> more
failures
Management: intermittent resource
availability -> complex management
Overlay Networks



Origin in Peer-to-Peer (P2P)
Builds upon Distributed Hash Tables
(DHTs)
Easy to deploy



No changes to routers or TCP/IP stack
Typically on application layer
Overlay properties



Resilience
Fault-tolerance
Scalability
Upper layers
DNS names, custom
identifiers
Overlay
Overlay addresses
Congestion
End-to-end
Routing
IP addresses
Routing paths
Cluster vs. Wide-area

Clusters are





single, secure, controlled, administrative
domains
engineered to avoid network partitions
low-latency, high-throughput SANs
predictable behaviour, controlled environment
Wide-area




Heterogeneous networks
Unpredictable delays and packet drops
Multiple administrative domains
Network partitions possible
Wide-area requirements



Easy deployment
Scalability to millions of nodes and
billions of data items
Availability



Copes with routine faults
Self-configuring, adaptive to network
changes
Takes locality into account
Distributed Data Structures
(DSS)


DHTs are an example of DSS
DHT algorithms are available for clusters
and wide-area environments


Cluster-based solutions



They are different!
Ninja
LH* and variants
Wide-area solutions



Chord, Tapestry, ..
Flat DHTs, peers are equal
Maintain a subset of peers in a routing table
Distributed Data Structures
(DSS)

Ninja project (UCB)



New storage layer for cluster services
Partition conventional data structure across
nodes in a cluster
Replicate partitions with replica groups in
cluster



Availability
Sync replicas to disk (durability)
Other DSS for data / clusters


LH* Linear Hashing for Distributed Files
Redundant versions for high-availability
Cluster-based Distributed
Hash Tables (DHT)





Directory for non-hierarchical data
Several different ways to implement
Each “brick” maintains a partial map
Overlay addresses used to direct to the
right “brick”
Resilience through parallel, unrelated
mappings
clients interact
with anyclient
service
“front-end”
client
service
DSS lib
Brick = singlenode, durable
hash table,
replicated
client
client
service
DSS lib
SAN
Service interacts
with DSS lib
Hash table API
Redundant, low
latency, high
throughput
network
storage
“brick”
storage
“brick”
storage
“brick”
storage
“brick”
storage
“brick”
storage
“brick”
DHT interfaces

DHTs offer typically two functions




put(key, value)
get(key) --> value
delete(key)
Supports wide range of applications

Similar interface to UDP/IP





Send(IP address, data)
Receive(IP address) --> data
No restrictions are imposed on the
semantics of values and keys
An arbitrary data blob can be hashed to
a key
Key/value pairs are persistent and global
Distributed applications
put(key, value)
get(key)
value
Distributed Hash Table (DHT)
Node
Node
Node
DHT balances keys and
data across nodes
Node
Some DHT applications









File sharing
Web caching
Censor-resistant data storage
Event notification
Naming systems
Query and indexing
Communication primitives
Backup storage
Web archive
Chord

Chord is an overlay algorithm from MIT


Chord is a lookup structure (a directory)


Stoica et. al., SIGCOMM 2001
Resembles binary search
Uses consistent hashing to map keys to
nodes


Keys are hashed to m-bit identifiers
Nodes have m-bit identifiers



IP-address is hashed
SHA-1 is used as the baseline algorithm
Support for rapid joins and leaves


Churn
Maintains routing tables
Chord routing I


A node has a well determined place
within the ring
Identifiers are ordered on an identifier
circle modulo 2m



A node has a predecessor and a
successor
A node stores the keys between its
predecessor and itself


The Chord ring
The (key, value) is stored on the successor
node of key
A routing table (finger table) keeps track
of other nodes
Finger Table



Each node maintains a routing table with
at most m entries
The i:th entry of the table at node n
contains the identity of the first node, s,
that succeeds n by at least 2i-1 on the
identifier circle
s = successor(n + 2i-1)

The i:th finger of node n
Finger
Real node
Maps to
1,2,3
x+1,x+2,x+4
N14
4
x+8
N21
5
x+16
N32
x+32
N42
6
m=6
2m-1 0
N56
N1
for j=1,...,m the
fingers of p+2j-1
Predecessor node
N8
N51
+1
+2
+4
N14
N42
+8
+32
N21
N38
+16
N32
Chord routing II

Routing steps




check whether k is found between n and the
successor of n
if not, forward the request to the closest finger
preceding k
Each know knows a lot about nearby
nodes and less about nodes farther
away
The target node will be eventually found
Chord lookup
m=6
2m-1 0
N56
N1
N8
N51
N14
N42
N21
N38
N32
Chord Properties




Each node is responsible for K/N keys (K
is the number of keys, N is the number of
nodes)
When a node joins or leaves the
network only O(K/N) keys will be
relocated
Lookups take O(log N) messages
To re-establish routing invariants after
join/leave O(log2 N) messages are
needed
Tapestry

DHT developed at UCB


Used in OceanStore



Secure, wide-area storage service
Tree-like geometry
Suffix - based hypercube


Zhao et. al., UC Berkeley TR 2001
160 bits identifiers
Suffix routing from A to B


hop(h) shares suffix with B of length digits
1245 routes to 5432 via


5432 -> 3235 -> 1245 -> 6245 -> 1245
Tapestry Core API:



publishObject(ObjectID,[serverID])
routeMsgToObject(ObjectID)
routeMsgToNode(NodeID)
Pastry I


A DHT based on a circular flat identifier
space
Prefix-routing





Message is sent towards a node which is
numerically closest to the target node
Procedure is repeated until the node is found
Prefix match: number of identical digits before
the first differing digit
Prefix match increases by every hop
Similar performance to Chord
Pastry II

Routing a message
 If
leaf set has the prefix --> send to
local
 else send to the identifier in the
routing table with the longest common
prefix (longer then the current node)
 else query leaf set for a numerically
closer node with the same prefix
match as the current node
 Used in event notification (Scribe)
Security Considerations

Malicious nodes


Attacker floods DHT with data
Attacker returns incorrect data




self-authenticating data
Attacker denies data exists or supplies
incorrect routing info
Basic solution: using redundancy
What if attackers have quorum?


Need a way to control creation of node Ids
Solution: secure node identifiers

Use public keys
Applications for DHTs

DHTs are used as a basic building block
for an application-level infrastructure

Internet Indirection Infrastructure (i3)


New forwarding infrastructure based on Chord
DOA (Delegation Oriented Architecture)

New naming and addressing infrastructure
based on overlays
Internet Indirection
Infrastructure (i3)

A DHT - based overlay network



Based on Chord
Aims to provide more flexible
communication model than current IP
addressing
Also a forwarding infrastructure




i3 packets are sent to identifiers
each identifier is routed to the i3 node
responsible for that identifier
the node maintains triggers that are installed
by receivers
when a matching trigger is found the packet is
forwarded to the receiver
i3 II


An i3 identifier may be bound to a host,
object, or a session
i3 has been extended with ROAM



Robut Overlay Architecture for Mobility
Allows end hosts to control the placement of
rendezvous-points (indirection points) for
efficient routing and handovers
Legacy application support

user level proxy for encapsulating IP packets
to i3 packets
R inserts a trigger (id, R) and receives
all packets with identifier id.
Mobility is transparent for the sender
the host changes its address from R1 to R2,
it updates its trigger from (id, R1) to (id, R2).
Source: http://i3.cs.berkeley.edu/
A multicast tree using a hierarchy of triggers
Source: http://i3.cs.berkeley.edu/
Anycast using the longest matching prefix rule.
Source: http://i3.cs.berkeley.edu/
Sender-driven service composition using
a stack of identifiers
Receiver-driven service composition using
a stack of identifiers
Source: http://i3.cs.berkeley.edu/
DOA (Delegation Oriented
Architecture)



Proposed to circumvent the harmful sideeffects of middleboxes
Middleboxes are used to improve
scalability, but they also constrain the
Internet architecture
Two changes




persistent host identifiers
a mechanism for resolving these
Each host has a unique 160 bit EID
(End-point Identifier)
Works in a similar fashion to HIP

EID hash of a public key, identity/locator split,
DOA II




DOA header is located between IP and
TCP headers and carries source and
destination EIDs
The mapping service maps EIDs to IP
addresses
This allows the introduction of various
middleboxes to the routing of packets
Outsourcing of intermediaries


Ideally clients may select the most useful
network elements to use
Differences to i3


Not a forwarding infrastructure
i3 does not provide a primitive for specifying
intermediaries
Layered-model
Object API
Presentation
Host Identity,
public key
Firewall bypass
Congestion control
IP addresses
End-to-end
Routing
Routing paths
Summary

Mobility and multi-homing require
directories


Overlay networks have been proposed




Scalability, low-latency updates
Searching, storing, routing, notification,..
Lookup (Chord, Tapestry, Pastry),
coordination primitives (i3), middlebox
support (DOA)
Logarithmic scalability, decentralised,…
Host/identity split and overlays for
directories seem good candidates for
solving many of the issues with mobility
and multi-homing