link request

Download Report

Transcript link request

A Survey on Parallel Computing in
Heterogeneous Grid Environments
Takeshi Sekiya
Chikayama-Taura Laboratory M1
Nov 24, 2006
Dynamic
Change of
CPU/Network
Load
Parallel Computing
in Grid Environments
• Increase opportunity in
which we can use multi
cluster environments
– But, schemes for stand alone
clusters cause problems in
grid-like usage
Heterogeneous
hardware and Firewall/
software
NAT
• New mechanisms are
needed
– Handling heterogeneity
– Firewall/NAT traversal
– Adaptation to dynamic
environment
– Monitoring
Maintenance
Complex
Configuration
Difficult to Know
What’s Happening
Failure
Heterogeneous Environments
• Heterogeneous machines
– Binaries are different
– Complex configuration are required when
hardware/software is different
• Heterogeneous networks
– Overheads of synchronization in parallel
application with different latency/bandwidth
– Firewalls/NATs
Firewall/NAT
• Firewalls/NATs hinder bi-directional
connectivity
• Bi-directional TCP/IP connectivity
needs to be provided to support a wide
spectrum of applications
Firewall or NAT
Solutions to the Internet
Asymmetric-Connectivity Problem
• MPI Environment on Grid with Virtual
Machines [Tachibana et al. 2006]
– Xen for VM and VPN for Virtual Network
– Low cost VM migration
• ViNe [Tsugawa et al. 2006]
– A host named Virtual Router
– Overlay network base
• WOW [Ganguly et al. 2006]
Outline
• Introduction
• WOW
– IPOP: IP over P2P
– Routing IP on the P2P Overlay
– Connection Setup
– Joining an Existing Network
– NAT Traversal
– Experiments
• Summary
Objective and Approach
• The system architected to …
– Adapt heterogeneous environments
• Present to end-users a cluster-like environment
– Scale to large number of nodes
– Facilitate the addition of nodes through selforganization of virtual network
• Less manual configuration
• Approach with Virtualization
– Virtual Machines
• Homogeneous software
– Self-organizing overlay network
• All-to-all connectivity
Virtual Machine
• A homogeneous
software environment
• Offering opportunities
for load balancing and
fault tolerance
• Users can use preconfigured systems
– Linux distribution
– Libraries and softwares
Virtual Network
Virtual Grid Cluster
IPOP (IP over P2P)
P2P Network
Physical Infrastructure
NAT
firewall
P2P overlay network
IPOP [Ganguly et al. 2006]
• Characteristics
– A virtual IP address space
– Self-organizing
• Architecture
– IP tunneling over P2P
– A virtualized network interface (tap)
captures virtual IP packets
– Brunet P2P overlay network
Capturing Virtual IP Packets
• The tap appears as a network interface from
applications
• IPOP translates virtual IP addresses to
Brunet P2P network addresses
Ethernet Frame
IP Packet
application
Tunneling
application
tap
tap
IPOP
Brunet Message
IP Packet
IPOP
Ethernet Frame
IP Packet
Brunet P2P
• Ring-structured overlay
• Organized connections
– Near: with neighbors
– Far: across the ring
• 160 bit SHA-1 hash
address
• Greedy routing
• Each node has constant
number of connections
–
O(log2(n))
overlay hops
n4
n3
n5
n2
n1
n6
Multi hop path
from n1 to n7
n7
n8
n12
n9
n11
n10
Connection Setup
Connection Protocol
•
Node A wishes to
connect to node B
CTM reply
CTM request
1. A sends a CTM (Connect
To Me) request to B over
P2P network
•
The CTM request contains
A
A’s URI
B
2. When B receives the
CTM request, B sends a
CTM reply to A
•
The CTM reply contains B’s
URI
URI (Uniform Resource Indicator)
ex.) brunet.tcp:192.0.0.1:1024
Connection Setup
Linking Protocol
3. B sends a link request
connection
message to A over the Direct
A to B
physical network
4. When A receives the link
request, A simply
link request
A
responds with a link
reply message
link reply
5. Finally, new connection is
established between A
and B
B
Linking Race Condition (1)
• Race condition may
occur because linking
protocol is initiated by
both peers
link request
link reply
link request
link reply
Both attempts succeed
Linking Race Condition (2)
link request
• Check no existing
connection or
Active linking on?
connection
attempt, when nodes
link error
receive link request
• When nodes receive
link error, they restart
protocol with random
back-off
Random back-off
link request
link error
link request
link reply
Joining an Existing Network
Leaf Connection
• A new node N creates a
leaf connection to an
initial node I by directly
using linking protocol
Correct position
of new node
• I acts as forwarding
agent for N
Initial node I
Leaf connection
New node N
Joining an Existing Network
Send CTM request
• N sends a CTM request
addressed to itself over
P2P network
– the CTM request
contains N’s URI
Left neighbor L
• A CTM request is
received by right and left
neighbors, since Right neighbor R
N is still not in the ring
CTM request
Initial node I
New node N
Joining an Existing Network
Send CTM reply
• L and R send CTM reply
including their URI to I
• I forwards CTM reply to
Left neighbor L
N
CTM reply
Right neighbor R
Initial node I
CTM reply
New node N
Joining an Existing Network
Linking Protocol
• Start linking protocol
• L and R send link
request message to N
over the physical Left neighbor L
network
Link request
Right neighbor R
Initial node I
Link request
New node N
Joining an Existing Network
Complete Joining
• N forms connections with
neighbors and is in ring
• Acquires “far” connections
Left neighbor L
New node N
Right neighbor R
Initial node I
Adaptive Shortcut Creation
• High latencies were observed in
experiments due to multi-hop overlay routing
• Shortcut creation
– Count IPOP packets to other nodes
– When number of packets within an interval
exceeds threshold, initiate connection setup
– Because overhead incurred during maintenance
connections, drop connections no longer in use
NAT
IP: 192.168.0.2
IP: 133.11.238.100
Src: 192.168.0.2:5000
Dst: 157.82.13.244:80
Host a
Src: 157.82.13.244:80
Dst: 192.168.0.2:5000
IP: 157.82.13.244
Src: 133.11.23.100:6000
Dst: 157.82.13.244:80
NAT
Host b
Src: 157.82.13.244:80
Dst: 133.11.23.100:6000
Private Network
Global Network
NAT Table
192.168.0.2:5000 ⇔ 133.11.23.100:6000
NAT Traversal
UDP Hole Punching
IP: A
IP: N
Src: A:a
Dst: M:m
Host A
IP: M
IP: B
Src: N:n
Dst: M:m
NAT
Src: M:m
Dst: A:a
NAT Table
A:a ⇔ N:n
NAT
Src: M:m
Dst: N:n
Host B
Src: B:b
Dst: N:n
NAT Table
M:m ⇔ B:b
Experimental Setup
Hosts: 2.4GHz Xeon, Linux 2.4.20,
VMware GSX
Hosts: 2.0 GHz Xeon, Linux 2.4.20,
VMware GSX
Host: 1.3GHz P-III Linux 2.4.21
VMPlayer
Host: 1.7GHz P4,
Win XP SP2, VMPlayer
34 compute nodes, 118 P2P router nodes on PlanetLab
Experiment 1
Joining and Shortcut Connections
• Node A: IPOP node
• Node B: new joining node
– A and B are in different
network domains with NAT
– B sends ICMP packets to A
at 1sec intervals
• Within period 1 (about 3
seconds), B establish a
route to other nodes
• Within period 2 (about
28seconds), B establish a
shortcut connections to A
Experiment 2
PVM parallel application: FastDNAml (1)
• Parallelization with PVM
based master-workers
model
• FastDNAml has a high
computation-tocommunication ratio
• Dynamic task assignment
tolerates performance
heterogeneities among
computing nodes
Master
Task Pool
Workers
Experiment 2
PVM parallel application: FastDNAml (2)
Sequential
Execution
Parallel Execution
Node #2
30 Nodes
Shortcuts disabled Shortcuts enabled
Execution time (sec)
22272
2033
1642
Parallel Speed up
n/a
11.0
13.6
• The execution with shortcuts enabled is 24%
faster than that with shortcuts disabled
• The parallel speedup is 13.6x
– 23x is reported in previous work in homogeneous
cluster
Summary
• Introduced WOW
– Scalable, fault-resilient and low
management infrastructure
• Future works
– Research on middleware which is easy to
use for heterogeneous adaptive Grid
environment