Transcript CCGrid 2002

Virtual Private Grid:
A Command Shell for Utilizing
Hundreds of Machines Efficiently
Kenji Kaneda
Kenjiro Taura
Akinori Yonezawa
University of Tokyo
Outline of Presentation
•
•
•
•
•
•
Introduction
Virtual Private Grid (VPG)
Implementation of VPG
Experiments
Summary
Demonstration
Outline of Presentation
•
•
•
•
•
•
Introduction
Virtual Private Grid (VPG)
Implementation of VPG
Experiments
Summary
Demonstration
Background
• People have many computational resources
– Super Computers
– Clusters of Workstation/PC
– etc.
• People want to run parallel programs
– e.g.) parameter-sweep-application
My Home
Univ.
Lab.
Ancient Internet
• No administrative restrictions
– e.g.) No firewall, DHCP client, private
IP, …
> rsh 133.11.12.200
ps
5762 pts/10 0:00 tcsh
– Each machine had a unique IP address
> rsh 133.11.12.201 ps
– Machines were able to communicate
directly with
5763 pts/10 0:00 tcsh
each other
> rsh 133.11.12.202 ps
• Job submission was very easy5764 pts/10 0:00
tcsh
133.11.12.200
133.11.12.201
133.11.12.202
Today's Internet
• Various administrative restrictions
– e.g.) firewall, DHCP client, private IP
– Machines have no unique name
– Machines cannot communicate directly with
each other
• Job submission becomes cumbersome
DHCP client
Firewall
private IP
Examples of Administrative
Restrictions (1/3)
• Firewall
– It restricts access from external
hostsps
> rsh host1
Error: cannot connect to host0
> rsh gateway0
> ssh gateway1
> rsh host1 ps
5764 pts/10 0:00 tcsh
host0
host1
Firewall
gateway0
gateway1
Examples of Administrative
Restrictions (2/3)
• DHCP client
– Its IP address changes dynamically
DHCP client
133.11.12.199
133.11.12.200
> rsh 133.11.12.200 ps
5764 pts/10 0:00 tcsh
> rsh 133.11.12.200 ps
Error: cannot connect to 133.11.12.200
> rsh 133.11.12.199 ps
5764 pts/10 0:00 tcsh
Examples of Administrative
Restrictions (3/3)
rsh 192.168.0.1 ps
• Firewall + Private IP >
Error: cannot connect to 192.168.0.1
– It restricts access from
external
hosts
> rsh
gateway0
> ssh gateway1
> rsh gateway2
> rsh 192.168.0.1 ps
5764 pts/10 0:00 tcsh
private IP
Firewall
192.168.0.1
gateway0
gateway1
gateway2
Utilizing Remote Machines is
Difficult
• User need to work around administrative
restrictions
– e.g.) connecting all the machines by keeping TCP
connections permanently
• Gateways keep a connection between them
• DHCP client initiates a connection to outside
• Private host initiates a connection to outside
• It is impossible to handle all the restrictions
manually
– as the number of machines increases
– e.g.) It is difficult to select only necessary
connections to create a single connected graph
Outline of Presentation
•
•
•
•
•
•
Introduction
Virtual Private Grid (VPG)
Implementation of VPG
Experiments
Summary
Demonstration
Virtual Private Grid (VPG)
• Command shell
– User can access remote machines directly
and transparently
• e.g.) command@host
– VPG automatically constructs a selfstabilizing spanning tree
• VPG requires only a small number of
connections
• New tree is constructed whenever network
topology changes
Features (1/4)
• Nicknaming
– VPG gives each machine a unique name
• Independent from a DNS name and IP address
host0
host3
host4
host5
host1
host2
host6
Features (2/4)
• Remote job submission
– e.g.) ps@host4
Standard input is
forwarded
'ps' is executed
host0
host3
Output is forwarded
host1
host2
host4
host5
host6
Features (3/4)
• Redirection from/to file on remote machine
– e.g.) cat@host4 < fileA@host0 > fileA@host6
• Pipe between remote machines
– e.g.) tar@host2 –c file | tar@host5 x
• Shell script
– e.g.) parameter-sweep-application
host1
host3
host4
host5
host0
host2
host6
Features (4/4)
• No modification of administrative policy
• VPG tolerates
– dynamic addition/removal of machines
– dynamic disconnection between machines
host1
host3
host4
host5
host0
host2
host6
Outline of Presentation
•
•
•
•
•
•
Introduction
Virtual Private Grid (VPG)
Implementation of VPG
Experiments
Summary
Demonstration
Overview of Implementation
1. Daemons start on all the available machines
– creating and keeping some TCP connections
– constructing a tree
2. Shell starts on a user's local host
– keeping track of network topology
– calculating a shortest-path to the target
Shell
Other Implementation Issues
(1/2)
• Daemon Configuration
– User must give following configuration
information manually
• Nickname
• Port number
• List of possible connections
Other Implementation Issues
(2/2)
• Security Issues
– Many firewalls allow only SSH to login
• VPG uses SSH connection if necessary
– Public key authentication with empty pass-phrase
– We are planning to authenticate
connections between daemons
• restricting access from malicious users
Tree Construction Algorithm
(1/3)
• We uses self-stabilizing spanning tree
algorithm [S. Kutten et al.]
– Each node knows only its possible connections
– Each node initiates some connections to its
neighbors
• Nodes gradually coalesce into large trees
• Eventually, all the nodes construct a single spanning tree
• Nodes construct a new tree whenever network topology
changes dynamically
possible connection
kept connection
Tree Construction Algorithm
(1/3)
• We uses self-stabilizing spanning tree
algorithm [S. Kutten et al.]
– Each node knows only its possible connections
– Each node initiates some connections to its
neighbors
• Nodes gradually coalesce into large trees
• Eventually, all the nodes construct a single spanning tree
• Nodes construct a new tree whenever network topology
changes dynamically
possible connection
kept connection
Tree Construction Algorithm
(1/3)
• We uses self-stabilizing spanning tree
algorithm [S. Kutten et al.]
– Each node knows only its possible connections
– Each node initiates some connections to its
neighbors
• Nodes gradually coalesce into large trees
• Eventually, all the nodes construct a single spanning tree
• Nodes construct a new tree whenever network topology
changes dynamically
possible connection
kept connection
Tree Construction Algorithm
(1/3)
• We uses self-stabilizing spanning tree
algorithm [S. Kutten et al.]
– Each node knows only its possible connections
– Each node initiates some connections to its
neighbors
• Nodes gradually coalesce into large trees
• Eventually, all the nodes construct a single spanning tree
• Nodes construct a new tree whenever network topology
changes dynamically
possible connection
kept connection
Tree Construction Algorithm
(2/3)
• How to merge trees
– Each tree has a unique priority
– Tree with higher priority overruns a tree with
lower priority
– Eventually, single tree with the highest priority is
constructed
0
17
11
1
13
12
7
6
2
3
10
8
9
16
4
5
15
14
Tree Construction Algorithm
(2/3)
• How to merge trees
– Each tree has a unique priority
– Tree with higher priority overruns a tree with
lower priority
– Eventually, single tree with the highest priority is
constructed
17
10
13
6
3
8
9
16
4
5
15
Tree Construction Algorithm
(2/3)
• How to merge trees
– Each tree has a unique priority
– Tree with higher priority overruns a tree with
lower priority
– Eventually, single tree with the highest priority is
constructed
10
17
16
15
Tree Construction Algorithm
(2/3)
• How to merge trees
– Each tree has a unique priority
– Tree with higher priority overruns a tree with
lower priority
– Eventually, single tree with the highest priority is
constructed
17
Tree Construction Algorithm
(3/3)
• Possible connections need not be very precise
– Even if possible connections are not listed, VPG
works correctly
– Even if blocked connections are listed, VPG works
correctly
– This information is currently required only for
performance
• We are planning to detect it dynamically
Tree Construction Algorithm
(3/3)
• Possible connections need not be very precise
– Even if possible connections are not listed, VPG
works correctly
– Even if blocked connections are listed, VPG works
correctly
– This information is currently required only for
performance
• We are planning to detect it dynamically
Tree Construction Algorithm
(3/3)
• Possible connections need not be very precise
– Even if possible connections are not listed, VPG
works correctly
– Even if blocked connections are listed, VPG works
correctly
– This information is currently required only for
performance
• We are planning to detect it dynamically
Tree Construction Algorithm
(3/3)
• Possible connections need not be very precise
– Even if possible connections are not listed, VPG
works correctly
– Even if blocked connections are listed, VPG works
correctly
– This information is currently required only for
performance
• We are planning to detect it dynamically
Tree Construction Algorithm
(3/3)
• Possible connections need not be very precise
– Even if possible connections are not listed, VPG
works correctly
– Even if blocked connections are listed, VPG works
correctly
– This information is currently required only for
performance
• We are planning to detect it dynamically
Outline of Presentation
•
•
•
•
•
•
Introduction
Virtual Private Grid (VPG)
Implementation of VPG
Experiments
Summary
Demonstration
Experimental Environment
• Network environment
– 3 subnets
– Various architectures
• Sparc, x86, MIPS, PowerPC
– Various operating systems
• Solaris, Linux, IRIX
• VPG ran on approximately
100 nodes
Comparison to Other Job
Submission Tools (1/2)
• Job submission tools
–
–
–
–
globus-job-run [I. Foster et.al.]
SSH
rsh
VPG
• Measuring overhead of job submission
– Turn around time of a light-weight job submission
– Multi-hops job submission
• e.g.) ssh hostA 'ssh hostB command'
Comparison to Other Job
Submission Tools (2/2)
• globus-job-run, SSH, and rsh require
authentication, whenever submitting a job
– Authentication of globus-job-run and SSH causes a
large overhead
• VPG does not require authentication, whenever
submitting a job
turn around time (s)
– VPG requires authentication only once when creating
7
a tree
6
5
4
3
2
1
0
globus-job-run
Secure Shell
rsh
VPG
1hop
2hop
3hop
Low Level Interface
• We derive a communication library out
of VPG implementation
– vpg_connect()
– vpg_accept()
– vpg_send()
– vpg_recv()
• We ran ray-tracing program on 32
nodes
Outline of Presentation
•
•
•
•
•
•
Introduction
Virtual Private Grid (VPG)
Implementation of VPG
Experiments
Summary
Demonstration
Related Work (1/2)
• Globus [I. Foster et.al.]
– No build-in function for working around
administrative restrictions
• Condor [M. Livny et.al.]
– No build-in function for working around
administrative restrictions
• SSH [http://www.ssh.com]
– Difficult to utilize hundreds of machines
Related Work (2/2)
• SOCKS [www.socks.nec.com]
– General framework to bypass a firewall
– No naming scheme for private IP nor DHCP client
– Difficult to manage machines over many subnets
• Virtual Private Network (VPN)
– Mechanism to connect multiple subnets via Internet
– It requires modification of administrative policy
Summary
• Virtual Private Grid
– utilizing many machines over multiple
subnets
• by working around administrative restrictions
automatically
– available at
• http://web.yl.is.s.u-tokyo.ac.jp/~kaneda/vpg
Future Work
• Simplification of daemon configuration
– e.g.) removal of a list of possible connections
– by modifying tree-construction algorithm
• Automatic resource selection
– Simple task mapping algorithm
• Location of input/output file
• Communication through pipes
Outline of Presentation
•
•
•
•
•
•
Introduction
Virtual Private Grid (VPG)
Implementation of VPG
Experiments
Summary
Demonstration
Demonstration
• Job submission through VPG
• Dynamic tree construction