Virtual Supercomputing - Department of Computing Science
Download
Report
Transcript Virtual Supercomputing - Department of Computing Science
Virtual Machines for HPC
Paul Lu, Cam Macdonell
Dept of Computing Science
The Problems
1. Making applications run faster
–
–
Not discussed today
Parallelism is not always the answer
2. Making it easier to use different clusters
–
–
Packaging of applications, scripts, and libraries
Dealing with differences in environment
3. Making it easier to manage your files
–
Distributed file systems
Making Use of Clusters
GROMACS
Python 2.3.5
BLAST
Library X
FFTW
Python 2.2
Globus
Trellis
Red Hat Linux
Scientific Linux
• Heterogeneity
creates complexity
• How can a scientist
make use of all
these clusters,
without becoming a
computing scientist?
Shrink-Wrapped VMs
VM
• Package once
GROMACS
Trellis
Linux
Linux, Windows, Mac OS
– OS (e.g., Linux)
– Libraries
– Application(s)
• Run many places
– Busby
– Glacier
– Favourite
workstation
HPC using VMs
File Server,
Laptop
Local
Remote
Glacier
Busby, AICT
GROMACS
Trellis
Linux
GROMACS
Trellis
Linux
GROMACS
Trellis
Linux
GROMACS
Trellis
Linux
• Packaged once, run
on many x86
clusters
• Using Trellis, data is
automatically moved
from local-toremote, and back
GROMACS on VM and HW
Concluding Remarks
• Small performance hit with VMs
• Much easier to package and use
• Potentially, access to many more
compute nodes
There is hope!
• Virtualization!
What is Computing
Science?
• “So…you…like…write programs or
something?”
• Can you fix my printer?
Scientific Computing
• Scientific applications are on the leading
edge of computing
– Lots of resources
– Complex interactions
– Huge amounts of data
Fastest Supercomputer
• Fastest Supercomputer
– IBM BlueGene/L @ LLNL
• Previously fastest
– NEC Earth Simulator
• Are computers good at solving
problems in natural science?
Computing in Canada
• Canada lacks world class computing
facilities
• We have to be able to aggregate
resources from numerous institutions
• The CISS experiments explored
aggregating computing resources
– 4000 CPUs, 19 ADs
Aggregating is difficult
• Different administration domains
• Running GROMACS
– Requires fftw
– Doesn’t like new compilers
– Files must be in certain locations
• And this is just for one application!
Virtualization
• Is it appropriate for Scientific
Computing?
– Performance has improved
– Pricing has improved (it’s become free)
Virtual Images
• Positives
– Completely portable
• Less administration
– Control entire environment within Virtual
Image
• We can run any application in them
• We can bundle data control software within
them
Virtual Images
• Negatives
– Large size
• GBs for virtual disks
– Performance Loss
• Virtualization is slower than running on
hardware
VMware on Busby
• Gromacs test run on Busby1
2
1.8
1.6
1.4
1.2
1
0.8
0.6
0.4
0.2
0
Hardware
Vmware
Hours
Future Directions
•
•
•
•
Resolve performance anomaly
More accurate timings of phases
Run other applications
Get all 4 nodes running concurrently