Beowulf Clusters

Download Report

Transcript Beowulf Clusters

BY Curtis Michels
What is a Beowulf Cluster
Where did the name come from?
“I knew him of yore in his youthful days; his aged father was Ecgtheow
named, to whom, at home, gave Hrethel the Geat his only daughter. Their
offspring bold fares hither to seek the steadfast friend. And seamen, too, have
said me this, — who carried my gifts to the Geatish court, thither for thanks,
— he has thirty men's heft of grasp in the gripe of his hand, the bold-inbattle.”, Beowulf (description of Beowulf by Hrothgar, the Danish king)
Computer Cluster
What is a cluster?
- Is a group of computers connected with each other that perform
parallel computing together.
- One type of cluster is a Commodity Cluster
-
Is a cluster which uses commercially available networks.
- There are four classes of Commodity Cluster
-
Workstation Cluster
-
-
Cluster Farm
-
-
Is a cluster made up of Workstations connected by a LAN.
Is a cluster that uses workstations that aren’t being used to perform tasks
within a local network.
Superclusters
-
Is a Cluster of Clusters within a local network.
Beowulf Cluster
What is a Beowulf Cluster?
- A class of Commodity Cluster.
- It is a cluster which is made up of
commodity off the shelf (COTS)
computers or components.
- A Beowulf cluster is made up of one
host and multiple clients (Nodes)
connected by an Ethernet network.
- Has become the most widely used
parallel computer structure.
- Most applications are either written
in Fortran or C.
A picture of a Linux based Beowulf Cluster.
- picture was retrieved from http://www.copyright-free-images.com
Advantages
1.
Scalability
- All that is needed to increase the capabilities of the cluster is to add
another computer.
- The cluster can be built in phases.
2. Convergence Architecture
- Over time Beowulf clusters has become like a standard.
- It is the mostly likely choose architecture for parallel computing.
- Before the HPC industry would constantly change parallel architecture
types requiring software rework through different generations of
parallel computing.
Advantages (continued)
3. Performance/Price
-
Has a bigger performance to cost ratio than single super computers.
-
All components are COTS with no custom parts makes it cheaper than a single
super computer.
-
A single super computer can cost millions dollars.
-
A Beowulf cluster with the similar capability would have cost in the thousands
of dollars.
-
With Linux the performance to cost advantage is even higher.
4. Flexibility of configuration and upgrade
-
The cluster can be easily configured for any application needing computing
power.
-
There is a wide variety of components to choose to make a Beowulf Cluster.
-
This flexibility makes it easier to upgrade the cluster when new technology
comes out.
Advantages (continued)
5. Able to keep up with changes in technology
- Nodes able to be easily added to system enables the cluster the
ability to keep up with changes in technology.
- Able to immediately integrate new technology into the cluster as
soon as it is available to consumers.
6. More reliable and better fault tolerance.
- A Beowulf cluster is able to gracefully degrade in performance.
- As components fail the number of available processors will decrease
but the system will continue to run.
- The reason for this is the fact that there is redundant hardware in a
Beowulf cluster.
Advantages (continued)
7. Users have higher level of control
- Installation of the system is easy.
- Administrator’s and users are able to control the structure of the
system, operation, and the evolution of the system.
8. Maintenance is easier
- Doesn’t need any special training to maintain since it is made up of off
the shelf components.
- Only need to know basic computer maintenance.
- Don’t need special tools.
Advantages (continued)
9. Development cost and time
- There is no need to design components since they are all off the shelf
parts.
- The components are cheaper than custom made.
- The time to implement is shorter since all the system designer has to
do is pick the parts that would give the system the desired capabilities.
- The system would be very quick to set up and configure. Since once
you have installed the software on one node can easily be copied over
to the other nodes.
Disadvantages
1. Programming is more difficult.
- Each processor has it’s own memory.
- The programs have to been written to take advantage of parallel
processing.
- Bad code can cause the system to become extremely slow eliminating
advantages.
2. Fragmentation of program data.
- The data of the program is split among the different systems.
- Once the calculations or simulation is finished then the data has to be
recombined. When the size of the data get big enough then it will
take the host two long
3. Limited by speed of network
History of Cluster computers
- In the 1950’s under IBM contracts with the United States air force
- Developed the Sage system which was used by NORAD.
- This was the first computer cluster.
- It was based upon MIT’s whirlwind computer Architecture.
1970’s
1. Development of microprocessor with the VLSI (very large scale
Integration) technology.
- This made it easier to have multiple computers able to be integrated
with each other.
2. Ethernet was developed
- The First widely used local area network technology.
- Created a standard for modestly price interconnect devices and a data
transport layer. ( Sterling, p.6)
3. Multitasking support was added to Unix.
1980’s
1. 160 interconnected Apollo workstations was configured as a cluster to
perform computational task for the NSA.
2. Software for task management tools for running a workstation farm
were developed
- UW-Madison developed the Condor software package.
3. PVM (Parallel Virtual Machines) was developed.
- It is a Library of linkable functions that enabled message passing between
computers on a network. To exchange data and to coordinate efforts.
1990’s
1. 1992 – NASA Lewis Research Center
-
Used a small cluster of IBM workstations to simulate the steady-state
behavior of jet aircraft engines.
2. 1993 – NOW (Network of Workstations) project at UC Berkley.
-
First of several cluster they developed in 1993.
-
One of those clusters was put on the Top500 list of the world most
powerful computers.
1994
3. First Beowulf Cluster
-
Developed at the NASA Goddard Space Flight Center by Thomas Sterling and
Donald Becker.
-
It used an early release of Linux and PVM.
- It was made up of 16 computers
- Intel 100 MHz 80486-based
- Connected by dual 10 Mbps Ethernet LAN’s
-
Had to develop the necessary LAN drivers for Linux.
-
Low-Level cluster management tools were developed.
-
This project demonstrated performance to cost advantage that a Beowulf
Cluster had for real-world scientific applications.
4. The first Message-Passing Interface (MPI) standard was adopted by the parallel
computing community.
- This created a uniform set of message-passing semantics and syntax.
1996
5. DOE Los Alamos National Laboratory and California Institute of
Technology with NASA Jet propulsion lab.
-
Demonstrated sustained performance of over 1 GFlops with a Beowulf system
costing $50,000.
-
Awarded the Gordon Bell prize for price/performance of this accomplishment.
2000’s
6. Compaq created a Beowulf Cluster Capable of 30 TFlops.
- Got awards from both the DOE and the NSF.
- Fortran or C was used to write programs using linkable libraries for
message-passing.
How to Setup a Beowulf cluster
- Requirements
- Software
- Set and configuration of the cluster
Requirements
1. The host computer should be faster, have more memory than the clients.
2. Works best if all computers are using the same processor architecture.
( AMD, INTEL, etc.)
3. A version of Linux or Windows.
4. Each computer has an Ethernet card.
5. A network switch, router able to handle all computer being used in
cluster.
Software
1. Condor (Red Hat or Debian)
-
supports distributed Job Streaming.
-
Management emphasizes capacity or throughput computing.
-
Schedule independent Jobs on cluster nodes to handle large user workloads
-
Many scheduling policy options.
2. PBS ( Linux and Windows)
-
Widely used system for distributin parallel user jobs across parallel Beowulf
cluster resources.
-
Admin. Tools for professional systems supervision.
Software (continued)
3. Maui (Linux)
-
Advanced Scheduler
-
Has policies and mechanisms for handling many user request and resource
states.
-
Sits on top other low-level management software.
4. Cluster controller (Windows)
-
Used at Cornell Theory Center.
-
Designed for windows. It takes full advantage of the windows environment.
Software (continued)
5. PVFS (Parallel Virtual Files System) (Unix and Linux)
-
manages secondary storage of Beowulf Cluster.
-
Provides parallel file management shared among distributed nodes of the
system.
-
It delivers faster response and much higher effective disk bandwidth than with
NFS (Network File System).
6. Windows Server 2008 R2 HPC, Windows server 2003 (Compute Cluster Edition)
-
Is able to automatically install nodes.
7. OpenMPI (Linux)
-
Is an open source implementation of MPI-2
-
Can schedule process
Setup
Assumptions made for this setup
- Have two computers or more connected together vie TCP/IP network
( I used two virtual machines)
- Each machine is using the same processor architecture.
- All machines have a common login name with the same password.
For this example : mpiuser.
- Each machine is sharing the same /home folder or has the important
folders synchronized. ( for this example /home is shared)
- The computers are going to have Debian 6.0.0 installed on them.
- OpenMPI will be the software used for the Beowulf cluster.
Setup (continued)
1. Install Linux on all machines. (Using Debian for this example).
1. The slave computers can have a minimum install
2. The master can have a full install
2. Make sure that all the computers can communicate with each other.
Setup (continued)
Host
1. Install openMPI
•
The packages needed are:
•
openmpi-bin
•
openmpi-common
•
libopenmpi1.3
•
libopenmpi-dev ( not needed for clients )
$ apt-get install openmpi-bin openmpi-common libopenmpi1 libopenmpi-dev
Setup (continued)
Host
2. Setting up SSH (used to control the clients)
•
The package openssh-client is needed
$ apt-get install openssh-client
•
Login as mpiuser and create ssh public/private key with a password
the file: /home /mpiuser/.ssh/id_dsa
using
$ ssh-keygen -t dsa
•
Make each computer know that the user mpiuser is authorized to login
$ cp /home/mpiuser/.ssh/id_dsa.pub /home/mpiuser/.ssh/authorized_keys
•
Fix file permissions
$ chmod 700 /home/mpiuser/.ssh
$ chmod 600 /home/mpiuser/.ssh/authorized_keys
•
To test to make sure connection is correct
$ ssh 192.168.137.215
Setup (continued)
Host
3. Configure Open MPI
•
To tell openmpi which machines to run the programs on. A file to store the info has to be
created. I created the file /home/mpiuser/.mpi_hostfile.
# The Hostfile for Open MPI
# The master node, 'slots=1' is used because it is a single-processor machine.
localhost slots=1
# The following slave nodes are single processor machines:
192.168.137.215
This is the contents of .mpi_hostfile
Setup (continued)
Host Machine
4. Make SSH not ask for a password
•
Open MPI uses SSH to connect to the slaves, password should not have to be
entered.
•
$ eval ‘ssh-agent’
•
$ ssh-add ~/.ssh/id_dsa ( tells the ssh-agent the password for the SSH key)
•
$ ssh 192.168.137.215 ( to test)
Setup (continued)
Clients
1. Install openmpi
•
The same as the host except don’t use pakage libopenmpi-dev
2. Install ssh server
•
$ apt-get install openssh -server
Sample program for a Beowulf Cluster
A simple program that just sends random numbers to each node. This
sample was written to use the setup above.
Sample code (testprogram.c)
Sample code(continued)
Sample code (continued)
This code example was gotten from http://techtinkering.com/2009/12/02/setting-up-a-beowulf-cluster-using-openmpi-on-linux/
Running program
1. The Master Node has to be running before the slaves are started.
2. Running a program on this Beowulf cluster isn’t hard.
•
Using testprogram.c
•
First compile it

•
To run the program on 20 proccess on local machine

•
$ mpirun –np 20 ./a.out
To run the program over the Beowulf Cluster (assuming .mpi_hostfile is in the
current directory)

•
$ mpicc testporgram.c
$ mpirun –np 5 --hostfile .mpi_hostfile ./a.out
I used: $ mpirun –np 2 –hostfile .mpi_hostfile ./a.out
Output of sample program
Waiting
Waiting
Waiting
Waiting
Waiting
Waiting
Waiting
Waiting
Waiting
Waiting
Waiting
Task 1: Received 1 char(s) (103) from task 0 with tag 1
Task 0: Sent message 103 to task 1 with tag 1
Simulation of Nuclear Explosion
Los Alamos uses a computer simulation to determine how the aging
stockpile of Nuclear weapons would behave since testing is banned. (2009)
- The simulation revealed how individual atoms and molecules
interact with each other.
- Had to be a much higher resolution than what was used in the past.
- Was used to visualize a number of components within a data set.
-
Scalar fields, vector fields, cell-centered variables, vertex-centered variables,
and polygon information.
-
Cost: $35,000
-
Had one Host and 15 Clients.
-
Being compared to a SGI ONYX 2 supercomputer
Beowulf Cluster used
1. Host hardware
2. Client hardware
•
OS is RedHat Linux 6.2
•
OS is RedHat Linux 6.2
•
733-Mhz processor
•
733-Mhz processor
•
Nvidia Geforce 2
•
Nvidia Geforce 2
•
2 GB of Ram
•
1 GB of Ram
•
55-GB disk
•
40-GB disk
3. Network hardware
•
100-Mbit Ethernet Cards
•
HP ProCurve 4000 switch
Beowulf Cluster in Space
• Singapore first satellite X-Sat used a Beowulf cluster (1995)
•
•
The satellite was equipped with:
•
10m resolution multispectral (color) camera to obtain pictures in the
Singapore and the region around it.
•
A radio link for an Australian-distributed sensor network.
•
PPU ( parallel processing unit)
First use of Linux in space
PPU components
The components in the PPU are:
•
20 x SA 1110
•
Peak performance: 4,000 MIPS (million instructions per second)
•
1,280 MB of memory
•
3,125 cm3 (size)
•
25 Watt (power consumption)
•
Cost: $3,500
•
Processing cost: 0.88 $/MIPS
•
Processing volume: 0.78 cm3/MIPS
•
Processing power: 6.25 mW/MIPS
•
OS: Linux
Summary
1. Beowulf Cluster are made up of COTS computers and components.
2. Programs for Beowulf Clusters are written in either C or Fortran.
3. Beowulf Clusters are cheaper than traditional single system
supercomputers.
4. The advantages over other supercomputers are Scalability, Convergence
Architecture, Performance/Price, Flexibility of configuration and
upgrade, Users have higher level of control, easier to maintain, Able to
keep up with changes in technology, More reliable and better fault
tolerance, and Development costs and time.
5. Beowulf Clusters can be used for many applications.
References
Beowulf.org. n.d. <http://beowulf.org/index.html>.
Chiu, Steve. "Current issues in high performance computing I/O architectures and
systems." Journal of Supercomputing (2008): 105-107.
Cluster resources :: Products - Maui Cluster Scheduler. 2011. 23 October 2011.
<http://www.clusterresources.com/products/maui-cluster-scheduler.php>.
Condor High Throughput Computing. 12 October 2011. 23 October 2011.
<http://www.cs.wisc.edu/condor/>.
Heckendorn, Robert B. "Building a Beowulf: Leveraging Research and
Department Needs for StudentEnrichment via Project Based Learning."
Computer Science Education December 2002: 255 - 273.
"Keener eyes for Beowulf." Mechanical Engineering June 2001: 78-79.
"Linux Clusters Serve Low End." InfoWorld 26.45 (2004): 21.
References (continued)
McLoughlin, Ian, Timo Bretschneider and Bharath Ramesh. "First Beowulf
Cluster in Space." Linux Journal September 2005: 34-38.
Open MPI: Open Source High Performance Computing. 3 October 2011. 23
October 2011. <http://www.open-mpi.org/>.
Parallel Virtual File System, version 2. n.d. 23 October 2011.
<http://www.pvfs.org/>.
Roach, Ronald. "Ball State creates supercomputer from old desktop
computers." Black Issues in Higher Education 19.5 (2002): 29.
Sclyd Clusterware. n.d. 23 October 2011.
<http://www.penguincomputing.com/software/scyld_clusterware>.
References (continued)
Sterling, Thomas. Beowulf Cluster Computing with Linux. Massahusetts
Institue of Technology, 2002.
—. Beowulf Cluster Computing with Windows. Massachusetts Institute of
Technology, 2002.
Woodman, Lawrence. Setting up a Beowulf Cluster Using Open MPI on
Linux. 2 December 2009. 15 10 2011.
<http://techtinkering.com/2009/12/02/setting-up-a-beowulf-cluster-usingopen-mpi-on-linux/>.