Cluster Computing
Download
Report
Transcript Cluster Computing
Cluster Computing:
An Introduction
金仲達
國立清華大學資訊工程學系
[email protected]
Clusters Have Arrived
1
What is a Cluster?
A collection of independent computer systems
working together as if a single system
Coupled through a scalable, high bandwidth, low
latency interconnect
The nodes can exist in a single cabinet or be
separated and connected via a network
Faster, closer connection than a network (LAN)
Looser connection than a symmetric multiprocessor
2
Outline
Motivations
of Cluster Computing
Cluster Classifications
Cluster Architecture & its Components
Cluster Middleware
Representative Cluster Systems
Task Forces on Cluster
Resources and Conclusions
3
Motivations of
Cluster Computing
4
How to Run Applications Faster ?
There are three ways to improve performance:
Work harder
Work smarter
Get help
Computer analogy
Use faster hardware: e.g. reduce the time per instruction
(clock cycle)
Optimized algorithms and techniques
Multiple computers to solve problem
=> techniques of parallel processing is mature and can
be exploited commercially
5
Motivation for Using Clusters
Performance of workstations and PCs is rapidly
improving
Communications bandwidth between computers is
increasing
Vast numbers of under-utilized workstations with
a huge number of unused processor cycles
Organizations are reluctant to buy large, high
performance computers, due to the high cost and
short useful life span
6
Motivation for Using Clusters
Workstation clusters are thus a cheap and readily
available approach to high performance computing
Clusters are easier to integrate into existing networks
Development tools for workstations are mature
Threads, PVM, MPI, DSM, C, C++, Java, etc.
Use of clusters as a distributed compute resource is
cost effective --- incremental growth of system!!!
Individual
node performance can be improved by adding
additional resource (new memory blocks/disks)
New nodes can be added or nodes can be removed
Clusters of Clusters and Metacomputing
7
Key Benefits of Clusters
High performance: running cluster enabled programs
Scalability: adding servers to the cluster or by adding more
clusters to the network as the need arises or CPU to SMP
High throughput
System availability (HA): offer inherent high system
availability due to the redundancy of hardware, operating
systems, and applications
Cost-effectively
8
Why Cluster Now?
9
Hardware and Software Trends
Important advances taken place in the last five year
Network performance increased with reduced cost
Workstation performance improved
Average number of transistors on a chip grows 40% per year
Clock frequency growth rate is about 30% per year
Expect 700-MHz processors with 100M transistors in early 2000
Availability of powerful and stable operating systems
(Linux, FreeBSD) with source code access
10
Why Clusters NOW?
Clusters gained momentum when three
technologies converged:
Very high performance microprocessors
workstation performance = yesterday supercomputers
High
speed communication
Standard tools for parallel/ distributed
computing & their growing popularity
Time to market => performance
Internet services: huge demands for scalable,
available, dedicated internet servers
big I/O, big compute
11
Efficient Communication
The key enabling technology:
from killer micro to killer switch
Single chip building block for
scalable networks
high bandwidth
low latency
very reliable
Challenges for clusters
greater routing delay and less than
complete reliability
constraints on where the network
connects into the node
UNIX has a rigid device and
scheduling interface
12
Putting Them Together ...
Building block = complete computers
(HW & SW) shipped in 100,000s:
Killer micro, Killer DRAM, Killer disk,
Killer OS, Killer packaging, Killer investment
Leverage billion $ per year investment
Interconnecting building blocks => Killer Net
High bandwidth
Low latency
Reliable
Commodity (ATM, Gigabit Ethernet, MyridNet)
13
Windows of Opportunity
The resources available in the average clusters
offer a number of research opportunities, such as
Parallel processing: use multiple computers to build
MPP/DSM-like system for parallel computing
Network RAM: use the memory associated with each
workstation as an aggregate DRAM cache
Software RAID: use the arrays of workstation disks to
provide cheap, highly available, and scalable file
storage
Multipath communication: use the multiple networks
for parallel data transfer between nodes
14
Windows of Opportunity
Most high-end scalable WWW servers are clusters
end services (data, web, enhanced information services,
reliability)
Network mediation services also cluster-based
Inktomi traffic server, etc.
Clustered proxy caches, clustered firewalls, etc.
=> These object web applications increasingly compute
intensive
=> These applications are an increasing part of the
“scientific computing”
15
Classification of
Cluster Computers
16
Clusters Classification 1
Based
High
on Focus (in Market)
performance (HP) clusters
Grand
High
challenging applications
availability (HA) clusters
Mission
critical applications
17
HA Clusters
18
Clusters Classification 2
Based
on Workstation/PC Ownership
Dedicated
clusters
Non-dedicated clusters
Adaptive
parallel computing
Can be used for CPU cycle stealing
19
Clusters Classification 3
Based
on Node Architecture
Clusters
of PCs (CoPs)
Clusters of Workstations (COWs)
Clusters of SMPs (CLUMPs)
20
Clusters Classification 4
Based
on Node Components
Architecture & Configuration:
Homogeneous
All
clusters
nodes have similar configuration
Heterogeneous
clusters
Nodes
based on different processors and running
different OS
21
Clusters Classification 5
Based
on Levels of Clustering:
Group
clusters (# nodes: 2-99)
A set
of dedicated/non-dedicated computers --mainly connected by SAN like Myrinet
Departmental
clusters (# nodes: 99-999)
Organizational clusters (# nodes: many 100s)
Internet-wide clusters = Global clusters
(# nodes: 1000s to many millions)
Metacomputing
22
Clusters and Their
Commodity Components
23
Cluster Computer Architecture
24
Cluster Components...1a
Nodes
Multiple high performance components:
PCs
Workstations
SMPs (CLUMPS)
Distributed HPC systems leading to
Metacomputing
They can be based on different architectures and
running different OS
25
Cluster Components...1b
Processors
There are many (CISC/RISC/VLIW/Vector..)
Intel:
Pentiums, Xeon, Merced….
Sun: SPARC, ULTRASPARC
HP PA
IBM RS6000/PowerPC
SGI MPIS
Digital Alphas
Integrating memory, processing and networking
into a single chip
IRAM (CPU & Mem): (http://iram.cs.berkeley.edu)
Alpha 21366 (CPU, Memory Controller, NI)
26
Cluster Components…2
OS
State of the art OS:
Tend to be modular: can easily be extended and new
subsystem can be added without modifying the
underlying OS structure
Multithread has added a new dimension to parallel
processing
Popular OS used on nodes of clusters:
Linux
Microsoft NT
SUN Solaris
IBM AIX
…..
(Beowulf)
(Illinois HPVM)
(Berkeley NOW)
(IBM SP2)
27
Cluster Components…3
High Performance Networks
Ethernet (10Mbps)
Fast Ethernet (100Mbps)
Gigabit Ethernet (1Gbps)
SCI (Dolphin - MPI- 12 usec latency)
ATM
Myrinet (1.2Gbps)
Digital Memory Channel
FDDI
28
Cluster Components…4
Network Interfaces
Dedicated Processing
power and storage
embedded in the
Network Interface
An I/O card today
Tomorrow on chip?
Mryicom
Net
160 MB/s
Myricom
NIC
P
M
I/O bus (S-Bus)
50 MB/s
M
$
P
Sun Ultra 170
29
Cluster Components…4
Network Interfaces
Network interface card
Myrinet has NIC
User-level access support: VIA
Alpha 21364 processor integrates processing,
memory controller, network interface into a
single chip..
30
Cluster Components…5
Communication Software
Traditional OS supported facilities (but heavy
weight due to protocol processing)..
Sockets (TCP/IP), Pipes, etc.
Light weight protocols (user-level): minimal
Interface into OS
User must transmit directly into and receive from the
network without OS intervention
Communication protection domains established by
interface card and OS
Treat message loss as an infrequent case
Active Messages (Berkeley), Fast Messages (UI), ... 31
Cluster Components…6a
Cluster Middleware
Resides between OS and applications and offers
an infrastructure for supporting:
Single System Image (SSI)
System Availability (SA)
SSI makes collection of computers appear as a
single machine (globalized view of system
resources)
SA supports check pointing and process migration,
etc.
32
Cluster Components…6b
Middleware Components
Hardware
DEC Memory Channel, DSM (Alewife, DASH) SMP
techniques
OS/gluing layers
Solaris MC, Unixware, Glunix
Applications and Subsystems
System management and electronic forms
Runtime systems (software DSM, PFS etc.)
Resource management and scheduling (RMS):
CODINE, LSF, PBS, NQS, etc.
33
Cluster Components…7a
Programming Environments
Threads (PCs, SMPs, NOW, ..)
POSIX Threads
Java Threads
MPI
Linux, NT, on many Supercomputers
PVM
Software DSMs (Shmem)
34
Cluster Components…7b
Development Tools?
Compilers
C/C++/Java/
RAD (rapid application development tools):
GUI based tools for parallel processing modeling
Debuggers
Performance monitoring and analysis tools
Visualization tools
35
Cluster Components…8
Applications
Sequential
Parallel/distributed (cluster-aware applications)
Grand challenging applications
Weather
Forecasting
Quantum Chemistry
Molecular Biology Modeling
Engineering Analysis (CAD/CAM)
……………….
Web
servers, data-mining
36
Cluster Middleware
and
Single System Image
37
Middleware Design Goals
Complete transparency
Let users see a single cluster system
Single entry point, ftp, telnet, software loading...
Scalable performance
Easy
growth of cluster
no change of API and automatic load distribution
Enhanced availability
Automatic recovery from failures
Employ checkpointing and fault tolerant technologies
Handle consistency of data when replicated..
38
Single System Image (SSI)
A single system image is the illusion, created by
software or hardware, that a collection of
computers appear as a single computing resource
Benefits:
Usage of system resources transparently
Improved reliability and higher availability
Simplified system management
Reduction in the risk of operator errors
User need not be aware of the underlying system
architecture to use these machines effectively
39
Desired SSI Services
Single entry point
telnet
cluster.my_institute.edu
telnet node1.cluster.my_institute.edu
Single file hierarchy: AFS, Solaris MC Proxy
Single control point: manage from single GUI
Single virtual networking
Single memory space - DSM
Single job management: Glunix, Condin, LSF
Single user interface: like workstation/PC
windowing environment
40
SSI Levels
Single system support can exist at different levels
within a system, one is able to be built on another
Application and Subsystem Level
Operating System Kernel Level
Hardware Level
41
Availability Support Functions
Single I/O space (SIO):
Single process space (SPS)
Any node can access any peripheral or disk devices
without the knowledge of physical location.
Any process can create processes on any node, and they
can communicate through signals, pipes, etc, as if they
were one a single node
Checkpointing and process migration
Saves
the process state and intermediate results in memory
or disk; process migration for load balancing
Reduction in the risk of operator errors
45
Relationship among Middleware
Modules
46
Strategies for SSI
Build as a layer on top of existing OS (e.g. Glunix)
Benefits:
Makes the system quickly portable, tracks vendor
software upgrades, and reduces development time
New systems can be built quickly by mapping new
services onto the functionality provided by the layer
beneath, e.g. Glunix/Solaris-MC
Build SSI at the kernel level (True Cluster OS)
Good, but can’t leverage of OS improvements by vendor
e.g. Unixware and Mosix (built using BSD Unix)
47
Representative Cluster Systems
48
Research Projects of Clusters
Beowulf: CalTech, JPL, and NASA
Condor: Wisconsin State University
DQS (Distributed Queuing System): Florida
State U.
HPVM (High Performance Virtual Machine):
UIUC& UCSB
Gardens: Queensland U. of Technology, AU
NOW (Network of Workstations): UC Berkeley
PRM (Prospero Resource Manager): USC
49
Commercial Cluster Software
Codine (Computing in Distributed Network
Environment): GENIAS GmbH, Germany
LoadLeveler: IBM Corp.
LSF (Load Sharing Facility): Platform Computing
NQE (Network Queuing Environment): Craysoft
RWPC: Real World Computing Partnership, Japan
Unixware: SCO
Solaris-MC: Sun Microsystems
50
Comparison of 4 Cluster Systems
54
Task Forces
on Cluster Computing
55
IEEE Task Force on Cluster
Computing (TFCC)
http://www.dgs.monash.edu.au/~rajkumar/tfcc/
http://www.dcs.port.ac.uk/~mab/tfcc/
56
TFCC Activities
Mailing list, workshops, conferences, tutorials,
web-resources etc.
Resources for introducing the subject in senior
undergraduate and graduate levels
Tutorials/workshops at IEEE Chapters
….. and so on.
Visit TFCC Page for more details:
http://www.dgs.monash.edu.au/~rajkumar/tfcc/
57
Efforts in Taiwan
PC Farm Project at Academia Sinica Computing
Center: http://www.pcf.sinica.edu.tw/
NCHC PC Cluster Project:
http://www.nchc.gov.tw/project/pccluster/
58
NCHC PC Cluster
A Beowulf class cluster
59
System Hardware
5 Fast Ethernet switching hubs
60
System Software
61
Conclusions
Clusters
are promising and fun
Offer
incremental growth and match with
funding pattern
New trends in hardware and software
technologies are likely to make clusters more
promising
Cluster-based HP and HA systems can be seen
everywhere!
62
The Future
Cluster system using idle cycles from computers
will continue
Individual nodes will have of multiple processors
Widespread usage of Fast and Gigabit Ethernet and
they will become de facto network for clusters
Cluster software bypass OS as much as possible
Unix-based OS are likely to be most popular, but
the steady improvement and acceptance of NT will
not be far behind
63
The Challenges
Programming
Reliability (RAS)
enable applications, reduce programming effort,
distributed object/component models?
programming effort, reliability with scalability to 1000’s
Heterogeneity
performance, configuration, architecture and interconnect
Resource Management (scheduling, perf. pred.)
System Administration/Management
Input/Output (both network and storage)
64
Pointers to Literature on
Cluster Computing
65
Reading Resources..1a
Internet & WWW
Computer architecture:
http://www.cs.wisc.edu/~arch/www/
PFS and parallel I/O:
http://www.cs.dartmouth.edu/pario/
Linux parallel processing:
http://yara.ecn.purdue.edu/~pplinux/Sites/
Distributed shared memory:
http://www.cs.umd.edu/~keleher/dsm.html
66
Reading Resources..1b
Internet & WWW
Solaris-MC:
http://www.sunlabs.com/research/solaris-mc
Microprocessors: recent advances
http://www.microprocessor.sscc.ru
Beowulf:
http://www.beowulf.org
Metacomputing
http://www.sis.port.ac.uk/~mab/Metacomputing/
67
Reading Resources..2
Books
In Search of Cluster
by G.Pfister, Prentice Hall (2ed), 98
High Performance Cluster Computing
Volume1: Architectures and Systems
Volume2: Programming and Applications
Edited by Rajkumar Buyya, Prentice Hall, NJ,
USA.
Scalable Parallel Computing
by K Hwang & Zhu, McGraw Hill,98
68
Reading Resources..3
Journals
“A Case of NOW”, IEEE Micro, Feb1995
“Fault Tolerant COW with SSI”, IEEE
Concurrency
by Anderson, Culler, Paterson
by Kai Hwang, Chow, Wang, Jin, Xu
“Cluster Computing: The Commodity
Supercomputing”, Journal of Software Practice
and Experience
by Mark Baker & Rajkumar Buyya
69