October 26, 1998

Download Report

Transcript October 26, 1998

How fast are fast computers?
Xing Cai
October 26, 1998
Overview
•
•
•
•
•
Modern fast computers at a glimpse
Fast computers & scientific computing
A closer look at SC performance
Current situation & future trends
Concluding remarks
October 26, 1998
2Xing Cai
An indirect answer
The slowest fast computer is faster than
the fastest slow computer.
October 26, 1998
3Xing Cai
http://www.top500.org
• Performance ranking of world’s 500
most powerful computers
• LINPACK benchmark (floating-pt intensive)
• J. Dongara, H. Meuer, E. Strohmaier
• Report every 6 months since June 93
• A good correction of peak performance
KFlops
October 26, 1998
MFlops
GFlops
TFlops
4Xing Cai
6/98
http://www.top500.org
Rank Vendor/Type
1 Intel ASCI Red
2 SGI T3E 1200
3 SGI T3E 900
4 SGI T3E 900
5 SGI T3E
6 Hitachi/Tsukuba
7 SGI T3E
120 SGI Origin2000
135 SGI T3E
Rmax
Rpeak
1,338G
1,830G
891.5G
1,296G
634.2G
1,123G
450.5G
756G
448.6G
614.4G
368.2G
614.4G
342.8G
470.4G
40.25G
49.92G
38.58G
52.80G
#Proc Location
Field
9,152 Sandia
Lab US
1,080 Govern.
US
1,248 Govern.
US
840 UK MET
Research
1,024 NASA
Research
2,048 Univ.
Tsukuba
784 MPG
Germany
128 UiB
Norway
88 NTNU
Norway
Academic
Classified
Classified
Research
Research
Academic
Academic
ASCI Red TFLOPS
85 cabinets, 9216 Intel Pentium Pro processors
http://www.sandia.gov/ASCI/Red/main.html
October 26, 1998
6Xing Cai
Some “high-end” computers
•
•
•
•
•
SGI Cray T3E 1200
SGI Cray Origin 2000
Fujitsu VPP 700
NEC SX-4
IBM RS/6000 SP
October 26, 1998
7Xing Cai
Vendor overview
http://www.netlib.org/utk/people/JackDongarra/top500-698/
October 26, 1998
8Xing Cai
Vendor overview
http://www.netlib.org/utk/people/JackDongarra/top500-698/
October 26, 1998
9Xing Cai
Scientific computing 50 years
ENIAC - world’s 1st electronic computer for scientific computing
Advance in hardware
• Rapid advance of microprocessor tech.
• World’s most powerful computer
– ENIAC 330 Flops, 1946
– Digital Alpha-21164 processor 1.2 GFlops, 1997
• World’s most powerful computing site
– ONR 583.73 KFlops, 1956 http://www.cnct.com/~gunter
– NSA 4,088.76 GFlops, 1998-Oct-14
“If car industry had made equal progress, you could buy a
car for a few $, drive across US in a few minutes, and park
it in your pocket!”
October 26, 1998
11Xing Cai
Scientific computing today
Earth & environment
DNA modelling &
medical research
http://www.psc.edu/science/projects.html
Grand challenge
“Fundamental problem in science or
engineering, with potentially broad
economic, political and/or scientific
impact, that could be advanced by
applying high performance computing
resources.”
Keyword: simulation
October 26, 1998
13Xing Cai
Numerical simulation
3rd paradigm
of science!
Phy.
phenom
Math.
model
Algorithm
Software
hardware
Advance in numerics
• Solution of Poisson’s equation
 2 u  f
Linear system with
sparse matrices
Ax  b
Banded LU O (n7/3)
Jacobi
2
O (n )
Conj. Grad. O (n3/2)
Multigrid
O (n)
• For “standard” size n=10 (100x100x100)
6
– Multigrid 14.42 seconds
– Banded LU 232.96 days
October 26, 1998
56 MBytes
160 GBytes
15Xing Cai
How fast (and big) should fast
computers be?
Global weather prediction
• Navier-Stokes on 3D grid for the earth
12
• 100 m cells, 100 levels - 5x10 cells
• 5 variables per cell - 200 TBytes
• 100 Flops/cell/minute
• Required performance: 8TFlops
There is never enough computing power?
October 26, 1998
16Xing Cai
Electrical potential
depolarization in human heart
• Grid node spacing 2 nodes/mm
• Estimated 3D grid - 4,200,000 nodes
• Estimated CPU time - one processor
– cpu per node 3.3 seconds
– total: 4,200,000x3.3 = 160 days
• Elapsed physical time: 300 ms
We need parallel computing
October 26, 1998
http://www.ifi.uio.no/~xingca/HEART/
17Xing Cai
Parallel computing
• We are approaching the limit of single
microprocessor performance
• We want to run larger simulations
• We want shorter simulation time
• More cost-effective computing
October 26, 1998
18Xing Cai
Oil reservoir simulation
Simulation of 1000 days of gas injection
• Single-processor workstation simulation
– one day for 80,000 unknowns
– 10 days for 800,000 unknowns
– 200 days for 32,000,000 unknowns (impossible)
• Efficient parallel computing
– 128 processor IBM SP
– 23 minutes for 32,000,000 unknowns (PETSc)
Importance of efficient parallel computing!
http://www.mcs.anl.gov/petsc/petsc.html
October 26, 1998
19Xing Cai
Main question
Actual performance of real-life SC
applications are well below the peak
performance. Why?
October 26, 1998
20Xing Cai
LINPACK benchmark revisited
•
•
•
•
•
Direct solution of dense matrix systems
Limited application in SC
Simple data structure
Close to artificial test problem
Only a more realistic upper-bound of
achievable peak performance - 20% of
reported performance can be expected
October 26, 1998
21Xing Cai
Characteristics of SC
• Data intensive computing
– 1 GFlops - memory bandwidth 24GB/s
(example DAXPY)
– Memory hierarchy
• Complex data structure
– Sparse matrices
– Structured grid vs unstructured grid
– Adaptive grid refinement
• Communication & synchronization
October 26, 1998
22Xing Cai
Multigrid method
• Suits well for large sparse systems
– asymptotically optimal operation count
– less 100 floating pt ops per unknown
• Complex data structure
• Relatively low performance
Stals & Rüde - Techniques for improving the data
locality of iterative methods
October 26, 1998
23Xing Cai
Architecture bottleneck
• Imbalance between processor speed
and memory access speed
– Processor speed annual increase >= 60%
– Memory access speed annual increase
5%-10%
• Inter-processor communication latency
& bandwidth
• Memory size
October 26, 1998
24Xing Cai
SC software today
•
•
•
•
•
•
Inefficient (not very cache-aware)
Not very portable
Not very easy to maintain
Not very user-friendly
Hard to program real-life applications
Limited compiler parallelism
– Hard to program parallel codes
October 26, 1998
25Xing Cai
O-O numerical software
•
•
•
•
•
•
•
Better representation of mathematics
Manpower effective
Stable code, easy maintenance
Good flexibility & extensibility
Structured & efficient parallelization
Need care for efficiency
Standard is not settled yet
October 26, 1998
26Xing Cai
Trend in architecture
http://www.netlib.org/utk/people/JackDongarra/top500-698/
October 26, 1998
27Xing Cai
Trend in CPU technology
http://www.netlib.org/utk/people/JackDongarra/top500-698/
October 26, 1998
28Xing Cai
Future trends
• Progress of semi-conductor technology
9
– over 10 transistors per chip in future
– increased on-chip parallelism
• Architecture changes are needed
• Impact on scientific computing
– Rüde:Technological trends and their impact
on the future of supercomputers
• Different levels of parallelism
October 26, 1998
29Xing Cai
Metacomputing
• Demand for enormous computing power
– US airforce battle simulation (8 US supercomputing centers)
– Unicore project (link supercomputers in Germany and US)
• Better utilization of idle comp. power
• “Seamless web” - heterogeneous comp.
• Need a balanced system connected by
high-speed networks
• Need a scalable distr. operating system
October 26, 1998
30Xing Cai
Supercomputers in future
• ASCI Option White - IBM 10 TFlops
• 100 TFlops computers in near future
15
• Petaflops (10 )
– 10,000-1,000,000 procs
– feasible and “affordable” in 2010?
October 26, 1998
31Xing Cai
Some observations
• HPSC is a small but exciting field
• Supercomputers adopt commodity tech
• Affordable parallel systems available
– SMP, distributed shared memory
– cluster of shared memory machines
– parallel computing standard appearing
• Scientific software industry is still in its
early stage
October 26, 1998
32Xing Cai
Challenges for SC
• Numerics
– faster algorithms
– good data locality
– low communication requirement
• Software
– efficient (performance, manpower)
– high-level problem solving environment
• Hardware
– changes of architecture
October 26, 1998
33Xing Cai
Some citations
‘There’s a future for high-performance
parallel computing out there.’
Tony Hey, Univ. Southampton
‘Allow datastructures and algorithms to
guide us to the appropriate architecture.’
John Vrolyk, SGI senior vice president
‘Intentions of the scientific users strongly
differ from the industrial users.’
Ulrich Trottenberg, GMD
October 26, 1998
34Xing Cai
The whole picture
Government
Supercomputer
Vendor
Scientific
Computing
Industry
General Public?
We are in the same boat...
October 26, 1998
35Xing Cai
Concluding remarks
•
•
•
•
Huge potential of scientific computing
More real-life applications to come
Growing demand of computing power
Scientific computing needs advances in
– numerical algorithms
– software technology
– hardware
October 26, 1998
36Xing Cai
Quiz
What was world’s fastest computer on June 2nd 1998?
‘It was a HP notebook used on Space shuttle “Discovery”
to compute orbital position. The speed was 17,500 mph.’
Jack Dongara
October 26, 1998
37Xing Cai