WRF Software Development and Performance

Download Report

Transcript WRF Software Development and Performance

High Performance Computing and
Atmospheric Modeling
John Michalakes
Mesoscale and Microscale Meteorology Division
National Center for Atmospheric Research
[email protected]
Colorado State University, November 26, 2007
Mesoscale & Microscale Meteorological Division / NCAR
Outline
• Part 1
– HPC and atmospheric simulation
– Characteristics of atmospheric models
• Part 2
– Weather Research and Forecast (WRF)
– Towards petascale
Mesoscale & Microscale Meteorological Division / NCAR
High Performance Computing
and Weather
• The original “HPC application”
– John von Neumann, Jule Charney,
Carl-Gustov Rossby, others; first
computerized weather forecast
using ENIAC, in 1950. 270 grid
points, 700 km resolution
– Joint Numerical Weather Prediction
unit; 1954
Grcar, Joseph, “John von Neumann and the Origins of Scientific Computing”, 2007
Mesoscale & Microscale Meteorological Division / NCAR
High Performance Computing
and Weather
• The original “HPC application”
– John von Neumann, Jule Charney,
Carl-Gustov Rossby, others; first
computerized weather forecast
using ENIAC, in 1950. 270 grid
points, 700 km resolution
– Joint Numerical Weather Prediction
unit; 1954
– 50th Anniversary of JNWPU at U.
Maryland, June 2004 attended by
representatives of National Weather
Service, Air Force Weather Agency,
and Navy Fleet Numerical
Grcar, Joseph, “John von Neumann and the Origins of Scientific Computing”, 2007
Mesoscale & Microscale Meteorological Division / NCAR
High Performance Computing
and Weather
• Today:
– Key component of atmospheric
research
– Higher resolutions: 109 grid points
– Speeds: 1013 operations/second
– More complex physics, direct
simulation, ensembles, coupled
model simulations
– 5% of Top 500® systems used for
weather and climate
Precipitable water field from
5 day global WRF forecast
20km resolution
Mesoscale & Microscale Meteorological
Division
(100 million
cells) / NCAR
Number of Weather and Climate Systems in
November Top500® Listings
35
top 10
11-50
30
51-100
101-200
201-300
25
300-500
20
15
10
5
0
2001
2002
2003
2004
Source: Top 500 Supercomputing Sites http://www.top500.org
Copyright (c) 2000-2006 TOP500.Org
2005
2006
Weather and Climate Systems in
November Top500® Listings
160000
140000
120000
GF / s
.
100000
80000
R max
Number of Processors
60000
40000
20000
0
2001
2002
2003
2004
Source: Top 500 Supercomputing Sites http://www.top500.org
Copyright (c) 2000-2006 TOP500.Org
2005
2006
Challenges for Petascale Atmospheric Models
• Conventional wisdom revisited
(View from Berkeley)
– Transistors are cheap, power
expensive
– Flops are cheap, memory access
is expensive
– Performance will continue to
increase, but with parallelism, not
clock rate
Mesoscale & Microscale Meteorological Division / NCAR
Characteristics of Atmospheric Models
• Fundamentally CFD
– Numerical approximation of solutions to PDEs for primitive equations
of mass, momentum, thermodynamics
• However, additional constraints/characteristics
– Domains:
• Spherical (global models), subject to pole-problem
• Rectangular (limited area models), subject to lateral boundary conditions
– Boundaries:
• Physical: land, ocean, upper
• Regional models also have lateral boundaries
– Predominantly structured grids, small by CFD standards
• Cartesian coordinates (includes lat/lon, isotropic, reduced Cartesian)
• Others: icosahedral, hybrids (cubed-sphere, yin-yang, etc.)
• Grids range 105 – 109 cells (typical problem size today around 108)
Mesoscale & Microscale Meteorological Division / NCAR
Characteristics of Atmospheric Models
• Fundamentally CFD
– Numerical approximation of solutions to PDEs for primitive equations
of mass, momentum, thermodynamics
• However, additional constraints/characteristics
– Domains:
• Spherical (global models), subject to pole-problem
• Rectangular (limited area models), subject to lateral boundary conditions
Cubed-sphere
Composite
Composite
– Lat/Lon
Boundaries:Icosohedral
Yin-yang
Channel with
• Physical: land, ocean, upper
Polar Caps
• Regional models also have lateral boundaries
– Predominantly structured grids, small by CFD standards
• Cartesian coordinates (includes lat/lon, isotropic, reduced Cartesian)
• Others: icosahedral, hybrids (cubed-sphere, yin-yang, etc.)
• Grids range 105 – 109 cells (typical problem size today around 108)
Mesoscale & Microscale Meteorological Division / NCAR
Characteristics of Atmospheric Models
Continued…
– Numerical considerations: Fastest modes -- gravity waves, acoustic
waves -- are of little interest for solution, but must be resolved without
overly constraining time step
• Time-split (explicit) finite difference approximation
• Elliptical solvers (implicit, spectral, semi-Lagrangian)
Berkeley Dwarf 5
Structured
Berkeley Dwarfs 1 & 3
Dense L.A., Spectral
– More than half of code, data, and computation is non-CFD
(e.g. physics)
• Large amount of state per grid-cell (50-100 variables)
• 3000-5000 computations per cell-step
• Solution-induced load imbalance
Mesoscale & Microscale Meteorological Division / NCAR
processor load
Characteristics of Atmospheric Models
Continued…
– Numerical considerations: Fastest modes -- gravity waves, acoustic
waves -- are of little interest for solution, but must be resolved without
overly constraining time step
• Time-split (explicit) finite difference approximation
• Elliptical solvers (implicit, spectral, semi-Lagrangian)
Berkeley Dwarf 5
Structured
Berkeley Dwarfs 1 & 3
Dense L.A., Spectral
convective
– More than half of code, data, and computation is
non-CFDprecipitation
(e.g. physics)
• Large amount of state per grid-cell (50-100 variables)
• 3000-5000 computations per cell-step
• Solution-induced load imbalance
Workload Characterization of Physics-Induced
Load Imbalance in WRF
Mesoscale & Microscale Meteorological Division / NCAR
www.mmm.ucar.edu/wrf/WG2/lb_report.doc
Characteristics of Atmospheric Models
• Weather versus Climate simulation:
– Codes and techniques for weather and climate simulation are
basically identical
– However, they differ fundamentally in requirements for HPC
• Weather forecasting:
– Forecast length bounded by predictability to one or two weeks
– Unbounded in complexity. Can usefully employ high resolution,
high-cost physics to “scale its way out” of Amdahl limits
• Climate prediction:
– Effectively unbounded in time – multi-decades/centuries
– High resolution and sophisticated physics is desirable but
simulation speed in years/day is paramount
• Can giant supercomputers with thousands of processors be
useful for weather? For climate?
Mesoscale & Microscale Meteorological Division / NCAR
Weather as a Petascale Application
•
•
•
Weather is a petascale
application only to the extent
that higher resolution can be
usefully employed
Exquisitely detailed but incorrect
forecasts are not the answer.
Instead, very high resolution
research simulation to improve
understanding that, in turn,
improves skill of lowerresolution operational forecasts
– For example, cloud resolving
(Dh ~ O(100 m)) simulations to
understand and improve
parameterizations of cloud
dynamics
Mesoscale & Microscale Meteorological Division / NCAR
Weather as a Petascale Application
•
•
•
Weather is a petascale
application only to the extent
that higher resolution can be
usefully employed
Exquisitely detailed but incorrect
forecasts are not the answer.
Instead, very high resolution
research simulation to improve
understanding that, in turn,
improves skill of lowerresolution operational forecasts
– For example, cloud resolving
(Dh ~ O(100 m)) simulations to
understand and improve
parameterizations of cloud
dynamics
Mesoscale & Microscale Meteorological Division / NCAR
Summary (part 1)
• Atmospheric models are rooted in CFD but with
additional constraints, costs, requirements
• Weather and climate present different challenges for
HPC, esp. entering the petascale era; weather is
difficult, climate may be problematic
• Successfully enabling new science using HPC depends
on careful consideration of numerous, often conflicting
requirements in the design of atmospheric modeling
software
Mesoscale & Microscale Meteorological Division / NCAR
Weather Research and Forecast Model
Mesoscale & Microscale Meteorological Division / NCAR
WRF Overview
•
Large collaborative effort to develop
next-generation community model with
direct path to operations
http://www.wrf-model.org
– Limited area, high-resolution
– Structured (Cartesian) with meshrefinement (nesting)
– High-order explicit dynamics
– Software designed for HPC
– 4000+ registered users
•
Applications
–
–
–
–
–
Numerical Weather Prediction
Atmospheric Research
Coupled modeling systems
Air quality research/prediction
High resolution regional climate
Mesoscale & Microscale Meteorological Division / NCAR
WRF Overview
•
Large collaborative effort to develop
next-generation community model with
direct path to operations
– Limited area, high-resolution
– Structured (Cartesian) with meshrefinement (nesting)
– High-order explicit dynamics
– Software designed for HPC
– 4000+ registered users
•
Hurricane Katrina
Applications
–
–
–
–
–
Numerical Weather Prediction
Atmospheric Research
Coupled modeling systems
Air quality research/prediction
High resolution regional climate
Observations (Radar)
5 day global WRF forecast at
20km horizontal resolution.
running at 4x real time
128 processors of IBM Power5+
(blueice.ucar.edu) Mesoscale
WRF Simulated
Reflectivity
4km Vortex-following
Moving Nest
& Microscale Meteorological Division / NCAR
WRF Overview
•
Large collaborative effort to develop
next-generation community model with
direct path to operations
– Limited area, high-resolution
– Structured (Cartesian) with meshrefinement (nesting)
– High-order explicit dynamics
– Software designed for HPC
– 4000+ registered users
•
Applications
–
–
–
–
–
Numerical Weather Prediction
Atmospheric Research
Coupled modeling systems
Air quality research/prediction
High resolution regional climate
WRF-CHEM
27km 36 hour NO+NO2 forecast
29-31 January 2005
Courtesy Georg Grell
http://www-frd.fsl.noaa.gov/aq/wrf/
Mesoscale & Microscale Meteorological Division / NCAR
WRF Overview
•
Large collaborative effort to develop
next-generation community model with
direct path to operations
– Limited area, high-resolution
– Structured (Cartesian) with meshrefinement (nesting)
– High-order explicit dynamics
– Software designed for HPC
– 4000+ registered users
•
Applications
–
–
–
–
–
Numerical Weather Prediction
Atmospheric Research
Coupled modeling systems
Air quality research/prediction
High resolution regional climate
Precipitable water field from 2-year NCAR
Climate-WRF simulation using 720 processors
IBM Power5
Mesoscale & Microscale Meteorological Division / NCAR
Top-level Control,
Memory Management, Nesting,
Parallelism, External APIs
mediation
ARW solver
model
• Hierarchical design
• Multi-level parallelism
• Performance-portable...
driver
WRF Software Framework
NMM solver
Physics Interfaces
Plug-compatible physics
Plug-compatible physics
Plug-compatible physics
Plug-compatible physics
Plug-compatible physics
Mesoscale & Microscale Meteorological Division / NCAR
WRF Software Framework
• Hierarchical design
• Multi-level parallelism
• Performance-portable...
Logical
domain
1 Patch, divided
into multiple tiles
Inter-processor
communication
Mesoscale & Microscale Meteorological Division / NCAR
WRF Software Framework
• Hierarchical design
• Multi-level parallelism
• Performance-portable...
Performance (v2.0.x)
Gflop/Sec
Simulation Speed (simulated time / wall clock)
www.mmm.ucar.edu/wrf/WG2/bench
Mesoscale & Microscale Meteorological Division / NCAR
WRF Supported Platforms
Vendor
Hardware
OS
Compiler
Apple
G5
MacOS
IBM
X1, X1e
UNICOS
Cray
XT3/XT4 (Opteron)
Linux
PGI
Alpha
Tru64
Compaq
Linux
Intel
HPUX
HP
AIX
IBM
Cray Inc.
HP/Compaq
Itanium-2
Power-3/4/5/5+
IBM
Blue Gene/L
Linux
Opteron
NEC
SGI
Sun
various
IBM
Pathscale, PGI
SX-series
Unix
Vendor
Itanium-2
Linux
Intel
MIPS
IRIX
SGI
UltraSPARC
Solaris
Sun
Xeon and Athlon
Linux and
Itanium-2 and Opteron
Windows CCS
Intel, PGI
Petascale precursor systems
Mesoscale & Microscale Meteorological Division / NCAR
WRF Supported Platforms
Vendor
Hardware
OS
Compiler
Apple
G5
MacOS
IBM
X1, X1e
UNICOS
Cray
XT3/XT4 (Opteron)
Linux
PGI
Alpha
Tru64
Compaq
Linux
Intel
HPUX
HP
AIX
IBM
Cray Inc.
HP/Compaq
Itanium-2
Power-3/4/5/5+
IBM
Blue Gene/L
Linux
Opteron
NEC
SGI
Sun
various
IBM
Pathscale, PGI
SX-series
Unix
Vendor
Itanium-2
Linux
Intel
MIPS
IRIX
SGI
UltraSPARC
Solaris
Sun
Xeon and Athlon
Linux and
Itanium-2 and
Opteron
University
Windows
São
PauloCCS
Intel, PGI
of
“Clothesline Computer”
Petascale precursor systems
Mesoscale & Microscale Meteorological Division / NCAR
Towards Petascale
Mesoscale & Microscale Meteorological Division / NCAR
WRF Nature Run
New scientific insights and ultimately better prediction will
be enabled by atmospheric modeling at Petascale
•
SC07 Gordon Bell HPC Challenge Finalist
–
–
–
–
–
Establish baseline for studying atmos. dynamics at very high resolution
5km hemispheric domain, first few hours of 90 day Nature Run
Computational: 12 Tera-ops/time step (5800 ops per cell-step)
Memory footprint: > 100 variables per cell (single prec.) times 2109 cells
Interprocessor communication:
• 160 MPI Send/Recv pairs per time step per processor
• Average 120 KB per exchange (one way)
– I/O
• Input: 200 GB/restart
• Output: 40 GB/hourly write
•
Goals:
– Record parallelism and scaling on a petascale problem
– Record computational rate with output
– Most importantly: new scientific result and a step towards new
understanding of predictability in the earth’s atmosphere
Mesoscale & Microscale Meteorological Division / NCAR
Floating Point Rate
Initial simulation results
N.H.
For comparison:
Real conditions
(July 22, 2007)
S.H.
WRF Nature Run
5km (idealized)
Capturing large scale structure already
(Rossby Waves)
Small scale features spinning up (next slide)
Initial simulation results
Kinetic Energy Spectrum
At 3:30 h into
the simulation,
the mesoscales
are still spinning
up and filling in
the spectrum.
Large scales
were previously
spun up on a
coarser grid
k-3
Large scales
already present
Mesoscales
spinning up
Scales not yet
spun up
k-5/3
New Directions
GPU and other non-traditional architectures
•
•
•
•
Converted standalone WRF microphysics to CUDA
Produces same output as original Fortran, within roundoff
Speed of original on host CPU (Xeon): 330 milliseconds
Speed on NVIDIA GPU:
– Theoretical peak speedup: .3 milliseconds (1000x)
– Time on GPU: 26 milliseconds (12x speedup)
– Including data transfers: 37 milliseconds (9x speedup)
• Preliminary, but current implementation is getting only 1% of peak on
GPU but still doing 10x better than host CPU!
Mesoscale & Microscale Meteorological Division / NCAR
Summary
• Atmospheric modeling, one of the original
HPC-enabled applications, is moving to
exploit petascale computing
• As always, significant engineering and
scientific challenges and opportunities
WRF web page: http://www.wrf-model.org
My contact info: [email protected]
Mesoscale & Microscale Meteorological Division / NCAR
Thank you
WRF web page: http://www.wrf-model.org
My contact info: [email protected]
Mesoscale & Microscale Meteorological Division / NCAR
WRF Nature Run
• Computational record for an atmospheric model?
– AFES Earth Simulator still the record at 27 TF/s
– 8.76 Tf/s (7.47 TF/s with I/O) is a WRF record and perhaps a
record for a model designed to run at high, non-hydrostatic
resolution with scale-appropriate dynamics
• Parallelism and scaling?
– 15K processors at 7.8% peak (7.2% with I/O)
– We think yes.
• I/O performance at scale
– 6.4% penalty for I/O on Blue Gene; 14.75% on XT4
– Needs improvement but science enabled in meantime
• Science
– Important new steps towards understanding the behavior and
predictability of the atmosphere through frontier simulation
Mesoscale & Microscale Meteorological Division / NCAR