Light-weight Parallel Python Tools Within the CESM Workflow

Download Report

Transcript Light-weight Parallel Python Tools Within the CESM Workflow

Light-weight Parallel Python Tools
Within the CESM Workflow
Sheri Mickelson1 & Alice Bertini2
Kevin Paul1, Dave Brown1, & John Dennis1
1 CISL/NCAR 2 CSEG/CGD/NCAR
CESM Workflow Refactor Project
Uses NCL, Matlib, XML, Python, and CESM scripts
Old Workflow
STOP
Model
Run
HPSS
New Workflow
Model
Run
Spinning
Disk
st_archive
Serial
Diagnostics
Parallel Data
Compression
time-series
generation
PyReshaper
Serial Data
Compression
time-series
generation
HPSS
optional
Parallel
Diagnostics
PyAverager
Analysis
Analysis
CESM Component Diagnostic Packages
• 4 Main Packages (AMWG, OMWG, Land, and Ice
Diagnostic Packages)
• Each package:
–
–
–
–
–
–
Contains a top level control csh script
Calculates climatological average files with NCO
Creates several hundred plots with NCL scripts
Creates a web page to interface the plots
Can be ran with limited task-parallelization (SWIFT)
Mostly ran serially – very slow at high resolutions, large
time scales, and when using time series files
Field 2 Field 1
Field 2 Field 1
Field 2 Field 1
Field 2 Field 1
Field 3
Slice 1
Field 3
Slice 2
Field 3
Slice 3
Field 3
Slice 4
Field 3
Slice 5
Field 2 Field 1
History Time-Slice to Time-Series
Series 1
Conversion
Field 1
Series 2
Field 2
Series 3
Field 3
• This was one of the most expensive CMIP5 post-processing
steps
• The current post-processing suite works in serial using NCO
Solutions for CESM Post-Processing
Incorporate the processes within the workflow:
1. Automate the User environment setup
2. Control the post processing environments in XML
3. Automate the job submission (while still enabling stand
alone capabilities)
Add parallelization:
1. Parallelize the time slice to time series conversion
2. Parallelize the Diagnostic Packages
First Step, Set Up A
Python Virtual Environment
Virtual Environment setup in the CESM source code tree once per
installation
1. Define a set of boot-strap python modules required to create the
virtual environment.
Python
>=
2.7.7
Numpy
>=
1.8.1
Scipy
>=
0.15.1
Mpi4py
>=
1.3.1
Pynio
>=
1.4.1
Matplotlib
>=
1.4.3
1. Activate the virtual environment
2. Install all the python tools into the virtual environment directory with
Makefile and setup.py files
3. Deactivate the virtual environment
Second step, Generate the post
processing tools specific to a machine
and experiment
create_postprocessing
Parse/Copy Config
Files to the CESM
Case Dir
Auto-Configure
Defaults based on
CESM run
information
Create Machine
Batch Files
Can be run from the experiment/case run script
or as stand-alone processes
Third Step, Run the Post Processing
Machine Batch Scripts
Timeseries Batch Script
• Activate the virtual environment
• Calls the time-slice to time-series converter (pyReshaper) in
parallel
• Deactivate the virtual environment
Diagnostics Batch Script
• Activate the virtual environment
• Can be submitted concurrently for each component
• Calls the climatology file generator tool (pyAverager) and NCL
plotting tools in parallel
• Deactivate the virtual environment
PyAverager Details
A light weight custom Python averaging tool
• Parallelizes over averages and variables
• Works on time slice and time series data
Types of averages it can compute:
• Temporal Averaging
– Seasonal, Yearly, Annual, Monthly (weighted optional)
Looking to also compute:
• Zonal Averaging
• Variance
• Across ensembles
Averages to
Compute
Partitioning of the PyAverager Tasks
AVG 1
AVG 3
AVG 4
AVG 6
AVG 7
AVG 8
AVG 9
Var 1
Var 2
Var 3
Rank 1
Rank 2
Rank 3
Rank 1
Rank 2
Rank 3
Rank 1
Rank 2
Rank 3
InterCommunicator 2
Avg Var 3
Avg Var 2
Avg Var 1
Rank 0
Avg Var 3
Avg Var 2
Avg Var 1
Avg Var 3
Avg Var 2
Avg Var 2
Avg Var 2
Avg Var 2
Avg Var 1
Avg Var 1
Rank 0
InterCommunicator 1
Avg Var 3
Var 3
Avg Var 1
Var 2
Avg Var 3
Var 1
Avg Var 1
Var 3
Avg Var 3
Var 2
Rank 0
Time Averaged
Climatology File
AVG 5
Var 1
Time-Series
Files
Time Averages
(Internal
Memory)
AVG 2
InterCommunicator 3
Time Averaging Options
• NCO (serial)
– Controlled by a top level csh script that calls NCO
operators to calculate averages.
• Swift (limited task parallel)
– Averages are calculated in parallel calling the NCO
operators
• PyAverager (task parallel)
– New method written in Python that task parallelizes over
variables and averages.
Each method was operated on both time slice and time
series files
Time Averaging Comparisons
Datasets Used
Types of time averages
computed
Component Res
Size
(GB)
# of Vars
CAM FV
1 .0
28
139
CAM SE
1.0
30
148
CAM SE
0.25 1055
214
CICE
1.0
8/4
137
CICE
0.1
556/42
132
CLM
1.0
10
310
POP & CICE
CLM
0.25 113
163
•
POP
1.0
190
170
POP
0.1
3113
87
CAM & CLM
•
Seasonal Averages
– ANN,DJF,MAM,JJA,SON
• Monthly Averages
– One average per month
• 17 Averages Total
•
Yearly Averages
– One average per year
10 Averages Total
* All dataset contain 10 years of both monthly time
slice and time series files
Low Resolution Timings
Original method vs. Swift vs. PyAverager
(min)
CAM
FV
CAM CICE
SE
CLM
POP
(min)
CAM
FV
CAM CICE
SE
CLM
POP
NCO
6
7
1
3
14
NCO
111
118
51
295
80
SWIFT
5
5
0.4
1.2
7
SWIFT
53
61
16
90
17
PyAve
0.7
1
0.2
0.25
3
PyAve
0.6
0.6
0.1
0.3
3
High Resolution Timings
Original method vs. Swift vs. PyAverager
(min)
CICE
CAM
CLM
POP
(min)
CICE
CAM
CLM
POP
NCO
27
215
14
306
NCO
88
861
1005
439
SWIFT
6
102
7
92
SWIFT
16
203
177
109
PyAve
1
16
4
21
PyAve
0.2
5
0.7
12
PyReshaper Details
A light weight custom Python tool that converts
Time-Slice Files to Time-Series Files
Field 2 Field 1
Field 2 Field 1
Field 2 Field 1
Field 2 Field 1
Field 3
Slice 1
Field 3
Slice 2
Field 3
Slice 3
Field 3
Slice 4
Field 3
Slice 5
Field 2 Field 1
Series 1
Field 1
Series 2
Field 2
Series 3
Field 3
Task Parallelization Strategy
Each rank is responsible for writing one
(or more) time-series variables to a file
Series 1
Field 1
Field 2 Field 1
Slice 1
Field 3
Field 2 Field 1
Slice 2
Field 3
Field 3
Field 2 Field 1
Slice 3
Rank 1
Series 2
Field 2
Rank 2
Series 3
Field 3
Rank 3
Time-Slice to Time-Series Conversion
PyReshaper Timing Statistics
Existing Method
(NCO)
Time (per MIP per Year)
Average Throughput (per run)
f09 x g16
225 minutes
1.85 MB/sec
ne120 x g16
478 minutes
4.85 MB/sec
New Method
(PyReshaper)
Time (per MIP per Year)
Average Throughput (per run)
f09 x g16
4 minutes
104 MB/sec
ne120 x g16
8 minutes
290 MB/sec
• Times include the approximate full time to convert all component data to NetCDF4
• Conversions were ran on Yellowstone using 4 nodes/4 cores (16 cores total)
• We can expect a 2X increase in throughput if we double core counts for lowresolution data
• We can expect a 3X increase in throughput if we double core counts for highresolution data
PyReshaper Plots
Time to convert 10 years of CESM data from time slice to time series.
PyReshaper v0.9.1 and PyAverager v0.1.0
available for download
• Download the packages from
https://www2.cisl.ucar.edu/tdd/asap/parallel-python-tools-postprocessing-climate-data
• Both packages depend on NumPy, mpi4py, and PyNIO
• Both contain README’s and Doxygen documentation
CESM workflow refactor team
•
•
•
•
•
•
•
•
•
•
•
•
•
Ben Andre
Alice Bertini
John Dennis
Jim Edwards
Mary Haley
Jean-Francois Lamarque
Michael Levy
Sheri Mickelson
Kevin Paul
Sean Santos
Jay Shollenberger
Gary Strand
Mariana Vertenstein
Questions?
https://www2.cisl.ucar.edu/tdd/asap/parallel-python-tools-post-processing-climate-data