the NetCDF (.nc) and CDL

Download Report

Transcript the NetCDF (.nc) and CDL

Preparing input data for models
Jon Goodall
Hydroinformatics
Fall 2014
This work was funded by National Science
Foundation Grants EPS 1135482 and EPS
1208732
Data Required by Hydrologic Models
•
•
•
•
•
•
Terrain
Land use/land cover
Soils
Precipitaiton
Streamflow
Etc.
Each of these data have to be gathered, transformed, and
loaded into models. Many of these data require GIS (terrain,
soils, land use), others we have already focused on in this
class (point observations). Thus we will focus on this task on
gridded precipitation data.
Preparing Precipitation Data Required by a Model
• Task: Create one (or more) time series of 10+ years of
daily (or subdaily) precipitation for a specific region.
• Challenges:
– Getting the raw data
– Reading and extracting the subset of data required for
your model
– Writing data files in the format required by the model
• Result: Data prep can be a significant portion of the
time spent on modeling
• Goal: use scripts to automate these steps:
reproducible, reusable, sharable!
We will focus on a gridded
precipitation data product from NWS
• US National Weather Service
Advanced Hydrologic
Prediction Service (AHPS)
• 4x4km spatial resolution
• 24 hour accumulated
precipitation
• “Multisensor” dataset using
NEXRAD and gauged data
• http://water.weather.gov/pre
cip/
Why focus on this data?
• You have already seen ways of dealing with point observational data, so I
am confident you can work with weather data collected at gauging
locations.
• Gridded data present new challenges including different file formats
(NetCDF, GRIB, HDF, etc.)
• There are many gridded weather data products that are helpful to
hydrologic studies including:
– Weather reanalysis products (e.g.,
http://www.esrl.noaa.gov/psd/data/gridded/reanalysis/)
– Satellite-derived precipitation (e.g., TRMM http://disc.sci.gsfc.nasa.gov/daacbin/DataHoldingsPDISC.pl)
– IPCC GCM output data (e.g., http://www.ipcc-data.org/sim/gcm_monthly/ )
• The NWS AHPS product serves as an example and my goal is to provide
background information to help you access and use any gridded
weather/climate dataset.
Back to the AHPS Product - Data Access
http://water.weather.gov/precip/download.php
Data is available
in shapefile or
NetCDF format
Data is available
for 2004present on an
FTP server
FTP Server with daily data 2004 to present
http://water.weather.gov/precip/p_download_new/
Gridded data is often stored as a NetCDF file
• What is NetCDF?
– Network Common Data Form
– “NetCDF is a set of software libraries and self-describing, machineindependent data formats that support the creation, access, and
sharing of array-oriented scientific data.”
• Created by UCAR’s Unidata: http://www.unidata.ucar.edu
• There are interfaces for working with NetCDF in Matlab, R, Python,
etc. In Python alone, there are multiple options available:
https://www.unidata.ucar.edu/software/netcdf/software.html#Pyth
on
NetCDF files are binary (not ASCII)
• Why use binary?
– Smaller file size for a given amount of data.
• Problem:
– You can’t see the contents unless you covert a NetCDF file into an
ASCII representation called a CDL (Common Data form Language).
• This is like the ESRI GRID format used for raster data in GIS. GRID is
a binary format, but you can convert the file to an ASCII GRID and
then view its contents with a text editor. For a given raster, the ASCII
GRID will be much larger than the binary GRID dataset. (see
http://en.wikipedia.org/wiki/Esri_grid for a nice explanation.)
Example NetCDF file represented in ASCII text as CDL
File name
netcdf foo { // example netCDF specification in CDL
dimensions:
lat = 10, lon = 5, time = unlimited;
variables:
int lat(lat), lon(lon), time(time);
float z(time,lat,lon), t(time,lat,lon);
double p(time,lat,lon);
int rh(time,lat,lon);
lat:units = "degrees_north";
lon:units = "degrees_east";
time:units = "seconds";
z:units = "meters";
z:valid_range = 0., 5000.;
p:_FillValue = -9999.;
rh:_FillValue = -1;
data:
lat = 0, 10, 20, 30, 40, 50, 60, 70, 80, 90;
lon = -140, -118, -96, -84, -52;
}
Dimensions of the arrays
Stored within the file.
Variables stored within the file. Each
variables can be a multi-dimensional array
and each variable stores data of a single data
type. For example, lat is a one dimensional
array with 10 elements all of which are
integers.
These are attributes of the file. They can
describe attributes of a variable in the file or
about the file itself.
The data itself, written as a list of values
separated by commas. Semicolons end
the values for a variable.
Source: http://www.unidata.ucar.edu/software/netcdf/docs/CDL.html#CDL
The Classic NetCDF Data Model
Source:
http://www.unidata.ucar.edu/software/netcdf/docs/html_tutorial/netcdf_data_model.html#classic_model
Converting between NetCDF and CDL
• A command line tools called “ncdump” and
“ncgen” allow for the conversion between
NetCDF and CDL format
• There is also an XML-based way of
representing the “metadata” in a NetCDF file
(NcML):
http://www.unidata.ucar.edu/software/thred
ds/current/netcdf-java/ncml/
Working with NetCDF file in Python
using the netCDF 4 package
• We will use the netCDF4 package to work with netCDF
files: http://unidata.github.io/netcdf4-python/
• This package should be preinstalled with Canopy.
• To test if it is installed, type the following at the
command prompt in the Canopy Editor:
– import netCDF4
• If it is not installed (i.e., if you get an error when you
try to import netCDF4), then open the Canopy Package
Manager, search for netCDF4, and install both
lib_netcdf4 and netCDF4 (see next slide for details).
Installing NetCDF Packages
These two packages
must be installed.
Reading a NetCDF File using
Python
•
Download the NetCDF (.nc) and CDL (.cdl) files I put on Canvas.
•
As an FYI:
– I obtained the NetCDF file from the NWS FTP site
(http://water.weather.gov/precip/p_download_new/ )
– YOU DON’T NEED TO DO THIS FOR THIS CLASS, BUT FOR REFERENCE I created the CDL file from
the NetCDF file with the following command:
ncdump nws_precip_conus_20141117.nc > nws_precip_conus_20141117.cdl
•
NOTE:
– I installed the netcdf tools following this approach:
http://mazamascience.com/WorkingWithData/?p=1474
– On Windows you could do this:
http://www.unidata.ucar.edu/software/netcdf/docs/winbin.html
– On other operating systems: http://www.unidata.ucar.edu/packages/netcdf/INSTALL.html
Portion of the NWS AHPS file viewed as CDL
Alternative: Download file directly
from FTP server
• The file I put on Canvas is also available
here:http://water.weather.gov/precip/p_download_ne
w/2014/11/17/nws_precip_conus_20141117.nc
• You can download the file using Python (the code
below downloads the file to your current working
directory)
import urllib
urllib.urlretrieve("http://water.weather.gov/precip/p_download_new/2014/1
1/17/nws_precip_conus_20141117.nc", "nws_precip_conus_20141117.nc")
In Class Example
• Open Canopy Editor and follow along as I
show how to read the netCDF file using
Python.
Opening the NetCDF file for Reading
#open a NetCDF file for reading
import netCDF4
ds = netCDF4.Dataset("nws_precip_conus_20141117.nc", "r")
Print Structure of Dataset
#print out the file’s metadata
print ds
RESPONSE:
<type 'netCDF4.Dataset'>root group (NETCDF3_CLASSIC data model,
file format UNDEFINED):
dimensions(sizes): hrapy(813), hrapx(1051), latlong(4), dates(11)
variables(dimensions): int32 amountofprecip(hrapy,hrapx), float32
lat(latlong), float32 lon(latlong), float32 true_lat(), float32 true_lon(),
|S1 timeofdata(dates), |S1 timeofcreation(dates), float32 hrap_xor(),
float32 hrap_yor()
groups:
Examples of Variables
int32 amountofprecip(hrapy,hrapx)
This means there is a 2D array named stored as the
variable ‘amountofprecip” with dimensions ‘hrapy’
and ‘hrapx.’
hrapy(813), hrapx(1051)
This means that the dimension hrapy has 813
elements and hrapx has 1051 elements.
Attributes of amountofprecip Variable
print ds.variables['amountofprecip']
RESPONSE:
<type 'netCDF4.Variable'>
int32 amountofprecip(hrapy, hrapx)
long_name: 24-Hour Rainfall
units: 1/100 mm
grid: hrap_grid=polar stereograph projection
resolution: 4km*4km
When observation was made
dateofdata: 2014111712Z
dateofcreation: 2014111721Z
location: continental united states
author: national weather service
rfcs genereated by program ROUT
comments: preliminary data...subject to change
unlimited dimensions:
current shape = (813, 1051)
filling off
Get a single attribute of the
amountofprecip variable
print ds.variables['amountofprecip'].long_name
Response:
24-Hour Rainfall
print ds.variables['amountofprecip'].units
Response:
1/100 mm
Get values of amountofprecip variable
precip = ds.variables['amountofprecip']
print precip[0,0]
REPONSE:
-1 #means no data (e.g., over the ocean)
precip[450,850] #this gets the precip value for cell y=450, x=850
RESPONSE:
3322 # units are 1/100 mm, so this is about 1.3 inches of
rain for this cell over the 24-hr period
Mapping between HRAP coordinates and lat/lon coordinates
Shapefile is available here:
http://water.weather.gov/precip/p_download_new/nws_precip_allpoint.tar.gz
Rainfall in Charlottesville
Precip[512, 944]
RESPONSE:
218 (1/100 mm) = 0.09 inches
Over what duration?
• Recall that the dateofdata attribute = 2014111712Z
– or November 17, 2014 12:00 UTC
– or November 17, 2014 07:00 EST
NWS AHPS Documentation
From
http://water.weather.gov/precip/about.php:
"Observed" data is expressed as a 24-hour total
ending at 1200 Z (same as Greenwich Mean
Time, or GMT), with longer periods simply being
a summation of multiple 24-hour periods. 1200
GMT is used as the ending time for a 24-hour
total, because it is the end of the "hydrologic
day", a standard used in river modeling.
Conclusion
• According to the NWS AHPS product, 0.08 inches
of precip fell at the Charlottesville, VA Airport
during the period November 16, 2014 07:00 EST
to November 17, 2014 07:00 EST
• Checking our answer: Observed data at
Charlottesville Airport is available
here:http://w1.weather.gov/obhistory/KCHO.htm
l
– According to this site, the airport received about 0.03
inches between November 16, 2014 07:00 EST to
November 17, 2014 06:53 EST.
Challenge Problem
• Create a Python script that:
1. Downloads a precip *.nc file from NWS
2. Extracts the precip for a given cell
3. Prints the precip value along with the date of
observation
I put my solution on the GitHub site:
https://github.com/goodalljl/hydroinformatics_class/blo
b/master/Class23_GetPrecipForCville.py
Next Class
• We will work in class to extend your Python script to
repeat for multiple days to derive a precipitation time
series for a location
• I will briefly introduce OPeNDAP
(http://en.wikipedia.org/wiki/OPeNDAP) as a means
for remote (server-side) subsetting of NetCDF files
• NOTE: There will not be a HW assignment for this
material. We want you to be spending your time on
your class projects instead.