2016-01-08_ESIP_Signell_CSW_Workflows

Download Report

Transcript 2016-01-08_ESIP_Signell_CSW_Workflows

Catalog-driven workflows using CSW
Rich Signell , USGS, Woods Hole, MA, USA
Filipe Fernandes, SECOORA, Brazil
Kyle Wilcox, Axiom Data Science, Wickford, RI, USA
ESIP Winter Meeting, Washington, DC
2016-01-08
The 4th Network Layer: Data
Data
Web
TCP/IP
Ethernet
• “We need an end-to-end, layer-by-layer,
designed information technology … that are
composed of no more than a stack of protocols”
• “We need open standards… and above all, we
need to teach scientists to work in this new layer
of data”
From the essay: “I have seen the Paradigm Shift, and It Is Us”,
byJohn Wilbanks, in the book “The Fourth Paradigm”
2
®
US Integrated Ocean Observing System (IOOS )
• Global Component
• Coastal Component
 17 Federal Agencies
 11 Regional Associations
IOOS Core Principles
• Adopt open standards & practices
• Avoid customer-specific stovepipes
• Standardized access services implemented at
data providers
Observations
Data
Provider
Web access
service
Customer
Models
4
Numerical model Output
Time Series, Trajectories
Meteorology and Wave Buoy in the Gulf
of Maine. Image courtesy of NOAA.
Ocean Glider. Photo by Dave Fratantoni,
Woods Hole Oceanographic Institution
IOOS Data Infrastructure Diagram
Nonstandard
Model Output
Data Files
THREDDS Data Server
NetCDF-Java
Rectilinear
ROMS
NCOM
HYCOM
Standardized
(CF-1.6, SGRID-0.1, UGRID-0.9)
Virtual Datasets
Common
Data Model
NcML
SELFE
NcML
ADCIRC
FVCOM
NcML
NcML
Observed data
(buoy, gauge,
ADCP, glider)
Nonstandard Data Files
Grid
TimeSeries
Profile
Trajectory
TimeSeriesProfile
Sgrid
Ugrid
Library
or Broker
Clients
Matlab
NetCDF
-Java
OPeNDAP
NcML
NcML
Web Services
IDV
Panoply
NetCDF4
-Python
WCS
ArcGIS
SOS
EDC
Python
ERDDAP
WMS
Web Portals
NetCDF Subset
ncISO
pycsw
Catalog
Services
Geoportal Server
GeoNetwork
CKAN
Catalog Search
8
Interoperable Access in Python (Iris)
IOOS System Test
2015 Boston Light Swim
2015 Aug 15, 7:00 am start
8 mile swim
No wet suit
How cold will the water be?
NECOFS Massbay Forecast
Reproducible Jupyter Notebook
Go to https://github.com/ocefpaf/boston_light_swim, click on “launch binder” to run on cloud
Final Result
18
19
pycsw
20
Workflow for the USGS CMG Portal
21
Workflow (3/3)
Axiom Data Science
– Runs a CSW search (in a cron job) on the
modeling groups pycsw services, filtering on
datasets that contain a project called
“CMG_Portal”
– Datasets that have valid WMS services are
added to the portal
See <https://github.com/USGS-CMG/usgs-cmgportal/wiki> for details of the workflow
22
23
WMS-driven Model Viewing Portal
25
Interoperable access in Matlab (nctoolbox)
27
28
Catalog-driven dynamic portals
29
30
Benefits of catalog-driven applications
• Dynamically adapt to new or changing data
• Find the machine-to-machine issues
– Easy problems that can be fixed in minutes to day
– Harder problems to guide future work
• Fixes for your workflow benefit everyone
• Build success stories
• Create reproducible workflows that others can learn from,
expand on, or transform
• Standardized workflows help develop the 4th network layer
for data