Time Series Plot from the Data Portal

Download Report

Transcript Time Series Plot from the Data Portal

CUAHSI-Hydrologic Information
Systems
UCAR
• CUAHSI – Consortium of
Universities for the
Advancement of
Hydrologic Science, Inc
• Formed in 2001 as a
legal entity
• Program office in
Washington (5 staff)
• Supported by the
National Science
Foundation
Unidata
Atmospheric
Sciences
Earth
Sciences
Ocean
Sciences
CUAHSI
HIS
National Science Foundation
Geosciences Directorate
CUAHSI Member Institutions
115 Universities as of August 2006
CUAHSI Mission: To provide infrastructure
and services to advance the development of
hydrologic science and education
Hydrologic
Synthesis
Additional
Hypotheses
Multi-Disciplinary
Teams
Data
Hypotheses
Community
Support
Needs
Measurement
Technology
Technological
Advances
Hydrologic
Observatories
Tools
Models
Data
Community
Support
Hydrologic
Information
Systems
Exogenous Data
Common Vision: WATERS Network
Observatories/
Environmental
Field Facilities
Informatics
Server
domainDNS
Disk array
Sensors and Measurement Facility
Workstation
Synthesis
Q( x ) 
 r (x )dx
CA
A combined CLEANER-CUAHSI effort
Definition
The CUAHSI Hydrologic Information
System (HIS) is a geographically
distributed network of hydrologic data
sources and functions that are
integrated using web services so that
they function as a connected whole.
Goals
• better Data Access
• support for Hydrologic
Observatories
• advancement of Hydrologic Science
• enabling Hydrologic Education
CUAHSI HIS Project Team
Project co-PI
Collaborator
CUAHSI Hydrologic Information System
1. Assemble
data from many
sources
Experiments
Monitoring
Information Sources
Remote sensing
GIS
Climate models
2. Integrate data
into a coherent
structure
3. Do science
Hydrologic Information Model
Modeling, Analysis
and Visualization
Hypothesis testing
Statistics Data Simulation
Assimilation
HIS User Assessment
• First survey done for HIS White Paper
(2003)
• HIS Symposium in March – 4 institutional
surveys and a survey of participants
• CUAHSI Web Surveyor – online
questionnaire (75 responses from 38
institutions)
• Summary paper
Value Score (counting 4 for first, 4 for
second, 2 for third and 1 for fourth).
Please rank these four HIS service categories for helping you.
Conclusion: Data services are the highest priority
% of time spent preparing data
Which operating systems do you use for your
research? If you use more than one operating
system, select all that apply.
Please indicate one dataset that you believe would
most benefit from increased ease of access through a
Hydrologic Information System (HIS).
EPA STORET Water Quality
USGS Streamflow
Remote Sensing data (e.g. LANDSAT, GOES, AVHRR)
NEXRAD Radar Precipitation
not applicable to my research
National Water Quality Assessment (NAWQA)
Conclusion: EPA
STORET Water
Quality, Streamflow
and Remote Sensing
Data are perceived to
be able to benefit from
improved access.
National Land Cover dataset (NLCD)
USGS Groundwater levels
Soils Data (STATSGO/SSURGO)
NCDC Precipitation
Climate Model Reanalysis data (e.g. NARR)
PRISM Precipitation data
NCDC Pan Evaporation
USGS National Geology data
National Hydrography Dataset
I am surprised USGS streamflow
is up there. Is this an indication of
importance over difficulty?
National Elevation Dataset and derivatives (EDNA)
SNOTEL
0%
2%
4%
6%
8% 10% 12% 14% 16% 18%
How we use software
(Austin Symposium)
4.0
Excel
3.9
A rcGIS/A rcView
3.3
FORTRA N
3.2
C/C++
2.7
Java
2.6
M S A ccess
2.5
Visual B asic
M atlab
2.4
SQL/Server
2.4
2.3
M o dflo w
1 =Never use
2 =do no t rely o n
3 =Use o ccasio nally
4 =Use o ften
5 = find indispensable
Value Score (counting 3 for first, 2
for second and 1 for third).
Which of the following data analysis difficulties are most
important for HIS to address?
Conclusion: High priorities
are:
- Data formats
- Metadata
- Irregular time steps
How we use software (Web
Surveyor)
• Programming (85% of respondents): Fortran,
C/C++, Visual Basic
• Data Management (93%): Excel, MS Access
• GIS (93%): ArcGIS
• Mathematics/Statistics (98%): Excel, Matlab,
SAS, variety of other systems
• Hydrologic models (80%): Modflow, HEC models
• A general, simple, standard, and open interface
that could connect with many systems is the only
way to accommodate all these
Water Data
Water quantity
and quality
Soil water
Meteorology
Remote sensing
Rainfall & Snow
Modeling
Water Data Web Sites
“Digital Watershed”
How can hydrologists integrate observed and
modeled data from various sources into a single
description of the environment?
A digital watershed is a synthesis of hydrologic observation
data, geospatial data, remote sensing data and weather
and climate data into a connected database for a hydrologic region
HDAS Web portal Interface
Information input, display, query and output services
Web services
interface
HTML -XML
WaterOneFlow
Web Services
e.g. USGS,
NCDC
WSDL - SOAP
3rd party
servers
Uploads
Downloads
Preliminary data exploration and discovery. See
what is available and perform exploratory analyses
Data access
through web
services
Data storage
through web
services
GIS
Matlab
IDL
Observatory
servers
SDSC HIS
servers
Splus, R
Excel
Programming
(Fortran, C, VB)
Applications and Services
Web application: Data Portal
Your application
• Excel, ArcGIS, Matlab
• Fortran, C/C++, Visual Basic
• Hydrologic model
• …………….
Your operating system
• Windows, Unix, Linux, Mac
Internet
Web Services
Library
CUAHSI Hydrologic Data Access System
http://river.sdsc.edu/HDAS
EPA
NCDC
NASA
NWS
Observatory Data
USGS
Arc Hydro
Server will be a
customization of
ArcGIS Server
9.2 for serving
water
observational
data
A common data window for accessing, viewing
and downloading hydrologic information
Utah State University Streamflow Analyst
Data Sources
NASA
Storet
Extract
Ameriflux
NCDC
Unidata
NWIS
NCAR
Transform
CUAHSI Web Services
Excel
Visual Basic
C/C++
ArcGIS
Load
Matlab
Applications
http://www.cuahsi.org/his/
Fortran
Access
Java
Some operational services
CUAHSI Hydrologic Information
System Levels
National HIS – San Diego Supercomputer Center
Map interface, observations catalogs and web services for
national data sources; integration of information from
workgroups
Workgroup HIS – research group or observatory
HIS Server
Map interface, observations catalogs and web services for
regional data sources; observations databases and web
services for individual investigator data
Personal HIS – an individual hydrologic scientist
Application templates and HydroObjects for direct ingestion of
data into analysis environments: Excel, ArcGIS, Matlab,
programming languages; MyDB for storage of analysis data
HIS Analyst
HIS Server
• Supports data discovery,
delivery and publication
– Data discovery – how do I
find the data I want?
• Map interface and
observations catalogs
• Metadata based Search
– Data delivery – how do I
acquire the data I want?
• Use web services or
retrieve from local
database
– Data Publication – how do I
publish my observation
data?
• Use Observations Data
Model
Observations Catalog
Specifies what variables are measured at each site, over what time interval,
and how many observations of each variable are available
HIS Server Architecture
• Map front end – ArcGIS
Server 9.2 (being
programmed by ESRI
Water Resources for
CUAHSI)
• Relational database –
SQL/Server 2005 or
Express
• Web services library –
VB.Net programs
accessed as a Web
Service Description
Language (WSDL)
National and Workgroup HIS
National HIS
National HIS has a polygon
in it marking the region of
coverage of a workgroup HIS
server
For HIS 1.0 the National and Workgroup HIS
servers will not be dynamically connected.
Workgroup HIS
Workgroup HIS has local
observations catalogs for
coverage of national data
sources in its region. These
local catalogs are partitioned
from the national observations
catalogs.
Hydrologic Science
It is as important to represent hydrologic environments precisely with
data as it is to represent hydrologic processes with equations
Physical laws and principles
(Mass, momentum, energy, chemistry)
Hydrologic Process Science
(Equations, simulation models, prediction)
Hydrologic conditions
(Fluxes, flows, concentrations)
Hydrologic Information Science
(Observations, data models, visualization
Hydrologic environment
(Dynamic earth)
Continuous Space-Time Model –
NetCDF (Unidata)
Time, T
Coordinate
dimensions
{X}
D
Space, L
Variables, V
Variable dimensions
{Y}
Discrete Space-Time Data Model
ArcHydro
Time, TSDateTime
TSValue
Space, FeatureID
Variables, TSTypeID
HydroVolumes
Take a watershed and extrude it vertically into the atmosphere
and subsurface
A hydrovolume is “a volume in space through which water, energy
and mass flow, are stored internally, and transformed”
Watershed Hydrovolumes
Hydrovolume
Geovolume is the
portion of a hydrovolume
that contains solid
earth materials
USGS Gaging stations
Stream channel Hydrovolumes
Geospatial Time Series
Time Series
Properties
(Type)
Value
A Value-Time array
Time
Shape
A time series that knows what
geographic feature it describes
and what type of time series it is
Terrain Data Models
Grid
TIN
Contour and flowline
Neuse Basin: Coastal aquifer system
Section line
Beaufort Aquifer
* From USGS, Water Resources Data Report of North Carolina for WY 2002
Neuse Groundwater
Geovolumes of hydrogeologic units
from US Geological survey (GMS)
Create a 3 dimensional
representation
Geovolume
Each cell in the 2D
representation is transformed
into a 3D object
Geovolume with model cells
Page 3
The Demands
Numerical Models
Prediction
Air-Q
Sensor Arrays
HSPF
MM5
NCDC
METADATA
USGS
NWIS
NCEP
NWS
Data Centers
Drexel University, College of Engineering
NGDC
Individual
Samples
Page 21
Hydrologic Metadata
Upper Hydrologic Ontology
We currently
What
we need
have
is
Many More
ISO 19108 Temporal Objects
ARCHydro
Many More
ISO 19115 Geospatial
ISO 19103 Units/Conversion
USGS Hydrologic Unit Code
Hydrologic Processes
Sedimentation
Many More
Michael Piasecki is our
expert in this subject!
Drexel University, College of Engineering
Many More
Ontology Examples
CUAHSI Observations Data Model
• A relational database
stored in Access,
PostgreSQL,
SQLServer, ….
• Stores observation
data made at points
• Access data through
web interfaces
• Fill using automated
data harvesting
Streamflow
Precipitation
& Climate
Water Quality
Groundwater
levels
Soil
moisture
data
Flux tower
data
Purposes
• Hydrologic Observations Data System to Enhance
– Retrieval
– Integrated Analysis
– Multiple Investigators
• Standard and Scalable Format for Sharing
• Ancillary information (metadata) to allow unambiguous
interpretation and use – incorporating uncertainty
• Traceable heritage from raw measurements to usable
information – quality control levels
Premise
• A relational database at the single observation level (atomic
model)
– Querying capability
– Cross dimension retrieval and analysis
Community Design Requirements
(from comments of 22 reviewers)
• Incorporate sufficient metadata to identify provenance
and give exact definition of data for unambiguous
interpretation
• Spatial location of measurements
• Scale of measurements
• Depth/Offset Information
• Censored data
• Classification of data type to guide appropriate
interpretation
– Continuous
– Indication of gaps
• Indicate data quality
Scale issues in the interpretation
of data
The scale triplet
a) Extent
b) Spacing
c) Support
From: Blöschl, G., (1996), Scale and Scaling in Hydrology, Habilitationsschrift, Weiner Mitteilungen Wasser Abwasser Gewasser, Wien, 346 p.
Hydrologic Observations Data Model
What are the basic attributes to be associated with each single observation and
how can these best be organized?
Data Source and
Network
Sites
Variables
Values
Metadata
Controlled
Vocabulary
Tables
e.g. mg/kg, cfs
e.g. depth
Streamflow
Depth of
snow
pack
Landuse, Vegetation
e.g. Non-detect,Estimated,
Windspeed, Precipitation
A data source operates an
observation network
A network is a set of observation
sites
A site is a point location
where one or more variables
are measured
A variable is a property
describing the flow or quality
of water
A value is an observation of a
variable at a particular time
Data Discovery
See http://www.cuahsi.org/his/documentation.html
Metadata provide information about the context of the observation.
Data Delivery
Ernest To
Center for Research in Water Resources
University of Texas at Austin
Independent of, but coupled to
Geographic Representation
Arc Hydro
HODM
Feature
Hydrologic Observations
Data Model
MonitoringPoint
1
SiteID
SiteCode
SiteName
Latitude
Longitude
…
!(
1
OR
HydroID
HydroCode
FType
Name
JunctionID
CouplingTable
SiteID
(GUID)
1
HydroID (Integer)
(!
!(
HydroID
!(
!(
!(
!(
Name
AreaSqKm
JunctionID
!( *
ComplexEdgeFeature
!(
!(
1
HydroEdge
!(
(!
!(
HydroID
HydroCode
ReachCode
Name
!( !(
LengthKm
LengthDown
FlowDir
!( FType
!( !(
EdgeType
!( Enabled
!(
!(
!((!
(!(!
Flowline
Shoreline
!(
!(
!(
*
!(
!( (!
!(
!(
HydroNetwork
!(
!( !(
!(
!(
!(
!(
!(
!(
EdgeType
!(
DrainID
!(
!(
AreaSqKm
JunctionID
!(
NextDownID
SimpleJunctionFeature
1
!( !(!(
Watershed
!((! HydroID
!(
(!(
!(! !( HydroCode
!(
!( HydroCode
FType
!(!(
*
!(
!(
Waterbody
HydroPoint
HydroJunction
(!
HydroID
!( !(
HydroCode
NextDownID
LengthDown
DrainArea
FType
Enabled
AncillaryRole
(!
1
!(
!(
!(
!( !(
!(
NHDPlus as a starting point for
geographic representation
• Slope
• Elevation
• Mean annual flow
– Corresponding velocity
• Drainage area
• % of upstream
drainage area in
different land uses
• Stream order
Variable attributes
Cubic meters per second
L3/T
m3/s
VariableName, e.g. discharge
VariableCode, e.g. 0060
SampleMedium, e.g. water
Valuetype, e.g. field observation, laboratory sample
IsRegular, e.g. Yes for regular or No for intermittent
TimeSupport (averaging interval for observation)
DataType, e.g. Continuous, Instantaneous, Categorical
GeneralCategory, e.g. Climate, Water Quality
NoDataValue, e.g. -9999
Data Types
•
•
•
•
•
•
•
•
•
Continuous (Frequent sampling - fine spacing)
Instantaneous (Spot sampling - coarse spacing)
t
Cumulative
V( t )   Q( )d
t
0
Incremental
V( t )   Q( )d
V (t )
t  t
Average
Q (t ) 
t
Maximum
Minimum
Constant over Interval
Categorical
Groups and Derived From Associations
Stage and Streamflow Example
Daily Average Discharge Example
Daily Average Discharge Derived from 15 Minute Discharge Data
Offset
OffsetValue
Distance from a datum or
control point at which an
observation was made
OffsetType defines the type of
offset, e.g. distance below
water level, distance above
ground surface, or distance
from bank of river
Water Chemistry from a profile in a lake
Methods and Samples
Method specifies the method whereby an observation is
measured, e.g. Streamflow using a V notch weir, TDS
using a Hydrolab, sample collected in auto-sampler
SampleID is used for observations based on the
laboratory analysis of a physical sample and identifies
the sample from which the observation was derived.
This keys to a unique LabSampleID (e.g. bottle number)
and name and description of the analytical method used
by a processing lab.
Accuracy and Precision
ObsAccuracyStdDev
Numeric value that expresses
measurement accuracy as the
standard deviation of each
specific observation
Observation Series
An Series is a set of all the observations of a particular variable at
one place, i.e. with unique SiteID. The ObservationSeriesCatalog is
programatically generated to provide a means by which a user can
get simple descriptive information about the variables observed at
a location.
Data Quality
Qualifier Code and Description provides qualifying information
about the observations, e.g. Estimated, Provisional, Derived,
Holding time for analysis exceeded
QualityControlLevel records the level of quality control that the
data has been subjected to.
- Level 0. Raw Data
- Level 1. Quality Controlled Data
- Level 2. Derived Products
- Level 3. Interpreted Products
- Level 4. Knowledge Products
15 min
Precipitation from
NCDC
Irregularly sampled groundwater level
How Excel connects to ODM
Excel
•
•
Obtains inputs for
CUAHSI web
methods from
relevant cells.
Available Web
methods are
GetSiteInfo,
GetVariableInfo
GetValues
methods.
HydroObjects
parses user
inputs into a
standardized
CUAHSI web
method
request.
CUAHSI Web service
converts
standardized
request to
SQLquery.
SQL query
Observations
Data
Model
Response
imports VB object
into Excel and
graphs it
converts XML
to VB object
converts
response to a
standardized
XML.
Example: Matlab use of CUAHSI Web Services
% create NWIS class and an instance of the class.
createClassFromWsdl('http://river.sdsc.edu/NWISTS/nwis.asmx?WSDL');
svsNWIS = NWIS;
xmlSites=GetSites(svsNWIS); % Could parse to identify sites to work with.
SiteID='10109000';
% Here specify a SiteID to use
% Call the GetSiteInfo function
xmlSiteInfo=GetSiteInfo(svsNWIS,SiteID)
% Parse the XML that is returned to learn the variables recorded there
structSiteInfo=parse_xml(xmlSiteInfo)
… (non trivial)
% Call the GetVariableInfo function to get details about each variable
xmlVarInfo=GetVariableInfo(svsNWIS,varcodes(i));
structVarInfo=parse_xml(xmlVarInfo);
% Parse to write results to html file for display
… (non trivial)
NWIS Site Information Generated using Web Services in
Matlab
Retrieve Data using GetValues
xmlVals=GetValues(svsNWIS,SiteID,varcodes(1),D1,D2);
% Parse the xml string that is returned into matrices and plot
strValues=parse_xml(xmlVals);
… (non trivial)
2000
plot(dn,Q);datetick;
1800
1600
1400
1200
1000
800
600
400
200
0
1950
1960
1970
1980
1990
2000
2010
Conclusions
• HIS = a geographically distributed system of
web-connected data and functions
• Hydrologic Data Access System is a significant
technological innovation
• Emerging understanding of digital watershed
structure and functions
• Beginnings of hydrologic information science
and shared data models with neighboring
sciences
• Web services provide access to HIS capability
from within a users preferred analysis
environment