Transcript CUAHSI_HIS

CUAHSI, WATERS and HIS
by Richard P. Hooper, David G.
Tarboton and David R. Maidment
The Need: Hydrologic Information
Science
It is as important to represent hydrologic environments precisely with
data as it is to represent hydrologic processes with equations
Physical laws and principles
(Mass, momentum, energy, chemistry)
Hydrologic Process Science
(Equations, simulation models, prediction)
Hydrologic conditions
(Fluxes, flows, concentrations)
Hydrologic Information Science
(Observations, data models, visualization
Hydrologic environment
(Dynamic earth)
Abstractions in Modeling
Real World
Physical
“Digital Environment”
Water
DNA Sequences
quantity
Meteorology
Geomorphologist
Hydrologist
Remote
sensing
Aquatic
Biogeochemist
Ecologist Vegetation
Survey
and quality
Conceptual
Snowmelt
Glaciated
Valley
Frameworks
Processes?
Groundwater
World Contribution?
Model
-Mathematical
Formulae
Geographically
Mapping
DOC Quality?
Perifluvial
Representations
Oligotrophic?
-Solution
Techniques
Referenced
Backwater
habitat
Hyporheic exchange?
Carbon
source?
Zones?
Q,Redox
Gradient,
Roughness?
Data
Substrate Size, Stability?
•Theory/Process
Knowledge
Hypothesis
Thalweg?
Representation
Benthic
Community
Well
Mineralogy?
sorted?
Chemistry?
•Perceptions
of this
place
•Intuition
Testing
Measurements
Abstractions in Modeling
How do different disciplines view the same place?
“Digital Environment”
Digital Environment
• Use GIS to explicitly map conceptual
model to real digital representation
– What do data represent to scientist?
• Assess utility of data to support multiple
conceptual models
• Pilot Projects:
– WATERS Test beds: Digital watersheds
– Critical Zone Observatories
Advancement of water science is critically
dependent on integration of water information
Models
Databases: Structured data sets to
facilitate data integrity and effective
sharing and analysis.
- Standards
ODM
- Metadata
- Unambiguous interpretation
Analysis: Tools to provide windows
into the database to support
visualization, queries, analysis, and
data driven discovery.
Web Services
Databases
Analysis
Models: Numerical implementations of
hydrologic theory to integrate process
understanding, test hypotheses and
provide hydrologic forecasts.
Water Data
Water quantity
and quality
Soil water
Meteorology
Remote sensing
Rainfall & Snow
Modeling
CUAHSI Observations Data Model
• A relational database at the
Streamflow
single observation level
(atomic model)
• Stores observation data made
at points
Precipitation
• Metadata for unambiguous & Climate
interpretation
• Traceable heritage from raw
measurements to usable
information
Water Quality
• Standard format for data
sharing
• Cross dimension retrieval and
analysis
Groundwater
levels
Soil
moisture
data
Flux tower
data
CUAHSI Observations Data Model
http://www.cuahsi.org/his/odm.html
Stage and Streamflow Example
ODM to Datacube
• A data cube is a database specifically for
data mining (OLAP)
– Organizes data along dimensions such
as time, site, or variable type
– Easy to group, filter, and aggregate
data in a variety of ways
– Simple aggregations such as sum, min,
or max can be pre-computed for speed
– Additional calculations such as median
can be computed dynamically
• SQL Server Analysis Services (SSAS)
provides the OLAP engine
• SQL Server Business Intelligence
Development Studio is used to define and
tune
• Excel and other client tools enable simple
browsing
Slide from Catharine van Ingen, Microsoft Research
ODM to Datacube
• A data cube is a database
specifically for data mining (OLAP)
– Organizes data along dimensions
such as time, site, or variable type
– Easy to group, filter, and aggregate
data in a variety of ways
– Simple aggregations such as sum,
min, or max can be pre-computed for
speed
– Additional calculations such as
median can be computed
dynamically
• SQL Server Analysis Services
(SSAS) provides the OLAP
engine
• SQL Server Business
Intelligence Development
Slide from
Catharineisvan
Ingen, to
Microsoft
Research
Studio
used
define
and
Data Processing
Applications
Internet
ODM and HIS in an Observatory Setting
Integration of Sensor Data With HIS
Base Station
Computer(s)
Observations
Database
(ODM)
Internet
Telemetry
Network
Data discovery, visualization,
analysis, and modeling
through Internet enabled
applications
Sensors
Workgroup HIS
Server
Workgroup HIS Tools
Programmer interaction
through web services
Sensors, data
collection, and
telemetry network
Integrated Monitoring System
Sensors
(Streamflow
Water Quality
Climate)
Wet Chemistry
Measurements
Bayesian Networks to
control monitoring
system, triggering
sampling for storm
events and base flow
Telemetry Network
A
B
Sensor
Bayes
Network
Central
Observations
Database
C
Site specific correlations
between sensor signals
and other water quality
variables
Constituent
Bayes A
Net
Little Bear River at Mendon Road (4905000)
B
300
Nutrient
Estimates
250
y = 2.3761x
R2 = 0.6993
200
C
Bayesian Networks to
construct water quality
measures from
surrogate sensor
signals to provide high
frequency estimates of
water quality and
loading
150
175
150
100
125
Residue Total Nonfiltrable; mg/L
TOtal Suspended Solids (mg/L)
CUAHSI HIS ODM
– central storage
and management
of observations
data
50
0
100
75
50
25
0
15
30
45
60
75
0
1980
Turbidity (NTU)
1990
Date
2000
Exogenous
Variables
(GIS, Land Use,
Management)
Little Bear River Near Avon (4905700)
450
Total Suspended Solids (mg/L)
400
y = 2.6882x + 1.8492
R2 = 0.8641
350
300
End result: high frequency
estimates of nutrient
concentrations and loadings
250
200
150
100
50
0
0
20
40
60
80
100
120
140
Managing Data Within ODM - ODM Tools
• Load – import existing
data directly to ODM
• Query and export –
export data series and
metadata
• Visualize – plot and
summarize data
series
• Edit – delete, modify,
adjust, interpolate,
average, etc.
Linking GIS and Water Resources
GIS
Water
Resources
Hydrologic Information System
GIS – the water environment
Water Resources – the water itself
Point Observations Information Model
http://www.cuahsi.org/his/webservices.html
USGS
Data Source
Streamflow gages
GetSites
Network
GetSiteInfo
Neuse River near Clayton, NC
Sites
Discharge, stage
(Daily or instantaneous)
GetVariables
Variables
Values
•
•
•
•
•
•
•
GetVariableInfo
GetValues
206 cfs, 13 August 2006 {Value, Time, Qualifier, Offset}
A data source operates an observation network
A network is a set of observation sites
A site is a point location where one or more variables are measured
A variable is a property describing the flow or quality of water
A value is an observation of a variable at a particular time
A qualifier is a symbol that provides additional information about the value
An offset allows specification of measurements at various depths in water
WaterML and WaterOneFlow
Locations
Variable Codes
Date Ranges
GetSiteInfo
GetVariableInfo
GetValues
Data
STORET
WaterML
Data
Data NAM
NWIS
WaterOneFlow
Web Service
Data
Repositories
Client
LOAD
TRANSFORM
EXTRACT
WaterML is an XML language for communicating water data
WaterOneFlow is a set of web services based on WaterML
WaterOneFlow
• Set of query functions
• Returns data in WaterML
WATERS Network Information System
Utah State
University
HIS
Servers
National HIS Server at
San Diego SuperComputer Center
Texas A&M
Corpus Christi
NSF has funded work at 11 testbed sites,
each with its own science agenda. A CUAHSI
Hydrologic Information Server is installed at each site.
Multiscale Information System
• Global data
• National data
• State data
• Project in region ….
• Principal investigator
data
Corpus Christi Bay WATERS Testbed site
NCDC station
TCEQ stations
TCOON stations
Hypoxic Regions
Montagna stations
USGS gages
SERF stations
National Datasets (National HIS)
USGS
NCDC
Regional Datasets (Testbed HIS)
TCOON
Dr. Paul Montagna
TCEQ
SERF
Hydrologic Information Server
• Supports data discovery,
delivery and publication
– Data discovery – how do I
find the data I want?
• Map interface and
observations catalogs
• Metadata based Search
– Data delivery – how do I
acquire the data I want?
• Use web services or
retrieve from local
database
– Data Publication – how do I
publish my observation
data?
• Use Observations Data
Model
Observation Stations Map for the US
Ameriflux Towers (NASA & DOE)
NOAA Automated Surface
Observing System
USGS National Water Information System
NOAA Climate Reference Network
http://river.sdsc.edu/DASH
Observations Catalog
Specifies what variables are measured at each site, over what time interval,
and how many observations of each variable are available
Hydrologic Information Server
WaterOneFlow services
DASH – data access system for hydrology
GetSites
GetSiteInfo
GetVariables
GetVariableInfo
GetValues
ArcGIS Server
Observations Data
Geospatial Data
Microsoft SQLServer Relational Database
Data Heterogeneity
• Syntactic mediation
– Heterogeneity of format
– Use WaterML to get data
into the same format
• Semantic mediation
– Heterogeneity of meaning
– Each water data source
uses its own vocabulary
– Match these up with a
common controlled
vocabulary
– Make standard scientific
data queries and have
these automatically
translated into specific
queries on each data
source
Objective
•
Search multiple heterogeneous data sources simultaneously regardless of
semantic or structural differences between them
What we are doing now …..
NWIS
request
return
request
return
request
return
NAWQA
request
return
NAM-12
request
return
request
return
request
return
request
return
NARR
Michael Piasecki
Drexel University
Semantic Mediator
What we would like to do …..
GetValues
GetValues
NWIS
GetValues
GetValues
generic
request
GetValues
GetValues
NAWQA
Michael Piasecki
Drexel University
GetValues
GetValues
NARR
HODM
HydroSeek: http://www.hydroseek.org