Katrina Project

Download Report

Transcript Katrina Project

Cyberinfrastructure for
Environmental Observing
Systems
Chaitan Baru
Director, Science R&D
San Diego Supercomputer Center
India-US Indoflux Workshop, July 12-16, 2006, Chennai, India
“CYBERINFRASTRUCTURE”
What do we mean?
• Technologies to bring remote resources together
A broad, systemic, strategic conceptualization
Components of Cyberinfrastructure (Web
Services)-enabled science & engineering
High-performance computing
for modeling, simulation, data
processing/mining
e-science
Humans
Individual &
Group Interfaces
& Visualization
Collaboration
Services
Instruments for
observation and
characterization.
Global
Connectivity
Social aspect: bringing
multidisciplinary groups together
http://www.communitytechnology.
org/nsf_ci_report/
Physical World
Facilities for activation,
manipulation and
construction
Knowledge management
institutions for collection building
and curation of data, information,
literature, digital objects
Source: Dan Atkins
Implies global
(international) system for collaboration
India-US Indoflux Workshop, July 12-16, 2006, Chennai, India
Environmental Observing Systems
• A major area of emphasis for NSF, and other
agencies (e.g. GEOSS)
–
–
–
–
LTER, www.lternet.org
NEON, www.neoninc.org
ORION, www.orionprogram.org
Waters Network
• CUAHSI HIS, www.cuahsi.org/his
• CLEANER, cleaner.ncsa.uiuc.edu
– EarthScope, www.earthscope.org
– And many other efforts…(NEES, www.neesinc.org), etc.
• Number of established efforts at other federal
and state agencies
– USGS, EPA, DOE, …
India-US Indoflux Workshop, July 12-16, 2006, Chennai, India
Example of a Cyberinfrastructure Project
GEON: Geosciences Network
(www.geongrid.org)
• Funded by NSF IT Research program (~$11.5M)
• Multi-institution collaboration between IT and Earth
Science researchers
• GEON Cyberinfrastructure provides:
– Authenticated access to data and Web services
– Registration of data sets and tools, with metadata
– Search for data, tools, and services, using ontologies
– Scientific workflow environment
– Data and map integration capability
India-US
Indoflux Workshop,
12-16, 2006, Chennai,
IndiaGIS mapping
– Scientific
dataJuly
visualization
and
Key Informatics Areas
• Portals
– Authenticated, role-based access to cyber resources: data, tools,
models, model outputs, collaboration spaces, …
• Data Integration
– Search, discover and integrate data from heterogeneous
information sources (“mediation” and “semantic integration”)
• Modeling and simulation environments based on
“scientific workflow” software
– Users can “program” and steer computations at a higher level of
programming abstraction
– Share models (not only data), and support generation and sharing
of provenance information
• Geospatial information and Geographic Information
Systems (GIS)
– Spatial statistics, spatiotemporal data mining
• Visualization of 2D, 2.5D, 3D, 4D data, and
multidimensional information spaces
India-US Indoflux Workshop, July 12-16, 2006, Chennai, India
GEON: International Component
• India
– Collaboration with University of Hyderabad
• Profs. K.V. Subbarao & Arun Agarwal
• Deploying a GEON Node at UofHyd and an India-based portal
– Conducted GEON Cyberinfrastructure Workshop, Oct. 2005
– Recently announced as a Knowledge Networked R&D Center by Indo-US
Science and Technology Forum
• Will partner with institutions like NGRI, INCOIS, Wadia Institute of Himalayan
Geology (WIHG), Birbal Sahni Institute of Paleo-Botany (BSIPB)
• China
– Collaboration with Chinese Academy of Sciences, Beijing
• Dr. Yaolin Shi, Director, Chinese Geodynamics Lab, Dr. Baopin Yan, Dir, CNIC
– GEON Cyberinfrastructure Workshop, July 20-23, 2006, Beijing.
– Deploy GEON node & a Linux cluster for developing parallel geodynamics
codes
• Japan
– Collaboration with AIST, Tokyo
• Dr. Satoshi Sekiguchi
– Initiating a GEOGrid in Japan. Inauguration in early October, 2006
– Will make various remote sensing data available via GEON.
LiDAR Data Processing
Meeting in August with USGS EROS
Data
Center to make continental-scale
Survey
datasets open to NEON, GEON, and
hazards user communities
R. Haugerud, U.S.G.S
Process &
Classify
D. Harding, NASA
• Current implementation
– 32 IBM P690 1.7GHz processors,
128GB, 8TB SAN
– ~2TB point cloud data, ~6B rows in
database
– ~20TB orthophotos
Point Cloud
Interpolate/Grid
• Migrating to…
– 16-way Linux cluster, 64-bit Intel
processors to support…
– Central warehouse and replicas for
failover and load balancing, and
– On-demand access & analysis of data
Analyze/ Interpret
Point Cloud
x, y, zn, …
Courtesy: Chris Crosby &
Prof. Ramon Arrowsmith, Arizona State
NEON Infrastructure Overview
NEON Sensornet-level Cyberdashboard
18
16
19
12
9
5
1
20
15
16
6
17
7
13
10
2
8
1
Northeast
2
Mid Atlantic
3
Southeast
4
Atlantic Neotropical
5
Great Lakes
6
Prairie Peninsula
12
Northern Rockies
7
Appalachians / Cumberland Plateau
13
Southern Rockies / Colorado Plateau
8
Ozarks Complex
14
Desert Southwest
9
Northern Plains
15
Great Basin
18
Tundra
10
Central Plains
16
Pacific Northwest
19
Taiga
11
Southern Plains
17
Pacific Southwest
20
Pacific Neotropical
14
11
3
4
Sensornet Software Stack
Portal
Workflows
Admin
and
Control
GridSphere
Kepler, Custom portlets
Services
Analysis
Data
and
Management
Visualization
Real-time Distributed Instrument Control
Sensors
Pub/Sub (Apache
AXIS WS)
Data Turbine, EmStar,
Antelope, SRB, Surge,
ESS2
TinyDB, ESS2
Courtesy: Tony Fountain, Neil Cotofana, SDSC
Definition of standard interfaces
E.g. NEON Site SensorNet Logical Infrastructure
User
(BioPDA)
User
(web)
Sensor
Services
Workflows
Portal Server
Data Mgmt.
RTD Instr. Ctl.
Workflows
Well-endowed Node
***
MicroNet Node
Sensor
Sensor
RTD Instr. Ctl.
MicroNet Node
…
Admin / Ctl.
RTD Instr. Ctl.
…
Portal
***
Sensor
Sensor MicroNet (e.g. Soil)
Services
…
Analysis
/ Vis.
Analysis / Vis. Server
Services
Data Mgmt.
RTD Instr. Ctl.
Data Mgmt.
Data Replication /
Archiving Server
Well-endowed Node
Sensor
RTD Instr. Ctl.
***
MicroNet Node
Sensor MicroNet (e.g. Climate)
Courtesy: Tony Fountain, Neil Cotofana, SDSC
Sensor
…
Services
MicroNet Node
***
…
Admin / Ctl.
RTD Instr. Ctl.
Sensor
Sensor
Opportunities
– Exploit national high-speed networking, e.g. Garuda, to
ensure easy and efficient access to online (cyber)
resources
– Leverage GEON cyberinfrastructure in collaboration with
UofHyd/UCSD
– Keep in step with NEON cyberinfrastructure
• Provide well-defined interfaces to metadata, data, and
instruments
– Create a true collaboration between scientists and
computer scientists and IT researchers
– Promote e-science
• Engage the next generation of scientists
– Lobby for NSF-India office (similar to NSF Beijing)
India-US Indoflux Workshop, July 12-16, 2006, Chennai, India
Thanks!
India-US Indoflux Workshop, July 12-16, 2006, Chennai, India
Data, Informatics and
Cyberinfrastructure
Applications: Medical informatics,
Biosciences, Ecoinformatics,…
integration
Visualization
Data Mining, Simulation Modeling,
Analysis, Data Fusion
Knowledge-Based Integration
Advanced Query Processing
Grid Storage
Filesystems, Database Systems
High speed networking
sensornets
Storage hardware
How do we represent data,
information and knowledge
to the user?
How do we detect trends and
relationships in data?
How do we obtain usable
information from data?
How do we collect, access
and organize data?
How do we configure computer
architectures to optimally support
data-oriented computing?
Networked Storage (SAN)
HPC
How do we combine data, knowledge
and information management with
simulation and modeling?
instruments
India-US Indoflux Workshop, July 12-16, 2006, Chennai, India
NEON Visualization and Forecasting Facility
NEON Data
Replica
Visualization
Displays
(“Synthesis
Center”)
Internet 2
Data from NEON District-level PoP’s
Processing nodes
...
Compute
Cluster
w/ disk
Geographically
Remote archival
copy
...
SAN
Raw data,
Derived products
NEON Data Center
Access to
TeraGrid
Archival
Storage
India-US Indoflux Workshop, July 12-16, 2006, Chennai, India
NEON “Cyberdashboard” (Portal)
– NEON Portal provides
• Authenticated access to NEON sensor
data, metadata, and derived prodcuts;
Web services; sensor command and
control
• Access to NEON forecasts and models via
scientific workflow environments
• Data and map integration capability
• Visualization and GIS mapping
A Proposed Open Systems Sensor
Network Software Stack
Portal
Workflows
Admin
and
Control
Services
Analysis
Data
and
Management
Visualization
Real-time Distributed Instrument Control
Sensors
Courtesy: Tony Fountain, Neil Cotofana, SDSC