LEAD – WRF - Unidata | Home

Download Report

Transcript LEAD – WRF - Unidata | Home

The LEAD Effort at Unidata
The Unidata Seminar will start at
1:30 PM MST
The LEAD Effort at Unidata
Tom Baltzer, Brian Kelly, Doug
Lindholm, Anne Wilson
December 14, 2005
LEAD is funded by the National Science
Foundation under the following
Cooperative Agreements:
ATM-0331594
ATM-0331591
ATM-0331574
ATM-0331480
ATM-0331579
ATM-0331586
ATM-0331587
ATM-0331578
Outline
1. Setting the Stage: Introduction to LEAD and
Unidata’s LEAD Efforts: Anne
2. Application of current technology on the LEAD
testbeds: Tom
3. The LEAD Hardware at Unidata: Brian
4. The THREDDS Data Repository: Doug
Setting the Stage: Introduction
to LEAD and Unidata’s LEAD
Efforts
Anne Wilson
Current IT Barriers to Mesoscale
Weather Research and Education
• Data and tools useable mainly by experts
• Researchers and educators constrained by
hardware limitations
• Rigid, brittle technology can’t accommodate
mesoscale weather research requirements:
– real time, on demand, dynamic data processing
and sensor steering
A Solution: Linked
Environments for Atmospheric
Discovery (LEAD)
• Funded by NSF Large Information Technology
Research (ITR) award
• Produce a web service based, scalable
framework for handling meteorological data and
model output:
– Identifying, accessing, preparing, assimilating,
predicting, managing, analyzing, mining, visualizing
– Independent of data format and physical location
• Dynamically adaptive workflows and steering of
sensors
The LEAD Vision
• Data access via querying, and browsing
• Analysis and forecast tools that can be
composed into workflows
• Workflows and sensors that respond to the
weather
• Support users ranging from grade 6 to
experienced researchers
LEAD Objectives
• Lower the barrier for entry and increase
the sophistication of problems that can be
addressed by complex end-to-end weather
analysis and forecasting/simulation tools
• Improve our understanding of and ability to
detect, analyze and predict mesoscale
atmospheric phenomena by interacting
with weather in a dynamically adaptive
manner
• Result: Paradigm change in how
experiments are conceived and performed
LEAD Challenges
Challenge
Requirements
Disparate, high volume data sets
Efficient transmission, remote
subsetting and aggregration,
reliable, robust storage, format
independence
Huge computational demands,
e.g. ensemble forecasting
Distributed, load balanced
computations
Use of existing complex numerical
models and data assimilation
systems
Make existing tools work in web
service environment
Lack of controlled vocabulary
Ontology, dictionary
Support for 6 – 12, college,
graduate, and advanced research
Robust security, user aids,
education modules, meaningful
responses
Multidisciplinary Effort
• Meteorology
• Computer Science and Information
Technology
• Education and Outreach
LEAD Institutions
> 100 scientists, students, technical staff
LEAD Thrust Groups
•
•
•
•
•
•
Data*
Orchestration
Portal
Meteorology
Grid and Web Services Test Bed*
Education and Outreach Test Bed
*Major Unidata areas
LEAD Data Subsystem
LEAD
Portal
Ontology
Service
Query
Service
Dictionary
Resource
Catalog
myLEAD
Catalog
LEAD Data Repository
(LDR)
Public Data
(e.g. IDD
data)
Unidata Technology Used in LEAD
• LDM/IDD Data Delivery: near real time data delivery
• THREDDS: catalogs of data and their associated
metadata
• Common Data Model (CDM): single interface to
multiple data formats
• THREDDS Data Server (TDS): integrated OPeNDAP
and http data access
• Integrated Data Viewer (IDV): visualization
• THREDDS Data Repository (TDR): data storage
framework
• Decoders
Unidata and LEAD
• Unidata also brings:
– Experience with atmospheric data
– Community of users
– Robust, fielded software
Recent LEAD-Related Efforts
Goal: Support both LEAD and our community
2. Application of current technology on our
LEAD testbed: Tom
3. Structure of the LEAD testbed: Brian
4. THREDDS Data Repository: Doug
Application of Current
Technologies on the LEAD
Testbed Systems
Tom Baltzer
Acronyms for LEAD Tools
ADAS - ARPS Data Assimilation System
(Center for Advanced Prediction of Storms at OU)
ADaM - Algorithm Development and Mining
(University of Alabama at Huntsville)
IDV – Integrated Data Viewer
(Unidata)
LDM/IDD – Local Data Manager/Internet Data Distribution
(Unidata)
OPeNDAP – Open-source Project for a Network Data Access Protocol
(OPeNDAP.org)
THREDDS – Thematic Real-time Environmental Distributed Data Services
TDS - THREDDS Data Server
TDR – THREDDS Data Repository
(Unidata)
WRF – The Weather and Research Forecasting Model
(ARW Core - NCAR)
Also: WS-Eta – Workstation Eta Model
LEAD Testbed Systems
• Testbed systems at several LEAD locations to
provide:
– Data
• Near Real-Time data ingest, storage and access
• LEAD Data Product storage and access
– Data Processing
• High Performance Computing
• Grid and Web Services
• Allow each institution to develop methods by
which their capabilities fit into LEAD effort
• Single Web Portal system at Indiana Univ. to
bring it all together and provide User Interface
MU
HU
CSU
Unidata
UI
IU
UNC
OU
UAH
LEAD Grid
Core Academic Partner
Core Academic Partner + Grid Test Bed
Core Academic Partner + Education Test Bed
Core Academic Partner + Grid Test Bed + Education Test Bed
Data Aspects of LEAD
Testbeds
LEAD Testbed Systems
• UPC Technologies being leveraged to
facilitate LEAD needs
– LDM/IDD
– THREDDS
– IDV
– NetCDF Decoders
– OPeNDAP (Unidata supported)
Typical LEAD Testbed
(Current Source Data Configuration)
LEAD
Grid System
Forecast
Model Output
Weather station
observations
THREDDS
Catalog
IDD
OPeNDAP
Aircraft data
Decoders
Radar data
GridFTP
Testbed System
Typical LEAD “Data” Testbed
(Future Source Data Configuration)
LEAD
Grid System
Forecast
Model Output
Weather station
observations
THREDDS
Catalog
OPeNDAP
IDD
TDS &
TDR
Aircraft data
Decoders
Radar data
GridFTP
Testbed System
Note: UPC plans ~ 6 month store
LEAD Processing on the
Unidata Testbed System
UPC Processing Testbed
(Current Configuration)
- WRF being Steered by Chiz’s GEMPAK
precipitation locator
NCEP NAM (Eta)
Forecast
Precipitation
Locator
WRF
Center
Lat/Lon Regional
Forecasts
WS-Eta
THREDDS
Catalog
OPeNDAP
Access
Unidata LEAD Test Bed
Next Steps
Millersville
ADaM Precip
Locator
CAPS ADAS
Assimilation
NCEP NAM (Eta)
Forecast
Precipitation
Locator
THREDDS
Catalog
WRF
Regional
Forecasts
WS-Eta
OPeNDAP
Access
Unidata LEAD Test Bed
Longer Term
IDD Datasets
• Radar
• Surface & Upper air
• Satellite
• NCEP NAM
NCEP NAM (Eta)
Forecast
ADaM
Precipitation
Locator
ADAS
WRF
Center
Lat/Lon Regional
Forecasts
WS-Eta
THREDDS
Catalog
OPeNDAP
Access
Unidata LEAD Test Bed
Ultimately
LEAD
Grid System
IDD Datasets
• Radar
• Surface & Upper air
• Satellite
• NCEP NAM
NCEP NAM (Eta)
Forecast
Web Service
ADaM
Precipitation
Locator
Web Service
ADAS
Web Service
WRF
Center
Lat/Lon Regional
Forecasts
WS-Eta
THREDDS
Catalog
OPeNDAP
Access
Unidata LEAD Test Bed
Objectives for UPC Testbed
• Testing ground for integration new UPC
and LEAD technologies
• Determining ways to bring LEAD
Technologies to the Unidata Community
• “Operational” environment for LEAD
• Processing cluster
• Data Storage
– ~6 months of IDD data
– LEAD product data
The LEAD Hardware
at Unidata
Brian Kelly
Existing LEAD Infrastructure
Lead3
Lead1
HTTP Server
THREDDS Server
OpenDAP Server
LDM Node
NFS Server
Cluster Node
GRID Server
Development Tools
NFS Server
Cluster Node
Lead4
TDS
LDM Node
NFS Server
Cluster Node
Lead2
GRID Server
NFS Server
Cluster Node
Cluster Monitoring
LeadStor
8 TB of Disk
NFS Server
Portal Servers for Web,
TDS, Grid and
LDM Services
UCAR/Unidata LEAD
Infrastructure
~30 GFLOP
Processing Cluster
40 TB Storage
Cluster
HTTP, TDS and
Grid Server
LDM Server
Test Server
Processing Cluster
Head Node
Storage Cluster
Gateway
Gigabit Network for
NFS Storage Access
LEAD Portal Systems
LEAD Processing Cluster
Beowulf Cluster
Connected by a
Gigabit Fibre Network
Each Node contains Two Athlon 2400+ CPUs
Cluster Uses OSCAR with the MPICH MPD
Eight Nodes is ~30 GFLOPs
LEAD Storage Cluster
LEAD Storage
Head Node
LEAD Storage
Gigabit Network
LEAD Storage Nodes
One (1) Guanghsing GHI-583 5U Case LEAD Storage Node
24 hot swapable SATA trays
1000W 2+2 power supply
●
One (1) Tyan Thunder K8SD Pro Motherboard
Dual Opteron CPUs
Four 64-bit 133/100 Mhz PCI-X Slots
Two Gigabit Ethernet ports
●
One (1) AMD Opteron 242 Processor
1.6 Ghz CPU
●
Three (3) Broadcom RAIDCore BC4853
Eight SATA ports
Controller spanning
Advanced raid
●
Twenty-Four (24) Seagate Barracuda ST3400832AS
7200 RPM 400GB SATA Drives
LEAD Storage Node
Twenty-Four (24) 400 GB Drives
Divided into Two (2) Eleven Column RAID 5 Arrays and Two Hot Spare
Form Two (2) 4 TB LUNs Using bcraid
Each Node Publishes the Two LUNS over iSCSI
LEAD Storage Gateway
●
Mounts Each Node's Two (2) 4 TB LUNs Published via iSCSI
●
Builds Two (2) 20 TB 6 column RAID 5 Meta-devices using mdadm
●
Divides Each Meta-device into Volume using LVM
●
Each Volume is Formatted with an XFS Filesystem
●
Each Filesystem is Published with NFS
Result: 40 TB of mid-performance double-redundant storage
THREDDS Data Repository
(TDR)
Doug Lindholm
LEAD Architecture
Data Storage Perspective
Unidata
NCSA
OU
UAH
IU
LEAD Data Grid
LEAD Architecture
Data Storage Perspective
Storage
Locator
Unidata
Data
Mover
NCSA
ID
Generator
OU
UAH
Name
Resolver
Metadata
Generator
Metadata
Crosswalk
IU
Cataloger
(myLEAD)
LEAD Data Grid
“Atomic”
Capabilities
LEAD Architecture
Data Storage Perspective
Forecast
Model
(WRF)
Storage
Locator
Unidata
Data
Mover
NCSA
ID
Generator
OU
UAH
Name
Resolver
Metadata
Generator
Metadata
Crosswalk
IU
Cataloger
(myLEAD)
LEAD Data Grid
Data
Assimilation
(ADAS)
“Atomic”
Capabilities
Data
Mining
(ADAM)
Visualization
(IDV)
Application
Services
LEAD Architecture
Data Storage Perspective
Forecast
Model
(WRF)
Storage
Locator
Unidata
Data
Mover
NCSA
ID
Generator
OU
UAH
Name
Resolver
Metadata
Generator
Metadata
Crosswalk
IU
Cataloger
(myLEAD)
LEAD Data Grid
Data
Assimilation
(ADAS)
“Atomic”
Capabilities
Portal
Data
Mining
(ADAM)
Visualization
(IDV)
Application
Services
User
LEAD Architecture
Data Storage Perspective
Forecast
Model
(WRF)
Storage
Locator
Unidata
Data
Mover
NCSA
ID
Generator
OU
UAH
Name
Resolver
Metadata
Generator
Metadata
Crosswalk
IU
Cataloger
(myLEAD)
LEAD Data Grid
Data
Assimilation
(ADAS)
“Atomic”
Capabilities
Portal
Data
Mining
(ADAM)
Visualization
(IDV)
Application
Services
User
LEAD Architecture
Data Storage Perspective
Forecast
Model
(WRF)
Storage
Locator
Unidata
Data
Mover
NCSA
ID
Generator
OU
UAH
Name
Resolver
Metadata
Generator
Metadata
Crosswalk
IU
Cataloger
(myLEAD)
LEAD Data Grid
Data
Assimilation
(ADAS)
“Atomic”
Capabilities
Portal
Data
Mining
(ADAM)
Visualization
(IDV)
Application
Services
User
LEAD Architecture
Data Storage Perspective
Forecast
Model
(WRF)
Storage
Locator
Unidata
Data
Mover
NCSA
ID
Generator
OU
UAH
Name
Resolver
Metadata
Generator
Metadata
Crosswalk
IU
Cataloger
(myLEAD)
LEAD Data Grid
Data
Assimilation
(ADAS)
“Atomic”
Capabilities
Portal
Data
Mining
(ADAM)
Visualization
(IDV)
Application
Services
User
LEAD Architecture
Data Storage Perspective
Unidata
Cataloger
(myLEAD)
THREDDS Data Repository
Storage
Locator
“Atomic”
Capabilities
Data
Repository
Data
Mover
NCSA
OU
ID
Generator
Name
Resolver
UAH
IU
LEAD Data Grid
Metadata
Generator
Metadata
Crosswalk
Forecast
Model
(WRF)
Data
Assimilation
(ADAS)
Portal
Data
Mining
(ADAM)
Visualization
(IDV)
Application
Services
User
THREDDS Data Repository
Component Architecture
Data Storage
locateStorage()
moveData()
generateUniqueID()
mapIDToURL()
generateMetadata()
translateMetadata()
catalogMetadata()
THREDDS Data Repository
putData()
discoverData()
getData()
THREDDS Data Repository
Component Architecture
Data Storage
locateStorage()
moveData()
generateUniqueID()
mapIDToURL()
generateMetadata()
translateMetadata()
catalogMetadata()
THREDDS Data Repository
putData()
discoverData()
getData()
THREDDS Data Repository
Component Architecture
Data Storage
locateStorage()
moveData()
generateUniqueID()
mapIDToURL()
generateMetadata()
translateMetadata()
catalogMetadata()
THREDDS Data Repository
putData()
discoverData()
getData()
LEAD Configuration
THREDDS Data Repository
Component Architecture
Data Storage
locateStorage()
moveData()
generateUniqueID()
mapIDToURL()
generateMetadata()
translateMetadata()
catalogMetadata()
THREDDS Data Repository
putData()
discoverData()
getData()
Alternate Configuration
Unidata Architecture
Internet Data
Distribution
(IDD)
Data
Storage
Local Data
Manager
(LDM)
Unidata Architecture
Internet Data
Distribution
(IDD)
access
Data
Storage
Local Data
Manager
(LDM)
Unidata Architecture
Internet Data
Distribution
(IDD)
access
Data
Storage
Local Data
Manager
(LDM)
THREDDS
Catalog
discover
THREDDS
Client
API
Unidata Architecture
Internet Data
Distribution
(IDD)
Data
Storage
Local Data
Manager
(LDM)
Common
Data Model
(CDM)
access
THREDDS
Catalog
discover
THREDDS
Client
API
Unidata Architecture
Internet Data
Distribution
(IDD)
Data
Storage
Local Data
Manager
(LDM)
Common
Data Model
(CDM)
THREDDS
Catalog
access
THREDDS
Data
Server
(TDS)
discover
THREDDS
Client
API
Unidata Architecture
Internet Data
Distribution
(IDD)
Data
Storage
Local Data
Manager
(LDM)
Common
Data Model
(CDM)
THREDDS
Catalog
access
THREDDS
Data
Server
(TDS)
discover
store
THREDDS
Data
Repository
(TDR)
THREDDS
Client
API
Unidata Architecture
Internet Data
Distribution
(IDD)
Data
Storage
Local Data
Manager
(LDM)
Common
Data Model
(CDM)
THREDDS
Catalog
access
THREDDS
Data
Server
(TDS)
discover
store
THREDDS
Data
Repository
(TDR)
Locally
Generated
Data
store
THREDDS
Client
API
Unidata Architecture
Internet Data
Distribution
(IDD)
Data
Storage
Local Data
Manager
(LDM)
Common
Data Model
(CDM)
THREDDS
Catalog
access
THREDDS
Data
Server
(TDS)
discover
store
THREDDS
Data
Repository
(TDR)
Locally
Generated
Data
store
notify
THREDDS
Client
API
E-mail
Application
(e.g. IDV)
Service
Questions?