vandy_ibpgrid_05

Download Report

Transcript vandy_ibpgrid_05

Global Terascale Data Management
Scott Atchley, Micah Beck, Terry Moore
Vanderbilt ACCRE Workshop
7 April 2005
Talk Outline
»
»
»
»
What is Terascale Data Management?
Accessing Globally Distributed Data
What is Logistical Networking?
Performing Globally Distributed Computation
Terascale Data Management
» Scientific Data Sets are Growing Rapidly
• Growing from 100s of GBs to 10s of TBs
• Terascale Supernova Initiative (ORNL)
• Princeton Plasma Physics Lab fusion simulation
» Distributed resources
• Simulate on supercomputer at ORNL or NERSC
• Analyze/Visualize at ORNL, NCSU, PPPL, etc.
» Distributed Collaborators
• TSI has members at ORNL, NCSU, SUNYSB,
UCSD, FSU, FAU, UC-Davis, others
Terascale Data Management
» Scientists use NetCDF or HDF to manage dataset
• Scientist has logical view of data as variables,
dimensions, metadata, etc.
• NetCDF or HDF manages serialization of data to
local file
• Scientist can query for specific variables,
metadata, etc.
• Requires local file access to use (local disk, NFS,
NAS, SAN, etc.)
• Datasets can be too large for the scientist to
store locally - if so, can’t browse
NetCDF
(Common Data Format)
Scientist
generates data
Variable1
Variable2
Variable3
NetCDF file
stored on disk
Scientist uses
NetCDF to
organize data
Header
Accessing Globally Distributed Data
» Collaboration requires shared file system
• Nearly impossible over the wide-area network
» Other methods include HTTP, FTP, GridFTP, scp
• Can be painfully slow to transfer large data
• Can be cumbersome to set up/manage accounts
» How to manage replication?
• More copies can improve performance, but…
• If more than one copy exists, how to let others
know?
• How to choose which replica to use?
What is Logistical Networking?
» An architecture for globally distributed, sharable
resources
• Storage
• Computation
» A globally deployed infrastructure
• Over 400 storage servers in 30 countries
• Serving 30 TB of storage
» Open-source client tools and libraries
• Linux, Solaris, MacOSX, Windows, AIX, others
• Some Java tools
Logistical Networking
Applications
Logistical Runtime System
LBone
exNode
IBP
Physical Layer
» Modeled on Internet Protocol
» IBP provides generic storage
and generic computation
• Weak semantics
• Highly scalable
» L-Bone provides resource
discovery
» exNode provides data
aggregation (length and
replication) and annotation
» LoRS provides fault-tolerance,
high performance, security
» Multiple applications available
Current Infrastructure Deployment
The public deployment includes 400 IBP depots
in 30 countries serving 30 TB storage (leverages PlanetLab).
Private deployments for DOE, Brazilian and Czech backbones.
Available LoRS Client Tools
Binaries for Windows and MacOSX. Source for Linux, Solaris, AIX, others.
LoDN - Web-based File System
Store files into the Logistical Network using Java upload/download tools.
Manages exNode “warming” (lease renewal and migration). Provides
account (single user or group) as well as “world” permissions.
NetCDF/L
» Modified NetCDF that stores data in logistical
network (lors://)
» Uses libxio (Unix IO wrapper)
» Ported NetCDf with 13 lines of code
» NetCDF 3.6 provides for >2 GB files (64-bit offset)
» LoRS parameters available via environment
variables
ncview using NetCDF/L
Libxio
» Unix IO wrapper (open(), read(), write(), etc.)
» Developed in Czech Republic for Distributed Data
Storage (DiDaS) project
» Port any Unix IO app using 12 lines
#ifdef HAVE_LIBXIO
#define open
#define close
...
#endif
xio_open
xio_close
» DiDaS ported transcode (video transcoding) and
mplayer (video playback) apps using libxio
High Performance Streaming for
Large Scale Simulations
» Viraj Bhat, Scott Klasky, Scott Atchley, Micah Beck, Doug
McCune, Manish Parashar, High Performance Threaded
Data Streaming for Large Scale Simulations, in the
proceedings of 5th IEEE/ACM International Workshop on
Grid Computing, Pittsburgh, PA, Nov 2004.
» Streamed data from NERSC to PPPL using LN
» Sent data from previous timestep while computation
proceeds on next timestep
» Automatically adjusts to network latencies
• Adds threads when needed
» Imposed <3% overhead (as compared to no IO)
» Writing to GPFS at NERSC imposed 3-10% overhead
(depending on block size)
Failsafe Mechanisms for Data Streaming in
Energy Fusion Simulation
GPFS
file sys
exnodercv
depots on simulation end
depots close by
Replication
Re-fetch failed
transfers from
GPFS/depots
PPPL depots
Write to GPFS or
Simulation Depot
Supercomputer
nodes
Buffer Overflow
Signal
Postprocessing
routines
Network Failure
Data Flow
LN in the Ctr for Plasma Edge Simulation
» Implementing workflow automation
• Reliable communication between stages
• Exposed mapping of parallel files
» Collaborative use of detailed simulation traces
• Terabytes per time step
• Accessed globally
• Distributed postprocessing and redistribution
» Vizualization of partial datasets
• Heirarchy: Workstation/cluster/disk cache/tape
• Access prediction, prestaging is vital
Performing Globally Distributed
Computation
» Adding generic, restricted computing within the
depot
» Side-effect-free programming only (no system
calls or external libraries except malloc allowed)
» Uses IBP capabilities to pass arguments
» Mobile code “oplets” with restricted environment
• C/compiler based
• Java byte code based
» Test applications include
• Text mining: parallel grep
• Medical vizualization: brain fiber tracing
Grid Computing for Distributed Viz: DTMRI Brain Scans (Jian Huang)
1. Send
data
2. Request
processing
3. Return
results
Transgrep
»
»
»
»
»
Stores 1.3 GB (web server) log file
Data is striped and replicated on dozens of depots
Transgrep breaks job into uniform blocks (10 MB)
Has depots search within blocks
Depots return matches as well as partial first and
last lines
» Client sorts results and searches lines that overlap
block boundaries
» 15-20 times speed up versus local search
» Automatically handles slow or failed depots
Czech Republic LN Infrastructure
Brazil RNP LN Infrastructure
Proposed OneTenn Infrastructure
Logistical Networking First Steps
»
»
»
»
»
Try LoDN by browsing as a guest (Java)
Download the LoRS tools and try them out
Download and use NetCDF/L
Use libxio to port your Unix IO apps to use LN
Run your own IBP depots (publicly or privately)
More information
http://loci.cs.utk.edu
[email protected]