Report on the 2002 IEEE/Nasa Mass Storage Conference

Download Report

Transcript Report on the 2002 IEEE/Nasa Mass Storage Conference

Report on the 2002 IEEE/Nasa
Mass Storage Conference
Where ? College Park, Maryland, USA
When ? 15-18 April 2002
Who ?
Jean-Philippe Baud and
Fabien Collin from CERN-IT/DS
Why ?
IT-DS deeply involved in data
storage (HW & SW aspects)
URL :
www.storageconference.org
Agenda




Tutorials (4 presentations)
Network storage (6)
Hierarchical Storage Management (6)
Storage Indexing (3) and Miscellaneous (3)

Short papers (7)
Vendor presentations (9)
Panel : Storage 10 year prospective

Conclusions and own impressions


Tutorials : Day 1 (1-2/4)

Perpendicular recording : Seagate Research







Current longitudinal recording = ~ 100 Gbit/in2
Perpendicular recording can achieve 1Tb/in2
Superparamagnetic behaviour not avoided but delayed
With Perpendicular recording and Heat assisted magnetic
recording, can go to 10 Tbit/in2
Another 5x gain to 50 Tbit/in2 possible with HAMR +
pattern medium (challenge is finding low cost means of
making media)
Material for PASTA secondary storage group
Virtualisation : Storagetek



Lots of hype about this subject nowadays
Abstraction of storage that separates the host view from
the storage subsystem implementation
Gave a few examples
Tutorial : Day 1 (3-4/4)

iSCSI : Julian Satran, IBM Research Lab


Yet another iSCSI presentation from Julian
Object Storage Devices : Tom Ruwart, Ciprico



File system storage component moved inside the device
itself
No longer a block interface but an object interface : being
standardized as part of the SCSI T10 committee
Not affected by technology shifts




Decoupled physical storage technology from file system and
applications
Ultimate virtualization technology
SCSI Standard
Material for PASTA secondary storage group
Agenda




Tutorials (4 presentations)
Network storage (6)
Hierarchical Storage Management (6)
Storage Indexing (3) and Miscellaneous (3)

Short papers (7)
Vendor presentations (9)
Panel : Storage 10 year prospective

Conclusions and own impressions


Network storage session

iSCSI performance : IBM Almaden


Latency measurements against FC within 5 %
Throughput 50 MB/s only using iSCSI software
implementation:



Initiator CPU bottleneck due to interrupt overhead
TCP copy and checksum overhead
Improvements :




Jumbo frames : less interrupts but not standard and not in 10Gige
Zero copy transmit TCP/IP stack. Zero copy receive not possible
TCP/IP offload engine (TOE) : TCP/IP stack on the adapter. Zero
copy receives not possible unless hints given by application
Most promising : Specialized iSCSI HBA but commodity Ethernet
adapters cannot be used but infrastructure can
Network storage session

DirectNFS : Calsoftinc (xNFS for HPLabs?)





Single distributed file system using NFS : HPLabs


A mixed NAS+SAN approach
Metadata caching, cache coherency through leases
Used FiST, a stackable file system generator
Could use FiST for Castor FS proof of concept ?
The poor’s man cluster server : NFS on top of NFS, hence
the name NFS^2 : HPLabs really likes NFS…
Locating Logical volumes in a large network (HPLabs)


Using DNS and modified BGP routing protocols
Developed for SSPs ?
Network storage session

HyperSCSI : Singapore University



An iSCSI clone without IP but with raw ethernet
frames instead : Performance oriented. Can be
tunneled with IP to link several sites
Interesting university work but unlikely to go
beyond that (not standard, no product)
Point in time copy : IBM Haifa


A review of snapshot techniques in existing high
end disk subsystem products (IBM ESS, STK SVA,
EMC Timefinder, Hitachi)
Looking for improvements : OSD in particular
Agenda




Tutorials (4 presentations)
Network storage (6)
Hierarchical Storage Management (6)
Storage Indexing (3) and Miscellaneous (3)

Short papers (7)
Vendor presentations (9)
Panel : Storage 10 year prospective

Conclusions and own impressions


HSM session (1-3/6)

RAIT:Yet another STK presentation on RAIT



Conceptual study of intelligent data archives of the
future : NASA




Performance & reliability (drive and media failures)
Said STK will announce a COTS device
Huge data sets, physically distributed data
MSS need to be intelligent
Blabla on requirements. Boring…
Storage Issues at NCSA : How to get file systems
going wide and fast within and out of large scale
Linux cluster systems : NCSA


Title totally misleading… This is in fact the history and
inventory of NCSA storage HW/SW over the last 15 years
Evaluating GPFS with Myrinet. Performance issues.
HSM session (4-5/6)

Data placement on tertiary storage (Purdue Univ)


Research work to try to gather data objects on tertiary
storage depending on their relationship and the probability
of access of these related objects
Storage resource managers (Arie Shoshani LBNL)




SRM are Middleware for Grid storage
Common interface to different MSS in a Grid environment
Implementation working for HPSS. Enstore/Castor in
progress
Jean-Philippe had a long chat with Arie and his developers
HSM session (6/6)

HPSS and SAN (IBM Glob Services)

Actual situation :


SAN is used between movers and devices but LAN is
used between movers and HPSS clients
Future situation (2003) :


A SAN enabled HPSS client will be develop to allow
direct transfers using the SAN between these clients and
devices
However, because today’s SAN lack real security
features, only trusted SAN enabled HPSS clients will
profit from this. Non trusted clients (even the ones on
the SAN) will use the LAN for data transfers to/from the
movers.
Agenda




Tutorials (4 presentations)
Network storage (6)
Hierarchical Storage Management (6)
Storage Indexing (3) and Miscellaneous (3)

Short papers (7)
Vendor presentations (9)
Panel : Storage 10 year prospective

Conclusions and own impressions


Storage Indexing Session (1-2/3)

Intra file security (UCSC)



File encryption is an all or nothing operation
Provides the ability to encrypt only parts of a file
(example : Satellite images)
Efficient storage & management of
environmental information (Rutgers Univ)

Building a data warehouse for environmental data
Storage Indexing Session (3/3)

Indexing and selection in huge data sets
(CERN LHCB, Sébastien Ponce)


Multi-dimensional selections (up to 30 variables),
among 1010 events of 100 KB each (1 PB/year)
Using tags




A subset of the data item it represents
A “pointer” to this item
The subset of the item contains a few values that
will be available for fast selection criteria
A tag is a small structured entity that can easily be
stored in a relational database
Potpourri Session (1-2/3)

The one terabyte tape cartridge and beyond (STK)

Discussing the design and implementation of a multi TB
3480 form factor cartridge using MP media and linear
recording





Medium has primary impact on areal density growth
Limit for MP tape is 10 Gb/in2
Head technology is in good shape
Tape not near any fundamental limits at this time, 5 to 10
TB tape with 150/560 MB/s conceivable.
Efficient RAID scheduling on smart disks (Univ of
Minnesota)

Parity computations (XOR) implemented inside disks (no
longer done by HW RAID controllers or by software). Three
new SCSI commands.
Potpourri Session (3/3)

Experimentally evaluating in-place delta
reconstruction



Software distribution to huge number of clients
over low bandwidth (and usually wireless) links
using compressed deltas against previous version.
In place decompression : no scratch space needed
Large scale, highly mobile applications on
inexpensive hardware
Agenda




Tutorials (4 presentations)
Network storage (6)
Hierarchical Storage Management (6)
Storage Indexing (3) and Miscellaneous (3)

Short papers (7)
Vendor presentations (9)
Panel : Storage retrospective and prospective

Conclusions and own impressions


Short papers

Building a massive distributed storage
infrastructure at Indiana University







HPSS and IBM hardware
High density holographic storage
Storage stability of MP media (Fuji)
iSCSI initiator implementation (IBM)
Efficiently scheduling tape-resident jobs
Java and Real Time Storage Applications (STK)
Performance analysis and testing of SANs
Agenda




Tutorials (4 presentations)
Network storage (6)
Hierarchical Storage Management(6)
Storage Indexing (3) and Miscellaneous (3)

Short papers (7)
Vendor presentations (9)
Panel : Storage 10 year prospective

Conclusions and own impressions


Vendor presentations







Sony DIR-2000 : 600 GB-128 MB/s
SN 6000 : STK Marketing blurb
BlueARC : NAS box offering CIFS/NFS/FTP/HTTP
interface implemented in silicon
Brocade : Marketing blurb about SAN security
EMC : Avalon HSM software suite.
SANavigator : Management and monitoring software
suite for SANs
Ampex :


Current : DST drive (300 GB helical -20 MB/s) and library
Working on : 1 TB DST M cart, still helical ($0.30 GB)-40
MB/s
Agenda




Tutorials (4 presentations)
Network storage (6)
Hierarchical Storage Management (6)
Storage Indexing (3) and Miscellaneous (3)

Short papers (7)
Vendor presentations (9)
Panel : Storage 10 year prospective

Conclusions and own impressions


Panel: A 10 year prospective





Magnetic disk : density up to 50 Tb/in2
Magnetic tape : Tape still there in the future,
still wins on volumetric efficiency and $/GB
Optical disk : Marketing guy (UDO = 30
GB/2003,120 GB in 2007)
Networked Storage : More intelligence in
devices (OSD), more bandwidth, more
automation
Storage security : Cost will increase A LOT
Conclusions and own impressions






Tutorials were really good and varied
A few good technology presentations for
PASTA and technical ideas
Not much marketing hype except vendor
presentations
This conference is a MUST GO for DS group
(the only one ?)
Washington area was by far too hot…
URL : www.storageconference.org, Adobe
Acrobat 5 required