Transcript ppt

An End-to-End Approach to
Globally Scalable Network
Storage
Presented in cs294-4 P2P Systems
by
Sailesh Krishnamurthy
15 October 2003
Logistical Networking


Models sync/async aspects of communication
Single “fabric” that unifies:


Data Storage
Data Transportation

Internet scalability goals

Claim: end-to-end design principles vital
Background: SAN vs NAS

Current trends in storage networking


SAN - Storage Area Network
NAS - Network Attached Storage
More on NAS v. SAN

NAS



Wires: TCP/IP
Protocol: NFS, CIFS
SAN


Wires: Fiber Channel
Protocol: Encapsulated SCSI
Where is the File System ?
Traditional networks

General goals




Minimize delay
Minimize probability of corruption
Maximize probability of delivery
Assumptions in traditional storage nets

When storage is closely coupled, delay and
probability of corruption can be low while
availability is high.
SANs cannot scale to SWANs


In the SWAN, resources can be
intermittently unavailable
So we need e2e strategies



Simple retries
Redundant data accesses spread across nw
High-latency archival backups
Correctness in the wild



SANs are “controlled” environments and
correctness is not an issue
In the SWAN, data storage may not be
reliable.
Data accuracy must be checked by
producers and consumers - at endpoints
SWAN Security


SWANs are not physically localized
SAN security assumptions don’t hold


Again, e2e approaches are required
DoS is a gotcha for SWANs


Can’t be prevented with e2e strategies
Imitate techniques for handling DoS in IP
Unbounded Size/Duration


Since single store may not have all the
resources all the time, the endpoint has
to manage distribution of data
Unbounded duration allocation hurts
resource sharing.

Should this be managed at endpoints ?
Logistical Networking

Storage Networking


IP networks: interconnection fabric of storage pool
Logistical Networking


Storage part of the networking infrastructure
Shared resource fabric exposing storage resources


Similar to how internet exposes bandwidth resources
Storage Stack


Bottom-up, layered e2e design approach
Internet Backplane Protocol (IBP)
Storage Stack
Applications
exNode tools and services
exNode
A data structure for aggregating network storage
IBP
Allocation and management of storage on network
storage depots
Local Access
Physical
IBP - Internet Backplane Protocol


First layer of stack that’s globally accessible
Abstracts access layer resources (file/block
storage services)

Expose underlying storage resources to maximize
freedom at higher levels


Enable scalable internet style resource sharing


Implement only indispensable & common functions
Mask peculiarities of access layer resource
Abstract service based on data blocks that
are managed as “byte arrays”
IP vs IBP

IP vs link layer




Agg. of link layer packets
masks packet size limits
Simple fault detection faulty datagrams dropped
Global addressing masks
diffs b/w LANs
IP Property


Any participant of a routed
IP n/w can use any link
layer connection
IBP “byte array” indep.




Agg. access layer blocks
masks fixed block size
Simple fault detection drop faulty byte arrays
Global addressing (IP)
maks diff b/w acc layer
IBP Property

Any participant of an IBP
n/w can use any access
layer storage resource
Issues with IBP

DoS vulnerability is much worse

In IP:



In IBP:



DoS attacks require constant sending of data
Does not profit the attacker in any way
Once data block is allocated it remains used
Using remote storage does benefit the attacker
Strong semantics (reliability) of traditional
storage/SAN are difficult to implement in the
SWAN
IBP Solutions

Time-limited storage allocations


When a lease expires, the storage can be
reused for some other user
Soft storage semantics in IBP


IBP is a “best-effort” service
Allocated storage can be revoked at any time
Storage Management
Data Transfer
Depot Management
IBP_Allocate,
IBP_Manage
IBP_Store,IBP_Load,
IBP_Copy,IBP_mcopy
IBP_Status
exNode - Flexible Aggregation
of Network Storage

Implement abstractions
w/ strong properties



Higher layer construct
Aggregates primitive IBP
byte-arrays
Need to maintain state
that represents the agg.

exNode aggregates IBP
byte-arrays as the Unix
inode aggs. disk blocks
e2e services for storage

exNode can hold additional metadata for
services:



Redundancy
Framing of data into segments w/ checksums
exNode is analogous to the state of a TCP
connection, data on disk analogue of a TCP
stream
Relation to p2p systems

Paper compares with Napster/Gnutella



In file sharing all allocations are at endpoints ..
leads to large data transfers
Appropriate comparison is Oceanstore ?
My view



exNode infrastructure is a way to create storage
services from smaller blocks
Can be useful in an Oceanstore-like setting
Can alleviate some SAN shortcomings ?