Transcript ppt
An End-to-End Approach to
Globally Scalable Network
Storage
Presented in cs294-4 P2P Systems
by
Sailesh Krishnamurthy
15 October 2003
Logistical Networking
Models sync/async aspects of communication
Single “fabric” that unifies:
Data Storage
Data Transportation
Internet scalability goals
Claim: end-to-end design principles vital
Background: SAN vs NAS
Current trends in storage networking
SAN - Storage Area Network
NAS - Network Attached Storage
More on NAS v. SAN
NAS
Wires: TCP/IP
Protocol: NFS, CIFS
SAN
Wires: Fiber Channel
Protocol: Encapsulated SCSI
Where is the File System ?
Traditional networks
General goals
Minimize delay
Minimize probability of corruption
Maximize probability of delivery
Assumptions in traditional storage nets
When storage is closely coupled, delay and
probability of corruption can be low while
availability is high.
SANs cannot scale to SWANs
In the SWAN, resources can be
intermittently unavailable
So we need e2e strategies
Simple retries
Redundant data accesses spread across nw
High-latency archival backups
Correctness in the wild
SANs are “controlled” environments and
correctness is not an issue
In the SWAN, data storage may not be
reliable.
Data accuracy must be checked by
producers and consumers - at endpoints
SWAN Security
SWANs are not physically localized
SAN security assumptions don’t hold
Again, e2e approaches are required
DoS is a gotcha for SWANs
Can’t be prevented with e2e strategies
Imitate techniques for handling DoS in IP
Unbounded Size/Duration
Since single store may not have all the
resources all the time, the endpoint has
to manage distribution of data
Unbounded duration allocation hurts
resource sharing.
Should this be managed at endpoints ?
Logistical Networking
Storage Networking
IP networks: interconnection fabric of storage pool
Logistical Networking
Storage part of the networking infrastructure
Shared resource fabric exposing storage resources
Similar to how internet exposes bandwidth resources
Storage Stack
Bottom-up, layered e2e design approach
Internet Backplane Protocol (IBP)
Storage Stack
Applications
exNode tools and services
exNode
A data structure for aggregating network storage
IBP
Allocation and management of storage on network
storage depots
Local Access
Physical
IBP - Internet Backplane Protocol
First layer of stack that’s globally accessible
Abstracts access layer resources (file/block
storage services)
Expose underlying storage resources to maximize
freedom at higher levels
Enable scalable internet style resource sharing
Implement only indispensable & common functions
Mask peculiarities of access layer resource
Abstract service based on data blocks that
are managed as “byte arrays”
IP vs IBP
IP vs link layer
Agg. of link layer packets
masks packet size limits
Simple fault detection faulty datagrams dropped
Global addressing masks
diffs b/w LANs
IP Property
Any participant of a routed
IP n/w can use any link
layer connection
IBP “byte array” indep.
Agg. access layer blocks
masks fixed block size
Simple fault detection drop faulty byte arrays
Global addressing (IP)
maks diff b/w acc layer
IBP Property
Any participant of an IBP
n/w can use any access
layer storage resource
Issues with IBP
DoS vulnerability is much worse
In IP:
In IBP:
DoS attacks require constant sending of data
Does not profit the attacker in any way
Once data block is allocated it remains used
Using remote storage does benefit the attacker
Strong semantics (reliability) of traditional
storage/SAN are difficult to implement in the
SWAN
IBP Solutions
Time-limited storage allocations
When a lease expires, the storage can be
reused for some other user
Soft storage semantics in IBP
IBP is a “best-effort” service
Allocated storage can be revoked at any time
Storage Management
Data Transfer
Depot Management
IBP_Allocate,
IBP_Manage
IBP_Store,IBP_Load,
IBP_Copy,IBP_mcopy
IBP_Status
exNode - Flexible Aggregation
of Network Storage
Implement abstractions
w/ strong properties
Higher layer construct
Aggregates primitive IBP
byte-arrays
Need to maintain state
that represents the agg.
exNode aggregates IBP
byte-arrays as the Unix
inode aggs. disk blocks
e2e services for storage
exNode can hold additional metadata for
services:
Redundancy
Framing of data into segments w/ checksums
exNode is analogous to the state of a TCP
connection, data on disk analogue of a TCP
stream
Relation to p2p systems
Paper compares with Napster/Gnutella
In file sharing all allocations are at endpoints ..
leads to large data transfers
Appropriate comparison is Oceanstore ?
My view
exNode infrastructure is a way to create storage
services from smaller blocks
Can be useful in an Oceanstore-like setting
Can alleviate some SAN shortcomings ?