Object-Based Network storage systems
Download
Report
Transcript Object-Based Network storage systems
Object-Based Network
Storage Systems
Shang Rong Tsai
DSLab
Institute of Computer and Communication
Department of Electrical Engineering
National Cheng-Kung University
1
Outlines
•
•
•
•
Background and Introduction
Network Storage/SAN/NAS
Object-Based Storage
Epilogue
2
The Epoch of Data Explosion
• Electronic data are continuously growing
• Data and storage managements are becoming
more and more important
–
–
–
–
Continuous service and data availability
Data backup and restore
Storage and data sharing
Efficiency in data and storage management
• Expecting using less man power
• Flexibility in system configuration
• Efficiently expanding storage capacity
3
Network Storage Systems
• Storages are moving to networks for sharing and efficient
management
• Such demands push the emerge of SAN (Storage Area
Network)
• NAS (Network Attached Storage) has been used for longer
time
• Object-Based Storage System
– Both SAN and NAS have their own merits and drawbacks.
– SAN supports data access at block level, not good for applications
in data sharing
– NAS imposes bottleneck on servers and has bad scalability
– Object-Based approach potentially eliminates their drawbacks
• Network storage technologies offer a new platform for
networking and storage people to play a new game.
4
Conceptual Model of SAN
client
client
client
client
Clients
Network to connect
clients and hosts
Linux
Windows
Hosts
SunOS
SAN to connect
hosts and storages
Storages
Disk
tape
Disk
Disk
5
What is a SAN
• SAN is a high speed network (traditionally
Fiber Channel) to connect storages to
servers (hosts). The network is basically
used as a replacement of storage bus in
traditional shared bus storage systems, thus
enhances the possible sharing scale and
possible fail over, etc.
• Access to SAN storage is at block level
6
Why SANs (1/2)
• From the communication aspect, the SAN
can bypass the possible communication
bottleneck. It enables communication
between
– Server-to-server
– Server-to-storage (typical model)
– Storage-to-storage (e.g. for backup without
servers’ intervention)
7
Why SANs
(2/2)
• Improvements to application availability:
Storage is accessible through multiple data
paths for better reliability, availability, and
serviceability.
• Higher application performance: Storage
processing is off-loaded from servers
• Virtualized storage: Storages on SAN can be
flexibly configured as a logical volume of any
size, thus easily sharing storage space at
configuration
• Data backup to remote sites: enabled for
disaster protection
• Simplified centralized management: Single
8
Storage Virtualization (1/2)
• The key technology of SAN
• SNIA defines storage virtualization as:
“The act of integrating one or more (back end)
services or functions with additional (front end)
functionality for the purpose of providing
useful abstractions. Typically, virtualization
hides some of the back-end complexity, or
adds or integrates new functionality with
existing back end services. Examples of
virtualization are the aggregation of multiple
instances of a service into one virtualized
service, or to add security to an otherwise
insecure service. Virtualization can be nested 9
or applied to multiple layers of a system.”
Storage Virtualization (2/2)
• To be more practical, storage virtualization is
the aggregation of physical storage from
multiple network storage devices into a single
logical storage device that is managed and
used by a central host.
• Logical Volume Manager (LVM) is basically
the concept of storage virtualization which
has been used for many years.
• By storage virtualization we can easily and
flexibly configure the size of a logical disk.
• For some applications or sites, the amount of
storage required grows at unprecedented
rates. Try to think about if a disk partition is
10
becoming full.
SAN connectivity
• Traditionally, SAN used Fibre Channel
technology to implement the storage
networks
• Fibre Channel SANs support high bandwidth
storage traffic at 200 MB/s and
enhancements to 10 Gb/s in the near future.
This will be mostly used for inter switch links
(ISL) between switches.
• iSCSI SAN (SCSI over TCP/IP) is a relatively
new approach for storage networks.
• Fibre Channel over IP (FCIP)
11
iSCSI
• iSCSI stands for Internet Small Computer
System Interconnect
• iSCSI is a protocol for encapsulating SCSI
commands on a TCP/IP network
• The iSCSI protocol enables universal access
to storage devices and Storage Area
Networks over TCP/IP network
12
SCSI Protocol Layers
• SCSI command layer
– Generic commands (for all devices)
– Device specific commands
• Transport Layer
• Physical Layer (Connectivity Layer)
13
SCSI Protocol Layer
SCSI application (e.g. File systems)
SCSI
commands
layer
SCSI transport
layer
SCSI block commands
SCSI commands for
SCSI stream commands other types of devices
SCSI Generic (Primary) commands
Parallel SCSI
SCSI over Fiber Channel
iSCSI over TCP/IP
TCP
IP
Physical
layer
Parallel SCSI Bus
Fiber Channel
Layer 2 (Ethernet)
14
iSCSI PDU
(from SNIA)
15
Overview of iSCSI
• iSCSI provides initiators and targets with unique names and a
discovery method.
• The iSCSI protocol establishes communication sessions between
initiators and targets, and provides methods for them to authenticate
one another.
• An iSCSI session may contain one or more TCP connections and
provides recovery in case of connection failures.
• SCSI CDBs (Command Descriptor Block) are passed from the SCSI
command layer to the iSCSI transport layer. The iSCSI transport layer
encapsulates the SCSI CDB into an iSCSI Protocol Data Unit (PDU)
and forwards it to the Transmission Control Protocol (TCP) layer.
• iSCSI provides the SCSI command layer with a reliable transport.
16
What is NAS
• Network Attached Storage (NAS) is
basically a LAN attached file server that
provides shared file access using a file
sharing protocol such as Network File
System (NFS) or Common Internet File
System (CIFS)
• Access to NAS is at file level.
17
Why NAS
• NAS technology has been used for decades to
share files, thus to save storage space and to keep
data consistency. (in contrast to file copy, like FTP)
• Data sharing in NAS is at file level, which
matches the semantic for applications. In contrast,
sharing in SAN is at block level which is very
difficult (if not impossible) for applications to
R/W data sharing. (Applications recognize files,
not blocks)
18
What is OSD
(from Erik Riedel Seagate)
• The Object-based Storage Device interface
standard is focused on moving chosen lowlevel storage, space management, and
security functions into storage devices(disks,
subsystems, appliances) to enable the
creation of scalable, self-managed,
protected and heterogeneous shared storage
for storage networks.
19
What is an Object-Based Storage
• Storage devices that operate at object-level
• Traditional storage devices (Such as DAS,
SAN) operate at block-level
• Objects are typically files which match the
semantic level that applications manipulate
data.
• In traditional systems, files are mapped to
blocks before they are stored in storages.
20
Why Object-Based Storage (1/2)
• Drawbacks for NAS
– Most of the processing on file access are on the
file server => Poor scalability
– Difficult to distribute users’ files to multiple
servers while preserve a single global file
namespace.
• We may delegate the management of the subset of
the whole file namespace to a file server, however,
the load distributed to each server may be very
uneven.
21
Why Object-Based Storage (2/2)
• Drawbacks of SAN
– SANs operate at block-level, so sharing data
between applications may still need upper layer
software.
• file read/write sharing
• Record/file locking
• Object-Based Storage can potentially
eliminate both the drawbacks of NAS and
SAN
22
The Value of Objects (from SNIA)
• Better security via capabilities
– Each object can have its own security domain
– All I/O is authorized by the device
• Easier to share data
– Files and records can be stored as objects
– Low-level metadata managed by device
• Opportunities for intelligence
– Attribute-based learning for resource allocation
• Better caching, pre-fetching and staging of data
– Self-configuring storage w/ continuous reorganization
• Layout objects to best serve client requests
23
Two Basic Components in File
Service
• Directory Service
– Providing the global file name space visible to users
and applications
– Mapping file pathname to unique file id
– May need the flat file system to access directories
• Flat File System
– Given the file id, returning the file contents
– Storing file attributes
– Managing file allocation on disks
24
Typical Handling in File Access
• File pathname lookup : open /a/b/c
– Symbolic file name => file id (inode in Unix)
– For each file path component, get the file id, get
the file contents until getting file id of the target
• Get the attributes (including the location of
the file on disks) of the file
• Ready for W/R
25
How to Restructure the file system to fit
the Object-Based Storage Architecture ?
• We may (as Lustre did) partition the functions into
– MDS (Meta Data Server)
– OSD (Object-Storage Device)
• What is the total system architecture?
• What are the data and management functions
delegated to MDS?
• What are the data and management functions
delegated to OSDs?
• Important observation: Typically, among the file
processing in systems, 90% processing is by OSDs,
10% by MDS => To balance the loads
26
Traditional structure vs. ObjectBased structure
Diagram taken from SNIA T10 standard document
27
Lustre
• A system developed at CMU, a very early working
system demonstrate the concept of object-based
storage systems.
• Modifying the Linux kernel
• A few number of MDSs
• A large number of OSDs
• Majorly targeting at the applications in large scale
computing.
– Large number of users
– File service for high performance computing
• Good Scalability
28
Lustre Architecture
OST 1
(from CFS Inc.)
MDS 1
(active)
MDS 2
(failover)
Linux
OST
Servers
with disk
arrays
QSW Net
OST 2
SAN
OST 3
Lustre Clients
(1,000 Lustre Lite)
Up to 10,000’s
OST 4
GigE
OST 5
3rd party OST
Appliances
OST 6
Lustre Object Storage
Targets (OST)
OST 7
29
Lustre Components and
Functions (1/2)
• Basic Components
– Client Filesystems
• Interfacing to local file system (VFS) on clients
• As MDS client
• As OST client
– MetaData Servers
• All the meta-data operations - creating new directories, files,
symbolic links, or acquiring and updating inodes, are handled
by the MDS.
– Object Storage Target
• All the file I/O related operations are directed to the OST’s.
30
Lustre Components and
Functions (2/2)
• The role of the client filesystem is to provide a directory
tree, subdivided into filesets, which provides cluster-wide
Unix file sharing semantics.
• The client filesystem interacts with the meta-data servers
for meta-data handling, i.e. for the acquisition and updates
of inodes and directory information.
• File I/O, including the allocation of blocks, striping, and
security enforcement, is contained in the protocol between
the client filesystem and the object storage targets.
• A third protocol exists between the OST and the MDS,
largely for pre-allocation and recovery purposes.
31
SNIA T10 Specification
• A specification developed for Object-based
Disk
• The SCSI command set defined to
provide efficient peer-to-peer operation
of input/output logical units that manage
the allocation, placement, and
accessing of variable-size data-storage
containers, called objects.
32
Some Operations defined in T10
• Format OSD -defining OSD structure on device
• Create Object Group -defining a set in which to
create objects
• Create object -creating an object, returning the
object ID fid
• Read (fid,starting byte,length)
• Write (fid,starting byte,length)
• Get attributes of an object
33
Possible Intelligent functions
• OSDs serve and manage data at object (most often
representing files) level. This enables OSDs to
smartly enforce management on a per-object basis
–
–
–
–
–
Automatic replication/backup
QoS
Caching/Prefetching
Optimal layout
Preprocessing/Postpressing (such as data
compression/decompression, data filtering)
34
How to support the intelligent
functions ?
• Extensible file attributes would be a good way
• Users can define their own file attributes. A
special ‘code’ attribute can be assigned to a
particular file. Setting the code attribute to a file is
in effect installing the code to the OSD to execute
the intelligent functions within the OSD.
• How to set the code attribute in practice ?
– Consideration for platform (OSD) dependence
– Sharing the code on OSD
• How to command the OSDs to execute the
intelligent functions automatically
35
Epilogue
• Network Storage Technologies get more attention
by IT industry for data sharing and effective data
management for huge and dynamically growing
amount of data.
• SAN and NAS have their roles to play, currently
the major approaches for network storages.
• SAN suffers the difficulty in direct data sharing.
• NAS suffers the difficulty in system scalability
• Object-Based Storage emerges as a new approach
to solve both problems.
• Object-Based Storage offers new opportunities for
intelligent storage devices which potentially derive
36
many research topics