Transcript chapter 10

Storage Virtualization
Team 3
Jennifer Brola-Richards
Mohib Fanek
Kathy Larson
Donovan Miles
Vishu Reddy
Fran Trees
1
2

Storage Virtualization


Storage Evolution and Fundamental Concepts



What, Where and How of Storage Virtualization?
Case Study
Research Topics in Storage Virtualization


What are innovations and fundamental concepts
associated with storage?
Storage Virtualization Deep Dive


What is storage virtualization and why storage
virtualization?
What are potential topics of research and dissertation?
Summary and Verbal Quiz
3
What is storage virtualization?
Storage Virtualization is the next frontier in Storage Advances
that aims to provide a layer of abstraction to reduce complexity.
Storage Networking Industry Association (SNIA) defines
Storage Virtualization as:
1. The act of abstracting, hiding, or isolating the internal
functions of a storage (sub) system or service from
applications, host computers, or general network resources,
for the purpose of enabling application and networkindependent management of storage or data.
2. The application of virtualization to storage services or
devices for the purpose of aggregating functions or devices,
hiding complexity, or adding new capabilities to lower
level storage resources.
4
Why storage virtualization?
Storage Virtualization aims to provide a layer of
abstraction to manage storage and reduce complexity !!!
Provided continuous availability
despite exponential growth (e.g.
FaceBook- Over 55 billion page views
a month, 41 million active users1)
Effectively group and manage
heterogeneous storage devices &
servers (e.g. Estimated number of
Google Servers 450,000 2!)
Allocate and manage
storage in accordance to the
Quality of Service (QoS)
associated with the data
(e. g. Gartner estimates
average data center
doubling its storage every
18 to 24 months)!)
Mergers and Acquisitions (e.g.
Microsoft & Yahoo!)
(1)
Multiple Storage Software
Platforms (e.g. IBM, EMC, HP,..)
Lucas Nealan, php|works, Atlanta September 13, 2007 (2) Wikipedia
5
What are the innovations and fundamentals associated with storage?
Client side storage innovations… variety of storage device
innovations that are smaller, higher capacity and cheaper have
helped end users cope with increasing storage requirements!
6
What are the innovations and fundamentals associated with storage?
Server side storage innovations… a combination of
storage devices, storage interfaces and storage software
innovations have helped enterprises cope with exponential
growth of data storage requirement !
Storage devices have evolved from tapes to hard drives to
RAID hard drives increasing capacity and resiliency.
7
What are the innovations and fundamentals associated with storage?
Storage interface innovations have evolved from SCSI to
ISCI, Fiber Channel (FCP) and InfiniBand to inter connect
devices and transport the data faster.
SCSI
ISCSI
FCP
Infiniband
8
What are the innovations and fundamentals associated with storage?
Storage Access File level access takes
center stage along with conventional
Block level access.
Block level access: Block addresses are
used to Read/Write data [Read/Write,
Block #] to the storage media.
Sample conventional Block
Allocation Map
File level access: Files are accessed by "semantics"
instructions [example: Open, Close]. Data inside files is
accessed by byte-ranges within the file (example: the first 10
bytes of a file). GFS (Google File System) is an example of
a large scale distributed file system.
9
What are the innovations and fundamentals associated with storage?
Metadata is Data about data; in the context of storage
metadata may describe an individual datum, or content
item, or a collection of data including multiple content
items.
Examples include: file size, who created file, attributes
such as read only, free block bitmaps, control data.
10
What are the innovations and fundamentals associated with storage?
Storage Software from simple back-up and restore to advanced
storage networks and storage management software functions.
(A) Simple Direct Attached Storage (DAS)
(B) Storage Area Network (SAN)
(C) Network Attached Storage (NAS)
11
What are the innovations and fundamentals associated with storage?
SAN and NAS: Key Differences
NAS
SAN
Access Methods
File access
Disk block access
Access Medium
Ethernet
Fiber Channel
Architecture
Decentralized
Centralized
Transport Protocol
Layer over TCP/IP SCSI/FC and SCSI/IP
Efficiency
Less
More
Good
Poor
Web
Workstations
Database
Database servers
Sharing and Access
Control
Typical Applications
Typical Clients
12

Taxonomy, Configuration, Challenges of CAS
13
What and Where can Storage be Virtualized?
SNIA Storage Model
Potential Areas of
Virtualization
3
2
File Level Virtualization
Host Level Virtualization
6
*
4
Network Virtualization
Block Virtualization
**
5
Device Virtualization
1
Storage Level Virtualization
Source: The Storage Networking Tutorials, SNIAVIRT- Page 20
http://www.snia.org/education/tutorials/
* Host aka Server
** Device=aggregation of Host and Network (Meta Data)
14
What and Where can Storage be Virtualized?
Storage Virtualization: Innovations and Trends
1
Storage
Device Level
Virtualization
2
Host Level
Virtualization
Historical: Mainframe
Recent development
example: VMware
3
File Level
Virtualization
Historical: Mainframe
Recent development
example: NAS
4
5
Block
Virtualization
Device
Virtualization
Sub-Technique
6
Network
Virtualization
Sub-Technique
Historical: RAID Level, SCSI Interface
Recent Development Examples: Fiber
Channel
Major innovations continue to
emerge even in historical areas of
storage virtualization
Symmetrical (aka in-band) and
Asymmetrical (aka Out-of-Band)
are emerging as key areas of
abstraction and virtualization.
15
How is storage virtualized at the enterprise level?
Currently Networks are virtualized using Metadata or
Storage Volume Controllers. There are two types of network
virtualization…
Metadata or Storage
Volume Controllers
are placed (out of
band) outside the
path of data flow.
Metadata or Storage
Volume Controllers
(SVC) are placed (inband) or in the path
of data flow.
Source: IBM Redbook Page 8
http://www.redbooks.ibm.com/redbooks/pdfs/sg246210.pdf
16
How is storage virtualized at the enterprise level?
In-Band Virtualization
2
SVC controls who can get
access to the storage device
controls, how storage can be
accessed, how storage is
allocated, etc.
1
Metadata or Storage Volume
Controllers (SVC) are placed (inband) or in the path of data flow.
3
SVC are managed through
Storage Management Software.
Source: IBM Redbook Page 10
http://www.redbooks.ibm.com/redbooks/pdfs/sg246210.pdf
4
Key Challenge is the potential IO bottlenecks
17
How is storage virtualized at the enterprise level?
Out-of-Band Network Virtualization
2
SVC controls who can get
access to the storage device
controls, how storage can be
accessed, how storage is
allocated, etc.
Host sends
Metadata to SVC
4
1
Metadata or Storage Volume
Controllers (SVC) are placed (inband) or in the path of data flow.
Source: IBM Redbook Page 12
http://www.redbooks.ibm.com/redbooks/pdfs/sg246210.pdf
3
Storage Pool sends
Metadata to SVC
18

Types of virtualization and case study
19
How is storage virtualized at enterprise level?
HIGH LEVEL DIAGRAM _ Typical Primary/Secondary site data replication with Storage Virtualization
Ethernet
(xxx) Blade server(s)
Type 1 SAN
Storage
with_52TB
Virtualization Engine
(xxx)
pSeries
server(s)
Blade SAN
Fabric
Type 2
SAN Storage_ 40 TB
Library wi
LT03 drives
xSeries server
Monitor
San Fabric B
Director
Monitor
3Com
3Com
San Fabric A
Director
VPN Comm-link
for remote
support
(2) Cisco 6509 switch
CISCOSYSTEMS
Type 2
SAN
Storage
26TB ea
CISCOSYSTEMS
SAN Fabric B
SAN Fabric A
Management
VLAN _ QA/
DEV _ storage,
library,
director _ 950
PROD_ Blades
+ Blade
Fabric_ 955
PRIMARY SITE
Environment:PROD, DEV, QA, SIT
Application:App1, App2
San Fabric A
Director
SD
Pwr
Network Appliances
DWDM
SD
Type 2 SAN Storage
Network Appliances
Pwr
SECONDARY SITE
Environment:Prod
Application:App1, App2
Network Appliances
SAN Fabric
A
CISCOSYSTEMS
SAN Fabric
B
VPN Comm-link
for remote
support
CISCOSYSTEMS
3Com
Library wi LTO3
drives
Type 1 Storage
3Com
San
Fabric B
Virtualization Engines
(xxx) xSeries server
Virtualization Engine
Monitor
D. Miles 06/09/07
(xxx)
pSeries
server(s)
San
Fabric A
Ethernet
20
The Study
1.
2.
Shows that commingling of data and meta-data on a
single logical device means that there is no way to
achieve different service level objectives for data and
meta-data in the same file system, without moving filesystem specific knowledge into the logical disk layers.
Shows that the standard assumptions underlying the
organization of data and meta-data in file systems are
no longer valid in virtualized storage environment and
hence fail to materialize the full benefits of storage
virtualization.
Proposes a different file system organization of data
and meta-data designed to exploit the power of
virtualized storage.
21
Service Level requirements within a single file system
• Organization A Needs No Encryption
• Organization B_ Needs Encryption
– Stores Medical Records
– Security requirements for file data is
extremely high.
– Performs nightly indexing operation
on file systems
– All directory information and file
access times must be read to
determine “changed” state of data
– Business requirement that all file
data be encrypted at rest.
– File meta data has no security
requirement
In Unix fast file system (ffs), a
logical disk is divided into
collections of blocks called
cylinder groups, each of which
stores both file data blocks as well
as file meta-data
22
Results

Clean logical separation
between data and metadata

Allows file system feature to
use virtualization features
and achieve different SLO’s

Redesign changes
◦
◦
◦
Code change
Packing the re-located
cylinder group header in the
first few meta data cylinder
groups ensures each header
is located @ a fixed,
predictable offset from the
front of the block device
User configurable block
address space before which
no data stored and after no
meta data stored
23
5-7% gains on the new file system
layout
31-44% for the file lookup and file
delete benchmarks, which result
in little or no file data i/o, the
advantage of data-only
encryption become obvious
Future Work
•
Differing SLO’s for granular meta data
•
Completely separate fixed/dynamic metadata
•
Separate file data from user defined file attribute
data
24
What are potential topics of research and dissertation?

Bayesian analysis for resource management

Bayesian analysis for diagnostics

Trusted domains for security

Storage Virtualization and Metadata Standards

Algorithm advances for block, device and other
component virtualization techniques
25
Storage Basics
1. What type of storage is found in your work station?
2. What type of storage systems may be found in a large
enterprise?
3. How is data accessed from storage?
4. Network Attached Storage (NAS) is well suited for what type
of applications?
5. Storage Area Network (SAN) is well suited for what type of
applications?
Storage Virtualization
1. What is Storage Virtualization?
2. Where and What can be virtualized in storage?
3. How is storage virtualized at a network level?
4. How is storage virtualization currently implemented?
5. What are the potential research topics in storage virtualization?
26