- VMware Labs
Download
Report
Transcript - VMware Labs
Storage Virtualization
Discussion led by Larry Rudolph
MIT IAP 2010 The Science of Virtualization
© 2009 VMware Inc. All rights reserved
Virtualize Storage device
Guest OS
Guest Disk
Guest OS Device Driver for Storage
Device is emulated
Virtual Disk is emulated as a single
file on the real file system
Device Driver
Device Emulation
Map virtual disk to
File on physical disk
Physical Disk
Device Driver
Real Disk
Simple
• Use direct file access
• Guest issues block reads & writes
• Guest issues some other random stuff, but it is just
detail
Details, details, details
2
VM’s file system size?
Does each virtual disk have a
fixed size?
• If it is a file on the host file system, then how to
do that?
Should all the space be preallocated?
What does Guest OS tell user
• How much space is left on disk?
Guest OS makes use of a buffer
cache
• Cache of blocks read already, from read-ahead
• Host OS also maintains a block cache
• Silly to have multiple copies
• What if there are several VMs
• Host block cache may thrash
3
Guest OS does disk I/O scheduling
Guest thinks it has a physical disk attached
Guest schedules I/O to optimize arm movement
• Is this a problem? Does it matter?
• Guest does not know the truth
Guest issues a command to virtual disk via disk driver
Captured by VMM’s emulated device
Command (translated &) issued to real device
Return back up to Guest OS, which issues next
command
• But might be too late
4
Optimization: simple sharing
Assume all n VMs running same version of Windows XP
•
•
•
•
Do not need to keep n copies of Windows read-only code
Can use virtualization to save disk space
How? Virtualization is a level of indirection; just like shared pages
How can this be done efficiently?
What about other blocks?
• COW – copy on write
Can / should storage be shared?
• Can host “see” files in the guest?
• What if guest is not running?
• Think about how virus protection works
• Can guest “see” files on the host?
• What happened to Isolation?
5
Consider VMs in a data center
Two broad models of storage:
Local disks vs Storage Array Network (SAN)
Google, for example, uses the local disk model
• Query sent to lots of servers, each searches local disk
• To access data on remote disk, ask it’s server
What happens when (virtualized) servers move from
one physical machine to another?
• Matters if real connection between the two
6
SANs: Storage Array Network
Not as simple as it seems
Servers and Disks connected to network
SAN need not be commodities – can
provide many services
• These services are physically based.
Logical Unit Number – LUN
• Think file system or file volume
• SANs deal with these things
Protocols: Fiber Channel & SCSI
What services can Storage Arrays
provide?
• RAID (e.g. use 9 disks to provide fault tolerance.)
• Lots of other stuff, but on a LUN granularity
7
SAN
Quality of Service – Multiqueue
Many VMs sharing one physical connection
If VMM has only a single queue, then one VM can
dominate leaving another VM starved for service
• Want to provide quality of service to all VMs
Potential Solution
• Multiple queues
• VMM does appropriate queue popping to maintain QoS
• Best effort vs Fairness in scheduling
Resource Sharing
• Bandwidth, time, ports, …
8
De-duplication
Why store multiple copies of the same block; just use
pointers
Good idea, let’s do it in S/W
Details?
• Deletion?
• Migration?
• Backup?
9
Snapshots (versioning)
Application consistent versioning
•
•
•
•
•
10
VMM must flush all block caches / buffers
Guest OS must flush all block caches / buffers
Guest Application must flush all block caches / buffers
Everyone must be told when it is safe to proceed
Who does the new mapping?
Atomicity – locking (reservations)
There are times when it is desirable to lock
whole file system.
• Why?
SANs provide locking on a LUN basis
11
Replication
Want to maintain two copies of the file system
• Geographic separation in case of disaster
• What other reasons?
SANs provide this on a LUN basis
• Multicast the write block messages
What are problems with replication at the file
system level
12
Disaster Recovery
Replication
• Asynchronous
• Synchronous (higher latency)
Fail-over Mechanism
13
Thin Provisioning
Allocate blocks on demand
• Map virtual block to physical block on some disk in storage array
• Permits over-commitment of storage
• When get close to real capacity, can add more physical disks to storage array
Good idea, let’s also do it in S/W
14
Predictive read and other (common case) optimizations
There are non-volatile buffers in front of disk
Can do read-ahead to avoid latency costs
• E.g. during boot-up, easy to predict which blocks read next
• Buffer is finite, cannot read everything into it
• Assume some logical flow of reads
Pattern of reads is on file system level not LUN
15
vMotion – migrate VM and its storage
Can migrate VM from server to server while it is still
running
Want to do the same for the (virtual) file system
Want to migrate live
• Eager or lazy?
16
Summary
Storage is a big part of data-center (Server,
Network, & Storage)
Today, the Physical deals in terms of LUNs, the
Virtual in files
• Agree on standard set of APIs
VMFS (virtual machine file system)
•
•
•
•
•
•
•
17
Provides API’s to be supported by physical SANs
Block copy
Zeroing
Atomic test & set
Thin Provisioning
Block release
…..
Storage virtualization for virtual machines
Christos Karamanolis, VMware, Inc.
© 2009 VMware Inc. All rights reserved
VMware Products
“Hosted” Products
• VMware Workstation (WS), VMware
Player – free
• ACE
• VMware GSX Server, VMware Server –
free
“Bare metal” Product
• VMware ESX Server
• Virtual Center
• Services: HA, DRS, Backup
Copyright © 2006 VMware, Inc. All rights reserved.
19
ESX Server Architecture
Platform for running
virtual machines (VMs)
Virtual Machine Monitor
(VMM)
VMkernel takes full
control of physical
hardware
• Resource management
• High-performance network
and storage stacks
• Drivers for physical hardware
Service console
• For booting machine and
external interface
Copyright © 2006 VMware, Inc. All rights reserved.
20
vSphere Storage Virtualization overview
Copyright © 2006 VMware, Inc. All rights reserved.
21
Storage virtualization: Virtual Disks
• Virtual Disks are presented as
SCSI disks to VM
• Guest OS sees h/w Host Bus
Adapter (HBA)
• Buslogic or LSIlogic
• Proprietary “PVSCSI” adapter
(optimized virtualization)
• HBA functionality emulated by
VMM
• Virtual Disks stored on either
• Locally attached disks
• Storage Area Network (SAN)
• Network Attached Storage (NFS,
iSCSI)
Copyright © 2006 VMware, Inc. All rights reserved.
22
VMkernel: virtual disks -> physical storage
Support for virtual disks
Virtual
Machine
ESX
VMkernel
(vSCSI)
virtual SCSI
COW
virtual
disks
support
virt disks -> phys storage
Raw
VMFS
NFS
LUN
• Process SCSI cmds from VM HBA
• Hot-add virtual disks to VMs
• Snapshots (COW): versioning, rollback
Map virtual disk I/O to physical
storage I/O
• Handle physical heterogeneity
• Different phys storage type per virtual
disk
core
SCSI
LUN
disk scheduling
rescan
multipathing
device
drivers
FC
driver
iSCSI
driver
Copyright © 2006 VMware, Inc. All rights reserved.
23
Advanced storage features
• Disk scheduling
• Multipathing, LUN discovery
• Pluggable architecture for 3rd parties
Device drivers for real h/w
VMware File System (VMFS)
Efficient FS for virtual disks on
SANs
• Performance close to native
Physical storage (LUN)
management
VMFS
• VMFS file systems automatically
recognized when LUNs discovered
• Lazy provisioning of virtual disks
2.vmdk.redo
1.vmdk
Clustered: sharing across ESX
3.vmdk
hosts
2.vmdk
• Per-file locking
• Uses only SAN, not IP network
SAN
Copyright © 2006 VMware, Inc. All rights reserved.
24
Supports mobility & HA of VMs
Scalability: 100s of VMs on 10s of
hosts sharing a VMFS file system
VMFS performance- sequential read (old graphs)
sequential-read workload
180
MBps
160
140
physical
120
virtual
100
80
60
40
20
0
4k
8k
16k
32k
64k
block size
ESX 2.5 on HP Proliant DL 580, 4 processors, 16 GB mem.
VM/OS: Windows 2003, Ent.Ed., uni-processor, 3.6 GB mem, 4
GB virtual disk on one 5-disk RAID5 LUN, on Clariion CX500
Testware: IOmeter, 8 outstanding IOs
Copyright © 2006 VMware, Inc. All rights reserved.
25
VMFS performance- random read (old graphs)
random-read workload
30
25
physical
virtual
MBps
20
15
10
5
0
4k
8k
16k
32k
64k
block size
ESX 2.5 on HP Proliant DL 580, 4 processors, 16 GB mem.
VM/OS: Windows 2003, Ent.Ed., uni-processor, 3.6 GB mem, 4
GB virtual disk on one 5-disk RAID5 LUN, on Clariion CX500
Testware: IOmeter, 8 outstanding IOs
Copyright © 2006 VMware, Inc. All rights reserved.
26
Raw access to LUNs
Clustering applications in VMs
• Sharing between VMs using SCSI
reservations
• Sharing between VMs and physical
machines
VMFS
1.vmdk
from VMs
4.rdm
• Some disk arrays provide snapshots,
SAN
Copyright © 2006 VMware, Inc. All rights reserved.
27
In-band control of array features
mirroring, etc. that is controllable via
special SCSI commands
• Similar for access to tape drives
• We provide pass-through mode that
allows VM to control this functionality
Lower layers of storage stack
Disk scheduling
• Control share of disk bandwidth to each VM
Multipathing
• Failover to new path when existing storage path fails
• Transparently provides reliable connection to
storage for all VMs
• Improves portability & ease of management
• No multipathing software required in Guest OS
LUN rescanning and VMFS file system
discovery
• Dynamically add/remove LUNs on ESX
Storage Device Drivers
• May need to copy/map guest data to low memory
because of driver DMA restrictions
Copyright © 2006 VMware, Inc. All rights reserved.
28
ESX
VMkernel
vSCSI
virtual
disks
support
COW
virt disks -> phys storage
VMFS
Raw
LUN
NFS
core LUN disk sched
SCSI rescan
multipathing
device FC
drivers driver
iSCSI
driver
Storage Solutions
Copyright © 2006 VMware, Inc. All rights reserved.
29
Enabling solutions
ESX’s storage virtualization platform
• VM state encapsulation
• Generic snapshot/rollback tool
• Highly available access to phys storage
• Safe sharing of the phys storage
Enabler for virtualization solutions
• Ease of management in the data center
• VM High-Availability and mobility
• Protection of VMs: application + data
Copyright © 2006 VMware, Inc. All rights reserved.
30
Future directions: IO resource control
Typical usage scenario: singleapplication VM
• Leverage knowledge of
application/workload?
• Provisioning, resource management,
metadata, format
Advanced IO scheduling
• Adapt to dynamic nature of SAN, workloads
• Combine with control of CPU, mem, net
• Local resource control vs. global goals
• Target high-level ‘business’ QoS goals
Copyright © 2006 VMware, Inc. All rights reserved.
31
Storage VMotion
• Migrate VM disks
• Non-disruptive while VM is running
• VM granularity, LUN independent
• Can be combined with VMotion
• Uses:
• Upgrade to new arrays
LUN A1
LUN B1
LUN A2
LUN B2
Array A (off lease)
32
Array B (NEW)
• Migrate to different class of storage
(up/down tier)
• Change VM -> VMFS volume
mappings
Business continuity
Virtual machine (configuration, disks) encapsulates all
state needed for recovery
Safe VM state sharing on SAN, allows immediate recovery
when physical hardware fails high availability
• Fail over VM upon failure; fully automated
• No h/w dependencies, no re-installation, no idle h/w
Extend this model to with non-shared and even remote
storage
Copyright © 2006 VMware, Inc. All rights reserved.
33
Consolidated Backup (VCB) Framework
Backup framework for VMs
Proxy backup server
• May be in VM or not
• Copy virtual disk off SAN/NAS
Disk consistency
• Delta disks for consistent disk
image
• Opt. Application quiescing (VSS)
3rd party backup software
• Number of vendors
Restore virtual disks from
backup and start VM
Copyright © 2006 VMware, Inc. All rights reserved.
34
• Use VMware Converter
Disaster Recovery (DR)
Site Recovery Manager (SRM)
• Automates disaster recovery
workflows:
• Setup, testing, failover, failback
• Central management of recovery
plans from Virtual Center
• Manual recovery processes ->
automated recovery plans
• Uses 3rd-party storage replication
• Array-based replication
• Setup replication at VMFS volume
granularity (group of LUNs)
35
Breaking virtual disk encapsulation?
Share/clone/manage VM state
• encapsulation vs. manageability
Opaque virtual disks
• Pros: encapsulation, isolation, generic versioning and
rollback, simple to manage
• Cons: too coarse granularity, proliferation of VM versions,
redundancy, no sharing
Explicit support for VM “templates”, share VM
boot virtual disks
VM state-aware FS that allows transparent
sharing (NSDI-06)
Copyright © 2006 VMware, Inc. All rights reserved.
36
Storage trends and challenges
37
Storage Trends: Converged Data Center Networks
Driving Forces
• need to reduce costs
• need to simplify and
consolidate management
• need to support next-gen
data center architectures
Industry Responses
• converged fabrics
• data center ethernet
• FCoE
• new data center network
topologies
• new rack architectures
• new management paradigms
38
Storage Trends - Information Growth
Driving Forces
• 60% raw growth
• Shift to rich content vs.
transactions
• Consumed globally and via
mobile
• Longer retention periods
• More use of “archival”
information in business
processes
• Desire to move beyond
tape
39
Industry Responses
• Increased focus on policy
• new governance functions
• Categorization software
• ILM, archiving, etc.
• Larger disk drives
• 1TB -> 2TB -> 3TB -> 4TB
• Data deduplication
• removing redundancy
• “Content” clouds
• moving information closer to
users
Storage vision
Today:
40
Tomorrow:
• Manual configuration and
• Zero physical storage
management of physical
storage based on vendors
tools
• multiple tiers of magnetic
media
• Manual mapping of
workload/app to storage
• manual performance
optimization
• disk and tape co-existing
management
• Manage Apps / workloads,
not media
• Automatic, dynamic storage
tiering
• Scale-out (distributed)
storage architectures based
on commodity h/w
• Policy-based automated
placement and movement of
data
Towards the vision
41
Storage awareness and integration
VMware Infrastructure
Virtual Datacenter OS from VMware
Storage
Partners
Infrastructure
vCompute
vServices
vStorage
Storage
operations
• VMFS
Storage
management
• Storage VMotion
• Storage Virtual
Appliances
• vStorage API’s
• VM storage
management
• Linked Clones
• Thin Provisioning
• VMDirectPath
42
vNetwork
vCloud
Consolidated I/O Fabrics
Common Storage and Network
Built on low-latency, lossless 10GbE
FCoE, iSCSI, NAS
Both cost effective bandwidth and
increased bandwidth per VM
Simplified infrastructure and
management
• Simplify physical infrastructure
• Unify LAN, SAN and VMotion networks
• Scale VM performance to 10GE
43
LAN
LAN
SAN A SAN B
VMotion
Scale-out storage architectures
ESX
hosts
Virtual Name Space (VNS):
Hide object placement
details
Client:
ESX hosts
Singlepane
mgmt
Asymmetric
Access protocol,
e.g. pNFS
Specify per-object policy
Storage cluster:
Object mobility, availability,
reliability, snapshots, thin
provisioning, dedup, etc
Storage
containers:
Provide different
tiers of storage
with different
capabilities
44
White
box
storage
h/w
Per-object policy
enforcement
Fabric
Automated, adaptive
storage provisioning
and tiering