GTS SAN Usage A UNIX SysAdmin’s View of How A SAN Works
Download
Report
Transcript GTS SAN Usage A UNIX SysAdmin’s View of How A SAN Works
Storage Area Network Usage
A UNIX SysAdmin’s View of
How A SAN Works
Disk Storage
Embedded
Internal Disks within the System Chassis
Directly Attached
External Chassis of Disks connected to a Server via a Cable
Directly Attached Shared
External Chassis connected to more than one Server via a Cable
Networked Storage
NAS
SAN
others
Disk Storage – 2000-2004
Type
Bus Speed Distance
Cable Pins
ATA
100 MB/s
18 inches
40
SCSI
320 MB/s
12 m
68 or 80
FC
400MB/s
10K m
4
SATA-II
300MB/s
6m
22
SAS
300MB/s
10 m
22
Deficiencies of Direct Connect Storage
Single System Bears Entire Cost of Storage
Small Server in an EMC Shop
Large Server cannot easily share its unused storage
Managability
Fragmented and Isolated
Scalability
Limited
What happens when you run out of peripheral bus slots?
Availability
“SCSI Bus Reset”
Failover is a complicated add-on, if available at all
DASD
Direct Access Storage Device
They still call it this in an IBM Mainframe Shop
Basic Limits of Disk Storage Recognized
Latency
Rotation Speed of the disk
Seek Time
Radial Movement of the Read/Write Heads
Buffer Sizes
Stop sending me data, I can’t write fast enough!
SCSI
SCSI – Small Computer System Interface
From Shugart’s 1979 SASI implementation
SASI: Shugart Associates System Interface
Both Hardware and I/O Protocol Standards
Both have evolved over time
Hardware is source of most limitations
I/O Protocol has long-term potential
SCSI - Pro
Device Independence
Mix and match device types on the bus
Disk, Tape, Scanners, etc…
Overlapping I/O Capability
Multiple read & write commands can be
outstanding simultaneously
Ubiquitous
SCSI - Con
Distance vs. Speed
Double the Signaling Rate
Speed: 40, 80, 160, 320 MBps
Halve the Cable Length Limits
Device Count: 16 Maximum
Low voltage Differential Ultra3 SCSI can support
only 16 devices on a 12 meter cable at 160 MBps
Server Access to Data Resources
Hardware changes are disruptive
SCSI – Overcoming the Con
New Hardware & Signaling Platforms
SCSI-3 Introduces Serial SCSI Support
Fibre Channel
Serial Storage Architecture (SSA)
Primarily an IBM implementation
FireWire (IEEE 1394 – Apple fixes SCSI)
Attractive in consumer market
Retains SCSI I/O Protocol
Scaling SCSI Devices
Increase Controller Count within Server
Increasing Burden To CPU
Device Overhead
Bus Controllers can be saturated
You can run out of slots
Many Queues, Many Devices
Queuing Theory 101 (check-out line) - undesirable
Scaling SCSI Devices
Use Dedicated External Device Controller
Hides Individual Devices
Provide One Large Virtual Resource
Offloads Device Overhead
One Queue, Many Devices - good
Cost and Benefit
Still borne by one system
RAID
Redundant Array of Inexpensive Disks
Combine multiple disks into a single
virtual device
How this is implemented determines
different strengths
Storage Capacity
Speed
Fast Read or Fast Write
Resilience in the face of device failure
RAID Functions
Striping
Write consecutive logical byte/blocks on consecutive physical disks
Mirroring
Write the same block on two or more physical disks
Parity Calculation
Given N disks, N-1 consecutive blocks are data blocks, Nth block is
for parity
When any of the N-1 data blocks are altered, N-2 XOR calculations
are performed on these N-1 blocks
The Data Block(s) and Parity Block are written
Destroy one of these N blocks, and that block can be reconstructed
using N-2 XOR calculations on the remaining N-1 blocks
Destroy two or more blocks – reconstruction is not possible
RAID Function – Pro & Con
Striping
Pro:
Con:
Increases Spindle Count for Increased Thruput
Does not provide redundancy
Mirroring
Pro:
Con:
Provides Redundancy without Parity Calculation
Requires at least 100% disk resource overhead
Parity Calculation
Pro:
Con:
Cuts Disk Resource Overhead to 1/N
Parity calculation is expensive
N-2 calculations are required
If all N-1 data blocks are not in cache, they must be read
RAID Types
RAID 0
Stripe with No Parity
RAID 1
Mirror two or more disks
RAID 0+1
Stripe on Inside, Mirror on Outside
RAID 1+0
Mirrors on Inside, Stripe on Outside
RAID 3
Synchronous, Subdivided Block Access; Dedicated Parity Drive
RAID 4
Independent, Whole Block Access; Dedicated Parity Drive
RAID 5
Like RAID 4, but Parity striped across multiple drives
RAID 0
RAID 1
RAID 3
RAID 5
RAID 1+0
RAID 0+1
Breaking the Direct Connection
Now you have high performance RAID
The storage bottleneck has been reduced
You’ve invested $$$ to do it
How do you extend this advantage to N
servers without spending N x $$$?
How about using existing networks?
How to Provide Data Over IP
NFS (or CIFS) over a TCP/IP Network
This is Network Attached Storage (NAS)
Overcomes some distance problems
Full Filesystem Semantics are Lacking
…such as file locking
Speed and Latency are problems
Security and Integrity are problems as well
IP encapsulation of I/O Protocols
Not yet established in the marketplace
Current speed & security issues
NAS and SAN
NAS – Network Attached Storage
File-oriented access
Multiple Clients, Shared Access to Data
SAN – Storage Area Network
Block-oriented access
Single Server, Exclusive Access to Data
NAS: Network Attached Storage
File Objects and Filesystems
OS Dependent
OS Access & Authentication
Possible Multiple Writers
Require locking protocols
Network Protocol: i.e., IP
“Front-end” Network
SAN: Storage Area Network
Block Oriented Access To Data
Device-like Object is presented
Unique Writer
I/O Protocol: SCSI, HIPPI, IPI
“Back-end” Network
A Storage Area Network
Storage
StorageWorks MA8000 (24), EVA (2)
HDS is 2nd Approved Storage Vendor
9980 Enterprise Storage Array – EMC class storage
Switches
Brocade 12000 (8), 3800 (20), & 2800 (34)
3900’s are being deployed – 32 port
UNIX Servers on the SAN
Solaris (56), IRIX (5), HP-UX (5), Tru64 (1)
Storage Volume Connected to UNIX Servers
13000 GB as of May, 2003
Windows Servers
Windows 2000 (74), NT 4.0 (16)
SAN Implementations
FibreChannel
FC Signalling Carrying SCSI Commands & Data
Non-Ethernet Network Infrastructure
iSCSI
SCSI Encapsulated By IP
Ethernet Infrastructure
FCIP – FibreChannel over IP
FibreChannel Encapsulated by IP
Extending FibreChannel over WAN Distances
Future Bridge between Ethernet & FibreChannel
iFCP - another gateway implementation
NAS & SAN in the Data Center
FCIP In The Data Center
FibreChannel
How SCSI Limitations are Addressed
Speed
Distance
Device Count
Access
FibreChannel – Speed
266 Mbps – ten years ago
1063 Mbps – common in 1998
2125 Mbps – available today
4 Gbps – near future products
Backward compatible to 1 & 2 Gbps
10 Gbps – 2005?
Not backward Compatible with 1/2/4Gbps
But 10 Gig Ethernet will compete
Remember FDDI & ATM
Why I/O Protocols are Coming to IP
IP Networking is ubiquitous
Gigabit ethernet is here
10Gbps ethernet is just becoming available
Don’t have to invest in a second network
Just upgrade the one you have
IP & Ethernet software is well understood
Existing talent pool for vendors to leverage
Developers, not end-user Network Engineers
FibreChannel – Distance
1063 Mbps
175m (62.5 um
500m (50.0 um
10 km (9 um
– multi-mode)
– multi-mode)
– single-mode)
2125 Mbps
500m (50.0 um
2 km (9 um
– multi-mode)
– single-mode)
FibreChannel – A Network
Layer 1 – Physical (Media: fiber, copper)
Fibre: 62.5, 50.0, & 9.0 um
Copper: Cat6, Twinax, Coax, other
Layer 2 – Data Link (Network Interface & MAC)
WWPN: World Wide Port Name
WWNN: World Wide Node Name
In a single port node, usually WWPN = WWNN
64-bit device address
Comparable to 48-bit Ethernet device addresses
Layer 3 – Network (IP & SCSI)
24-bit fabric address
Comparable to an IP address
FibreChannel Terminology: Port Types
N_Port
Node port – Computer, Disk, or Storage Node
F_Port
Fabric port – Found only on a Switch
E_Port
Expansion Port – Switch to Switch port
NL_Port
Node port with Arbitrated Loop Capabilities
FL_Port
Fabric port with Arbitrated Loop Capabilities
G_Port
Generic Switch Port: Can act as any of F_Port, E_Port, or FL_Port
FibreChannel - Topology
Point-to-Point
Arbitrated Loop
Fabric
FibreChannel – Point-to-point
Direct Connection of Server and Storage Node
Two N_Ports and One Link
FibreChannel - Arbitrated Loop
Up to 126 Devices in a Loop via NL_Ports
Token-access, Polled Environment (like FDDI)
Wait For Access Increases with Device Count
FibreChannel - Fabric
Arbitrary Topology
Requires At Least One Switch
Up to 15 million ports can be concurrently logged
in with the 24-bit address ID.
Dedicated Circuits between Servers & Storage
via Switches
Interoperability Issues Increase With Scale
FibreChannel – Device Count
126 devices in Arbitrated Loop
15 Million in a fabric (24-bit addresses)
Bit 0-7:
Bit 8-15:
Bit 16-23:
Port or Arbitrated Loop addr
Area, identifies FL_Port
Domain, address of switch
239 of 256 address available
256 x 256 x 239 = 15,663,104
FibreChannel Definitions
WWPN
Zone & Zoning
LUN
LUN Masking
FibreChannel - WWPN
World-Wide Port Number
A unique 64-bit hardware address for each
FibreChannel Device
Analogous to a 48-bit ethernet hardware address
WWNN - World-Wide Node Number
FibreChannel – Zone & Zoning
Switch-Based Access Control
Analogous to an Ethernet Broadcast Domain
Soft Zone
Zoning based on WWPN of Nodes Connected
Preferred
Hard Zone
Zoning Based on Port Number on Switch
to which the Nodes are Connected
FibreChannel - LUN
Logical Unit
Storage Node Allocates Storage and
Assigns a LUN
Appears to the server as a unique device
(disk)
FibreChannel – LUN Masking
Storage Node Based Access Control List (ACL)
LUNs and Visible Server Connections (WWPN)
are allowed to see each other thru the ACL.
LUNs are Masked from Servers not in the ACL
LUN Security
Host Software
HBA-based
firmware or driver configuration
Zoning
LUN Masking
LUN Security
Host-based & HBA
Both these methods rely on correct security
implemented at the edges
Most difficult to manage due to large numbers and
types of servers
Storage Managers may not be Server Managers
Don’t trust the consumer to manage resources
Trusting the fox to guard the hen house
LUN Security
Zoning
An access control list
Establishes a conduit
A circuit will be constructed thru this
Allows only selected Servers see a Storage Node
Lessons learned
Implement in parallel with LUN Masking
Segregate OS types into different Zones
Always Promptly Remove Entries For Retired Servers
LUN Security
LUN Masking
The Storage Node’s Access Control List
Sees the Server’s WWPN
Masks all LUNs not allocated to that server
Allows the Server to see only its assigned
LUNs
Implement in parallel with Fabric Zoning
LUN - Persistent Binding
Persistent Binding of LUNs to Server Device IDs
Permanently assign a System SCSI ID to a LUN.
Ensures the Device ID Remains Consistent
Across Reconfiguration Reboots
Different HBAs use different binding methods &
syntax
Tape Drive Device Changes have been a repeated
source of NetBackup Media Server Failure
SAN Performance
Storage Configuration
Fabric Configuration
Server Configuration
SAN - Storage Configuration
More Spindles are Better
Faster Disks are Better
RAID 1+0 vs. RAID 5
“RAID 5 performs poorly compared to RAID 0+1 when both are
implemented with software RAID”
Allan Packer, Sun Microsystems, 2002
Where does RAID 5 underperform RAID 1+0?
Random Write
Limit Partition Numbers Within RAIDsets
SAN - Fabric Configuration
Common Switch for Server & Storage
Multiple “hops” reduce performance
Increases Reliability
Large Port-count switches
32 ports or more
16 port switches create larger fabrics simply to
carry its own overhead
SAN - Server Configuration
Choose The Highest Performance HBA
Available
PCI: 64-bit is better than 32-bit
PCI: 66 MHz is better than 33 MHz
Place in the Highest Performance Slot
Choose the widest, fastest slot in the system
Choose an Underutilized Controller
Size LUNs by RAIDset disk size
BAD: LUN sizes smaller than underlying disk size
SAN Resilience
At Least Two Fabrics
Dual Path Server Connections
Each Server N_Port is Connected to a Different Fabric
Circuit Failover upon Switch Failure
Automatic Traffic Rerouting
Hot-Plugable Disks & Power Supplies
SAN Resilience – Dual Path
Multiple FibreChannel Ports within Server
Active/Passive Links
Most GPRD SAN disruptions have affected
single-attached servers
SAN – Good Housekeeping
Stay Current With OS Drivers & HBA
Firmware
Before You Buy a Server’s HBA
Is it supported by the switch & storage vendors?
Coordinate Firmware Upgrades
Storage & Other Server Admin Teams Using SAN
Monitor Disk I/O Statistics
Be Proactive; Identify and Eliminate I/O Problems
SAN Backups – Why We Should
Why We Should
Offload Front-end IP Network
Most Servers are still connected to 100baseT IP
1 or 2 Gbps FC Links Increase Thruput
Shrink Backup Times
Why We Don’t
Cost
NetBackup Media Server License: starts at $5K list
Backup Futures
Incremental Backups
No longer stored on tape
Use “near-line” cheap disk arrays
Several vendors are under current evaluation
Still over IP
1 Gbps ethernet is commonly available on new servers
10 Gbps ethernet needed in core
Questions