FCTC_Zamer_iscsi
Download
Report
Transcript FCTC_Zamer_iscsi
IP Storage Tutorial
Presented 17 October 2001 by
Marc Staimer, President & CDS – Dragon Slayer Consulting
Ahmad Zamer, Sr. Product Line Marketing – Intel
John Hufferd, Sr. Technical Staff – IBM SSD
Joe Gervais, Director Product Marketing – Alacritech
Tutorial Introduction
Marc Staimer, CDS – Dragon Slayer Consulting
[email protected]
The Purpose of this Tutorial
IP Storage as “block” vs. “file” storage
NAS will be discussed peripherally
To provide details about IP Storage
To provide factual information
To clarify issues
To facilitate understanding
Key point
This is will be pragmatic education not cheerleading
17 October 2001
3
IP Networked Storage
iSCSI – New Possibilities
Ahmad Zamer
[email protected]
October 2001
Overview
Introduction
Benefits of IP Storage
IP Storage technologies
iSCSI
Conclusions
17 October 2001
5
Introduction
“Ethernet wins. Again. In time… Ethernet will eventually triumph
over all other storage networking technologies, including Fibre
Channel”
Source: March 2001 Forrester Research
“If we were starting with a clean piece of paper … we would
probably use gigabit Ethernet and IP”
Source: Bill Miller CTO StorageNetworks, Industry Standard
“... 76% of senior IT executives believe IP will make it easier to
implement large-scale storage networks”
Source: Enterprise Storage Group 9/11/2000
“75% perceive iSCSI as the IP storage standard”
Source: Marc Staimer , Dragon Slayer Consulting – May 2001
17 October 2001
6
Network Storage Models
Direct Attached Storage
•High Cost of Ownership
•In-flexible
17 October 2001
Network Attached Storage
•Transmission optimized for file
transactions
•Storage traffic travels across the LAN
Storage Area Network
•Transmission optimized for database
transactions
•Separate LAN and SAN
•Increases Data availability
•Flexible and scalable
7
Moving from Dedicated to Networked Storage
17 October 2001
8
Benefits of IP Storage
Brings the SAN concept to Ethernet networks
Lower total cost of ownership
Creates a single integrated network
Makes remote data replication possible
Improves enterprise networks management
Provides higher degree of interoperability
17 October 2001
9
Advantages of IP Storage
Storage access over distance
Transparent to Applications
Leverage Benefits of IP
iSCSI
IT Skills
Ethernet & SCSI Infrastructure
Network Management
R&D Investment
Universal Access to Storage
17 October 2001
IP Network
Storage Router
GE
FC or SCSI
Storage appears
local to servers
10
Key Business Trends Favor IP Storage
Network Performance
Overall System Cost
100Gbps
IP Storage
Switches
FC Switches
40Gbps
10Gbps
10Gbps
1Gbps
IP Storage
Switches
1.7Gbps
FC Switches
0.85Gbps
2000
2001
2002
2003
Trained Staff Available
2000
2001
2002
2003
Total Cost of Ownership
IP Storage
Switches
FC Switches
IP Storage
Switches
FC Switches
2000
17 October 2001
2001
2002
2003
2000
2001
2002
2003
11
IP Storage Standards
Storage Networking Industry
Association
IETF IP Storage (IPS) Working Group
iSCSI
FCIP
iFCP
iSNS
Storage Networking Industry Association (SNIA)
SNIA IP Storage Forum
17 October 2001
12
IP Storage Technologies
What are the technologies? (iSCSI, iFCP, FCIP)
iSCSI
FCIP
iSCSI is a TCP/IP-based protocol for establishing and managing
connections between IP-based storage devices, hosts and
clients
FCIP is a TCP/IP-based tunneling protocol for connecting
geographically distributed Fibre Channel SANs transparently to
both FC and IP
iFCP
iFCP is a TCP/IP-based protocol for interconnecting Fibre
Channel storage devices or Fibre Channel SANs using an IP
infrastructure in place of Fibre Channel switching and routing
elements
17 October 2001
14
IP Storage: iSCSI, FCIP, iFCP
End
Devices
Fabric
Services*
iSCSI
iSCSI/IP
Internet
Protocol
FCIP
Fibre
Channel
Fibre
Channel
iFCP
Fibre
Channel
Internet
Protocol
* Fabric Services include routing, device discovery,
management, authentication, inter-switch communication
17 October 2001
15
iSCSI, iFCP and FCIP Protocol Stacks
Applications
Operating System
Standard SCSI Command Set
FCP
New Serial SCSI
FCP
FC-4
FC-4
FC Lower Layers
17 October 2001
TCP
TCP
TCP
IP
IP
IP
iSCSI
iFCP
FCIP
16
iFCP
iFCP
iFCP is a gateway-to-gateway protocol for implementing a
fibre channel fabric over a TCP/IP transport
Traffic between fibre channel devices is routed and
switched by TCP/IP network
The iFCP layer maps Fibre Channel frames
to a predetermined TCP connection for transport
FC messaging and routing services are terminated at the
gateways so the fabrics are not merged to one another
Dynamically creates IP tunnels for FC frames
Ethernet
Header
IP
TCP iFCP
FCP
//
SCSI Data …
CRC
Checksum
17 October 2001
18
iFCP Approach
FC
Server
iFCP provides F port to
F port connectivity only
FC Tape
Library
iFCP
Gateway
iSNS
Server
iFCP
Gateway
iFCP
Gateway
FC
JBOD
17 October 2001
IP Network
iFCP
Gateway
FC
Server
Device-to-Device
Session
FC Tape
Library
FC
Server
iFCP
Gateway iFCP
Gateway
iFCP
iFCP Gateway
Gateway
Device-to-Device
Session
IP Services at individual device level
IETF Standards for Routing, Naming,
Security, QoS, CoS, Discovery (iSNS)
FC
Server
19
FC
JBOD
iSNS
Server
FCIP
FCIP
FCIP encapsulates FC frames within TCP/IP, allowing islands of FC
SANs to be interconnected over an IP-based network
TCP/IP is used as the underlying transport to provide congestion
control and in-order delivery FC Frames
All classes of FC frames are treated the same as datagrams
End-station addressing, address resolution, message routing, and
other elements of the FC network architecture remain unchanged
IP introduced exclusively as a transport protocol for an inter-network
bridging function
IP is unaware of the Fibre Channel Payload and the FC fabric is
unaware of IP
//
Ethernet
Header
IP
TCP FCIP
FCP
SCSI Data …
CRC
Checksum
17 October 2001
21
FCIP Approach—IP Tunneling
FC Tape
Library
FC Server
FC Switch
FC
Switch
Fibre
Channel
SAN
FCIP
Tunnel
FC
Server
FC Tape
Library
FC Switch
IP Network
FCIP
Tunnel
Tunnel Session
FC Switch
Fibre
Channel
SAN
FC Switch
FC Switch
FC
Server
FC Switch
IP Services
Available at Aggregated
FC SAN Level
FC
JBOD
FC
Server
FC
JBOD
FCIP provides E port to E port connectivity
17 October 2001
22
iSCSI
iSCSI
iSCSI is a SCSI transport protocol for
mapping of block-oriented storage data
over TCP/IP networks
The iSCSI protocol enables universal
access to storage devices and Storage
Area Networks (SANs) over standard
TCP/IP networks
//
Ethernet
Header
IP
TCP
iSCSI
SCSI Data…
CRC
Checksum
17 October 2001
24
iSCSI, iFCP, FCiP
//
Ethernet
Header
IP
TCP FCIP
FCP
SCSI Data …
CRC
SCSI Data …
CRC
Checksum
Ethernet
Header
IP
TCP iFCP
FCP
Checksum
//
Ethernet
Header
IP
TCP
iSCSI
SCSI Data…
CRC
Checksum
17 October 2001
25
iSCSI – Cont.
iSCSI (Internet SCSI) specifies a way to
“encapsulate” SCSI commands in a
TCP/IP network connection:
IP
Header
TCP
Header
iSCSI
Header
SCSI commands and data
Explains how to extract
SCSI commands and data
Provides information necessary to
guarantee delivery
Contain “routing” information
So that the message can find its
Way through the network
17 October 2001
26
iSCSI Deployment
17 October 2001
27
iSCSI Implementations
iSCSI
Client
Native iSCSI Device
IP
Network
iSCSI
Server
17 October 2001
iSCSI
Gateway
FC
Switch
28
Disk
Storage Consolidation
NT
Servers
NT
Servers
Tape
Library
RAID
RAID
(Email)
Tape Drive
Switch
Switch
Switch
Switch
RAID
LAN
Mission-Critical RAID
(Oracle, ERP DB)
SAN
Tape Drive
RAID
Tape Drive
Server and LAN bottlenecks
Single points of failure
Poor scalability (management
overhead, resource inefficiencies)
17 October 2001
Tape Drives => Tape Library
Departmental => Application-centric
disc arrays
29
iSCSI Architecture
Overview
Architectural Model
Features Beyond // SCSI
Issues Beyond // SCSI
17 October 2001
30
iSCSI - Layered Model
Initiator I/O System
SCSI
Application
Layer
SCSI Application
Target I/O System
SCSI Application
Protocol
SCSI Device
Server
SCSI CDB
Protocol Service
Interface
iSCSI Protocol
Layer
iSCSI Protocol
Services
iSCSI Protocol
iSCSI Protocol
Services
iSCSI PDU
iSCSI Transport
Interface
TCP/IP
TCP/IP
TCP/IP
TCP/IP
TCP/IP Protocol
TCP/IP
TCP/IP
TCP/IP
TCP segments
in IP
datagrams
iSCSI session
Ethernet
Data link +
Physical
Data link +
Physical
Ethernet
Frame
Ethernet
17 October 2001
Replaces shared bus with switched fabric
Transparently encapsulates SCSI CDBs
Unlimited target and initiator connectivity
31
iSCSI Sessions
iSCSI Host
iSCSI Device
iSCSI Session
iSCSI Initiator
iSCSI Target
TCP Connection
TCP Connection
TCP Connection
iSCSI Target
iSCSI Session
Session between initiator and target
One
or more TCP connections per session
Login phase begins each connection
Deliver SCSI commands in order
Recover from lost connections
17 October 2001
32
iSCSI Encapsulation
Data Servers
IP
Network
SCSI Initiator
iSCSI Initiator
Ethernet
Header
iSCSI Target
FC
SCSI
Header
DATA
C
R
C
Ethernet
Header
T
I
C
P
P
C
R
C
DATA
T
I
C iSCSI SCSI DATA
P
P
C
R
C
SCSI Target
Fibre Channel SAN
LUNs
17 October 2001
33
External
Network
End Users
iSCSI Packet Order
Data Servers
1
2
3
IP
Network
SCSI Initiator
iSCSI Initiator
1
iSCSI Target
1 Target
2
SCSI
3
2
3
Fibre Channel SAN
LUNs
17 October 2001
34
iSCSI Packet
//
Ethernet
Header
IP
TCP
iSCSI
SCSI Data…
CRC
Checksum
17 October 2001
35
iSCSI Packet
46–1500 bytes
Preamble
Destination Source
Type
Address Address
8
6
6
Well-known
Ports:
21 FTP
23 Telnet
25 SMTP
80 iSCSI
http
5003
IP
TCP
Data
FCS
2
4 Octet
iSCSI
Encapsulated
Opcode
Opcode Specific Fields
Length of Data (after 40Byte header)
Sourced Port
Destination Port
LUN or Opcode-specific fields
Sequence Number
Acknowledgment Number
OffsetReserved U A P R S F
Window
Checksum
Urgent Pointer
Options and Padding
17 October 2001
TCP Header
Initiator Task Tag
Opcode Specific Fields
Data Field …
36
iSCSI Commands
SCSI Commands
Command phase
Optional data phase
Response phase
iSCSI Commands
17 October 2001
Binds command phase with
associated data into iSCSI
Protocol Data Unit (PDU)
37
iSCSI Architecture Features Beyond // SCSI
Sessions
Device sharing
Comprises one or more TCP connections used for fail
over and/or link aggregation
Any host on the network can potentially use the same
iSCSI device
Device scalability
Hosts can connect to an effectively limitless number of
iSCSI devices
17 October 2001
38
iSCSI Architecture Issues Beyond // SCSI
Naming, addressing and discovering
Security & Data Integrity
Ordering and numbering
Error handling/recovery
Networking Overhead
17 October 2001
39
iSCSI Architecture Issues
Naming, Addressing & Discovery
// SCSI uses a simple NAD scheme:
Devices discovered by polling the bus
Devices given unique id between 0 and 15
iSCSI requires:
Internet addressing
Location independent naming
operation beyond firewalls
multiple addresses to one target
multiple targets behind one address
3rd party commands
Scalable discovery (poll the Internet??)
17 October 2001
40
iSCSI Storage Device Discovery Process
1) Host driver requests available iSCSI targets
from the SCSI router
2) SCSI router sends available iSCSI target
names to host
3) Host logs into iSCSI targets that were received
4) SCSI router accepts the login and sends target
identifiers to Host (numbers)
5) Host queries targets for device information
6) Targets respond with device information
7) Host creates table of internal devices (/dev/…)
17 October 2001
41
iSCSI Sequence
Initiator
TCP
Target
Single TCP Session
Establish normal TCP Session
TCP port
5003
0X03 Command—Login
iSCSI Driver
Send Targets
0X43 Login Response—Reject Login Status 1
In text area, list of assessable target names.
Keeps TCP session up.
0X03 Command—Login
List of Target names sent
0X43 Login Response
Response with target drive mapping
17 October 2001
42
This
device
has
already
initialized
onto the
Fibre
Channel
iSCSI Architecture Issues: Security Levels
0: None – ok in controlled environments
1: Initiator and target authentication
2: Digests for header and data integrity
Prevents unauthorized access
Prevents against man-in-middle, insertion,
modification and deletion
3: Encryption (IPSEC)
Prevents against eavesdropping
17 October 2001
43
iSCSI Architecture Issues Ordering & Numbering
Unlike // SCSI, iSCSI PDUs may
Arrive out of order (by taking different routes)
Not arrive at all
iSCSI requires
Command numbering
Ordered delivery over multiple connections
Status numbering
Detection of a failed connections
Data sequencing
Detection of missing data PDUs
17 October 2001
44
iSCSI Architecture Issues Error Handling & Recovery
// SCSI errors incur costly recovery:
Aborted commands; target, bus and host resets
OK, because bus errors are infrequent
iSCSI errors will be more frequent
Link failures
TCP failures
Bad “middle box” (firewall, router)
Does the Internet have a “reset” option??
17 October 2001
45
iSCSI Architecture Issues Networking Overhead
Software iSCSI can achieve near GbE wire
speed – but at 100% CPU
Traditional TCP stacks are expensive
multiple memory copies
too many interrupts
checksums calculations
We needs TCP offload engines (TOE)
17 October 2001
46
iSCSI - TCP Offload
Ethernet
Header
IP
TCP
iSCSI
SCSI Data
CRC
Ethernet frame requires additional CPU processing
Headers must be stripped
Packets ordered
Data copied into memory buffers
CRC checked
17 October 2001
47
iSCSI Architecture Issues Networking
TOE
The challenge rests on the TOE vendor
Interrupt host on command boundaries
Offer zero-copy from NIC to app
Eliminate TCP reassembly buffer
Provides true zero-copy
Requires RDMA or synchronization
Proposed IETF solutions for framing
WARP - an RDMA mechanism
Markers – a synchronization mechanism
17 October 2001
48
What’s Next for iSCSI
CRC
SLP (Service Location Protocol)
Authentication
Encryption
17 October 2001
49
Conclusions
Conclusions
IP-based storage will proliferate
Benefits are strong
Significant players
Clear need
Standards will be established
Work with industry leaders
17 October 2001
51
Backup
iSNS
iSNS (Internet Storage Name Server)
Provides registration and discovery of SCSI
devices and Fibre Channel-based
In IP-based storage like iSCSI end devices
registered with iSNS
In iFCP, Fibre Channel-based storage end
devices register with iSNS by a iFCP gateway
17 October 2001
53
iSNS Operation
iSNS
server
FC network 1
FC network 2
Local
iFCP Portal
Server_1
N_port ID
#24
IP
Network
IP address
10.1.2.3
IP address
10.1.2.4
Remote
iFCP portal
Server_2
N_port ID
#24
Problem: Two identical N_port IDs
Solution: Create new ID (based on IP address + N_port ID) = 2422
17 October 2001
54
Tracing an iSCSI Block I/O
Server
Database
Application
1
iSCSI Appliance
Application
File I/O requests
2
Operating System
Database System
Raw Partition
Manager
iSCSI Appliance Storage
Storage I/O Bus
File System
Volume Manager
SCSI Device Driver
iSCSI Device Driver Layer
TCP/IPP stack
Network Interface Card
RAID Host Bus Adapter
SCSI Device Driver
iSCSI Device Driver Layer
TCP/IPP stack
Network Interface Card
Device specific requests to TCP/IP network
Block I/O / data / storage location
17 October 2001
55
Challenge 1 - TCP Overhead
Consider a SCSI WRITE command. How many times do you think
the data is copied before eventually reaching the target HBA?
Linux Host System
Application
File System
1
Buffer Cache
Linux Target System
SCSI Subsystem
2
iSCSI Host Driver
TCP/IP
Ethernet Driver
Ether
Bridging Software
iSCSI Target Driver
3
4
Block Device
Driver
TCP/IP
Ethernet Driver
Ether
HBA
Application –copy-> Buffer Cache –copy-> TCP/IP –DMA-> Ether (2 copies
1 DMA) Ether –DMA-> Ring Buffer –copy-> TCP/IP –copy-> Bridge –DMA->
HBA (2 copies 2 DMA)
17 October 2001
56
TCP Overhead (2)
TCP Processing
Every TCP connection that is part of an iSCSI session has
processing overhead potential
Connection setup / teardown
TCP state machine:
Acknowledge, Timeout, Retransmission
Window management
Congestion Control
TCP segmentation
IP fragmentation
Checksum calculations
Partial or Complete TCP Offload mechanisms are
assumed to be required to make iSCSI performance
comparable to FC
17 October 2001
57
Challenge #2 – Framing
Message Boundaries (The Framing - HW-Issue)
iSCSI messages have no alignment relationship with TCP
segments
And TCP does not have a “built in mechanism” for
signaling message boundaries.
IETF considered leverage the urgent pointer for some time
So how can an iSCSI adapter determine where a message
begins and ends??
By reading the length field in the iSCSI header
Determines where in byte stream current message ends and
next begins
NIC must stay “in sync” with beginning of byte stream
Works well in a perfect world (Maybe a SAN or LAN ????)
In a MAN/WAN we have issues
IP Frags leading to out-of-order packet delivery and/or packet
loss
Any “middle box” may fragment an IP packet until, sending each
along potentially different routes
17 October 2001
58
Framing (2)
Message Boundaries Continued
THE SCENARIO:
An iSCSI header is not received when expected because the TCP
segment that it was part of was delivered out of order
THE ISSUE:
The receiver does not know where to put the trailing data packets
until the packet with the header arrives
The different options?
Drop all packets until the header arrives
They will be retransmitted
Buffer packets until the header arrives. Then “re-assemble.”
On a 1Gbit WAN link,16MB of buffer memory is required per TCP
connection
On a 10 Gbit WAN link, 125MB of buffer memory required per TCP
connection
17 October 2001
59
Framing (3)
Message Boundaries Continued
THE BAD NEWS:
Dropping packets greatly impacts performance and
significantly increases network congestion
Local buffering is expensive and NIC logic is complex
17 October 2001
60
Into – SAN View
Storage Management & Apps
Hosts
Infrastructure
Targets
17 October 2001
61
SAN Components
Server Platforms:
Storage Platforms:
Fibre Channel Host Bus Adapters
IP Storage NICs (SNICs)
SAN Software
RAID subsystems
JBOD
Tape subsystems
SAN Interconnect:
Fibre Channel hubs and switches
IP Storage switches
SAN-to-SCSI bridges
MAN and WAN gateways
17 October 2001
62
SAN, NAS, iSCSI Comparison
DAS
SAN
iSCSI
iSCSI
Appliance Gateway
NAS
Computer System
Application
Application
Application
Application
Application
File System
File System
File System
File System
File System
Volume Manager
Volume Manager
Volume Manager
Volume Manager
SCSI Device Driver
iSCSI Driver
SCSI Device Driver
iSCSI Driver
I/O Redirector
NFS/CIFS
TCP/IP stack
NIC
SCSI Device Driver
SCSI Device Driver
SCSI Bus Adapter
Fibre Channel HBA
TCP/IP stack
TCP/IP stack
NIC
NIC
File I/O
Block I/O
SCSI
SAN
IP
IP
IP
FC
NIC
TCP/IP stack
iSCSI layer
Bus Adapter
NIC
TCP/IP stack
iSCSI layer
Bus Adapter
NIC
TCP/IP stack
File System
Device driver
Block I/O
FC switch
17 October 2001
63
17 October 2001
64
Potential Outcomes and Success Probability
17 October 2001
65
I/O Adapters “Data Movers”
Intel and other vendors will have
ONE Ethernet Wire
for
ALL Storage & LAN Traffic
I/O Block Data
GbE
R
010101
Port
LAN Data
010101
17 October 2001
66
Storage Functions/Applications
Current Functions/Applications
Storage Consolidation
Tape Backup
Clustering
Replication
Disaster Recovery
New Capabilities with IP Storage
SAN Extension
QoS
Security
17 October 2001
67
LAN-free Tape Backup
Users
Servers
RAID
SAN Switch
SAN Bridge Tape Subsystem
SAN Advantages for LAN-free Tape Backup:
Removes backup traffic from the LAN
Tape becomes SAN shared resource
High performance SAN infrastructure
SCSI attached via SAN bridge
17 October 2001
68
Remote Backup Application
NT
Server
Backup Server :
• Veritas Shared Storage Option
• Tivoli Storage Manager
Tape
Library
NT
Server
RAID
(Email)
HBAs
LAN
Mission-Critical RAID
(Oracle, ERP DB)
RAID
GE, 10GE ( iSCSI, iFCP )
Fibre Channel
SCSI
iSCSI
Servers
17 October 2001
Tape
Library
Allows customers to move archiving off-site for higher
disaster protection
69
Server Clustering
Users
Heartbeat
Servers
RAID
SAN Switch
SAN Advantages for server clustering:
17 October 2001
RAID
Server access to common storage resources
Failure of a single server still provides data access
Scalable to > 30 servers in a cluster
Simplified storage resource management
70
SAN Extension: Replication over WAN
NT
Server
Tape
Library
NT
Server
RAID
(Email)
HBAs
LAN
IP WAN
RAID
RAID
iSCSI
Servers
17 October 2001
Tape
Library
Unified Management of Data Center and WAN storage routers
Not vulnerable to disruption at a local SAN
IP WAN Link
Leverage current infrastructure
GE, 10GE ( iSCSI, iFCP )
Fibre Channel
Expandable to iSCSI devices
SCSI
(OC-3, T1, etc)
71
TCP/IP Layers
TCP/IP Protocols
OSI Model
TCP/IP layers
7
FTP
Telnet HTTP SNMP
TFTP
Process layer
6
5
4
TCP/IP
UDP
Connection oriented
Connectionless oriented
Host to host
layer
3
IP
Internet layer
2
LAN/WAN
Network
access layer
1
17 October 2001
Ethernet, token ring, ATM, Frame Relay, FDDI
72