Performance assessment of distributed SAN systems

Download Report

Transcript Performance assessment of distributed SAN systems

Performance assessment of distributed
SAN systems
TERENA Networking Conference,
Poznań, 2005
Bartosz Belter
Artur Binczewski
Wojbor Bogacki
Maciej Brzeźniak
[email protected]
[email protected]
[email protected]
[email protected]
Agenda
Introduction
Storage Networking challenges
IP Storage – new approach to build distributed SANs
IP Storage – experiments in Polish NREN PIONIER
Storage Networking
Storage Networking definition from SNIA
The practice of creating, installing, administering, or using networks whose
primary purpose is the transfer of data between computer systems and
storage elements and among storage elements.
Storage Area Network is a high-speed specialpurpose network (or subnetwork) that interconnects
different kinds of data storage devices with associated
data servers. Usually SANs are based on Fibre
Channel or SCSI technology.
Storage Networking – the importance
Explosion of Storage Data:
Data Warehousing
statistics,
charts,
reporting
Internet
web hosting
e-commerce
e-bussiness
Customer Relationship Management
Currently focused on application aspect:
Local and remote mirroring, backups and disaster recovery
Remote data replication
Local and remote storage access
Are separated SANs enough for high
performance computing?
How to integrate remote, separated HPC centers in single, distributed, scalable high
performance system?
HPC centers use different technology, not always applicable in backbone network
traditional Storage Networking introduces additional limitation: maximum distance to
transfer data
Traditional Storage
Networking technology
SCSI
FC
Maximum cable length
25 meters if no more then 2 devices
are used, otherwise 12 meters
30 meters device to device
(copper), 10 000 meters device to
device (optical)
2.560 Gbps
up to 2.125 Gbps
Maximum speed
(10 Gbps in the near future)
Maximum number of
devices
16
126
IP Storage
IP Storage is a new approach to extend existing Storage Area Networks using IP
protocol, usually over Gigabit Ethernet.
According to SNIA, IP Storage is:
Computer systems and storage elements that are connected via Internet Protocol (IP)
The transport of storage traffic over an IP network
IP Storage traffic carries the traditional block I/O using SCSI protocols supported by most
open systems
According to SNIA, IP Storage is not:
File-level transfer of data (i.e NAS)
Object level access (i.e. http, ftp)
IP Storage protocols
Internet Small Computer Systems Interface (iSCSI)
iSCSI is a protocol which enables transfer of data-block traffic via IP network instead of
a direct SCSI compatible bus. It uses a TCP layer and unlike other network storage
protocols it requires only Ethernet interface to operate.
Internet Fibre Channel Protocol (iFCP)
iFCP is a new standard for extending Fibre Channel storage networks across the
Internet. It provides a mechanism to deliver storage data to and from Fiber Channel
storage devices over SAN infrastructure or even over the Internet using TCP/IP.
Fibre Channel Over IP (FCIP)
FCIP describes mechanisms that allow the interconnection of islands of Fibre Channel
storage area networks over IP-based networks to form a unified storage area network in
a single Fibre Channel fabric. FCIP relies on IP-based network services to provide the
connectivity between the storage area network islands over local area networks,
metropolitan area networks, or wide area networks.
The experiment
Tests were performed in Polish Optical Internet PIONIER
testbed interconnects 9 HPC centers
maximum distance length - over 1500 km
no QOS provided for FCIP traffic across WAN infrastructure. FCIP was tested based on
production network
IP Storage vendor solutions used in tests:
CNT UltraNet Edge 3000
Cisco MDS 9216 and 8-port IP Storage Services Module
The main goals of the experiment:
to build the distributed data architecture based on new IP Storage technology
to verify IP Storage protocols (iSCSI and FCIP) used in live network environment
to evaluate the performance of IP Storage vendor solutions connected via Gigabit
Ethernet
Testbed description
Testbed description
Hardware:
PC
Processor: Pentium 4 3.0 GHz
Memory: 512 MB
Hard Disc: Segate Baracuda 7200.7 SATA
Western Digital Raptor WD740GD
Gigabit Ethernet Controller
Fibre Channel interface QLA 2340
IP Storage element
Cisco - MDS 9216
CNT - UltraNet Edge 3000
RAID 0 includes two storage arrays
RAID 0
PC
IP Storage
element
Gigabit
Ethernet
IP Storage
element
Gigabit
Ethernet
switch
Gigabit
Ethernet
switch
Testing methodology
Benchmark software:
Windows 2000
HD Tach
SiSoftware Sandra
010010
010010
101001
010010
101001
001010
010010
101001
001010
101001
010010
101001
001010
101001
111010
010010
101001
001010
101001
111010
010010
101001
001010
101001
111010
010010
101001
001010
101001
111010
010010
101001
001010
101001
111010
010010
101001
001010
101001
111010
010010
101001
001010
101001
111010
010010
101001
001010
101001
111010
010010
101001
001010
101001
111010
010010
101001
001010
101001
111010
010010
101001
001010
101001
111010
010010
101001
001010
101001
111010
101001
001010
101001
111010
001010
101001
111010
101001
111010
111010
Linux Suse 9.1 and 9.2
Bonnie
IOZone
HDParm
IOMeter
MySQL database benchmark
Performance Benchmark from Tivoli SANergy
Test results
Performance Benchmark from Tivoli SANergy
Throughput
MB/s
Reading performance
100
90
80
FCIP
iSCSI
70
60
50
40
30
20
10
Poznań
0 km
Zielona Góra
161 km
Wrocław
390 km
Opole
500 km
Katowice
650 km
Bielsko Kraków
Biała 850 km
740 km
Radom
1090 km
Białystok
1540 km
Test Site
as it was expected the overall performance decreases, it has linear relationship with the distance
interconnection of distant HPC centers is possible even over 1500 km! (but the overall performance
decreases twice)
Test results
Performance Benchmark from Tivoli SANergy
Reading/Writing performance (Write Acceleration option)
Reading Throughput
MB/s
Writing Throughput
MB/s
100
90
80
70
60
50
40
30
20
10
Poznań Wrocław Bielsko Biała
740 km
390 km
0 km
Białystok
1540 km
Test Site
100
90
80
70
60
50
40
30
20
10
Poznań Wrocław Bielsko Biała
390 km
740 km
0 km
FCIP
FCIP
Write
Acceleration
Białystok
1540 km
Test Site
some vendors introduce their own improvements to protocols - CISCO implements "Write
Acceleration" (WA) feature
WA has not affected the reading performance
WA introduces interesting results for writing performance – in Białystok (1540 km) writing
performance increases twice in comparison to standard FCIP transmission
Test results
IOMeter: Reading - CPU load
%CPU
100
80
FCIP
60
iSCSI
40
20
0
Poznań
0 km
Zielona Góra
161 km
Wrocław
390 km
Opole
500 km
Katowice
650 km
Bielsko Kraków
Biała 850 km
740 km
Radom
1090 km
Białystok
1540 km
iSCSI software driver introduces higher CPU load than FCIP (handled by a hardware)
Test Site
Test results
Copying of 700MB raw data
Time
sec
25
FCIP
22.5
20
iSCSI
17.5
15
12.5
10
7.5
5
2.5
Poznań
0 km
Zielona Góra
161 km
Wrocław
390 km
Opole
500 km
Katowice
650 km
Bielsko Kraków
Biała 850 km
740 km
good linear relationship with the distance
Radom
1090 km
Białystok
1540 km
Test Site
Test results – MySQL benchmark
MySQL
– popular Open Source Relational Database
benchw
– simple benchmark for relational databases (http://benchw.sourceforge.net)
DB Tables:
fact01:
1,02 GB - 10mln records,
dim1:
0,24MB - 10k records,
dim0:
0,24 MB - 10k records,
dim2
1,40MB - 10k record
Query types:
Loading data into the database:
all tables
Q0: select from 2 tables, 2 cond. (dim0 & fact01,
Generating indexes for the table:
DB & DB filesystem recreated each time
all tables
“=”, “<>”, numbers)
Test results
MySQL database benchmark
Time
Loading data to database server
sec
100
FCIP
80
iSCSI
60
40
20
0
Poznań
0 km
Zielona Góra
161 km
Wrocław
390 km
Opole
500 km
Katowice
650 km
Bielsko Kraków
Biała 850 km
740 km
Radom
1090 km
Białystok
1540 km
Test Site
load to database performs sequential reading of input file and putting data into the db structure
operation performance scales linearly with the distance
Test results
MySQL database benchmark
Query no 0
Time
sec
60
FCIP
40
iSCSI
20
0
Poznań
0 km
Zielona Góra
161 km
Wrocław
390 km
Opole
500 km
Katowice
650 km
Bielsko Kraków
Biała 850 km
740 km
Radom
1090 km
Białystok
1540 km
Test Site
operation reads from two database tables only
even non-complicated query introduces decrease of performance in comparison between local and
remote measurements
Test results
MySQL database benchmark
Time
sec
Index generating
600
FCIP
400
iSCSI
200
0
Poznań
0 km
Zielona Góra
161 km
Wrocław
390 km
Opole
500 km
Katowice
650 km
Bielsko Kraków
Biała 850 km
740 km
Radom
1090 km
Białystok
1540 km
Test Site
operation reads from all tables stored in database and writes small amount of data (generated
indexes)
more complicated request introduces significant decrease of performance in comparison between
local and remote measurements
Test results
Time
sec
dd command – FCIP vs. iSCSI
55
Block Size
50
45
4096
40
16384
35
32768
30
131072
25
20
15
10
5
Poznań
0 km
Zielona Góra
161 km
Wrocław
390 km
Opole
500 km
Katowice
650 km
Bielsko Kraków
Biała 850 km
740 km
Radom
1090 km
Białystok
1540 km
Test Site
FCIP
iSCSI
Test results
Time
sec
dd command – FCIP vs. iSCSI
55
Block Size
50
45
4096
40
16384
35
32768
30
131072
25
20
15
10
5
Poznań
0 km
Zielona Góra
161 km
Wrocław
390 km
Opole
500 km
Katowice
650 km
Bielsko Kraków
Biała 850 km
740 km
Radom
1090 km
Białystok
1540 km
Test Site
FCIP
iSCSI
Test results
Time
sec
dd command – FCIP vs. iSCSI
55
Block Size
50
45
4096
40
16384
35
32768
30
131072
25
20
15
10
5
Poznań
0 km
Zielona Góra
161 km
Wrocław
390 km
Opole
500 km
Katowice
650 km
Bielsko Kraków
Biała 850 km
740 km
Radom
1090 km
Białystok
1540 km
Test Site
using block size 4kB, 16kB or 32kB there are no significant differences between
iSCSI and FCIP protocols
the greater block size – the better performance, but ...
too large block size decreases overall performance (block size > raid chunk size)
FCIP
iSCSI
IP Storage – tuning up the transmission
configured TCP parameters:
TCP Maximum Window Size (default: 64 Kbytes, maximum: 32 Mbytes)
MWS > B x D
example: Gigabit Ethernet Network, RTT = 10 ms
B – end to end bandwith
MWS > 1000 x 10 6bit/sec x 10 x 10 -3sec
D – round trip time
MWS > ~1,2 Mbytes
TCP Selective Acknowledge
TCP SACK helps TCP connections that are extended over long distances to recover from any sort of
frame loss that may occur
MTU set to 2148 bytes on IP Storage devices
for iSCSI protocol - hardware TCP Offload Engine was not tested
for FCIP protocol – FCIP Compression was not tested
IP Storage – conclusions
As it was expected the overall performance decreases, it has linear relationship with the distance
(latency)
Assuming linear characteristic – it’s possible and easy to predict how overall performance decreases
with the increase of distance (latency):
for each 100 km of distance -> performance decreases about 4 MB/s
for every 1 ms of latency -> performance decreases about 3 MB/s
Interconnection of far HPC centers is possible even over 1500 km! (but the overall performance
decreases twice)
Write Acceleration feature considerably increases writing performance
iSCSI software driver used in tests could really affect the iSCSI performance, especially for short
distances
Interoperability
Even if IP Storage protocols published by IETF – still an important issue!
Thank you!