No Slide Title

Download Report

Transcript No Slide Title

EDC CR1 Storage Architecture
August 2003
Ken Gacke
Systems Engineer
(605) 594-6846
[email protected]
U.S. Department of the Interior
U.S. Geological Survey
Contractor for the USGS
at the EROS Data Center
1
Storage Architecture Decisions

Evaluated and recommended through engineering white
papers and weighted decision matrices

Requirements Factors






Reliability – Data Preservation
Performance – Data Access
Cost – $/GB, Engineering Support, O&M
Scalability – Data Growth, Multi-mission, etc.
Compatibility with current Architecture
Program/Project selects best solution
2
Contractor for the USGS
at the EROS Data Center
Storage Technologies

Online Storage Characteristics


Immediate Data Access
Server Limitations



Cost is Linear




High Performance RAID -- $30/GB using 146GB drives
Low Cost RAID -- $5/GB using ATA or IDE Drives
Non RAID – Less than $5/GB using 146GB drives
Facility Costs



Number of I/O slots
System Bandwidth
Disk drives are always powered up
Increased cooling requirements
Life cycle of 3 to 4 years
5
Contractor for the USGS
at the EROS Data Center
Storage Technologies

Online Storage

Direct Attach Storage (DAS)


Network Attach Storage (NAS)


Storage directly attached to server
TCP/IP access to storage typically with CIFS and NFS access
Storage Area Network (SAN)


Dedicated high speed network connecting storage devices
Storage devices disassociated from server
6
Contractor for the USGS
at the EROS Data Center
Storage Technologies

Direct Attach Online Storage


Disk is direct attached to single server
System Configuration





SCSI or Fibre Channel
 RAID Fibre Channel devices are typically SAN ready
Just a Bunch of Disk (JBOD)
Redundant Array Independent Disk (RAID)
High Performance on the local server
Manageability


Simple Configuration
Resource reallocation requires physical move of controllers and
disk
7
Contractor for the USGS
at the EROS Data Center
Storage Technologies

Direct Attach Online Storage

Advantages



High performance on local server
Good for image processing and database applications
Disadvantages


Data sharing limited to slower network performance
Difficult to reallocate resources to other servers
8
Contractor for the USGS
at the EROS Data Center
Storage Technologies
Direct Attached
100Mb Network (FTP/NFS)
Host A
Host B
Host C
File System
File System
100MB FC
File System
9
Contractor for the USGS
at the EROS Data Center
Storage Technologies

NAS Online Storage





Disk attached on server accessible over TCP/IP Network
System Configuration
 Fibre Channel RAID Configurations
 Switched Network Environment
Performance
 Network Switches and/or dedicated network topologies
Reliability
 NAS Server performs a single function thereby reducing faults
 RAID, Mirror, Snapshot capabilities
Easy to Manage
10
Contractor for the USGS
at the EROS Data Center
Storage Technologies

Network Attach Online Storage


Advantages
 Easy to share files among servers
 Network Storage support NFS and CIFS
 Servers can use existing network infrastructure
 Good for small file sharing such as office automation
 Availability of fault protection such as snapshot and mirroring
Disadvantages
 Slower performance due to TCP/IP overhead
 Increases network load
 Backup/Restore to tape may be difficult and/or slow
 Does not integrate with nearline storage
11
Contractor for the USGS
at the EROS Data Center
Storage Technologies
1Gb Network (NFS/CIFS)
Host A
Network Attached
Host B
Host C
NAS
Server
File System
File System
File System
Share Files
12
Contractor for the USGS
at the EROS Data Center
Storage Technologies

SAN Online Storage





Disk attached within Fabric Network
System Configuration
 Fibre Channel
 RAID Configurations
Scalable High Performance
High Reliability with redundant paths
Manageability
 Configuration becomes more complex
 Logical reallocation of resources
15
Contractor for the USGS
at the EROS Data Center
Storage Technologies
Redundancy SAN Configuration
100Mb Network
Fibre
Switch
Host A
Host B
Fibre
Switch
Host C
(DMF)
17
Contractor for the USGS
at the EROS Data Center
Storage Technologies

SAN Online Storage Architecture


Disk Farm
 Multiple servers share large disk farm
 Server mounts unique file systems
Clustered File Systems
 Multiple servers share a single file system
 Software Required – Vendor solutions include
 SGI CXFS
 ADIC StorNext File System
 Tivoli SANErgy
18
Contractor for the USGS
at the EROS Data Center
Storage Technologies
Disk Farm SAN Configuration
100Mb Network
Host A
Host B
Fibre
Switch
Logical
reallocation
of disk
Host C
19
Contractor for the USGS
at the EROS Data Center
Storage Technologies
Cluster SAN Configuration
100Mb Network
Host A
Host B
Clustered
File System
Fibre
Switch
CXFS
Host C
CXFS
20
Contractor for the USGS
at the EROS Data Center
Storage Technologies

SAN Risks


Cost is higher than DAS/NAS
Technology Maturity



Solutions are typically vendor specific
Application software dependencies
Infrastructure Support



Complexity of Architecture
Management of SAN Resources
Sharing of storage resources across multiple
Programs/Projects
21
Contractor for the USGS
at the EROS Data Center
Storage Technologies

SAN Benefits



Administration flexibility
 Logically move disk space among servers
 Large capacity drives can be sliced into smaller file systems
 Scales better than direct attach
 Integrate within nearline configuration
Data Reliability
 Storage disassociated from the server
 Fault Tolerant with Redundant Paths
Increase Resource Utilization
 Reduce the number of FTP network transfers
 Logically allocate space among servers
22
Contractor for the USGS
at the EROS Data Center
Storage Technologies
SAN with Nearline Configuration
1Gb Network
Host A
Fibre
Switch
Clustered
File System
Host B
CXFS
Tape Library
Host C
DMF/CXFS
23
Contractor for the USGS
at the EROS Data Center
Online/Nearline Cost Comparison
Use of Existing Infrastructure (CR1 Silo)
4000
5yr Cost (1000s)
3500
3000
2500
Perf RAID
Bulk RAID
PH 9840C
PH 9940B
2000
1500
1000
500
0
5TB
10TB
20TB
24
40TB
80TB
Contractor for the USGS
at the EROS Data Center
Storage Technologies

Bulk RAID Storage Considerations

Manageability


Server connectivity constraints
Many “islands” of storage



Data Reliability

Loss of online file system requires full restore from backup


On average, could restore one to two terabyte per day
Performance


Multiple storage management utilities
Multiple vendor maintenance contracts
Multiple user access will reduce performance
Life Cycle

Disk storage life cycle shorter then tape technologies
25
Contractor for the USGS
at the EROS Data Center
Storage Technologies

SAN Nearline Storage

Data Access



Data stored on infinite file system
 Immediate access to data residing on disk cache
 Delayed access for data retrieved from tape
Access via LAN using FTP/NFS
Access via SAN Clustered File System
 SGI DMF/CXFS Server
 SGI, SUN, Linux, NT clients
26
Contractor for the USGS
at the EROS Data Center
Storage Technologies

SAN Cluster Proposal



Mass Storage System & Product Distribution System (PDS)
Limit Exposure to Risk

Servers are homogeneous

Implement with Single dataset

Data is file orientated

Data currently being FTP
Anticipated Benefits

Improved performance

Reduce total disk capacity requirements

Experience for future storage solutions
27
Contractor for the USGS
at the EROS Data Center
Current DMF/SAN Configuration
Product Distribution
CXFS
SAN Storage
Disk Cache
/dmf/edc 68GB
/dmf/doqq 547GB
/dmf/guo 50GB
/dmf/pds 223GB
/dmf/pdsc 547GB
29
DMF Server
Tape Drives
8x9840
2x9940
Contractor for the USGS
at the EROS Data Center
CR1 Mass Storage System
Nearline Data Storage
48
44
40
32
28
24
20
16
12
8
4
30
Dec-02
Dec-01
Dec-00
Dec-99
Dec-98
Dec-97
Dec-96
Dec-95
Dec-94
0
Dec-93
Terabytes Stored
36
Contractor for the USGS
at the EROS Data Center
CR1 Mass Storage System
Nearline Data Storage by Data Type
24
16
General
Archive
Ortho
12
PDS
8
4
31
Dec-03
Dec-02
Dec-01
Dec-00
Dec-99
Dec-98
Dec-97
Dec-96
Dec-95
Dec-94
0
Dec-93
Terabytes Stored
20
Contractor for the USGS
at the EROS Data Center
CR1 Mass Storage System
Nearline Data Storage
100
90
70
General
60
Archive
50
Ortho
PDS
40
Total
30
20
10
32
Dec-04
Dec-03
Dec-02
Dec-01
Dec-00
Dec-99
Dec-98
Dec-97
Dec-96
Dec-95
Dec-94
0
Dec-93
Terabytes Stored
80
Contractor for the USGS
at the EROS Data Center
CR1 Mass Storage
Nearline Monthly Average Data Archive/Retrieve
9
8
6
5
Data Archived
Data Retrieved
4
3
2
33
2003
2002
2001
2000
1999
1998
1997
1996
1995
0
1994
1
1993
Terabyte Per Month
7
Contractor for the USGS
at the EROS Data Center
CR1 Mass Storage
Nearline Average Transfer Rate
8
7
6
4
Data Archived
Data Retrieved
3
2
34
2003
2002
2001
2000
1999
1998
1997
1996
0
1995
1
1994
MB/Sec
5
Contractor for the USGS
at the EROS Data Center
CR1 Mass Storage
Largest Single Day Data Transfers
Av 12.1MB/sec
1400
1200
800
Data Archived
Data Retrieved
600
Description
1996 – 3490, pre DOQQ
1999 – D-3, DOQQ
2002 – 9840, DOQQ
2003 – 9840/9940, UA/AVHRR
400
2003
2002
0
1999
200
1996
Gigabyte
1000
35
Contractor for the USGS
at the EROS Data Center
CR1 DMF FY04 Budget
Description
StorageTek Maintenance
SGI Maintenance (O300, DMF/SAN)
Sun Maintenance
ITS Charges (Labor, Legato)
Infrastructure Upgrades
Project Staff
Estimated
Cost
$41,000.00
$22,000.00
$1,300.00
$20,000.00
$41,700.00
$64,000.00
Total
$190,000.00
36
Contractor for the USGS
at the EROS Data Center
Storage Technologies

Multi Tiered Storage Vision

Online



Supported Configurations
 DAS – Local processing such as image processing
 NAS – Data sharing such as office automation
 SAN – Production processing such as product generation
Data accessed frequently
Nearline



Integrated within SAN
Scalable for large datasets and less frequently accessed data
Multiple Copies and/or Offsite Storage
37
Contractor for the USGS
at the EROS Data Center
Storage Technologies

SAN – Final Thoughts

SAN Technology Maturity


SAN solution should be from a single vendor
Program/Project SAN solution benefits
+
+
+
+
-
Decrease storage requirements
Increase performance
Increase reliability
Increase flexibility of resource allocations
Increase cost (hardware/software)
Increase configuration complexity
38
Contractor for the USGS
at the EROS Data Center