No Slide Title
Download
Report
Transcript No Slide Title
EDC CR1 Storage Architecture
August 2003
Ken Gacke
Systems Engineer
(605) 594-6846
[email protected]
U.S. Department of the Interior
U.S. Geological Survey
Contractor for the USGS
at the EROS Data Center
1
Storage Architecture Decisions
Evaluated and recommended through engineering white
papers and weighted decision matrices
Requirements Factors
Reliability – Data Preservation
Performance – Data Access
Cost – $/GB, Engineering Support, O&M
Scalability – Data Growth, Multi-mission, etc.
Compatibility with current Architecture
Program/Project selects best solution
2
Contractor for the USGS
at the EROS Data Center
Storage Technologies
Online Storage Characteristics
Immediate Data Access
Server Limitations
Cost is Linear
High Performance RAID -- $30/GB using 146GB drives
Low Cost RAID -- $5/GB using ATA or IDE Drives
Non RAID – Less than $5/GB using 146GB drives
Facility Costs
Number of I/O slots
System Bandwidth
Disk drives are always powered up
Increased cooling requirements
Life cycle of 3 to 4 years
5
Contractor for the USGS
at the EROS Data Center
Storage Technologies
Online Storage
Direct Attach Storage (DAS)
Network Attach Storage (NAS)
Storage directly attached to server
TCP/IP access to storage typically with CIFS and NFS access
Storage Area Network (SAN)
Dedicated high speed network connecting storage devices
Storage devices disassociated from server
6
Contractor for the USGS
at the EROS Data Center
Storage Technologies
Direct Attach Online Storage
Disk is direct attached to single server
System Configuration
SCSI or Fibre Channel
RAID Fibre Channel devices are typically SAN ready
Just a Bunch of Disk (JBOD)
Redundant Array Independent Disk (RAID)
High Performance on the local server
Manageability
Simple Configuration
Resource reallocation requires physical move of controllers and
disk
7
Contractor for the USGS
at the EROS Data Center
Storage Technologies
Direct Attach Online Storage
Advantages
High performance on local server
Good for image processing and database applications
Disadvantages
Data sharing limited to slower network performance
Difficult to reallocate resources to other servers
8
Contractor for the USGS
at the EROS Data Center
Storage Technologies
Direct Attached
100Mb Network (FTP/NFS)
Host A
Host B
Host C
File System
File System
100MB FC
File System
9
Contractor for the USGS
at the EROS Data Center
Storage Technologies
NAS Online Storage
Disk attached on server accessible over TCP/IP Network
System Configuration
Fibre Channel RAID Configurations
Switched Network Environment
Performance
Network Switches and/or dedicated network topologies
Reliability
NAS Server performs a single function thereby reducing faults
RAID, Mirror, Snapshot capabilities
Easy to Manage
10
Contractor for the USGS
at the EROS Data Center
Storage Technologies
Network Attach Online Storage
Advantages
Easy to share files among servers
Network Storage support NFS and CIFS
Servers can use existing network infrastructure
Good for small file sharing such as office automation
Availability of fault protection such as snapshot and mirroring
Disadvantages
Slower performance due to TCP/IP overhead
Increases network load
Backup/Restore to tape may be difficult and/or slow
Does not integrate with nearline storage
11
Contractor for the USGS
at the EROS Data Center
Storage Technologies
1Gb Network (NFS/CIFS)
Host A
Network Attached
Host B
Host C
NAS
Server
File System
File System
File System
Share Files
12
Contractor for the USGS
at the EROS Data Center
Storage Technologies
SAN Online Storage
Disk attached within Fabric Network
System Configuration
Fibre Channel
RAID Configurations
Scalable High Performance
High Reliability with redundant paths
Manageability
Configuration becomes more complex
Logical reallocation of resources
15
Contractor for the USGS
at the EROS Data Center
Storage Technologies
Redundancy SAN Configuration
100Mb Network
Fibre
Switch
Host A
Host B
Fibre
Switch
Host C
(DMF)
17
Contractor for the USGS
at the EROS Data Center
Storage Technologies
SAN Online Storage Architecture
Disk Farm
Multiple servers share large disk farm
Server mounts unique file systems
Clustered File Systems
Multiple servers share a single file system
Software Required – Vendor solutions include
SGI CXFS
ADIC StorNext File System
Tivoli SANErgy
18
Contractor for the USGS
at the EROS Data Center
Storage Technologies
Disk Farm SAN Configuration
100Mb Network
Host A
Host B
Fibre
Switch
Logical
reallocation
of disk
Host C
19
Contractor for the USGS
at the EROS Data Center
Storage Technologies
Cluster SAN Configuration
100Mb Network
Host A
Host B
Clustered
File System
Fibre
Switch
CXFS
Host C
CXFS
20
Contractor for the USGS
at the EROS Data Center
Storage Technologies
SAN Risks
Cost is higher than DAS/NAS
Technology Maturity
Solutions are typically vendor specific
Application software dependencies
Infrastructure Support
Complexity of Architecture
Management of SAN Resources
Sharing of storage resources across multiple
Programs/Projects
21
Contractor for the USGS
at the EROS Data Center
Storage Technologies
SAN Benefits
Administration flexibility
Logically move disk space among servers
Large capacity drives can be sliced into smaller file systems
Scales better than direct attach
Integrate within nearline configuration
Data Reliability
Storage disassociated from the server
Fault Tolerant with Redundant Paths
Increase Resource Utilization
Reduce the number of FTP network transfers
Logically allocate space among servers
22
Contractor for the USGS
at the EROS Data Center
Storage Technologies
SAN with Nearline Configuration
1Gb Network
Host A
Fibre
Switch
Clustered
File System
Host B
CXFS
Tape Library
Host C
DMF/CXFS
23
Contractor for the USGS
at the EROS Data Center
Online/Nearline Cost Comparison
Use of Existing Infrastructure (CR1 Silo)
4000
5yr Cost (1000s)
3500
3000
2500
Perf RAID
Bulk RAID
PH 9840C
PH 9940B
2000
1500
1000
500
0
5TB
10TB
20TB
24
40TB
80TB
Contractor for the USGS
at the EROS Data Center
Storage Technologies
Bulk RAID Storage Considerations
Manageability
Server connectivity constraints
Many “islands” of storage
Data Reliability
Loss of online file system requires full restore from backup
On average, could restore one to two terabyte per day
Performance
Multiple storage management utilities
Multiple vendor maintenance contracts
Multiple user access will reduce performance
Life Cycle
Disk storage life cycle shorter then tape technologies
25
Contractor for the USGS
at the EROS Data Center
Storage Technologies
SAN Nearline Storage
Data Access
Data stored on infinite file system
Immediate access to data residing on disk cache
Delayed access for data retrieved from tape
Access via LAN using FTP/NFS
Access via SAN Clustered File System
SGI DMF/CXFS Server
SGI, SUN, Linux, NT clients
26
Contractor for the USGS
at the EROS Data Center
Storage Technologies
SAN Cluster Proposal
Mass Storage System & Product Distribution System (PDS)
Limit Exposure to Risk
Servers are homogeneous
Implement with Single dataset
Data is file orientated
Data currently being FTP
Anticipated Benefits
Improved performance
Reduce total disk capacity requirements
Experience for future storage solutions
27
Contractor for the USGS
at the EROS Data Center
Current DMF/SAN Configuration
Product Distribution
CXFS
SAN Storage
Disk Cache
/dmf/edc 68GB
/dmf/doqq 547GB
/dmf/guo 50GB
/dmf/pds 223GB
/dmf/pdsc 547GB
29
DMF Server
Tape Drives
8x9840
2x9940
Contractor for the USGS
at the EROS Data Center
CR1 Mass Storage System
Nearline Data Storage
48
44
40
32
28
24
20
16
12
8
4
30
Dec-02
Dec-01
Dec-00
Dec-99
Dec-98
Dec-97
Dec-96
Dec-95
Dec-94
0
Dec-93
Terabytes Stored
36
Contractor for the USGS
at the EROS Data Center
CR1 Mass Storage System
Nearline Data Storage by Data Type
24
16
General
Archive
Ortho
12
PDS
8
4
31
Dec-03
Dec-02
Dec-01
Dec-00
Dec-99
Dec-98
Dec-97
Dec-96
Dec-95
Dec-94
0
Dec-93
Terabytes Stored
20
Contractor for the USGS
at the EROS Data Center
CR1 Mass Storage System
Nearline Data Storage
100
90
70
General
60
Archive
50
Ortho
PDS
40
Total
30
20
10
32
Dec-04
Dec-03
Dec-02
Dec-01
Dec-00
Dec-99
Dec-98
Dec-97
Dec-96
Dec-95
Dec-94
0
Dec-93
Terabytes Stored
80
Contractor for the USGS
at the EROS Data Center
CR1 Mass Storage
Nearline Monthly Average Data Archive/Retrieve
9
8
6
5
Data Archived
Data Retrieved
4
3
2
33
2003
2002
2001
2000
1999
1998
1997
1996
1995
0
1994
1
1993
Terabyte Per Month
7
Contractor for the USGS
at the EROS Data Center
CR1 Mass Storage
Nearline Average Transfer Rate
8
7
6
4
Data Archived
Data Retrieved
3
2
34
2003
2002
2001
2000
1999
1998
1997
1996
0
1995
1
1994
MB/Sec
5
Contractor for the USGS
at the EROS Data Center
CR1 Mass Storage
Largest Single Day Data Transfers
Av 12.1MB/sec
1400
1200
800
Data Archived
Data Retrieved
600
Description
1996 – 3490, pre DOQQ
1999 – D-3, DOQQ
2002 – 9840, DOQQ
2003 – 9840/9940, UA/AVHRR
400
2003
2002
0
1999
200
1996
Gigabyte
1000
35
Contractor for the USGS
at the EROS Data Center
CR1 DMF FY04 Budget
Description
StorageTek Maintenance
SGI Maintenance (O300, DMF/SAN)
Sun Maintenance
ITS Charges (Labor, Legato)
Infrastructure Upgrades
Project Staff
Estimated
Cost
$41,000.00
$22,000.00
$1,300.00
$20,000.00
$41,700.00
$64,000.00
Total
$190,000.00
36
Contractor for the USGS
at the EROS Data Center
Storage Technologies
Multi Tiered Storage Vision
Online
Supported Configurations
DAS – Local processing such as image processing
NAS – Data sharing such as office automation
SAN – Production processing such as product generation
Data accessed frequently
Nearline
Integrated within SAN
Scalable for large datasets and less frequently accessed data
Multiple Copies and/or Offsite Storage
37
Contractor for the USGS
at the EROS Data Center
Storage Technologies
SAN – Final Thoughts
SAN Technology Maturity
SAN solution should be from a single vendor
Program/Project SAN solution benefits
+
+
+
+
-
Decrease storage requirements
Increase performance
Increase reliability
Increase flexibility of resource allocations
Increase cost (hardware/software)
Increase configuration complexity
38
Contractor for the USGS
at the EROS Data Center