DB-05: Database Storage Considerations

Download Report

Transcript DB-05: Database Storage Considerations

DB-05:
Database Storage
Considerations
Adam Backman
White Star Software
[email protected]
Outline
Why is DB storage so important?
 Hardware options

 RAID
vs. JBOD
 Network Storage (SAN, NAS, …)

Database setup options
 Block
size
 Cluster size (DB and BI)
 Splitting Data into areas
DB-05 Database Storage Considerations
© 2005 White Star Software
Why is DB Storage So
Important?
Everything starts from the disk
 Disks need to be reliable and fast
 All physical disks are only capable of doing
approximately 100 operations per second
 Salespeople are weasels

DB-05 Database Storage Considerations
© 2005 White Star Software
Who cares about 100 I/O
operations per second?
Example:
10000 read operations
90% buffer hit rate
1 disk would take 10 seconds to complete
10 disks would take 1 second to complete
Which of these would your users choose?
DB-05 Database Storage Considerations
© 2005 White Star Software
Stop I/O before it gets to the
disks

Application code
 Use
indexes
 Watch out for can-find
 Bulk updates vs. many small updates

Use memory not disk
 Add
system memory (no swapping or paging)
 Increase buffers (-B, -Bt, -Bp)
 Add cache to disk controllers
DB-05 Database Storage Considerations
© 2005 White Star Software
A little more about buffer hit rate
Using same example: 10000 Logical reads
90% buffer hit rate = 1000 Physical reads
95% buffer hit rate = 500 Physical reads
96% buffer hit rate = 400 Physical reads
See the effect buffer efficiency has on disks!
DB-05 Database Storage Considerations
© 2005 White Star Software
What causes disk I/O?
Operating system (swapping and paging)
 Progress

 Database
(DB and BI)
 Application (code and temp files)

Other applications
DB-05 Database Storage Considerations
© 2005 White Star Software
Hardware Options

RAID vs. JBOD
 Buying
disks
 What is variance
 Simplicity and cost vs. control and budget

Network vs. Locally Attached Storage
 Definitions
 Configuration
 Pros
and Cons
DB-05 Database Storage Considerations
© 2005 White Star Software
Buying Disks

Buy small disks
Each disk regardless of it’s size is capable of
doing the same number of I/Os per second

Buy fast disks
Slow disk = slow performance
Buy reliable disks
 Buy many disks

The inner portion of the disk is 20% faster than
the outer portion of the disk
DB-05 Database Storage Considerations
© 2005 White Star Software
Balancing Disk I/O
Balancing disk I/O is the process of making
sure you are using all of the available disk
resources (file systems, disks and
controllers) are working equally as hard at
load. This is also called eliminating
variance. A well tuned system will have less
than a 15% variance across all disks.
DB-05 Database Storage Considerations
© 2005 White Star Software
How can I eliminate variance?
Warning: This is a religious argument.
Everyone has an opinion and everyone is
right.
There are two schools of thought.
 The system should take care of this.
 I will take care of this.
DB-05 Database Storage Considerations
© 2005 White Star Software
The system will take care of this.
Disk I/O can be effectively distributed by the
operating system with the use of smart I/O
systems. Most notably RAID.
Basic Definition of RAID: Redundant Array
of Independent Disks
DB-05 Database Storage Considerations
© 2005 White Star Software
What RAID really means
RAID has many levels. I will only cover a
few.
 RAID 0: This level is also called striping.
 RAID 1: This is referred to as mirroring.
 RAID 5: Most popular RAID level
 RAID 10: This is mirroring and striping.
Also known as RAID 0 + 1
DB-05 Database Storage Considerations
© 2005 White Star Software
Network vs. Local Storage
Network storage options
 Benefits of network storage
 Drawbacks of network storage

DB-05 Database Storage Considerations
© 2005 White Star Software
Network Storage Options

SAN (Storage Area Network)
A dedicated storage network supporting multiple
computers

NAS (Network Attached Storage)
A dedicated storage device on an existing network
including a SAN

iSCSI
A storage networking protocol allows storage traffic to
travel through general-purpose networks instead of the
specialized Fibre Channel SAN
DB-05 Database Storage Considerations
© 2005 White Star Software
Benefits of Network Storage
Centralized data store - buy a SAN, hook up
many servers
 Easier to maintain (backup one place)

DB-05 Database Storage Considerations
© 2005 White Star Software
Drawbacks to Network Storage

More complex performance monitoring
Many servers all making requests at the same time make
it difficult to resolve conflicts

Overall more complex
More things to break

Single point of failure
Many vendors support array to array mirroring to resolve
this issue

Generally more expensive
Unless you have a large number of servers using the
network storage
DB-05 Database Storage Considerations
© 2005 White Star Software
Creative Hardware Solution:
Solid State Disks

PROs
 Reliable
 Fast

CONs
 Expensive
 Really

expensive
Conclusion: Good for logs (AI and BI) and
limited portions (hot spots) of the database
DB-05 Database Storage Considerations
© 2005 White Star Software
I will take care of this
Manual spread of data across disks
 Better control as you can see where the I/O
is going
 More attention by system administrator is
needed
 Administrator must think about reliability as
well

DB-05 Database Storage Considerations
© 2005 White Star Software
Database Setup Options

Database Stuff
 Storage
areas
 Blocksize and Records per block
 Type II areas

BI cluster size
 Effect
of BI cluster size
 Why are APWs affected by BI Cluster size
DB-05 Database Storage Considerations
© 2005 White Star Software
Storage Areas

Benefits
 Greater
control of location of data
 Minimize downtime for utilities
 Stripe some, leave some on straight disks

Drawbacks
 More
things to break
 More complex to monitor
DB-05 Database Storage Considerations
© 2005 White Star Software
Storage Areas - Control
A Storage Area can hold 1 or more data
objects (index, table, schema, …)
 Separate schema from data if possible
 Isolate the logs (AI and BI)
 Try to keep the number of areas
manageable, only add more areas for valid
business reasons

DB-05 Database Storage Considerations
© 2005 White Star Software
Database Block Size
8k for most (4k for NT)
 More in synch with the operating system
 More stuff per read
 Especially good for index reads
 Watch the number of records per block so
you don’t waste block space

DB-05 Database Storage Considerations
© 2005 White Star Software
Records per Block
Can be set to a binary number between 1
and 256
 If it is set too low you will waste space at
the end of each block
 If you set it too high you run the risk of
fragmenting records (most true for records
that grow over time)
 More important to get right for type II areas

DB-05 Database Storage Considerations
© 2005 White Star Software
Setting Records per Block
First do a database analysis.
proutil <dbname> -C dbanalys > filename
Then take the mean record size of the table(s)
you are working with and add 20 bytes for
record and block overhead.
Divide the result above into your block size.
Last take a binary number just above the
result above.
DB-05 Database Storage Considerations
© 2005 White Star Software
Example: Setting Records/Block
Table 1 has a mean record size of 120 bytes
 Add 20 bytes of overhead (140)
 Divide block size (8192) by result (140)

This (58.51) is how many mean-sized records
will fit in a single database block

Take the next higher binary number (64)
Are there exceptions, you bet!
DB-05 Database Storage Considerations
© 2005 White Star Software
Type II Areas
The table and index blocks from individual
objects are clustered together
 The cluster size is “tunable” to 8, 64 or 512
block clusters
 Benefits: Potential performance benefits
 Drawbacks: Yet another thing to try to get
right, potential wasted space

DB-05 Database Storage Considerations
© 2005 White Star Software
How Type II Areas Work
The data blocks are clustered together and in turn
these clusters are chained together
 Scan type operations need only find the first block
and the rest of the blocks will follow
Example: When you do an index rebuild, the first
step is to delete all of the index blocks. In a type 1
area all of the blocks in the area are scanned and
the index blocks are deleted. In a type II area the
first cluster is put on the free chain and no other
updates are needed.

DB-05 Database Storage Considerations
© 2005 White Star Software
Why are APWs Affected by BI
Cluster Size
The BI cluster size determines the length of
the checkpoint
 The APW efficiency determines the
efficiency of the checkpoint
 Generally, I start with 1 apw and the
“correct” BI cluster size and then add 1 apw
if I see buffers flushed at checkpoint during
my “important” processing times.

DB-05 Database Storage Considerations
© 2005 White Star Software
BI Cluster Size
The default value of 512 KB is wrong for
most people
 Monitor the DB during your high update
portion of the day
 Set the cluster size high enough to keep
checkpoint lengths greater than 2 minutes
 Generally, 1 MB to 8 MB for most sites

DB-05 Database Storage Considerations
© 2005 White Star Software
Conclusion

Look at your storage




No RAID 5
RAID or JBOD
To network or not to network
Get your DB setup right




Block size (remember records per block)
Rational split of your data into storage areas
Type II areas (where needed)
Get your BI cluster size right
DB-05 Database Storage Considerations
© 2005 White Star Software
Use Both Methods
Use mirroring to protect your data
 Use stripping to get your data onto more
physical drives
 Use storage areas and extents eliminate
variance
 Use storage areas and extents to further
spread your data

DB-05 Database Storage Considerations
© 2005 White Star Software
Questions
DB-05 Database Storage Considerations
© 2005 White Star Software