DB-05: Database Storage Considerations
Download
Report
Transcript DB-05: Database Storage Considerations
DB-05:
Database Storage
Considerations
Adam Backman
White Star Software
[email protected]
Outline
Why is DB storage so important?
Hardware options
RAID
vs. JBOD
Network Storage (SAN, NAS, …)
Database setup options
Block
size
Cluster size (DB and BI)
Splitting Data into areas
DB-05 Database Storage Considerations
© 2005 White Star Software
Why is DB Storage So
Important?
Everything starts from the disk
Disks need to be reliable and fast
All physical disks are only capable of doing
approximately 100 operations per second
Salespeople are weasels
DB-05 Database Storage Considerations
© 2005 White Star Software
Who cares about 100 I/O
operations per second?
Example:
10000 read operations
90% buffer hit rate
1 disk would take 10 seconds to complete
10 disks would take 1 second to complete
Which of these would your users choose?
DB-05 Database Storage Considerations
© 2005 White Star Software
Stop I/O before it gets to the
disks
Application code
Use
indexes
Watch out for can-find
Bulk updates vs. many small updates
Use memory not disk
Add
system memory (no swapping or paging)
Increase buffers (-B, -Bt, -Bp)
Add cache to disk controllers
DB-05 Database Storage Considerations
© 2005 White Star Software
A little more about buffer hit rate
Using same example: 10000 Logical reads
90% buffer hit rate = 1000 Physical reads
95% buffer hit rate = 500 Physical reads
96% buffer hit rate = 400 Physical reads
See the effect buffer efficiency has on disks!
DB-05 Database Storage Considerations
© 2005 White Star Software
What causes disk I/O?
Operating system (swapping and paging)
Progress
Database
(DB and BI)
Application (code and temp files)
Other applications
DB-05 Database Storage Considerations
© 2005 White Star Software
Hardware Options
RAID vs. JBOD
Buying
disks
What is variance
Simplicity and cost vs. control and budget
Network vs. Locally Attached Storage
Definitions
Configuration
Pros
and Cons
DB-05 Database Storage Considerations
© 2005 White Star Software
Buying Disks
Buy small disks
Each disk regardless of it’s size is capable of
doing the same number of I/Os per second
Buy fast disks
Slow disk = slow performance
Buy reliable disks
Buy many disks
The inner portion of the disk is 20% faster than
the outer portion of the disk
DB-05 Database Storage Considerations
© 2005 White Star Software
Balancing Disk I/O
Balancing disk I/O is the process of making
sure you are using all of the available disk
resources (file systems, disks and
controllers) are working equally as hard at
load. This is also called eliminating
variance. A well tuned system will have less
than a 15% variance across all disks.
DB-05 Database Storage Considerations
© 2005 White Star Software
How can I eliminate variance?
Warning: This is a religious argument.
Everyone has an opinion and everyone is
right.
There are two schools of thought.
The system should take care of this.
I will take care of this.
DB-05 Database Storage Considerations
© 2005 White Star Software
The system will take care of this.
Disk I/O can be effectively distributed by the
operating system with the use of smart I/O
systems. Most notably RAID.
Basic Definition of RAID: Redundant Array
of Independent Disks
DB-05 Database Storage Considerations
© 2005 White Star Software
What RAID really means
RAID has many levels. I will only cover a
few.
RAID 0: This level is also called striping.
RAID 1: This is referred to as mirroring.
RAID 5: Most popular RAID level
RAID 10: This is mirroring and striping.
Also known as RAID 0 + 1
DB-05 Database Storage Considerations
© 2005 White Star Software
Network vs. Local Storage
Network storage options
Benefits of network storage
Drawbacks of network storage
DB-05 Database Storage Considerations
© 2005 White Star Software
Network Storage Options
SAN (Storage Area Network)
A dedicated storage network supporting multiple
computers
NAS (Network Attached Storage)
A dedicated storage device on an existing network
including a SAN
iSCSI
A storage networking protocol allows storage traffic to
travel through general-purpose networks instead of the
specialized Fibre Channel SAN
DB-05 Database Storage Considerations
© 2005 White Star Software
Benefits of Network Storage
Centralized data store - buy a SAN, hook up
many servers
Easier to maintain (backup one place)
DB-05 Database Storage Considerations
© 2005 White Star Software
Drawbacks to Network Storage
More complex performance monitoring
Many servers all making requests at the same time make
it difficult to resolve conflicts
Overall more complex
More things to break
Single point of failure
Many vendors support array to array mirroring to resolve
this issue
Generally more expensive
Unless you have a large number of servers using the
network storage
DB-05 Database Storage Considerations
© 2005 White Star Software
Creative Hardware Solution:
Solid State Disks
PROs
Reliable
Fast
CONs
Expensive
Really
expensive
Conclusion: Good for logs (AI and BI) and
limited portions (hot spots) of the database
DB-05 Database Storage Considerations
© 2005 White Star Software
I will take care of this
Manual spread of data across disks
Better control as you can see where the I/O
is going
More attention by system administrator is
needed
Administrator must think about reliability as
well
DB-05 Database Storage Considerations
© 2005 White Star Software
Database Setup Options
Database Stuff
Storage
areas
Blocksize and Records per block
Type II areas
BI cluster size
Effect
of BI cluster size
Why are APWs affected by BI Cluster size
DB-05 Database Storage Considerations
© 2005 White Star Software
Storage Areas
Benefits
Greater
control of location of data
Minimize downtime for utilities
Stripe some, leave some on straight disks
Drawbacks
More
things to break
More complex to monitor
DB-05 Database Storage Considerations
© 2005 White Star Software
Storage Areas - Control
A Storage Area can hold 1 or more data
objects (index, table, schema, …)
Separate schema from data if possible
Isolate the logs (AI and BI)
Try to keep the number of areas
manageable, only add more areas for valid
business reasons
DB-05 Database Storage Considerations
© 2005 White Star Software
Database Block Size
8k for most (4k for NT)
More in synch with the operating system
More stuff per read
Especially good for index reads
Watch the number of records per block so
you don’t waste block space
DB-05 Database Storage Considerations
© 2005 White Star Software
Records per Block
Can be set to a binary number between 1
and 256
If it is set too low you will waste space at
the end of each block
If you set it too high you run the risk of
fragmenting records (most true for records
that grow over time)
More important to get right for type II areas
DB-05 Database Storage Considerations
© 2005 White Star Software
Setting Records per Block
First do a database analysis.
proutil <dbname> -C dbanalys > filename
Then take the mean record size of the table(s)
you are working with and add 20 bytes for
record and block overhead.
Divide the result above into your block size.
Last take a binary number just above the
result above.
DB-05 Database Storage Considerations
© 2005 White Star Software
Example: Setting Records/Block
Table 1 has a mean record size of 120 bytes
Add 20 bytes of overhead (140)
Divide block size (8192) by result (140)
This (58.51) is how many mean-sized records
will fit in a single database block
Take the next higher binary number (64)
Are there exceptions, you bet!
DB-05 Database Storage Considerations
© 2005 White Star Software
Type II Areas
The table and index blocks from individual
objects are clustered together
The cluster size is “tunable” to 8, 64 or 512
block clusters
Benefits: Potential performance benefits
Drawbacks: Yet another thing to try to get
right, potential wasted space
DB-05 Database Storage Considerations
© 2005 White Star Software
How Type II Areas Work
The data blocks are clustered together and in turn
these clusters are chained together
Scan type operations need only find the first block
and the rest of the blocks will follow
Example: When you do an index rebuild, the first
step is to delete all of the index blocks. In a type 1
area all of the blocks in the area are scanned and
the index blocks are deleted. In a type II area the
first cluster is put on the free chain and no other
updates are needed.
DB-05 Database Storage Considerations
© 2005 White Star Software
Why are APWs Affected by BI
Cluster Size
The BI cluster size determines the length of
the checkpoint
The APW efficiency determines the
efficiency of the checkpoint
Generally, I start with 1 apw and the
“correct” BI cluster size and then add 1 apw
if I see buffers flushed at checkpoint during
my “important” processing times.
DB-05 Database Storage Considerations
© 2005 White Star Software
BI Cluster Size
The default value of 512 KB is wrong for
most people
Monitor the DB during your high update
portion of the day
Set the cluster size high enough to keep
checkpoint lengths greater than 2 minutes
Generally, 1 MB to 8 MB for most sites
DB-05 Database Storage Considerations
© 2005 White Star Software
Conclusion
Look at your storage
No RAID 5
RAID or JBOD
To network or not to network
Get your DB setup right
Block size (remember records per block)
Rational split of your data into storage areas
Type II areas (where needed)
Get your BI cluster size right
DB-05 Database Storage Considerations
© 2005 White Star Software
Use Both Methods
Use mirroring to protect your data
Use stripping to get your data onto more
physical drives
Use storage areas and extents eliminate
variance
Use storage areas and extents to further
spread your data
DB-05 Database Storage Considerations
© 2005 White Star Software
Questions
DB-05 Database Storage Considerations
© 2005 White Star Software