Top 10 Database tuning tips

Download Report

Transcript Top 10 Database tuning tips

Top 10 Oracle database
tuning tips and techniques
Guy Harrison
The top 10
Be methodical and empirical
Optimize your database design
Index Wisely
Write efficient code
Optimize the optimizer
Tune SQL and PL/SQL
Monitor and Manage contention
Optimize memory to reduce IO
Tune IO last, but TUNE it well
2
Dell Software Group
Be methodical and
empirical
Dell - Restricted - Confidential
Hint 1: Methodical and Empirical tuning
4
Methodical:
• Have a plan
• Focus on root causes, not symptoms
• Hypothesis and experiment, not trial and error
error
Empirical
• Measure performance
• Compare before and after
• What is your control group?
• Beware of “old DBA tales” and “silver bullets”
bullets”
Dell Software Group
Hint 1: Methodical and Empirical tuning
5
Performance
Troubleshooting
• Identify the element of your workload which is
contributing most to service time
• Workload could be SQL or logical transaction
• Element is typically CPU time + Oracle wait time
time
• Find a way to reduce that element
Tuning by layers
• As above, but prioritize higher layers of the stack (which
(which have flow on effects into lower layers)
• Distinguish between symptoms and causes
Dell Software Group
Basics of a database request
6
Dell Software Group
The 4 layers in database performance
7
Dell Software Group
What’s wrong with the
donkey?
8
Dell Software Group
Measurement
• Standard SQL views
• AWR reports
• DB control/Cloud control
• Toad/Spotlight other 3rd party tools
• A single query can tell you a lot
http://guyharrison.net/OPSGSample
s/Ch03/timeModelSimple.sql
9
Dell Software Group
Database design
Dell - Restricted - Confidential
Database design
Third Normal form
(3NF)
3NF discretion
Denormalization
11
Dell Software Group
• The key, the whole key and nothing but the key
• OR – Don’t repeat yourself (DRY)
• Datatypes (VARCHARs vs LOBS, etc)
• Subtypes
• NULLS
• Replicate column values to avoid joins
• Summary tables and materialized views
• Vertical partitioning
3NF is mostly Don’t Repeat Yourself (DRY)
• We don’t want to repeat the student name
every time they take a new test
• We don’t want to repeat the test_name for
every student that takes it
• We don’t want the “repeating group” of
answers
– Also, we might not know how many questions
there will be in future tests
12
Dell Software Group
3NF version
• Students take tests
• Tests have questions
• Students taking tests provide answers
13
Dell Software Group
Don’t go to far!
• In this example the designer pulled address into a
separate table and city, country into another table.
• It’s correct in theory, but it means that every time
we want an employees address we have to join three
tables.
• It would have been better to have just one table
14
Dell Software Group
Logical to Physical: Subtypes
“Customers are people too”
15
Dell Software Group
Subtypes
• It’s usually better to put all the data in one table, or have two tables for each subtype
• Having a “supertype” table and two “subtype” tables means you always have to join
tables
Probably don’t
want to do this
16
Dell Software Group
Denormalization
17
Dell Software Group
Index Wisely
Dell - Restricted - Confidential
Indexes
Indexes exist mainly to improve performance
• Choosing the best set of indexes is crucial
• Create the best set of Concatenated (multi-column) indexes that will support
support your queries.
• Indexes slow down transactions, so don’t create too many
Think of indexes on a table like indexes in a book
• You use the index when you want to look up one thing or a few things
things
• You don’t use the index to read the whole book or read a whole chapter
chapter
• You do use an index when you want to go to a specific page
19
Dell Software Group
Concatenated index effectiveness
• The more columns in the
index the less IO it takes to
resolve a query using all the
columns
last,first,birthyear,id 3
last,first,BirthYear 4
• The order of the columns
affects how the index can be
used
last+first name 6
last name
63
None
1459
0
200
400
600
800 1000 1200 1400 1600
Logical IO
20
Dell Software Group
• If the index has all the
columns, don’t need to touch
the table at all
SELECT cust_id
FROM sh.customers c
WHERE cust_first_name = 'Connor'
AND cust_last_name = 'Bishop'
AND cust_year_of_birth = 1976;
Index overhead
7
16,316
Number of indexes
6
14,285
5
• This chart shows how inserts
slow down as more and more
indexes are added
12,727
4
10,719
3
• Deleting a row with lot’s of
indexes is particularly
expensive
8,691
2
6,671
1,191
1 (PK only)
0
2,000
4,000
6,000
8,000 10,000 12,000 14,000 16,000 18,000
Logical reads required
21
Dell Software Group
• Indexes speed up queries,
but make DML (insert,
update, Delete) slower
Index or FTS?
1000
Elasped Time (s)
100
10
Break even points
Full Scan no caching
Index sorted data, no caching
Index unsorted, cached data
Full Table scan, cached data
1
0
10
20
30
40
50
Pct of table accessed
22
Dell Software Group
60
70
80
90
100
Application code
Dell - Restricted - Confidential
Database coding guidelines
SQL
Execution
Transaction
design
Network
overhead
24
Dell Software Group
• Don’t call the database unless you
have to – cache data instead.
• Reduce “hard” parsing by using bind
variables
• Minimize lock duration using optimistic
and Pessimistic locking strategies
• Array fetch and Insert
• Stored procedures
Parse overhead
• It’s easy enough in most programming languages to create a new SQL
every time you execute the query:
25
Dell Software Group
Better to use the same SQL with different
arguments:
• Prepare the statement once, then execute many
times with different arguments
• Using bind variables
26
Dell Software Group
Reduction in parse (SQL Compile) time
Bind Variables
No Bind variables
0
200
400
600
Parse Time
800
Other
1,000 executions of the code on preceding two
slides in Oracle
27
Dell Software Group
1000
1200
1400
Designing transactions
• There are two ways to design transaction locking
• Pessimistic works best if you think someone else is going to “grab” your row before
you’re finished
• Optimistic works best if you thing no one else will “grab” the row
Dell Software Group
Duration of lock
Duration of lock
28
Reduce Network traffic
• “Round trips” to the database can add a lot of overhead
• Two ways to reduce round trips:
– Use the “Array” interface in your program code
– Use stored procedures for complex interactions with the database
29
Dell Software Group
Array fetch performance
40,000
35,000
30,000
25,000
20,000
15,000
10,000
5,000
0
0
20
40
60
80
Array fetch size
Logical Reads
30
Dell Software Group
100
Network round trips
120
140
Network – stored procedures
• A stored procedure is code stored in the
database
• If you have a transaction that goes “back and
forth” to the database, consider a stored
procedure
• ESPECIALLY if you are working across a slow
network
31
Dell Software Group
Optimize the optimizer
Dell - Restricted - Confidential
Table and index
Structure
Cardinality
Estimates
Object Statistics
DB parameters
And config
Optimizer inputs
IO and CPU
Estimates
• Remember: Garbage In : Garbage
Out
• The optimizer can only do a good
job if statistics are up to date
33
Dell Software Group
System Statistics
Cost estimate
Histograms
• Indexes are only good for getting
small amounts of the table
20,000
18,000
Number of rows
16,000
• So it might be a good idea to use
an index to get “New Zealand”
customers, but not “United
States”
14,000
12,000
10,000
8,000
6,000
4,000
2,000
Saudi Arabia
South Africa
Turkey
New Zealand
Denmark
Argentina
Singapore
Japan
Poland
China
Australia
Brazil
Canada
Spain
France
United Kingdom
Italy
Germany
United States of America
0
34
Dell Software Group
• A histogram allows the optimizer
to understand how the data is
distributed and make the best
decision
• Create histograms to correct bad
plans on skewed tables
Without a histogram
Number of rows the
estimated
Real number of
rows
35
Dell Software Group
Optimizer chooses
not to use an index
With a histogram
Optimizer estimate
is correct
Optimizer chooses
to use an index
36
Dell Software Group
DBMS_STATS
GATHER_INDEX_STATS , GATHER_SCHEMA_STATS,
GATHER_TABLE_STATS
Basic statistics
method_opt
=> 'FOR ALL INDEXED COLUMNS SIZE AUTO'
Histograms
method_opt
=>
Multi-column extended statistics
'FOR ALL COLUMNS FOR COLUMNS (col1, col2)'
method_opt
=> 'FOR ALL COLUMNS FOR COLUMNS
(function(column)))'
Expression extended statistics
DBMS_STATS.gather_system_stats (
gathering_mode => 'NOWORKLOAD');
Non-workload system statistics
DBMS_STATS.gather_system_stats
(gathering_mode
=> 'INTERVAL',
interval
=> 60);
Workload system statistics
37
Dell Software Group
Tune SQL and
PL/SQL
Dell - Restricted - Confidential
Find them and tune them
39
Find SQL
•
•
•
•
Mine V$SQL
ASH, AWR tables
Database control, grid/cloud control
Toad, Spotlight
Tune SQL
•
•
•
•
•
SQL Profiles
SQL Baselines
Rewrites
Hints (be careful)
SQL Optimizer
Dell Software Group
V$SQL is your friend
• Also, V$SQL_PLAN,
V$SQL_PLAN_STATISTICS
• Tkprof, session trace
40
Dell Software Group
41
Dell Software Group
Now let’s look at the
donkey
42
Dell Software Group
Monitor and manage
contention
Dell - Restricted - Confidential
Contention – the proverbial bottleneck
Application
Demand for DB
services
44
Dell Software Group
Contention for limited or
serialized resources causes
waits and or queuing
Apparent
demand at lower
layers is reduced
Types of contention
Locks
Latches and
Mutex
Buffer
contention
Latches are very light
weight locks that protect
memory instead of tables
When sessions have to
wait for a memory
“buffer” to become
available
Usually locking problems
are due to application
locks (remember
optimistic locking)?
Sometimes internal locks
can cause problems.
45
Dell Software Group
Latches and mutex contention
• Latches are like locks, but
instead of protecting table rows,
they protect memory (buffers)
user
user
user
• If two sessions try to access the
same area of memory, then one
will wait
Buffers
• Instead of “sleeping” (like a lock)
they waiting session will “spin”
on the CPU for a very short time
• Latch problems may indicate
“hot” blocks
• They might cause CPU drain
(because of spinning)
46
Dell Software Group
Database
files
Free buffer waits
Write to
buffers
User
Buffers
Buffer
Waits
Read
from
• When buffers are modified they are
called “dirty” – DBWR writes to disk
buffers
Read
from
disk
Database
Writer
Write
dirty
blocks
to disk
Database
files
47
Dell Software Group
• Database buffers improve performance
by caching data in memory
• When all the blocks are “dirty” then
sessions have to wait for the buffers to
be written before new data can be read
• This might mean that your DBWR can’t
write to disk fast enough
Specific contention scenarios
Lock/Latch
Possible cause
Library cache mutex
Hard parsing no bind variables. Try
CURSOR_SHARING=SIMILAR
Library cache pin
PL/SQL package parsing and invalidations
Shared pool latch
Hard parsing and possibly excessive SGA resizing
Cache buffers chains
Hot blocks, very high logical read rates
Free Buffer waits
DBWR failing to “keep up” with block changes
Flashback buf free
Insufficient IO bandwidth for flashback area
Buffer busy
Hot blocks – partitioning might help
48
Dell Software Group
Optimize memory to
reduce IO
Dell - Restricted - Confidential
Memory is primarily used to avoid
IO
• Buffer cache avoids IO to table/index
tablespaces
Oracle Session
50
Buffer pools
Program Global
Area (PGA)
Sort area
Hash Area
Data (USER)
tablespace
Temporary
tablespace
Dell Software Group
• PGA avoids IO to temporary tablespace
Temp tablespace IO can easily overwhelm
table/index IO
Time
Multi-pass
Disk Sort
Single Pass
Disk Sort
Memory Sort
Available Memory
More Memory
Table/Index IO
51
Dell Software Group
CPU Time
Less Memory
Temp Segment IO
Automatic Memory Management
• Introduced in 11g, AMM manages allocations between and within
PGA and SGA
52
Dell Software Group
AMM can (rarely?) lead to thrashing or starvation
• ALWAYS set minimum values for key components of the SGA
53
Dell Software Group
Tune IO last, but tune
it well
Dell - Restricted - Confidential
IO Tuning
• Disk IO is the slowest part of the database system, so it’s critical to
performance
• FIRST:
– Tune SQL and application
– Remove contention
– Allocate memory
• Only when you’ve done that will your IO demand be realistic. Then
you can tune your IO
55
Dell Software Group
Basics of IO tuning (magnetic disks)
• The amount of IO you can support depends on the number of disks you have
• Provide enough disks to support the amount of IO you need
• even if that means the disks are not filled with data
• Magnetics disks can do between 75-150 IO per second (IOPS) – do the math!
SAS 15,000 rpm
SAS 10,000 rpm
SATA 10,000 rpm
SATA 7,200 rpm
0
20
40
60
80
100
IO/ps
56
Dell Software Group
120
140
160
180
200
Disk Drive latency
100
90
80
Response Time (ms)
• Latency is the time taken to perform a single
70
IO
• Most disk drives can return an IO in 5-10 ms60
– Faster if the disk has a memory cache
• Disk latency increases with:
– Throughput (best at 50-75% of maximum)
– Disk fill (best at 50% capacity)
• To get best performance disks should be
– Sparsely populated
– Under only moderate load
• RAID 0+1 is the preferred configuration
50
40
30
20
10
0
0
100
200
300
IO/second
57
Dell Software Group
400
500
The more that things change....
58
Dell Software Group
Disks are not getting faster
59
Dell Software Group
SSD - Cheaper by the IO
SSD DDR-RAM
15
SSD PCI flash
25
SSD SATA Flash
80
Magnetic Disk
4,000
0
60
Dell Software Group
1,000
2,000 3,000
Seek time (us)
4,000
5,000
But not by the
GB
12
10
10
10
2.9
2.3
2.2
7.4
8
1.7
$$/GB
$$/GB
1.3
6
1
2011
1
5.3
2013
2012
2014
4
3.2
0.35
2.9
0.28
2.2
2.3
1.7
0.21
2
Dell Software Group
1.3
0.17
1
0.35
0.28
0.21
0.17
0.13
0.13
0
0.12011
2012
2013
2014
2015
HDD
61
2015
MLC SDD
SLC SSD
Tiered storage management
Main Memory
DDR SSD
$/GB
Fast Disk (SAS, RAID 0+1)
Slow Disk (SATA, RAID 5)
Tape, Flat Files, Hadoop
62
Dell Software Group
$/IOP
Flash SSD
Random reads – FusionIO
CPU
Table on SSD
Other
121
DB File IO
Flash cache IO
SAS disk, flash
cache
583
SAS disk, no
flash cache
2,211
0
63
Dell Software Group
500
1000
1500
Elapsed time (s)
2000
2500
Full table scans
Table on SSD
72
Flash cache doesn’t accelerate
Full table scans b/c scans use
direct path reads and flash
cache only accelerates
buffered reads
SAS disk, flash cache
CPU
398
Other
DB File IO
Flash Cache IO
SAS disk, no flash cache
418
0
64
Dell Software Group
100
200
300
Elasped time (s)
400
500
Disk Sorts – temp tablespace SSD vs HDD
4000
3500
Multi-pass
Disk Sort
2500
2000
1500
Single Pass
Disk Sort
1000
500
0
300
65
Dell Software Group
250
200
150
Sort Area Size
SAS based TTS
100
SSD based TTS
50
0
Elapsed time (s)
3000
Concurrent redo workload (x10)
Flash based redo log
1,637
331
1,681
CPU
Other
Log File IO
SAS based redo log
1,605
0
66
Dell Software Group
500
1,000
397
1,500
1,944
2,000 2,500 3,000
Elapsed time (s)
3,500
4,000
4,500
Tune RAC
Dell - Restricted - Confidential
RAC will scale well,
providing that….
a) The time taken to
transfer a block
across
Load is reasonably
well balanced
the
interconnect
across the
instances
in theiscluster
much less than the
time taken to read
The overhead of maintaining cluster
from disk
consistency does not dominate
overall response time
68
Dell Software Group
Toad: your
companion
in the Big
Data
revolution
69
Dell Software Group
Thank you
Dell - Restricted - Confidential