05_waits_intro

Download Report

Transcript 05_waits_intro

Part II : Waits Events
and the
Geeks who love them
Kyle Hailey
http://perfvision.com
#.2
WaitEvents
Events
Wait
Copyright 2006 Kyle Hailey
#.3
And the Geeks Who Love Them
Copyright 2006 Kyle Hailey
#.4
In this Presentation:
 Introduction to Waits
 Tuning Methodology
 Plan
of Action
 Statspacks, AWR or OEM for Collection Data
 Based on Waits
 Using Waits to Solve Bottlenecks
Copyright 2006 Kyle Hailey
#.5
Database is Hung!
Everybody blames the database
Yet 9 out of 10 dba’s agree it’s not the
database
How do you prove it to management?
On the off chance it’s the database, what do
we do?
Copyright 2006 Kyle Hailey
Database:
Guilty until proven innocent
*$%@!!
Copyright 2006 Kyle Hailey
#.6
#.7
Oracle’s Defense
 After years of false accusations
 Oracle took action and created a defense
system:
WAIT EVENTS
To the rescue
 Oracle is the best instrumented database on
the market which can save time and money
on development and tuning
Copyright 2006 Kyle Hailey
Oracle Instrumentation
CPU
Redo
Locks
Lib
Cache
Buffer
Cache
Network
IO
Copyright 2006 Kyle Hailey
#.8
Waits
 Introduced in v7
 Revolutionized tuning
 Changed
from Ratio Guesswork to empirical
measure of time lost to bottlenecks
 10g added the crucial addition ASH
 Not




only identifies bottlenecks but
Who (session, service, package, procedure)
Where (CPU, Wait)
When (time)
What (SQL statement)
Copyright 2006 Kyle Hailey
#.9
#.10
Tuning Methodology
1. Machine

Run queue (CPU)



Check other applications
reduce CPU usage or add CPUs
Paging

Reduce memory usage or add memory
2. Oracle
Waits + CPU > Available CPU



We are going to
concentrate here
CPU 100%
 Tune SQL
on WAITS
Else low waits, available CPU then

Tune waits

It’s the application
Copyright 2006 Kyle Hailey
#.11
Dependable Tuning Strategy
Determine AAS :
 Run Statspack or AWR Report
 Top

5 Timed Events
~50 lines down from top
 Need


Available CPU
Elapsed Time
CPU_COUNT
 ASH Report : ashrpt.sql
 OEM 10g
 Performance
Page does everything
If there is a wait bottleneck tune the wait
Copyright 2006 Kyle Hailey
#.12
Tuning Methodology Graphics
Relax, it’s the
application
Get to Work!
Copyright 2006 Kyle Hailey
#.13
Waits beyond OEM
 OEM identifies Wait problems
 Provides solutions with ADDM sometimes
 But


What do you do when ADDM isn’t sufficient?
What do you do if you don’t have OEM 10g?
 Waits



Need to know about waits
How they work
How to analyze them
Copyright 2006 Kyle Hailey
Wait Areas
Buffer Cache
I/O
Locks
Waits
Library Cache
Redo
SQL*Net
We’ll discuss Waits in these logical database areas
Copyright 2006 Kyle Hailey
#.14
Wait Tree
IO
Buffer Cache
Write IO
Read IO
Rollback
Buffer Busy
Free lists
Cache Latches
IO Read
Library Cache
Library Cache
Waits
#.15
Shared Pool
Lock
TX Row Lock
Redo
TX ITL Lock
SQL Net
HW Lock
Log File
Log Buffer
Log File Sync
Copyright 2006 Kyle Hailey
v$active_session_history
 When ADDM fails or we don’t have ADDM we can
collect the necessary information from
 v$active_session_history
Session (user, service, client, package, procedure, etc)
 SQL statement
 For IO related waits


CURRENT_OBJ# ,CURRENT_FILE# ,CURRENT_BLOCK#
Blocking_Session
 P1
 P2
 P3

Copyright 2006 Kyle Hailey
#.16
#.17
What are P1,P2,P3 ?
 Each Wait has a 3 parameters P1,P2,P3
 Give detailed information
 Meaning different for each wait
 Meaning definitions in V$event_name
colSelect
parameter1 for a10
col parameter2name,
for a10
col parameter3parameter1,
for a10
parameter2,
select parameter1
,parameter2 , parameter3
parameter3
from v$event_name
from v$event_name;
where
name = '&1';
Copyright 2006 Kyle Hailey
Wait Arguments Example
#.18
select parameter1 ,parameter2 , parameter3
from v$event_name;
NAME
PARAMETER1
PARAMETER2
PARAMETER3
------------------------------ ----------- --------------- --------------latch: cache buffers chains
free buffer waits
buffer busy waits
latch: redo copy
log buffer space
switch logfile command
log file sync
db file sequential read
enq: TM - contention
undo segment extension
enq: TX - row lock contention
row cache lock
library cache pin
library cache load lock
pipe put
address
file#
file#
address
buffer#
file#
name|mode
segment#
name|mode
cache id
handle address
object address
handle address
number
block#
block#
number
tries
set-id#
class#
tries
block#
object #
blocks
table/partition
usn<<16 | slot
mode
pin address
lock address
record length
sequence
request
100*mode+namesp
100*mask+namesp
timeout
Copyright 2006 Kyle Hailey
#.19
Wait Analysis requires p1,p2,p3
 Of the top 30 wait events 8 can be solved
without ASH
free buffer waits
log buffer space
log file switch (archiving needed)
log file switch (checkpoint incomplete)
log file switch completion
log file sync
switch logfile command
write complete waits
 The rest need
 Sql_id
and/or P1,P2,P3
Copyright 2006 Kyle Hailey
#.20
Difficult Waits
These 4 waits have multiple causes
 Latches

p2 = latch # (p1= address, p3= tries)
 Locks

p1 = lock type and mode ( p2 = id1, p3= id2)
 Buffer Busy
p3 = block class#, p1= file, p2=block
 (in 9i p3 was the bbw type)

 Row Cache Lock

p1 = cache id (p2 = mode, p3=request)
Copyright 2006 Kyle Hailey
#.21
Wait Analysis
 Find SQL waiting
 Most
often the tuning answer lies in looking at what
the application is doing, and changing it
 Find extended wait information
 Parameter1,
Parameter2, Parameter3
 Sometimes the wait events that are found are
not in the documentation and it takes some
educated guesswork to figure out the problem
#.22
Waits we will Ignore
One thing that makes waits difficult is knowing
which ones to look at and which ones to
ignore.
 Background
 Idle
 Resource Manager
Copyright 2006 Kyle Hailey
Background Processes
PMON
SMON
SGA
Log
Library Buffer
Buffer Cache
Cache
Buffer
Cache
Log Buffer
DBWR
LGWR
User1
User2
User3
REDO Log Files
Data Files
Copyright 2006 Kyle Hailey
#.23
#.24
Background & Foreground
 Background Processes





DBWR
LGWR
PMON
SMON
Etc
 Foreground Processes




SQL*Plus
Pro*C
SQL*Forms
Oracle applications
Only interested in Foreground waits
Copyright 2006 Kyle Hailey
#.25
Background Waits
 ASH
 Avoid
Background waits in ASH with
Select …from v$active_session_history
where SESSION_TYPE='FOREGROUND'
 V$session_wait joined to v$session
select
from
…
v$session
s,
v$session_wait w
where w.sid=s.sid
and s.type='USER'
Copyright 2006 Kyle Hailey
Idle Waits
 Filtered Out of ASH by default
 10g
 where
wait_class != ‘Idle’
 Create a list
Select name from v$event_name where
wait_class=‘Idle’;
 9i
 Create



a list with
Documentation
List created from 10g
Stats$idle_events from statspack
SQL*Net message from client
Copyright 2006 Kyle Hailey
#.26
#.27
Parallel Query Waits
Filter Out
 Parallel Query Wait events are unusable
 Save
waits are both idle and waits
 Parallel Query Waits start with ‘PX’ or ‘KX’
 PX
Deq: Par Recov Reply
 PX Deq: Parse Reply
Copyright 2006 Kyle Hailey
#.28
Resource Manager Waits
 Resource manager throttles user
 Creates
wait
 Obfuscates problems
 10g
select name from v$event_name where
wait_class='Scheduler';
Copyright 2006 Kyle Hailey
#.29
RAC Waits
RAC waits are certainly interesting but will be covered
outside of this presentation.




You are on your own
Check documentation
If you are not using RAC then no worries
10g
Select event from v$event_name where
wait_class=‘Cluster’;
 9i

RAC and OPS waits usually contain the word “global”
Copyright 2006 Kyle Hailey
#.30
Latches
 Protect areas of memory from concurrent use
 Light weight locks
 Bit
in memory
 Atomic processor call
 Fast and cheap
 Gone if memory is lost
 Often used in cache coherency management
 Changes
to a data block
 Exclusive Generally
 Sharing
reading has been introduced for some latches
Copyright 2006 Kyle Hailey
#.31
Finding Latches
 “latch free”

Covers many latches, find the problem latch by
1. select name from v$latchname where latch# = p1;
OR
2.
Find highest sleeps in Statspack latch section
 In 10g, important latches have a wait event



latch: cache buffers chains
latch: shared pool
latch: library cache
Copyright 2006 Kyle Hailey
#.32
Enqueues aka Locks
 “Enqueue” wait – covers all locks pre 10
 Protect data against concurrent changes
 Lock info written into data structures
 Block
headers
 Data blocks
 Written in cache structures
 Shareable in compatible modes
Copyright 2006 Kyle Hailey
#.33
Locks 10g
 10g breaks all Enqueues out
enq:
 enq:
 enq:
 enq:
 enq:
 enq:

HW
TM
TX
TX
TX
UL
-
contention
contention
allocate ITL entry
index contention
row lock contention
contention
Copyright 2006 Kyle Hailey
Configuration
Application
Configuration
Concurrency
Application
Application
Row Cache Lock
 Need p1 to see the cache type
SQL> select cache#, parameter from v$rowcache;
CACHE#
---------1
4
2
0
5
6
7
3
8
17
12
PARAMETER
-------------------------------dc_free_extents
dc_used_extents
dc_segments
dc_tablespaces
dc_tablespace_quotas
dc_files
dc_users
dc_rollback_segments
dc_objects
dc_global_oids
dc_constraints
Copyright 2006 Kyle Hailey
#.34
#.35
Row Cache Lock
 Statspack
^LDictionary Cache Stats for DB: ORA9 Instance: ora9 Snaps: 1 -2
->"Pct Misses" should be very low (< 2% in most cases)
->"Cache Usage" is the number of cache entries being used
->"Pct SGA"
is the ratio of usage to allocated size for that cache
Get
Pct
Scan
Pct
Mod
Final
Cache
Requests
Miss
Reqs Miss
Reqs
Usage
----------------- --------- ------ ------- ----- -------- ---------dc_object_ids
45
0.0
0
0
958
dc_objects
89
0.0
0
0
1,129
dc_segments
69
0.0
0
0
807
dc_tablespaces
12
0.0
0
0
13
dc_usernames
22
0.0
0
0
19
dc_sequences
120,003
0.0
0
120,003
5
Copyright 2006 Kyle Hailey
#.36
Additional Support
 AWR Tables – on disk for 7 days by default

DBA_HIST_ACTIVE_SESS_HISTORY


DBA_HIST_SEG_STAT


Important for getting avg wait times
DBA_HIST_SQLSTAT


Good for ITL and buffer busy wait
DBA_HIST_SYSTEM_EVENT


1 in 10 ASH samples
sql execution deltas
DBA_HIST_SYSMETRIC_SUMMARY

Statistics avg, max, min
 Metric Tables – in memory deltas

V$EVENTMETRIC
Copyright 2006 Kyle Hailey
#.37
All Events over 7 days
select count(*), event from
( select event from DBA_HIST_ACTIVE_SESS_HISTORY
where sample_time < ( select min(sample_time) from
v$active_session_history)
union all
select event from v$active_session_history
)
group by event
order by event
/
Copyright 2006 Kyle Hailey
Example ASH Query
#.38
Select ash.p1,
ash.p2,
CURRENT_OBJ#||' '||o.object_name objn,
o.object_type otype,
CURRENT_FILE# filen,
CURRENT_BLOCK# blockn,
ash.SQL_ID,
w.class ||' '||to_char(ash.p3) block_type
from v$active_session_history ash,
( select rownum class#, class from v$waitstat ) w,
all_objects o
where event='buffer busy waits'
and w.class#(+)=ash.p3
and o.object_id (+)= ash.CURRENT_OBJ#
and ash.session_state='WAITING'
and ash.sample_time > sysdate - &1/(60*24)
Order by sample_time
P1
P2 OBJN
OTYPE FILEN BLOCKN SQL_ID
BLOCK_TYPE
-- ------ --------------------- ----- ----- ------ ------------- ------1 112796 66053 BBW_INDEX_VAL_I INDEX
1 112796 6avm49ys4k7t6 data block 1
1 112401 66053 BBW_INDEX_VAL_I INDEX
1 112401 5wqps1quuxqr4 data block 1
1 112796 66053 BBW_INDEX_VAL_I INDEX
1 112796 5wqps1quuxqr4 data block 1
1 113523 66053 BBW_INDEX_VAL_I INDEX
1 113523 5wqps1quuxqr4 data block 1
Copyright 2006 Kyle Hailey
#.39
Average Wait Times Historic
select
btime,
(time_ms_end-time_ms_beg)/nullif(count_end-count_beg,0) avg_ms
from (
select
to_char(s.BEGIN_INTERVAL_TIME,'DD-MON-YY HH24:MI') btime,
total_waits count_end,
time_waited_micro/1000 time_ms_end,
Lag (e.time_waited_micro/1000)
OVER( PARTITION BY e.event_name ORDER BY s.snap_id) time_ms_beg,
Lag (e.total_waits)
OVER( PARTITION BY e.event_name ORDER BY s.snap_id) count_beg
from
DBA_HIST_SYSTEM_EVENT e,
BTIME
AVG_MS
DBA_HIST_SNAPSHOT s
-------------------- -----------where
08-JAN-08 01:00
1.017
s.snap_id=e.snap_id
08-JAN-08 02:00
.720
and e.event_name= '&1'
08-JAN-08 03:00
.621
order by begin_interval_time
08-JAN-08 04:00
1.747
)
08-JAN-08 05:00
1.046
order by btime;
08-JAN-08 06:00
Copyright 2006 Kyle Hailey
1.444
Avg Wait times now
#.40
select
en.name,
(time_waited)/nullif(wait_count,0) avg_ms,
wait_count
from
v$eventmetric e,
v$event_name en
where
e.event# = en.event#
and en.name like '%&1%‘;
NAME
------------------------db file sequential read
db file scattered read
db file parallel write
AVG_MS WAIT_COUNT
---------- ---------.658863707
6420
.549427419
186
.089073438
64
Copyright 2006 Kyle Hailey
#.41
Object Translation
 Object ID
 File # and Block #
#.42
Wait interface Weaknesses
 Logons
EM 10g shows these on perf page
 Time model helps







V$SYS_TIME_MODEL
 connection management call elapsed time
I’ve had problems
Paging/Memory issues
CPU starvation
Null Events
Bugs – read external table reports CPU

http://blog.tanelpoder.com/
Copyright 2006 Kyle Hailey
#.43
Dependable Tuning Strategy
 Run Statspack/AWR report
 Top

5 Timed Events
~50 lines down from top
 Need


Available CPU
Elapsed Time
CPU_COUNT
 OEM 10g
 Performance
Page does everything !
 OEM doesn’t solve the problem
 Query
v$active_session_history directly
Copyright 2006 Kyle Hailey
Summary
 Waits make Tuning Easy
Check Machine Health
 Tune Waits
 Tune CPU



Tune SQL
Change Application Architecture
 Use
OEM10g
 Statspack/AWR,
 S/ASH

 Ignore Background, Idle, Resmgr, PQO
 Use ASH if OEM fails
 See http://perfvision.com for more info
Copyright 2006 Kyle Hailey
#.44