02_AAS - Kyle Hailey
Download
Report
Transcript 02_AAS - Kyle Hailey
Average Active Sessions
(AAS)
The Golden Metric ?
Kyle Hailey
http://perfvision.com
#.1
In this Session
1. AAS
Single Metric
Shows DB Performance
Two methods to calculate
Sampling
Time statistics
2. Yardstick
Max CPU
CPU Count
To measure AAS against
3. Subcomponents
4/9/2016
CPU
Waits
Time series
Copyright 2006 Kyle Hailey
#.2
2
Database Performance
How quick can you find
Bottleneck
in DB
If DB is idle
Current DB Load
what is DB Load ?
What do you use?
Statspack/AWR
V$active_session_history
Alerts
4/9/2016
what the
*!####!*!*? is
the database
doing ?!
what do you alert on ?
#.3
3
Active Session
Database Machine
Shadow
SGA
Shadow
Process
Process
Shadow
Shadow
Process
Process
SQL*
Plus
Shadow
Process
SQL*
Plus
Application
Copyright 2006 Kyle Hailey
#.4
Active Session
Shadow
Process
idle
Database (shadow process)
Query 1 idle
Active
Typing
waiting
typing
Query 2 idle
Query 3
Active
Active
waiting
idle
Time
typing
waiting
Have a coffee
SQL*Plus (ie application)
Copyright 2006 Kyle Hailey
#.5
Active Session
Database
User 1
User 2
User 3
User 4
=
4
3
2
1
Active Sessions
Graph represents # of sessions active, but also represents
amount of time active in the database
#.6
Active Session
4
3
2
1
Active Sessions
For Every Active Session there is a user (or application) waiting
1
2
3
4
Users Waiting
Copyright 2006 Kyle Hailey
#.7
Measuring Active Sessions
Sampling Every Second
...
User 1
User 2
User 3
If happens a lot or for long … we’ll catch it, guaranteed
Fast query run often
Fast query run rarely
Slow query
Copyright 2006 Kyle Hailey
#.8
Sampling is like taking Pictures
Copyright 2006 Kyle Hailey
#.9
Active Session
Database
idle
Query 1 idle
Query 3
Query 2 idle
idle
To execute a query , some CPU will be used but we also might spend time
waiting for IO or on waiting for concurrency resources like latches
idle
Query 1 idle
Query 3
Query 2 idle
CPU
IO
Database
idle
Wait
Activity can thus further be broken down into the type of activity: CPU, IO or WAIT
Work
Latency
Contention
Copyright 2006 Kyle Hailey
#.10
Graphical ASH
Session 1
Session 2
Session 3
Session 4
TIME
#.11
Graph of User States
Copyright 2006 Kyle Hailey
#.12
One Second Graph
Copyright 2006 Kyle Hailey
#.13
15 Second Averages
Copyright 2006 Kyle Hailey
#.14
Maximum CPU Line
Copyright 2006 Kyle Hailey
#.15
Idle Users
Copyright 2006 Kyle Hailey
#.16
OEM Perf Page
Copyright 2006 Kyle Hailey
#.17
AAS in OEM
AAS
LOAD
#.18
The Power ASH gives AAS
DB Home
Performance
Based on TIME
AAS=db
time/elapsed time
(events Statistics)
Top Activity
AAS=count active
users /samples
Based on
ASH
Copyright 2006 Kyle Hailey
#.19
DB TIME = area under the curve
Height = # of Sessions
Width = seconds
Area under curve = DB Time
DB Time =
Σ
n
active sessions(ti) * Δt
0
DB Time = sum of active time in database
Copyright 2006 Kyle Hailey
#.20
AAS Sources
AAS = DB Time/Elapsed Time
1. Manually from
v$sysstat (9i : v$system_event )
2. Statspack
Need several calculations
3. AWR
One calculation
4. OEM 10g
4/9/2016
Directly displayed
Copyright 2006 Kyle Hailey
#.21
21
1. Manually
AAS = DB TIME / Elapsed Time
DB Time (DBT) = Time Spent in Database
DB TIME (10g) = select value from v$sysstat
‘DB time’;
time’
where name = ‘DB
DB TIME (9i) = Select sum(time_waited) from v$system_event
where event not in ( ... idle events …);
+
Select value from v$sysstat
where name = ‘CPU used by this session’;
still need to take delta values
Note: stats$idle_event : 70
v$event_name.wait_class=‘Idle’ :62
Copyright 2006 Kyle Hailey
#.22
2. Statspack AAS
Look for
Elapsed
Time
Top 5 Timed Events
4/9/2016
Start at line 52 of about 1300
Copyright 2006 Kyle Hailey
#.23
23
2. Statspack AAS
Elapsed Time
STATSPACK report for
DB Name
DB Id
Instance Inst Num Release
RAC
Host
------- ----------- -------- -------- ---------- --- ------LABSF03 1420044432
labsf03
1 10.1.0.2.0 NO labsfr
Snap Id
Snap Time
Sessions Curs/Sess
--------- ------------------ -------- --------Begin Snap:
1 03-Apr-06 12:34:06
18
5.6
End Snap:
2 03-Apr-06 12:34:36
18
4.8
Elapsed:
1.00 (mins)
Look at Top 5 Timed Events
Top 5 Timed Events
~~~~~~~~~~~~~~~~~~
Event
--------------------buffer busy waits
CPU time
free buffer waits
write complete waits
log buffer space
4/9/2016
Waits
--------2,748
1,588
10
306
Copyright 2006 Kyle Hailey
Time (s)
--------250
32
15
8
5
% Total
Call Time
------78.72
10.16
4.63
2.51
1.51
#.24
24
2. Statspack AAS
Top 5 Timed Events
DBTIME= CPU + WAITS
CPU = 32
WAITS = 250+15+8+5 = 278 secs
DBTIME=320
Elapsed Time = 60 secs
320 secs / 60 secs
Event
Time (s)
--------------------buffer busy waits
250
CPU time
32
free buffer waits
15
write complete waits
8
log buffer space
5
AAS = 5.1
4/9/2016
Copyright 2006 Kyle Hailey
#.25
25
3. AWR Report
AAS = DB Time/Elapsed Time
23.56/59.66 = 0.39
AAS= 0.39
4/9/2016
#.26
26
4. OEM 10g
•OEM 10g
AAS = ~0.75
4/9/2016
#.27
27
Got AAS, Now What ?
We Need one more item:
CPU Count
# of CPUs available on System
Shared
with other applications
Need to track CPU used on the system as well
On
dual & quad cores, lower the CPU count
Represents max active sessions that can do work
4/9/2016
Copyright 2006 Kyle Hailey
#.28
28
CPU Count
# of CPUs available in
Statspack 10g
AWR report
OEM 10g
Statspack 9i # of CPUs missing # of CPUs
SQLPLUS> show parameters cpu_count
NAME
VALUE
------------------ ---------cpu_count
2
4/9/2016
Copyright 2006 Kyle Hailey
#.29
29
What’s the DB Doing?!
It’s 2am … your manager calls
Whip out the stethoscope:
AAS
what the
*!####!*!*? is the
database doing ?!
4/9/2016
Copyright 2006 Kyle Hailey
#.30
30
AAS Formulas
Use CPU count as yardstick:
AAS < 1
Database is not blocked
AAS ~= 0
Database basically idle
Problems are in the APP not DB
Ideal world –
one database
solution
track CPU at OS
AAS < # of CPUs
CPU available
Are any single sessions 100% active?
AAS > # of CPUs
Could have performance problems
AAS >> # of CPUS
AAS > 1
still want to know
if a single user is
100% active
There is a bottleneck
4/9/2016
Copyright 2006 Kyle Hailey
#.31
31
Available CPU vs AAS
Statspack
DBTIME= CPU + WAITS
CPU = 32
WAITS = 250+15+8+5 = 278 secs
DBTIME=320
Elapsed Time = 60 secs
320 secs / 60 secs
AAS = 5.1
# of CPU = 2
AAS far above
available CPU
=> problem
AAS = 5.1
AWR Report
AAS = 0.39
# of CPU = 2
AAS < 1 , database is fine
AAS = 0.75
# of CPU = 2
AAS < 1 , database is fine
4/9/2016
Copyright 2006 Kyle Hailey
#.32
32
Going Farther with AAS
AAS can tell you a lot
But it’s components tell you much more
To go farther need the components of AAS
1. CPU
2. Wait
3. Value over time
4/9/2016
Only OEM 10g shows the value over time, Statspack and AWR
are aggregated over the snapshot period
Copyright 2006 Kyle Hailey
#.33
33
EM DB Home Page
Copyright 2006 Kyle Hailey
#.34
OEM 10g Perf Pages
DB Home
AAS Point in Time
Performance
AAS over Time
Copyright 2006 Kyle Hailey
#.35
AAS Components : OEM 10g
OEM 10g
Performance Page
Real CPU available:
Max CPU - non instance CPU
4/9/2016
Available CPU
AAS:
CPU + WAIT
#.36
36
OEM 10g
Relax
Get to Work!
Looks
OK
But …
4/9/2016
#.37
37
Idle Database – Perf Page
Value of proving the database is
Idle
It’s the Databases Fault
How many times do you hear that?
Database Idle
No load on database
Database “performance” is fine
Under utilized
Problem lies elsewhere
Saved me time and stress many
times
And weeks of debate about where the
problem is coming from
Copyright 2006 Kyle Hailey
#.38
More than AAS and #CPU
Knowing your DB Profile
Copyright 2006 Kyle Hailey
#.39
When to Tune
1. Machine
CPU
a)
Response times skewed
100% CPU might be fine
Users wait in queue (run queue) => machine
underpowered
Memory
b)
Paging
Wait times skewed (ex : latch free)
Erratic response times ( ex : ls )
Host
2. Oracle
Oracle Load
(AAS)
Waits > CPU ?
1)
tune waits
AAS >
#CPU
CPU > 100% ?
2)
tune top CPU SQL
AAS > 1
Waits >
CPU
Top Session
Top Wait
SQL Detail
Wait Detail
Else
3)
CPU
Memory
CPU >
Waits
Top SQL
It’s the application
Object Detail
Session Detail
File Detail
Copyright 2006 Kyle Hailey
#.40
Limited Analysis
What if you find a problem ?
Of the 800 waits which in order to solve most need to know
What SQL
Which sessions
Values of P1, P2 and P3
Statspack and AWR fail
AAS = DB TIME / Elapsed Time
But there is another way …
4/9/2016
#.41
41
AAS based on ASH
ASH - Active Session History
v$active_session_history
Samples
sessions once a second
AAS = count(*) / elapsed_seconds
A
statistical approximation, but surprisingly close
ASH data source empowers drilldowns
Top
Sql
Top Waits
Details p1,p2,p3 and more
4/9/2016
Copyright 2006 Kyle Hailey
#.42
42
ASH
vs
Statistics
Statistics
are
more expensive
have lag time
lack clear identification of culprits
Copyright 2006 Kyle Hailey
#.43
Statistic Lag Time
Samples (ASH)
Slight Lags
Statistics
Copyright 2006 Kyle Hailey
#.44
CPU Lag Problem
ASH is the only way to see CPU usage realtime
V$sysstat reports CPU but
is
only updated at the end of the call.
Long calls look deceiving like no CPU is being used
Time Model also reports CPU
Updated
quicker
Copyright 2006 Kyle Hailey
#.45
CPU in ASH vs Stats
Copyright 2006 Kyle Hailey
#.46
2. OEM 10g : Top Activity
•Top Activity
•Based on ASH
•Enables Drilldowns
•Top SQL
•Top Session
•Drill into a session
• Stats
• Raw waits
• Open cursors
• General info
•Drill into a SQL
• Stats and text
• Users executing
• Explain plan
• Tuning options
4/9/2016
#.47
47
Top Activity : Based on ASH
missing
Thanks
To
ASH
Copyright 2006 Kyle Hailey
#.48
AAS – %Session Time Issue
Whenever AAS > 1
I want to know if any
one session is 100% active
Unfortunately OEM isn’t set up for this
(ASHMON, my free tool, is )
Copyright 2006 Kyle Hailey
#.49
Top Activity: ASH Sessions
Many Users Active
On Performance Page, no way to tell how many users
But Top Activity Page fixes that
Copyright 2006 Kyle Hailey
#.50
Top Activity: ASH Sessions
Two Users Active
Shown in % DB Time
Missing % Session Time
Copyright 2006 Kyle Hailey
#.51
OEM 10g Perf Pages
DB Home
Top Activity
Performance
SQL
Top Activity
Session
SQL
Session
Copyright 2006 Kyle Hailey
#.52
Session : ASH Activity
Copyright 2006 Kyle Hailey
#.53
SQL : ASH Activity
Copyright 2006 Kyle Hailey
#.54
AAS from ASH
1. ASHRPT
Based entirely on
v$active_session_history
@?/rdbms/admin/ashrpt.sql
Exec ASH_REPORT_TEXT/HTML
select * from table
(dbms_workload_repository.ash_report_text(
(select dbid from v$database),
1,
sysdate – 1/24,
sysdate )) ;
4/9/2016
Copyright 2006 Kyle Hailey
#.55
55
1. ASHRPT
ASH Report For TESTDB/testdb
DB Name
DB Id
Instance
Inst Num Release
RAC Host
------------ ----------- ------------ -------- ----------- --- -----------TESTDB
2371570538 testdb
1 10.2.0.1.0 NO sdbe604a
CPUs
SGA Size
Buffer Cache
Shared Pool
ASH Buffer Size
---- ------------------ ------------------ ------------------ -----------------2
1,000M (100%)
468M (46.8%)
112M (11.2%)
4.0M (0.4%)
Analysis Begin Time:
21-Apr-06 12:00:01
Analysis End Time:
21-Apr-06 12:05:01
Elapsed Time:
5.0 (mins)
Sample Count:
3,716
Average Active Sessions:
12.39
Avg. Active Session per CPU:
6.19
Report Target:
None specified
Top User Events
DB/Inst: TESTDB/testdb (Apr 21 12:00 to 12:05)
Avg Active
Event
Event Class
% Activity
Sessions
----------------------------------- --------------- ---------- ---------CPU + Wait for CPU
CPU
67.98
8.42
enq: TX - row lock contention
Application
23.98
2.97
buffer busy waits
Concurrency
4.66
0.58
latch: cache buffers chains
Concurrency
2.26
0.28
4/9/2016
Copyright 2006 Kyle Hailey
#.56
56
1. ASH RPT
1) General info
2) Top User Events ***
3) Top Background Events
4) Top Event P1/P2/P3 Values
5) Top Service/Module
6) Top Client IDs
7) Top SQL Command Types
8) Top SQL Statements ***
4/9/2016
9) Top SQL using literals
10) Top Sessions ***
11) Top Blocking Sessions
12) Top Sessions running PQs
13) Top DB Objects
14) Top DB Files
15) Top Latches
16) Activity Over Time ***
Copyright 2006 Kyle Hailey
#.57
57
1. ASHRPT over Time
Waits over Time
Not
in AAS
Difficult but better
than nothing
Compare to …
4/9/2016
#.58
58
3. Custom Scripts
Hate Graphics ?
Query v$active_session_history directly
Join to dba_hist_active_sess_history for week of data
act.sql
Like
top 5 timed events
Aveact.sql
Charts
with text AAS by hour (15 minute, minute , etc)
Aveactn.sql
Ditto,
with top 2 wait events per bucket
Following Scripts Available on
http://perfvision.com/ashscripts.php
4/9/2016
#.59
59
3. Custom Scripts
@act
Analysis Begin Time :
Analysis End
Time :
Start time, mins ago:
Request Duration
:
Collections
:
Data Values
:
Elapsed Time: 15 mins
2007-07-24 11:04:48
2007-07-24 11:19:45
15
15
528
3327
WAIT_EVENT
CNT
% Active Ave_Act_Sess
-------------------------------------- ---------- -----------latch free
10
.3
.02
log buffer space
13
.39
.02
buffer busy waits
14
.42
.03
db file scattered read
15
.45
.03
library cache pin
78
2.34
.15
log file sync
213
6.40
.40
ON CPU
726
21.82
1.38
enqueue
855
25.70
1.62
db file sequential read
1399
42.05
2.65
-----------sum
6.30
4/9/2016
Copyright 2006 Kyle Hailey
#.60
60
3. Custom Scripts
@aveact
TM
NPTS AVEACT GRAPH
---------------- ------ ------- ---------------------06-AUG 13:00:00
270
.33 +2
06-AUG 14:00:00
1040
2.24 ++--------2--06-AUG 15:00:00
623
6.67 ++++------2---------06-AUG 16:00:00
1088
2.59 ++--------2---06-AUG 17:00:00
1104
1.26 ++----2
06-AUG 18:00:00
1093
1.38 +++---2
06-AUG 19:00:00
1012
1.74 ++------- 2
06-AUG 20:00:00
1131
.99 +---2
06-AUG 21:00:00
1111
1.22 ++----2
06-AUG 22:00:00
1010
1.66 ++------ 2
06-AUG 23:00:00
1120
1.08 +---2
07-AUG 00:00:00
1024
.83 +--2
07-AUG 01:00:00
1006
1.74 ++------- 2
07-AUG 02:00:00
1090
2.47 ++--------2---07-AUG 03:00:00
687
6.59 +++-------2---------07-AUG 04:00:00
1004
1.95 ++++++--- 2
07-AUG 05:00:00
1104
3.08 +++++-----2-----07-AUG 06:00:00
1122
1.91 +++++++-- 2
4/9/2016
Copyright 2006 Kyle Hailey
CPU WAITS
---- ----29
59
341 1984
438 3718
335 2486
349 1043
663
842
373 1388
304
820
344 1012
414 1259
298
913
273
576
319 1428
347 2345
382 4142
1299
659
1170 2226
1582
558
#.61
61
Aveact.sql
Copyright 2006 Kyle Hailey
#.62
Aveact.sql
“-” = WAIT
“+” = CPU
which waits ? -> aveactn.sql
Copyright 2006 Kyle Hailey
#.63
Aveactn.sql
Copyright 2006 Kyle Hailey
#.64
Summary
AAS: Two Sources
v$system_event & v$sysstat
1.
Performance page in OEM
Statspack, awr report
Indirect – based on time converted to AAS
Accurate
Lags (especially CPU)
Limits analysis
Top activity in OEM
ashrpt
Direct measure of AAS
Real time
Approximation
***Allows drilldowns***
Tuning:
CPU
Memory
Oracle Load
(AAS)
AAS >
#CPU
AAS >
1
v$active_session_history
2.
Waits >
CPU
Top Session
Top Wait
SQL Detail
Wait Detail
Object Detail
ADDM
CPU >
Waits
Top SQL
Session Detail
File Detail
SQL Tuning
Advisor
Machine first – CPU, Memory
Oracle
4/9/2016
Host
AAS ~= 0 Idle
AAS > 1 : possible sessions blocked
AAS > # CPU : bottleneck
Copyright 2006 Kyle Hailey
#.65
65
Q1
What is the easiest way to find the load on the database
b. Average Active Sessions takes into
account all the major criteria in Oracle
a. top 5 timed eventsperformance - cpu usage, wait time, elapsed
time and CPU count
b. Average Active Sessions
others:
c. The transaction rate
d. The commit rate a. lacks the elapsed time
c & d - the amount of load created by
commits and transactions varies drastically
between database to database and even
between different loads on a database during
the day
Copyright 2006 Kyle Hailey
#.66
Q2
Average active sessions, the measurement in the
performance chart in OEM 10g can be calculated by which
formula(s)
a. db time / elapsed time
b. wait time / elapsed time
answer
c. count of rows in v$active_session_history over an
a,c,d
interval/ number of samples in interval
c, d are equivalent because ASH samples
d. count of rows in v$active_session_history / seconds of
once a second
elapsed time
others
b doesn't work because it is missing CPU
time
Copyright 2006 Kyle Hailey
#.67
Q3
If Average Active Session is less than one, we can say
a. the database is using less than 100% CPU on machine
b. the database is using 100% CPU
c. No session is completely blocked
d. Database performance should be acceptable
answer
a, c and d
Copyright 2006 Kyle Hailey
#.68
Q4
An Average Active Session value of zero
means
a. the database is down
b. the database is hung
c. the database is idle
d. the database is overloaded
answer
c
Copyright 2006 Kyle Hailey
#.69
Q5
CPU is updated in v$sysstat
a. when call finishes
b. every second
c. every five seconds
d. in real time
a - which can be a problem in
monitoring tools if the call last a long
time showing no CPU usage until the
call finishes, then reporting an
unrealistic spike of CPU when the
call finishes
Copyright 2006 Kyle Hailey
#.70