stats_gathering_oak_121202
Download
Report
Transcript stats_gathering_oak_121202
Guiding Practices
for
Gathering Database Statistics
Martin Widlake
Database Architecture, Performance & Training
Ora600 Limited
http://mwidlake.wordpress.com
www.ora600.org.uk
ORA00 Ltd
1/45
Abstract
• Guiding practices for Database Statistics gathering
• The Cost Based Optimizer continues to improve and stats
gathering is now more efficient than ever - but it still seems to
be that most Oracle Sites struggle with performance issues due
to poor stats. It's like the annoying, embarrassing rash that
simple won't go away. I will cover the options available and
general principles for sorting out the stats issue, which should
lead to more stable and good performance ie a more
comfortable life . This should calm the annoying rash and give
you some potential treatments should it flare up again.
ORA600 Ltd
2/45
Who am I and why am I doing this
Talk?
• 20+ years of Oracle experience, mostly as a developer, development
DBA, Architect, Performance guy.
• Tested the CBO under V7.3 and became a cautious advocate of it
in V8. Been fighting the issues since!
• I keep getting pulled into designing “better” methods of gathering
stats for clients and, frankly, I’d rather do other things {thus the
presentations and blog posting telling everyone what I know}.
• I am of the opinion that, over all the CBO now gets 99% of SQL
execution plans good enough, if the stats are good.
• Stats gathering can be quite interesting. Honest!
ORA600 Ltd
3/45
These slides will be on the
UKOUG web site
I am going to talk around some slides (the
ones with pictures on and key points) and skip
over some - as we all get tired of reading
powerpoint slides in presentations.
The others are there to fill in the chat.
Ask questions, Email me
[email protected]
mwidlake.wordpress.com
ORA600 Ltd
4/45
Quick Quiz
• What is the most common version of Oracle you currently use? (8, 9,
10.1, 10.2, 11.1, 11.2)
• What is the latest version you use in production?
• Who relies on the Automated Stats Collection job on their database?
(If “Yes”, have you altered the schedule?)
• Who has intermittent performance issues when code goes bad either
“over night” or after stats are collected?
• Who has a site-crafted stats gathering regime?
• (If your site wrote it’s own, did it take 2X, 4X, 8X or more the effort to
get right than you expected?).
ORA600 Ltd
5/45
Possibly the Single Most Common
Cause of Poor Database Performance
Poor or missing object statistics are probably the
single most common and easily fixed cause of
In my poor
opinion,
the introduction
database
performance. of the
automated stats gathering process with
Most issue with individual SQL statements
Oracle
10g
was
probably
the
single
greatest
performing poorly are fixed by gathering accurate
performance
enhancement
by Oracle Corp
statistics
on the tables involved.
In the last 15 years.
The worst of all situations is to have statistics on a
few tables.And
The ICost
Based
Optimiser
don’t
really
like it is invoked
and has to use very poor defaults for everything
else.
Ora 600 Ltd
6/45
Automatic Stats Collection
Auto Stats
Job
Preparation
FLUSH_DATABASE_
MONITORING_INFO
Global and Table Prefs
(stale_pct,est_pct, met_op. Degree..)
DBA_TAB_MODIFICATIONS
(10% BY DEFAULT)
SYS.COL_USAGE$
Data Dictionary Information
Existing Statistics
OBJ_FILTER_LIST
(STALE AND EMPTY)
Runs gathers
In the scheduled
window
ORA600 Ltd
7/45
GATHER_DATABASE_STATS_JOB_PROC
From the 10g Tuning guide:The GATHER_DATABASE_STATS_JOB_PROC procedure collects
statistics on database objects when the object has no previously gathered
statistics or the existing statistics are stale because the underlying object
has been modified significantly (more than 10% of the rows).The
DBMS_STATS.GATHER_DATABASE_STATS_JOB_PROC is an
internal procedure, but its operates in a very similar fashion to the
DBMS_STATS.GATHER_DATABASE_STATS procedure using the
GATHER AUTO option. The primary difference is that the
DBMS_STATS.GATHER_DATABASE_STATS_JOB_PROC procedure
prioritizes the database objects that require statistics, so that those
objects which most need updated statistics are processed first. This
ensures that the most-needed statistics are gathered before the
maintenance window closes.
That last sentence is the only major change in the 11g documentation.
Ora 600 Ltd
8/45
Automated DBMS_STATS Job
• If it works for you, then fine, leave it be and work on something else. If
it almost works for you, fix the exceptions, leave the main job alone
and work on something else.
• If you have a VLDB (or you downloaded this as you had an issue with
stats gathering) It is almost certainly not good enough for you.
• It is an attempt at a single solution to work for every situation and it
does not. Even Oracle Corp have admitted, it just simply does not
work for VLDBs, it chokes on large objects
• Turn it off (maybe leave it running for DICTIONARY stats) and write the
replacement. You will write something that does a lot of what this job
does. Your replacement will almost certainly be more complex than
you initially plan. Sorry.
• There is no one single solution to stats gathering that is
right for any large, complex system. {Sorry again}
ORA600 Ltd
9/45
DBA _TAB_MODIFICATIONS
• All inserts, updates, deletes and truncate operations on monitored
tables are flushed to this table. So V10 upwards, that is everything.
• Under V9 flushed every 3 hours, under V10.1 every 15 minutes,
under V10.2/V11 it is not automatically flushed.
• It is flushed to by calls to schema/db dbms_stats GATHER calls or
by calling DBMS_STATS.FLUSH_DATABASE_MONITORING_INFO
It doesgenerally
not capture
directbut
inserts,
things
that avoid
the
It• seems
accurate
I haveappends,
witnessed
it missing
the odd
SQL layer.
“insert-into-select”
statements. And it does not see direct-insert activity.
• Increments, including over database restarts.
• Row is deleted when stats are gathered on the Table OR PARTITION
(and only at the correct level).
Ora 600 Ltd
10/45
show_tab_mods
OBJ_NAME
NO_INS
NO_UPD NO_DEL T LAST_FLUSH
-- show_tab_mods
--------------------------------------- --------- --------- ------ - ---------------- Martin Widlake 11-nov-07
MIDDLEOFFICE.POSITIONLIQUIDATIONAUDIT9,588
0
0 N 080602 10:06:03
-- quick check on recent table changes
MIDDLEOFFICE.POSITIONKEEPINGGROUPS2
1
0 N 080602 10:06:03
-- NB flush if want up-to-date info (15 min interval)
GATEWAY.TREE_RELATIONS44
0
43 N 080602 10:06:03
set
lines 100 pause on
COMMHOME.COMM_TRAD_PCTTRAD_BAND4
0
4 N 080602 10:06:03
col
obj_name form a50
GATEWAY.INSTRUMENT_INFO2
1
0 N 080602 10:06:03
col
no_ins form 9999,999
GATEWAY.TRADINGPERIODPROFILEDATA1
0
1 N 080602 10:06:03
col
no_upd form 9999,999
BOBJ_LOGIN.ORDERS-ORDERS_12345678
1,066
9,381
0 N 080531 06:05:03
col
no_del form 999,999
DATAFIX.TRADEORD_VOLT2
0
0 N 080531 06:05:03
select
table_owner||'.'||table_name||'-'||partition_name
obj_name
BOBJ_LOGIN.ORDERS_HISTORY-OH_54545454
9,379
0
0 N 080531 06:05:03
,inserts
no_ins
DATAFIX.MAXORDERS2
0
0 N 080531 06:05:03
,updates
no_upd
,deletes
no_del
BOBJ_LOGIN.ORDERAUDIT-ORDERAUDIT23
1375610
0
0 N 080530 10:05:02
,substr(truncated,1,1)
T
COMMHOME.COMM_ON_DEPOSIT_H4
0
0 N 080530 10:05:02
,to_char(timestamp,'YYMMDD
hh:mm:ss') 95,449
last_flush
GATEWAY.BIN$TnRrf2V98FrgQwrckArwWg==$00
0 N 080530 10:05:02
from
dba_tab_modifications
GATEWAY.BIN$TnRrf2V38FrgQwrckArwWg==$0882
0
0 N 080530 10:05:02
where
timestamp > sysdate -31
GATEWAY.DEALINGRECNMSAI290
5,052
0 N 080530 10:05:02
and
table_owner not like 'SYS%'
GATEWAY.BIN$TnRkBv02UezgQwrckApR7A==$016,913
0
0 N 080530 10:05:02
order by timestamp desc
/
clear colu
Ora 600 Ltd
11/45
-- mdw 11/05/03
TAB_NAME
ANLYZD_ROWS
LAST_ANLZD
TOT_ROWS
CHNGS
PCT_C
-- mdw 17/01/08 Modified to
look at dba_tab_modifications
---------------------------------------------- -------- -----set pause on pages 24 lines
110 pause --------------'Any Key>'
GATEWAY.ACCESSGROUPS
4
0
.000
colu anlyzd_rows form 99999,999,999 4 080212 16:03:12
MIDDLEOFFICE.ACCOUNTAUDIT
4,725,464 080512 22:37:16
4,898,302
173,738
.037
colu tot_rows form 99999,999,999
GATEWAY.ACCOUNTBEHAVIOURS
14 080212 16:03:18
14
0
.000
colu tab_name form a30
GATEWAY.ENBZHEERHWV
149,136 080522 22:06:39
150,922
6,156
.041
colu chngs form 99,999,999
DATAFIX.VRHERHHHEHH_08_RBK
17,650 080425 22:00:04
17,650
0
.000
colu pct_c form 999.999
DATAFIX.AFEOUNSFEWS_190505
42,757 071105 22:01:00
42,757
0
.000
select dbta.owner||'.'||dbta.table_name
tab_name
DATAFIX.ACSFFWEGGEE_310108
182 080131 22:00:03
182
.000
,dbta.num_rows
anlyzd_rows0
MM_AUDIT.EFEFEFGHOME_AUDIT
513,230 080509
22:07:34
526,192
12,962
.025
,to_char(dbta.last_analyzed,'yymmdd
hh24:mi:ss')
last_anlzd
DATAFIX.AEGFWSCEFOE_TEMP
10,000 071105 22:00:52
10,000
0
.000
,nvl(dbta.num_rows,0)+nvl(dtm.inserts,0)
GATEWAY.AEFSSTEFFFE
2,083 080212 16:03:36
2,08tot_rows
0
.000
-nvl(dtm.deletes,0)
MM_AUDIT.ACCOUNSGSEGEGGEIT
2,083 071105 22:00:21
2,083
0
.000
,nvl(dtm.inserts,0)+nvl(dtm.deletes,0)+nvl(dtm.updates,0)
chngs
MIDDLEOFFICE.ASGESGEPOTALLOCAT
123,944 080602 22:00:40
123,944
0
.000
,(nvl(dtm.inserts,0)+nvl(dtm.deletes,0)+nvl(dtm.updates,0))
GATEWAY.ENBZHEERHWV
39,136 080522 22:06:39
44,922pct_c 6,156
.131
/greatest(nvl(dbta.num_rows,0),1)
DATAFIX.RWERNGNONNGNVZBYNO
17,650 080425
22:00:04
17,650
0
.000
from
dba_tables
dbta
DATAFIX.WEGEBTRUUUXTDWTTHH
32,707 071105
22:01:00
32,707
0
.000
left outer join dba_tab_modifications
dtm
DATAFIX.ACGERTIIJJYKYNJNIY
189 080131 22:00:03
189
0
.000
on dbta.owner
= dtm.table_owner
MM_AUDIT.KUKLYUTDJJ4YUNYJJ
313,230 080509 22:07:34
426,192
112,962
.251
and dbta.table_name
= dtm.table_name
DATAFIX.AERGERHGGRE_TEMP
10,000 071105 22:00:52
10,000
0
.000
and dtm.partition_name is null
GATEWAY.AEFSSTEFFFE
2,083 080212 16:03:36
15,231
0
.000
where dbta.table_name like upper(nvl('&Tab_name','WHOOPS'))
/
clear colu
Ora 600 Ltd
12/45
SYS.COL_USAGE$
• Every time a SQL statement is parsed, information about
columns referenced in table joins and where predicates
is stored in the internal table SYS.COL_USAGE%
• This is what DBMS_STATS uses to help decide which of
the indexed columns to gather stats on when
method_opt “for all indexed columns” is used.
• It might also play a part in deciding which columns to
gather histograms on, as I have tested adding very
skewed columns to a table and the automatic stats
collection does not gather histograms and neither does
a specific call to gather_stats with method_opt=“auto”.
• It can also be useful to use, to help identify if an index is
missing or even is likely to be used
Ora 600 Ltd
13/45
OWNER
COLUMN_NAME
-- chk_col_usageTAB_NAME
------------------------------------------------------------------ this is a rip-off
of Tim Gormans' script
to look at the column usage info
that,in
EQUAL_PREDS
EQI_JOINS NONEGI_JNS RANGE_PRDS LIKE_PRDS NULL_PRDS TS
-9i,10g
and 11 beta
at least,
is not--------revealed--------in a DB------------view. Gits
----------- --------------------------col owner form a22 wrap
COMMHOME
COMM_TRAD_PCTTRAD_H
TREENODEID
col tab_name form
a30 wrap
2
0
0
0
0
0 02 JUN 2011 12:44:22
col column_name
form
a30 wrap
col equal_preds COMM_TRAD_PIPREFUND
form 9999,999
COMMHOME
CC
col eqi_joins
form
9999,999
0
3
0
0
0
0 02 JUN 2011 10:59:13
col noneqi_jns
form
9999,999
COMMHOME
COMM_TRAD_PIPREFUND
HOMEID
col range_prds form
9999,999
col like_prds
form
4,581
10 9999,9990
0
0
0 15 JUN 2011 20:38:44
col
null_prds
form
9999,999
COMMHOME
COMM_TRAD_PIPREFUND
ISLEAF
select oo.name owner
163tab_name0
0
0
0
0 15 JUN 2011 20:38:44
, o.name
COMMHOME
COMM_TRAD_PIPREFUND
TREENODEID
, c.name column_name
, u.equality_preds
177
169 equal_preds
0
0
0
0 15 JUN 2011 20:38:44
,
u.equijoin_preds
eqi_joins
SYS
TS$
FLAGS
, u.nonequijoin_preds
nonegi_jns
3,102
0
0
0
0
0 14 JUN 2011 23:33:37
, u.range_preds
range_prds
, u.like_preds like_prds
SYS
TS$
NAME
, u.null_preds
null_prds
2,708
1,954
5
0
392
0 15 JUN 2011 12:22:23
, u.timestamp ts
SYS
TS$ u
ONLINE$
from sys.col_usage$
3,553 o
0
0
0
0
0 15 JUN 2011 06:38:16
, sys.obj$
,
sys.user$
oo
SYS
TS$
TS#
, sys.col$
c
0
86
0
0 15 JUN 2011 18:08:32
where1,132
o.obj# =8,555
u.obj#
GATEWAY
TRADINGPERIODPROFILEDATA
TRADINGCLOSE
and oo.user# = o.owner#
and c.obj#
0 = u.obj#
0
0
1
0
0 29 MAR 2011 05:23:45
and
c.col#
=
u.intcol#
GATEWAY
TRADINGPERIODPROFILES
INSTGROUPID
and o.name like upper(nvl('&tab_name','%'))||'%'
46
237
0
0
0
0 12 JUN 2011 11:09:17
and oo.name
like
upper(nvl('&tab_own','%'))||'%'
GATEWAY
PROFILENAME
order by 1,2,3 TRADINGPERIODPROFILES
/
0
67
0
0
1
0 12 JUN 2011 00:53:36
clear
colu
GATEWAY
TRADINGPERIODPROFILES
SOURCEID
44
234
0
0
0
0 12 JUN 2011 11:09:17
14/45
Ora 600 Ltd
Statistics Hierarchy
There is more than one type of “stats” that Oracle can gather and
which have different impacts and are best gathered in different ways.
Gather “once” – How big your memory objects are
Areas of memory, number of users, size of caches
X$ sys-only “objects”
FIXED OBJECT
STATISTICS
OBJECT
STATISTICS
Auto Gather
DICTIONARY
STATISTICS
Gather regularly, probably via auto stats job
Do not enhance or do one-offs
Essentially “normal” stats for sys.obj$-type things
Gather regularly, via auto job and enhancements
Tables, Indexes, Columns
What DBAs/Developers mean by “Stats”
Increasing Impact
Gather “once” – how fast the hardware is.
Multi-Block read : Single-Block
+Speed of your COU
SYSTEM
STATISTICS
System Statistics
• In effect, these stats are just the CPU speed and the relative speeds of
Single-Block Reads (SBR) and Multi-Block Reads (MBR).
• The actual speed of single- and multi- block reads are recorded, in
milliseconds, but it is the ratio between them that counts.
• If Multi-Block reads are found to be the same or faster than SingleBlock reads, Oracle 10 ignores the data collected, does not store it.
• The CBO converts all IO and CPU cost into units of single-block reads.
That is what the COST is in explain plan. It is also what you see in
AWR.
• Gathering System Statistics may:
• push oracle towards or away from high-CPU actions like sorts.
• Alter the likelihood of full table scans and fast full index scans as
oracle better understands the cost of the multi-block actions.
ORA600 Ltd
16/45
System Statistics
• V10 and 11 come with a default set of system statistics. You can
GATHER_SYSTEM_STATS with a fake workload or based on activity
on your system over a period of time.
• I advise the latter – but ensure your system has a representative
workload.
• You only need to gather the System Statistics “once” (but ensure you
do so with an “average load”).
• Re-gathering is only required if your storage changes (eg add more
spindles), if there is a major system change or the server(s) you use
change significantly in CPU utilisation
• Gathering at day and night and storing/swapping system stats is often
suggested – but seems to be a bit of an urban myth.
• You may wish to gather system stats 4 or 5 times and
DBMS_STATS.SET_SYSTEM_STATS to the average.
• NB Not RAC aware – system stats gathered on one node apply to all
ORA600 Ltd
17/45
Fixed Object Statistics
• These are statistics on the in-memory “dynamic performance”
objects, the x$ and similar tables (what V$ views sit on).
• Need to gather “once” and only re-gather if something significant
changes such as allocating much more memory to the instance or the
number of user sessions greatly increasing.
• Gathering Fixed Object stats will aid internal SQL, checking session
details, looking at memory constructs. I have seen a small
improvement in parse speed. Certain dictionary queries run faster.
• Re-gather after upgrade etc.
• If they have never been gathered the impact can be significant, I have
yet to personally see a major change as a result of re-gathering (I just
do so once a year on “just in case” principles).
ORA600 Ltd
18/45
Dictionary Statistics
• Statistics on the internal tables owned by SYS and other internal
Oracle users. SYS.OBJ$, SYS.TAB%, SYS.TS$, those tables.
• Are gathered as part of the default statistics gathering job.
• In effect just like gathering schema statistics on the SYS, SYSTEM,
OUTLN and other users. Support (and recommended {*} ) from V10.
V9, keep with the RBO.
• Can take several hours to gather on a database with tens of thousands
of objects or more.
• Can significantly aid parsing and other internal SQL, as well as DBA
scripts running on the DBA views and also the underlying tables.
ORA600 Ltd
19/45
Dictionary Statistics
• Gather them regularly. Weekly to monthly.
• If you use the default automatic statistics gathering job, it is collecting
dictionary statistics for you and is fine.
• If you disable the automatic statistics gathering for your schema stats
either:
• Leaving it running for ONLY Dictionary statistics:
DBMS_PARAM.SET_PARAM(‘AUTOSTATS_TARGET’,’ORACLE’)
• Organise regular dictionary stats gathering by your own methods.
• Not gathering Dictionary Stats, especially on very large/complex
database could lead to very poor dictionary/parse performance.
• To be honest, with 10.1/10.2 at least, even gathering Dictionary Stats
for systems with massive numbers of segments can fail to resolve
some slow Dictionary performance
ORA600 Ltd
20/45
That was the pre-amble.
Getting the System, Fixed Object
and Dictionary stats gathered gives
you a solid base to tackle
Object Statistics
ORA600 Ltd
21/45
Why are Stats Key to Performance?
• Used by the Cost Based Optimiser (CBO).
• CBO examines the SQL statement and works out the various ways
in which it could satisfy the query (up to about 2,000 plans under 10).
• For each step the CBO works out the cost, which is expected IO
plus CPU (if turned on) and the cardinality, the number of records
that step will return.
• The cardinality is passed back to the next step and can be a
multiplier of that step’s cost.
• The costs are added up and Oracle then picks the plan with the
lowest overall cost(*) and runs that plan
(*) This is a slight lie, but the principal is true
Ora 600 Ltd
22/45
Why are Stats Key to Performance?
• The CBO is very logical, it uses just the figures it is presented with
and simple calculations to make it’s decision. No magic involved.
• If those figures are wrong, ie the statistics are not representative
then the costs calculated will be incorrect.
• A small error can cascade up the code and cause a large difference
to other steps and cause the plan to change.
• With edge cases, a small difference often result in a different plan
being chosen.
• That different plan is often sub-optimal, sometimes seriously so,
occasionally hundreds or thousands of times slower.
• Cost/Cardinality being too low , often 1, is the most
common cause of poor performance, due to prompting
nested loop plans and incorrect driving tables
Ora 600 Ltd
23/45
Automated Stats Gathering
• If the automated job is working for you, leave it alone.
• If you turned off the automated job and “wrote your own”
under V10 or before and now are on 11 – consider going
back to the automated job. It is faster and more accurate.
• If you turned off the automated job and did not do
anything about your Dictionary stats, that is bad.
• Under V11 you can control %stale and defaults like
method_opt at table level. Consider doing that.
• If you decided to “roll your own” I would advise you leave
the automated job running but just for Oracle’s objects:
dbms_stats.set_global_prefs(autostats_target,’ORACLE’)
ORA600 Ltd
24/45
Stats Gathering
• Version 9, swap to DBMS_STATS and write your own
• Version 10, use the automated job (at least for dictionary
stats) write your own version/exceptions for your tables.
• V11.1 – Test, do not trust me, but I would still say go auto.
• V11.2 – Use the automated job and if you must intervene,
use default sample size so you get one pass NDV.
• V11 – look at rolling up stats for partitions, subpartitions –
but read up on it extensively. You have to ensure you
gather all partitions or sub-partitions.
• With all versions of Oracle you will have exceptions,
usually overnight batch or partitioned tables. If you are the
DBA, this is part of your job.
ORA600 Ltd
25/45
New Oracle 11 NDV
• One of the most demanding parts of generating object
statistics is gathering Number of Distinct Values
• Oracle 11 introduced the single-pass NDV function. It
scans the data once, uses much less memory, is faster
and more accurate.
• You have to use:
ESTIMATE_PERCENT=DBMS_STATS.AUTO_SAMPLE_SIZE
• You cannot use BLOCK sampling.
Amit poddar via JL
http://jonathanlewis.files.wordpress.com/2011/12/one-pass-distinct-samplingpresentation.pdf
http://structureddata.org/2007/09/17/oracle-11g-enhancements-to-dbms_stats/
https://blogs.oracle.com/optimizer/entry/improvement_of_auto_sampling_statisti
cs_gathering_feature_in_oracle_database_11g
ORA600 Ltd
26/45
The Automated Job may Choke
ORA600 Ltd
27/45
select substr(operation,1,30) operation
,to_char(start_time,'DD-MON-YYYY HH24:MI:SS.FF') START_tme
,to_char(END_TIME,'DD-MON-YYYY HH24:MI:SS.FF') end_tme
from sys.WRI$_OPTSTAT_OPR
order by start_time desc
OPERATION
-----------------------------gather_database_stats(auto)
gather_database_stats(auto)
gather_database_stats(auto)
gather_database_stats(auto)
gather_database_stats(auto)
gather_database_stats(auto)
gather_database_stats(auto)
gather_dictionary_stats
gather_dictionary_stats
gather_database_stats(auto)
gather_database_stats(auto)
gather_database_stats(auto)
gather_database_stats(auto)
gather_database_stats(auto)
gather_database_stats(auto)
gather_database_stats(auto)
gather_database_stats(auto)
START_TME
-----------------------------02-JUN-2011 22:00:02.979206
31-MAY-2011 06:00:02.811976
30-MAY-2011 22:00:01.976379
29-MAY-2011 22:00:01.416256
28-MAY-2011 22:00:02.243542
27-MAY-2011 22:00:03.588237
26-MAY-2011 22:00:01.602425
24-MAY-2011 11:42:31.771667
24-MAY-2011 11:42:11.396340
24-MAY-2011 06:00:02.905945
23-MAY-2011 22:00:01.732964
22-MAY-2011 22:00:01.421518
21-MAY-2011 22:00:01.942455
20-MAY-2011 22:00:03.066981
19-MAY-2011 22:00:02.571718
17-MAY-2011 06:00:01.462810
16-MAY-2011 22:00:01.096761
Ora 600 Ltd
END_TME
--------------------02-JUN-2011 23:04:37
31-MAY-2011 06:56:21
30-MAY-2011 22:19:31
29-MAY-2011 23:36:14
29-MAY-2011 00:12:18
27-MAY-2011 23:14:24
26-MAY-2011 23:14:05
24-MAY-2011 11:42:35
24-MAY-2011 11:42:15
24-MAY-2011 06:20:38
23-MAY-2011 22:58:35
23-MAY-2011 06:00:05
22-MAY-2011 06:00:01
21-MAY-2011 06:00:01
19-MAY-2011 23:04:27
17-MAY-2011 08:02:34
16-MAY-2011 23:14:56
28/45
Fixing Choked Stats
• Once the automated job chokes, it will continue to choke. Every night.
This is because it tries the same thing each night.
• The longer weekend run should sort things out – until it chokes.
• Identify the table (Look for the gather statement interactively during
the window, pull off the list of tables to gather options=list_stale,
check for large objects with 10% difference...)
• Do a manual gather with something like: block_sample=>true,
estimate_percent=>0.1, degree=>8 method_opt=>for all columns size
1,cascade=> false, noinvalidate=>false, granualrity=>global
• Once that is run you can afford to do a larger sample size and do the
indexes. Do the PK first.
• You probably need to lock the stats on the table and treat it as an
exception.
ORA600 Ltd
29/45
The Biggest “Wrong Stats” Issues
The below are the worst “stats” causes of performance issues, in order,
in my opinion based on experience
1. The stats say a segment is empty and it is not.
2. Your WHERE predicates are out of range for the known column
values.
3. The stats say a segment holds than 10* less data than it does. The
more orders of magnitude out, the worse.
4. Histograms. (there and not need or need and not there. Ouch)
5. The correlation between columns is not understood by the
optimiser e.g. That values in tab1.x “line up” with those in tab2.y
6. Edge cases that experts get excited about but 99% of us never see
Ora 600 Ltd
30/45
Stats Issues are a VLDB thing?
• I can only go one what I have seen and, to a less turstworthy level
(*ironic given the sources), what I have heard...
• OLTP systems with a need for the fastest absolute response time to
small data queries are NOT troubled by object stats.
• Edge cases that balance on correlation or swapping to nested loop
from hash or using Cartesian join are specific to OLTP and a set of
requirements where stats gathering are, well, redundant.
• Where pain occurs is when a plan that hashes several segments
together {often including partition exclusion} swaps to either a nested
loop or Cartesian merge join that is not suitable
• When it comes down to it, the plan for large volumes of data is right
for all volumes of data. If it does it in 30 seconds inefficiently, doing it
in 33 second efficiently is spot-on good enough.
ORA600 Ltd
31/45
Single Values Expected Outside of range
100
Oracle 10 with no
Histograms
10,000
values
Expect No
Of values
0
100
200
300
400
300
400
300
400
300
400
100
Oracle 10 with
Histograms
10,000
values
Expect No
Of values
0
100
200
100
Oracle 11 with no
Histograms
10,000
values
Expect No
Of values
0
100
200
100
Oracle 11 with
Histograms
10,000
values
Expect No
Of values
0
100
Or
200
100
10,000
values
Expect No
Of values
0
100
ORA600 Ltd
200
300
400
32/45
Range Values Expected Outside of range
100
Oracle 10 with no
Histograms
10,000
values
Expect No
Of values
0
200
300
100
Oracle 10 with
Histograms
10,000
values
Expect No
Of values
0
200
300
100
Oracle 11 with no
Histograms
10,000
values
Expect No
Of values
0
200
300
100
Oracle 11 with
Histograms
10,000
values
Expect No
Of values
0
200
ORA600 Ltd
300
33/45
Histograms and Data Ranges
• Oracle 10 deals with column values being “out of range” by
decreasing cardinality over that range
• Eg if the low_value is 200 and the high_value is 300 and there are 1000
rows, that is 10 rows for 225.
• For 350, it is 50% outside the range, so 10 rows is reduced by 50% to 5
rows. At value 400 it drops to 1 row. This is fine if you gather at 10%.
• Histograms massively alter this “out of range” half-life. I have seen
massive issues with dates. A table covering 5 years of data, with
histograms on the date, can reduce a value only a week out of range
to less then 1% of the average value.
• This is a big issue on large tables that have stats gathered only
occasionally (as not changed by a big percentage).
ORA600 Ltd
34/45
Histograms and Partitions
• This Out of Range issue is thrown into sharp relief with Partitions.
• If the partitions are daily or weekly and have histograms on them, SQL
statement selecting data for the latest hour can become “out of range”
sooner than you can believe.
• Spot this by the cardinality being 1 rather than several hundred or
thousand.
• Can happen even without histograms, especially with daily partitions
• Solutions? Use dynamic sampling on latest daily or weekly partitions
(your in-house code either does not collect or deletes such stats),
insert half-day stats or gather very aggresively.
ORA600 Ltd
35/45
Replacing the Automatic Stats
Job
The CBO is Complex.
Stats are in effect a constantly
evolving part of your code base.
A simple approach will only work
on a simple system.
(And I really am Sorry!)
ORA600 Ltd
36/45
Replacing the AUTO stats job
• Don’t
• Tweak the current job – make it run at different times, alter the %stale
and defaults at table levels.
• Locks the stats on the tables that give you issues and treat these as
your exceptions.
• If you MUST replace the Auto job it will hurt:
• Initial estimate for “simple” replacement will be a week or two.
• If you plan and estimate it, that will come out as four weeks.
• It will take you 2 months.
• You will emulate most of what the AUTO job does.
• Your solution will probably need to look at segment size,
DBA_TAB_MODIFICATIONS and have several control table that allow
you to specify METHOD_OPT, SAMPLE SIZE and GRANULARITY at
table level...
ORA600 Ltd
37/45
Auto Stats Replacement
OBJ_CTL
OWNER
TAB_NAME
PART_NAME
DFLT_STALE
DFLT_SAMPLE
DFLT_METHOD_OPT
IDX_PCT
Your Stats Job
STATS_CTL
DBA_TAB_MODS
DBA_SEGMENTS
LIST_STALE/EMPTY
COPY RULES
SAVE_STATS
STATS_LOG
BLOCK SAMPLE
TAB THEN INDEX
ORDER BY SIZE ASC
ALL COLUMNS SIZE 1
PARALLEL
ORA600 Ltd
38/45
Block or Row Sample
• When you stats an ESTIMATE_PERCENT for a table gather statement,
you can also state if it is block or row sample. It defaults to Row.
• Row sample selects the percentage
rows scattered across all
Breaking of
news!
blocks. If you have eg 16k block size and over 100 rows per block, a
1%sample size will visit every block.
There appears to be some issue with block sampling on
• Block sample selects whole blocks, which greatly reduces the
version 10.2
physical IO
You can use the SAMPLE command in normal SQL
• Block sample size gives low column cardinalities and is susceptible to
select statements and that is what Oracle does to gather
getting high-low values that are not close to the true edges.
stats. However, the BLOCK SAMPLE seems to vary the
• The
under-sampling
of columns
depending
on spread,
but my
actual
number of
blocks itvaries
checks
for a given
sample
tests show that 5% block sample
size is about as good as 0.8% row
size.
sample size, but still 10 times faster. It is fine for 99% of cases.
I will investigate when I have time.
• Oracle V11 up – the new NDV and speed of stats pushes me back
towards AUTO SAMPLE SIZE
ORA600 Ltd
39/45
Write Something to Gather Stats on
all Segments
• You need to gather stats on each segment that needs stats. I strongly
suggest any in-house code works segment-by-segment such that all
tables, partitions and index segments are processed “as one”
• You could use GATHER_SCHEMA_STATS and LIST the objects into an
array and process. This gets object s Oracle would deem in need of
stats:
o STALE list those objects that have stats but have changed by 10%
o EMPTY list those objects with no current stats
o AUTO is supposed to list both but is buggy in 10.2.0.3
• Alternatively, run through all segments in the schemas you are
interested in and used DBA_TAB_MODIFICATIONS directly. Slower (*)
but much more control
• I used the above on 10.1 but on current 10.2 system, the data
dictionary is too slow. We decided to revert to
GATHER_SCHEMA_STATS
ORA600 Ltd
40/45
Do not write code to roll back stats.
• Oracle keeps all stats changed for 31 day, by default.
• You can alter this with
DBMS_STATS.ALTER_STATS_HISTORY_RETENTION, check it with
DBMS_STATS.GET_STATS_HISTORY_RETENTION (usualy 31).
• If the stats history is causing you issues use PURGE_STATS to get rid
of them but, be warned, once gone they are gone. But then, how often
do you RESTORE stats?
• Unless you REALLY want to be able to identify sets of stats, just use
Oracle’s in-built feature where it stores previous stats for 31 days and
you can recover them.
• This hidden feature can create a LOT of data, especially if you have
lots of partitions with lots of column stats. It goes into SYSTEM TS.
http://mwidlake.wordpress.com/2009/08/03/why-is-my-system-tablespace-so-big/
ORA600 Ltd
41/45
Rolling Back Statistics
• Most people are aware of the potential to have a user_statistics_table.
This is a table you create that you can put stats into using
EXPORT_XXXX_STATS and retrieve them with IMPORT_XXXX_STATS
• You use the CREATE_STAT_TABLE table to create the table and give a
STATID to sets of stats you wish to export and import.
• One little Gotcha. The documentation is not clear, if you state a user
statistics table in GATHER_XXXX_STATISTICS then they are NOT
placed into the stats table, they are put in the dictionary and the OLD
values are put in the stats table.
• Rolling back stats works for system, fixed_object, database and object
stats. I’ve tested it, it works – at least under 10.1 and 10.2.
• Oracle 11 allows you to create Pending Stats which is very nice. It
beats the manually version I developed for V10 in...ohh... 2005?
ORA600 Ltd
42/45
Rolling Back
• Gathering System ,Fixed Object and Dictionary statistics are “system
wide” changes. Do you have a suitable system to test this on?
• All have equivalent DBMS_STATS.RESTORE_XXXX_STATISTICS
commands that, when I tested on 10.2.0.3, worked correctly (including
blanking stats that were previously null).
• You could save the stats being replaced in your own statistics table
and recover from there. The only advantage is it is easier to
interrogate stats you saved into your own stats table.
• Deleting System Stats reinstigates the default values seen after install.
• Deleting Fixed Object stats deletes the stats.
• Deleting Dictionary stats is not a very good idea as it works.
• I said you could restore Dictionary stats but, even though I tested it, I
am not going to promise that all stats will be the same after you
restore...
ORA600 Ltd
43/45
Tables Cascade or Not?
• If you gather statistics on a Table or Table Partition, the default is to
CASCADE to relevant index segments.
• The default will gather between 1000 and 2000 index blocks. I find this
to be overkill.
• By NOT cascading to index partitions but doing the indexes
specifically and setting a sample size derived from the index size, you
can use smaller samples sizes.
• However, this is more code to write and test. It may be more pragmatic
to just cascade, though each index segment may end up taking as
long as the table segment to be gathered.
• Consider global indexes and partitioned global indexes.
ORA600 Ltd
44/45
DBMS_STATS Package
• Many of us still refer to “analyze the table” but of course we all now
mean “gather statistics with DBMS_STATS”. Don’t we .
• Don’t use ANALYZE any more, especially not on production systems.
• DBMS_STATS gathers better statistics, including histograms, and
cascade down to partitions and sub-partitions as needed.
• Can be run for specific segments (table, index, partition thereof etc),
for a schema or for the database and can be set to only gather “stale”
objects.
• Can use Parallel to make up for the slower running, can set sample
size by row and block (see more later)
• In these slides, any procedure or function is in DBMS_STATS unless
otherwise specified.
ORA600 Ltd
45/45
Stability and Performance Dichotomy
• To get the best performance, you want the stats to be as
up-to-date and accurate as possible.
• When the stats change, it is like having a miniture code
Please note, the actual result of the data processing does
release. The processing of the application can change –
not change, the functionality is preserved, but the time
after all, that is the idea.
taken can change, the idea being it changes for the better
• How many people here would allow a code release on
their business critical systems at any time?
• Most people are not actually concerned with good
performance, they are concerned with poor
performance.
• Code can suddenly start running slowly “over night”
when stats are gathered. Just one very bad statement
can actually make the system unusable.
Ora 600 Ltd
46/45
Why are Bad Stats worse than No
Stats?
• As an example, stats showing a table or partition is empty will cause
the CBO to think it is a cost of 1 to scan and only one record will be
found. Nested loops to access it and Cartesian Merge plans will occur.
• More later, but stats saying there is no data later than a month ago will
cause the CBO to again assume there is as little as one record to find.
• If there are no stats, Oracle will use dynamic sampling or use defaults,
which are usually better than bad stats
• Spotting Bad Stats is not as easy as spotting missing Stats, especially
as Oracle stores high/low values and histograms in internal formats.
• Old stats (a specific type of bad stats) can cause a plan to flip to
another plan without notice, with NO CHANGE.
ORA600 Ltd
47/45
Intervention methods for stats
• Stated low percentage stats gathering for large objects – use BLOCK
SAMPLING and do tables and indexes specifically.
• Set statistics manually or COPY_TABLE_STATS. The tricky part is the
column stats and any histograms you have to have.
• Delete and lock stats and allow Dynamic Sampling to occur.
• Gather stats at lowest level (sub-partition or Partition) and allow them
to be calculated at higher levels via Incremental stats. Made possible
via synopses (but needs care to implement)
• The overall aim is always the same. You do not need accurate stats,
you need stats that are good enough to give execution plans that will
work for large data volumes
ORA600 Ltd
48/45
When to Gather
• Most sites have stats gathering over night, even if they replace the
auto stats gathering job. This may well not be the best time,
• You may be better off collecting stats in the late evening or early
morning. You can use PARALLEL to get the job done more quickly.
• You are very unlikely to benefit from gathering all stats in one window.
• If you are processing date into a table, gather the stats at the right
time(s). This is probably after the load, or it could be several times in
the process.
• So long as only specific gathering is done, there is great benefit to be
had gathering stats as your batch process proceeds.
ORA600 Ltd
49/45
When to Gather-Batch Processing
• I’m soooooo tired of this conversation and I have had it so many
times... Batch processing means you need to think about stats
• You have a table that is empty. You stuff it full of data. You use it to
load data into your live database. Then you may or may not truncate it.
Ask yourself, when is the volume of data significant and when do you
gather stats.
• If it is a global temporary table you may need to set some stats.
• If I was to be given a pound for each time I had seen that the batch
processing had no concept of stats gathering I would have £17.50.
• So long as only specific gathering is done, there is great benefit to be
had gathering stats as your batch process proceeds.
ORA600 Ltd
50/45
Incremental Stats
• I’m sorry but I ripped it out. There is nothing between “you can do it”
and “here is how it works” that is not 1 min or 30 mins.
• Do NOT attempt before V11, even though it is back-ported to 10.2.0.4
and .5. I’ve tried, it was hard. It was not nice.
• You gather stats at the lowest level, sub-partition or partition at a
reasonable level. These are summed up. Synopses are used to allow
NDV calculations. Better than the crap I came up with..
• YOU HAVE TO BE RIGOROUS in ensuring all stats are gathered at the
lowest levels and issues do not occur, but it works well if controlled.
• No one seems to mention this, but, it all falls apart when you add new
columns to a table. It won’t handle this as far as I have seen and those
tables with stats that no longer update? Urrrrgghhhh.
ORA600 Ltd
51/45
One Extreme - Dynamic Sampling
Delete stats from a table and lock the table.
• If there are no stats for a table, they are gathered from the table as it is
now, thus avoiding issues with the table having recently changed.
• You set the level of DYNAMIC SAMPLING at the instance level. In 10.2
it defaults to 2, gather stats on any segment lacking them,
• At level 3, Guesses for filter predicates are checked and at level 4
correlation between columns are checked. These can greatly improve
performance.
• Only the data sampled can be considered and this is going to either be
much less than that considered by a GATHER statement or else will
take a long time to sample.
• Levels above 4 increase the number of blocks assessed by the
dynamic sample, up to 10 which is “every block”. Don’t do that.
ORA600 Ltd
52/45
Dynamic Sampling
• Dynamic sampling can be set at the database level
(optimizer_dynamic_sampling) or in hints.
• I would suggest setting the database level to 3 or 4 on a
Datawarehouse, but I’ve not had much success getting Live sites to
do this.
• Dynamic sampling does extend parse times (sometimes to several
seconds, potentially) so not suitable for sites with high SQL statement
turnover (ie no use of binds). Which is a shame as binds cause other
issues, with histograms and correlation!
• Some people argue that, on DWs, delete all user-table (and
index)stats, lock the tables and use dynamic sampling alone.
• I repeat, Not suitable for OLTP-like SQL activity.
ORA600 Ltd
53/45
Other Extreme - Freezing Stats
• DBMS_STATS.LOCK_TABLE_STATS and LOCK_SCHEMA_STATS
allow you to lock the stats. Can’t lock at partition or index.
• The automated stats collection will leave those segments (and their
dependents) alone unless you use the ‘force=true’ parameter.
• Individual gather/set/delete/import stats statements on a locked
table will cause an error.
• Locking populated stats will help preserve an execution plan, which
gives stability but prevents improvements. However, you can’t stop
time.
• If you delete the stats first and lock them, they stay empty, allowing
dynamic sampling to prevail.
• If you lock the stats, people or processes gathering stats THAT
SHOULD NOT BE will fail. Can re-enable via the force parameter
Ora 600 Ltd
54/45
Locking Stats
• If you lock the stats on a table then Oracle will not collect them
anymore.
• If you lock a table with empty stats, the CBO will dynamically sample
the table when the SQL statment is parsed. This is fine for long
running SQL like on Data Warehouses and utterly unacceptable for
OLTP systems where you need the answer in 100ms.
• If you have gathered stats on an empty table and lock the stats you
are in a world of pain as the CBO will think there are no rows and base
it’s plan on that.
• If you gather stats when you know that they will be good, and then
lock them, then your execution plans will probably be good.
• If you have a complex situation but know what stats are needed, you
can SET those stats and lock the table. No more stats will be
gathered. Everything will be fine until the world moves on.
ORA600 Ltd
55/45
Histograms
• Histograms and bind variables do not mix well on Oracle 10 due to
bind peeking. Oracle 11 is supposed to fix this.
• In essence, if the first bind values seen by the parse are not typical (or
match a low-cardinality plan) the plan chosen can perform very poorly
for larger returned data sets.
• These bad plans can get stuck in the SGA. Eg takes 11 minutes to run
and kicked off every 10 minutes. You can’t get the code out of the
SGA.
• Histograms increase the time taken to gather stats significantly.
Oracle 10.1 and 10.2 massively over-gather histograms.
• If you write your own code you can stop histograms being gathered.
• But Histograms are really good for some code.
• If only there was a “parse” hint
ORA600 Ltd
56/45
Further Information
• PL/SQL Packages and Types Reference, DBMS_STATS
• Blogs - Jonathan Lewis (generally), Doug Burns (recent series on
partitions and stats), Christian antognini, follow links to others
• My Blog, I intend to do some more posts on stats gathering.
• Email me and ask, I’m happy to answer general questions if I can.
• Your system.
ORA600 Ltd
57/45
Table Sample Size
• Generally speaking, the larger the table the smaller the sample size
you need for “good enough” statistics.
• I usually decide on a percentage sample size derived from the number
of blocks in the table, aiming for 1000 to 10,000 blocks.
• SIZE_AUTO samples 0.01% then 0.1% (or similar) and sees if the
stats change significantly, then increase the percentage size until the
stats are stable or a compute is quicker.
• SIZE_AUTO sounds clever but works out, in practice, to usually be
inefficient. Just does not cope with lots of large segments. If you think
about it, it almost guarantees gathering at a samples size larger than
you need before stopping.
• May want to gather the table only with no CASCADE to indexes as
Oracle over-samples indexes with the cascade option (I hardly every
use CASCADE in replacements for the auto job)
ORA600 Ltd
58/45
I over-ran, didn’t I?
And I kept it to 45 slides.
Oh Well, next SIG...
ORA600 Ltd
59/45
Template Picture Slide
ORA600 Ltd
60/45