Tracking Index Usage in SQL 2012 and 2014

Download Report

Transcript Tracking Index Usage in SQL 2012 and 2014

Tracking Index Usage in
SQL 2012 and 2014
Obligatory “Who Am I” slide

Ken Schwartz

Doing Database-ish stuff for 15+ years

Data Architect at CommerceHub

Particular areas of interest are query tuning and index tuning

Twitter: nope, I don’t tweet

Blog: nope, I don’t blog

Facebook: nope, not that either

Feel free to email me: [email protected]
Goal: Develop a way to track index
usage over a period of time.

Typical question for DBA to address: What indexes do we have that we
don’t really need?

Obvious answer: Any index that we update, but never read from

Fuzzier answer: Any index that we update WAAAAAY more than we
read from
 How

much more is “WAAAAAY” more?
To answer, we need to find information on index usage. This is easy –
do a quick AltaVista search

Hmm…AltaVista no longer exists

Now what do we do?

Lycos search!!!
sys.dm_db_index_usage_stats

DMV to track index usage supplied by Microsoft. THANKS!

Here’s the code I turn to…I stole it from the internet and
tweaked it about 5 years ago
 Just
run this in whatever database you want to see
index usage

For ease of use, I turned this into a proc, “ix_rw_ratio”
(index_read_write_ratio) – just pass in name of database
you want info for – can run from any database

Hmmm…not very exciting. Well, server just restarted
when I booted up, so DMV’s were reset – because that’s
how DMV’s work
sys.dm_db_index_usage_stats
(continued)

Let’s run some queries and recheck the DMV info

Run some for scans, some for seeks

And now recheck

OK…so everything is exactly as we expected

Why are you making me watch this, Ken?
Accumulating Index Usage Stats

OK, finally…the point of this all…

Easy enough:
 Create
a table to log index usage
 Create
a proc that hits the DMV, and for each index,
upsert the usage in the log table
 Create
SQL Agent Job that calls the proc
 Schedule
it to run every…day…week…whatever
 Except…ummm…has
last run?
there been a server restart since
sys.dm_os_sys_info





Microsoft to the rescue again. This DMV gives us exactly
what we need
SELECT sqlserver_start_time FROM sys.dm_os_sys_info
Now make sure your proc checks if there’s been a server
restart since last run and if so, create a new record for all
indexes (if not, just update existing records)
When querying the logging table, aggregate as needed
So far, really nothing new – this is pretty much the classic
solution
I repeat myself when under stress…

OK…so everything is exactly as we expected

Why are you making me watch this, Ken?

Let’s look at one more example now…look at what
happens after we run some index maintenance (which we
all do, right? Well, that’s a whole ‘nother debate there)
Whaaaa????

What happened to the index usage info?

“That never happened before” – at least, not prior to
2012

What does BOL say? From
https://msdn.microsoft.com/enus/library/ms188755.aspx
(2014 page on
sys.index_usage_stats): “The counters are initialized to
empty whenever the SQL Server (MSSQLSERVER) service is
started. In addition, whenever a database is detached or
is shut down (for example, because AUTO_CLOSE is set to
ON), all rows associated with the database are removed.”

Same exact entry for 2012
Quick! Someone tell Microsoft so they
can fix this bug

First reported 4/27/2012 on connect.microsoft.com by
Joe Sack (sql skills)

91 to 0 votes to fix

So, they must working pretty hard to take care of this,
right?

“Status: Closed as won’t fix”
So…back to index usage logging

Our index maintenance job is set to do an index rebuild when
fragmentation hits a certain level (20%? 30%?...whatever)

As soon as that happens, we have lost all our DMV numbers for
that index. Just like there was a restart, except...


Harder to track, since it is index by index

Harder to track, since there is no internal info tracking when
an index was rebuilt (or, if there is, I couldn’t find it)
Now a heavily used index – especially if on a table with a lot of
churn – can look like it was hardly ever used. So we risk
dropping exactly the indexes we need most.

Yay!
High Level Steps for the Fix (not “The Fixx”)

Make sure your periodic index maintenance rebuild job logs its actions

Create a table to hold this info

Create a table to hold index usage stats

Create a staging table to hold stats since last time the stats were
gathered (i.e., delta)

The proc that runs as part of the maintenance job has to update usage
info

The proc will run BEFORE the index maintenance step

The proc will capture the usage delta for each index

For each delta, add to the appropriate record, except…

If we’ve had a restart, just add a new record for each index
(just like old style)
A picture is worth 1000 words
SQL Agent Job

Step 1: Track Index Usage

Step 2: Index Maintenance (Skip/Reorg/Rebuild)
A picture is worth 1000 words

Step 1: Track Index Usage

SQL Agent Job
Staging Table Swaps Old Data For New Data
Pre
IX1

During
Prior
Current
145
151
IX1


Prior
Current
151
???
IX1
Prior
Current
151
172
Unless there was a rebuild of index 1 since last time
Pre
IX1
After
During
Prior
Current
145
151
IX1
After
Prior
Current
0
???
Either way, the delta (= usage since last time) = 21
Step 2: Index Maintenance (Skip/Reorg/Rebuild)
IX1
Prior
Current
0
21
More Details…the tables
CREATE TABLE dbo.index_maintenance_log
 This is updated as part of the job that actually does the
rebuild/reorganize


action_taken is one of skip/rebuild/reorganize

maintenance_job_time is when the nightly/weekly/etc job
started

start_date is when the rebuild/reorganize started

end_date is when the rebuild/reorganize completed
The Tables (continued)

CREATE TABLE [dbo].[index_usage_stats]

This holds the info on usage – the info we use to evaluate
whether or not to drop indexes

Pretty straightforward – just capture everything the DMV’s
offer
Additionally, two more columns:


Start_time (which hold when the server last restarted)

Capture_time (which is when the records was
created/updated)
The Tables (continued)

CREATE TABLE [dbo].[index_usage_stats_staging]

The “_user” and “_system” columns all get defaults of 0

As the “staging” implies, this is where we hold info on the
usage, from last run until this run

For a given index, we will either add the usage number, or
add a new row…more later

The table is not truncated/refilled with each run, but
instead the values get over-written each run
So, what does the proc do – a bit more detail?

Make sure that the staging table has rows for all indexes/heaps that have
usage

Add records to the usage stats table for any index that had no previous usage
(including new indexes)

Has the server been restarted since our last capture?

Yes: Add a new record for every index that has had any usage since
restart – pulling the info directly from the DMV
(dm_db_index_usage_stats)

No:

Reset staging table’s “prior_run” numbers to its “current_run”
numbers

For any indexes that have been rebuilt since our last run, over-write
the new “prior_run” values with “0”

Set the “current_run” numbers to those from DMV

Update usage stats table by getting the delta from the staging table,
and add the delta to the existing value
So, let’s see it work…

OK, let’s load up some demos here. Hang on to your hats.
What does the proc look like? Let’s see
the code already…

Ok. Here it is…
Questions?
OK…now my turn

This presentation made frequent use of sys.dm_db_index_usage_stats. But
that is not the only stats-related DMV. There is also
sys.dm_db_index_operational_stats

Can anyone tell me the difference between them?

Usage_stats tells us how many times an index shows up in the execution plan of
code that gets executed

Operational_stats tell us how many times operations on the index were executed

So, for example, suppose we have an index that shows up in a query plan as
the inner part of a loop join. In usage stats, the number will bump up by 1
when the SQL is run

But in operational stats, the number will increment by the amount of
iterations in the loop.

There is one catch…umm…one cache???