11g Hang Detection

Download Report

Transcript 11g Hang Detection

11g Hang Detection
Jeremiah Wilton
http://www.ora-600.net
11g Diagnosability Infrastructure

ADR / IPS (adrci, DIAG, dia0)

Prelim connection (excellent)

Hang detection (cyclic, acyclic)

Test case builder

Health checks (on-demand, reactive)

Interfaces to DB/Grid Control

Advisers (SQL Repair)
Cyclic Hangs

Sessions deadlocked through chain of
resources

Generates ORA-60 like normal deadlocks

Demo...
Acyclic Hangs

11g harder to hang

Chains of resource holding

Steady (seconds_in_wait increasing)

Takes min 300 seconds

RAC only

Quietly generates a trace file (no alert log message)




<SID>_DIA0_<PID>_<n>.trc
test01_DIA0_3867_1.trc
Message in DIA0 trace “About to produce dumps”
Demo
My hopes for hang detection

Amazon's model

Customizable / scriptable


When to dump

What to dump
Integrate with ADR
Real-world hangs

Hang with LC Pin pile-up (typical)

Hang with BCT writer I/O hanging (bug?)

Database completely disabled but failure does
not qualify as a hang
Prelim connection

Can connect to a totally hung instance

Use your own or attach to an existing process

Hanganalyze, systemstate, errorstack

Run queries on x$ structs - results to screen or
trace

Pull ASH data

Can you do prelim from non-SQL*Plus?
Other data sources
when you can log in

ASH – your first stop for performance

Hanganalyze

Errorstack
Talk back




We need to report typical events to Oracle
Support is a sufficient route (enhancement
request)
Bring them good data
Tell me or use your sales channels to get it on
the agenda