11g Hang Detection
Download
Report
Transcript 11g Hang Detection
11g Hang Detection
Jeremiah Wilton
http://www.ora-600.net
11g Diagnosability Infrastructure
ADR / IPS (adrci, DIAG, dia0)
Prelim connection (excellent)
Hang detection (cyclic, acyclic)
Test case builder
Health checks (on-demand, reactive)
Interfaces to DB/Grid Control
Advisers (SQL Repair)
Cyclic Hangs
Sessions deadlocked through chain of
resources
Generates ORA-60 like normal deadlocks
Demo...
Acyclic Hangs
11g harder to hang
Chains of resource holding
Steady (seconds_in_wait increasing)
Takes min 300 seconds
RAC only
Quietly generates a trace file (no alert log message)
<SID>_DIA0_<PID>_<n>.trc
test01_DIA0_3867_1.trc
Message in DIA0 trace “About to produce dumps”
Demo
My hopes for hang detection
Amazon's model
Customizable / scriptable
When to dump
What to dump
Integrate with ADR
Real-world hangs
Hang with LC Pin pile-up (typical)
Hang with BCT writer I/O hanging (bug?)
Database completely disabled but failure does
not qualify as a hang
Prelim connection
Can connect to a totally hung instance
Use your own or attach to an existing process
Hanganalyze, systemstate, errorstack
Run queries on x$ structs - results to screen or
trace
Pull ASH data
Can you do prelim from non-SQL*Plus?
Other data sources
when you can log in
ASH – your first stop for performance
Hanganalyze
Errorstack
Talk back
We need to report typical events to Oracle
Support is a sufficient route (enhancement
request)
Bring them good data
Tell me or use your sales channels to get it on
the agenda