Monday, March 19, 2012

ORADEBUG How To


What is ORADEBUG:

ORADEBUG is a debugger tool, can be used for (tracing any session, dump DB memory structure, suspend/resume a session and the most useful use is to analyze the hang when an instance is hanging by creating a report shows the blocked/blocker sessions).

How to use ORADEBUG:

If you are lazy to do it yourself, there is a script can do it for you: 👍
https://www.dropbox.com/s/qrpbsr12avc675j/oradebug.sh?dl=0

If you want to do it yourself then follow these steps:

Login to SQLPLUS AS SYSDBA or in case you cannot login in a normal way (e.g. database is hanging and it's not accepting new sessions) you can use the prelim option which will allow you to connect to SGA without opening a session. Note: prelim option cannot be used in an already connected session.

# sqlplus /nolog
SQL> set _prelim on
SQL> conn / as sysdba

or:
# sqlplus -prelim "/ as sysdba"


Second set your PID:

SQL> oradebug setmypid

To perform a DB cluster wide analysis:

SQL> oradebug setinst all

To make the tracefile size unlimited:

SQL> oradebug unlimit

To Run the Hang Analysis: <The most useful use for ORADEBUG>

SQL> oradebug -g def hanganalyze <level#>

-> This will create a logfile and will give you it's path.

-Available levels are:
10     Dump all processes (IGN state)
5      Level 4 + Dump all processes involved in wait chains (NLEAF state)
4*    Level 3 + Dump leaf nodes (blockers) in wait chains (LEAF,LEAF_NW,IGN_DMP state)
3      Level 2 + Dump only processes thought to be in a hang (IN_HANG state)
1-2  Only HANGANALYZE basic output, no process dump at all

*Level 4 and above are resource intensive + large output, which can impact the instance performance.
and thus Oracle recommends to use level 3: 
SQL> ORADEBUG -g def hanganalyze 3


How to read HANGANALYZE trace:

Generally: Read the log and search for these keywords "final blocker" if not found search for "blocked by" & "blocking".

Note: Open chains consist of processes waiting for a resource held by a blocker. These are not true hangs and can be resolved by killing the holding session and allowing the blocked processes to proceed. This can be useful when a process is holding a latch, a frequent example is the library latch, and other processes are waiting for the latch.


Check section "State of ALL nodes" At the end of the HANGANALYZE log :

The data under this line means:
([nodenum]/cnode/sid/sess_srno/session/ospid/state/[adjlist]):

"cnode" column means: cluster node number
"sid" is the Oracle session ID!
"sess_srno" column means: SERIAL#
"ospid" The operating system process ID
"state" columns have the following values:

A session with status "LEAF" or "LEAF_NW" means a session blocking others "Good candidate for kill" specially "LEAF".
Session with status "NLEAF" means waiting session and appear to be hanging. 
Session with status "IN_HANG" means there is a problem!
Session with status "IGN" or "IGN_DMP" means the session is idle.


Caution: Do not kill critical processes like SMON or PMON or the instance will get terminated.

Other Commands:

To Check the current trace file name:

SQL> oradebug tracefile_name

Flush any pending writes to the trace file and close it:

SQL> oradebug flush
SQL> oradebug close_trace

To trace a specific session with it's OS PID:

SQL> oradebug setospid

To trace a specific session with it's Oracle PID:

SQL> oradebug setorapid

I've created a script that can make this job easy:
https://www.dropbox.com/s/qrpbsr12avc675j/oradebug.sh?dl=0

Using ORADEBUG to freeze/suspend a session:

First, you have to get the PID for that session and set ORADEBUG as it:
SQL> oradebug setospid 12518

*12518 is the PID for the session you want to kill

Second Freeze/Hang the session even it running in a middle of something it will freeze:
SQL> oradebug suspend

To UnFreeze/Resume the session to continue its work:
SQL> oradebug resume


Trace Oracle Clusterware:

Trace CRS events:

SQL> oradebug dump crs 3

Trace CSS behaviour:

SQL> oradebug dump css 3

Trace OCR:

SQL> oradebug dump ocr 3

Using ORADEBUG to poke/wake up the SMON or PMON:

In case the SMON process is not timely cleaning up temporary segments, you can poke it by waking it up using oradebug:

First: you need to know the PID of SMON process:
SQL> select pid from v$process p, v$bgprocess b where b.paddr = p.addr and name='SMON';
       PID
----------
13
Second: Wake up the SMON using its pid:
SQL> oradebug wakeup 13

Same thing for PMON, if the PMON is not timely cleaning up the killed sessions traces & locks, you can poke it using oradebug:

First: Get the PMON's PID using this statement:
SQL> select pid from v$process p, v$bgprocess b where b.paddr = p.addr and name='PMON';
       PID
----------
2
Second: Wake up the PMON using its PID:
SQL> oradebug wakeup 2

Remember:
-You can execute oradebug from more than one sqlplus sessions.

-ORADEBUG utility is poorly documented by Oracle, because it's dangerous and can do a potential damage to the database, this is why it should be used under Oracle support supervision.

Reference: http://mgogala.byethost5.com/oradebug.pdf

No comments:

Post a Comment