Friday, February 14, 2014

Oracle Database Monitoring Script

The backbone monitoring solution for Oracle databases in my environment is dbalarm.sh script. I've developed it many years back and it proved itself successful on each complex/critical environment I've used on.

This script will report/monitor the following:
  > Report ORA- errors and TNS- errors in ALL databases ALERT LOGs.
  > Report ORA- errors and TNS- errors in ALL Listeners ALERT LOGs.
  > Monitor CPU utilization. (along with detailed analysis of top oracle sessions).
  > Monitor TABLESPACES utilization.
  > Monitor FLASH RECOVERY AREA (FRA) utilization.
  > Monitor ASM Disk Groups utilization.
  > Monitor mounted FILESYSTEMs.
  > Monitor database BLOCKING LOCKS.

This script is coded to send you the new errors that appear since the last script execution,
in other words, it will not report already reported errors unless they get logged again in the log files, and this is one of the key strengths in this script.


How dbalarm script works?

This script is very smart and very easy to use, just follow these three steps:

Step 1:
Download the script from this link:

Step 2:
Open the script and change the E-mail address to your email address in the line# 22
MAIL_LIST="youremail@yourcompany.com"

Note: sendmail service should be configured on your server to allow emails to be sent out from the machine.

Step 3:
By Oracle user:
In the crontab, schedule the script to run at least every 5 minutes:
# crontab -e
#Add this line:
*/5 * * * * /home/oracle/dbalarm.sh
Note: /home/oracle/dbalarm.sh is the full path that points to dbalarm script where /home/oracle is the Oracle user home directory.

In case you will schedule this script to run from root user crontab:
# crontab -e
#Add this line to schedule the run of dbalarm.sh script every 5minutes:
*/5 * * * * su - oracle -c /home/oracle/dbalarm.sh
Now the only thing remaining is to set back and relax and the script will report you all errors and all breached predefined threshold.

One thing more, you can adjust the threshold inside the script as per your preferences by altering the below red colored values in the THRESHOLDS section inside the script:

# #########################
# THRESHOLDS:
# #########################
# Modify the THRESHOLDS to the value you prefer:

FSTHRESHOLD=95 # THRESHOLD FOR FILESYSTEM %USED [OS]
CPUTHRESHOLD=95 # THRESHOLD FOR CPU %UTILIZATION [OS]
TBSTHRESHOLD=95 # THRESHOLD FOR TABLESPACE %USED [DB]
FRATHRESHOLD=95 # THRESHOLD FOR FRA %USED [DB]
ASMTHRESHOLD=95 # THRESHOLD FOR ASM DISK GROUPS [DB]
BLOCKTHRESHOLD=1         # THRESHOLD FOR BLOCKED SESSIONS#[DB]

Not only this! you can control the script features by enabling/disadling them in this section by using Y to enable or N to disable the feature:

# Enable/Disable Checking Listeners: [Default Enabled]
CHKLISTENER=Y

# Enable/Disable Database Down Alert: [Default Enabled]
CHKOFFLINEDB=Y

# Enable/Disable Goldengate Alert: [Default Disabled]
CHKGOLDENGATE=N

# Break down to DB Active sessions when CPU hit the threshold: [RECOMMENDED TO SET =N on very busy environments]
CPUDIGMORE=Y            # RUN more checks on DB side     [DB]

Also the script gives you the option to exclude specific database, tablespace, ASM Diskgroup, filesystm from having the scrip run against:
Here are the controls which you can edit:

# #######################################
# Excluded INSTANCES:
# #######################################
# Here you can mention the instances dbalarm will IGNORE and will NOT run against:
# Use pipe "|" as a separator between each instance name.
# e.g. Excluding: -MGMTDB, ASM instances:

EXL_DB="\-MGMTDB|ASM"                                                 #Excluded INSTANCES [Will not get reported offline].

# #########################
# Excluded TABLESPACES:
# #########################
# Here you can exclude one or more tablespace if you don't want to be alerted when they hit the threshold:
# e.g. to exclude "UNDOTBS1" modify the following variable in this fashion without removing "donotremove" value:
# EXL_TBS="donotremove|UNDOTBS1"
EXL_TBS="donotremove"

# #########################
# Excluded ASM Diskgroups:
# #########################
# Here you can exclude one or more ASM Disk Groups if you don't want to be alerted when they hit the threshold:
# e.g. to exclude "FRA" DISKGROUP modify the following variable in this fashion without removing "donotremove" value:
# EXL_DISK_GROUP="donotremove|FRA"
EXL_DISK_GROUP="donotremove"

# ################################
# Excluded FILESYSTEM/MOUNT POINTS:
# ################################
# Here you can exclude specific filesystems/mount points from being reported by dbalarm:
# e.g. Excluding: /dev/mapper, /dev/asm mount points:

EXL_FS="\/dev\/mapper\/|\/dev\/asm\/"                          #Excluded mount points [Will be skipped during the check].


Moreover, you can exclude specific ORA- TNS- errors in case they can be tolerated, I've already excluded some of minor errors to not get bothered about them:

# #########################
# Excluded ERRORS:
# #########################
# Here you can exclude the errors that you don't want to be alerted when they appear in the logs:
# Use pipe "|" between each error.

EXL_ALERT_ERR="ORA-2396|TNS-00507|TNS-12502|TNS-12560|TNS-12537|TNS-00505"              #Excluded ALERTLOG ERRORS [Will not get reported].
EXL_LSNR_ERR="TNS-00507|TNS-12502|TNS-12560|TNS-12537|TNS-00505"                        #Excluded LISTENER ERRORS [Will not get reported].
EXL_GG_ERR="donotremove"             #Excluded GoldenGate ERRORS [Will not get reported].


This script has been tested on Linux and SUN environments.

DISCLAIMER: THIS SCRIPT IS DISTRIBUTED IN THE HOPE THAT IT WILL BE USEFUL, BUT WITHOUT ANY WARRANTY. IT IS PROVIDED "AS IS".

If you're looking for a script to check the database health check on a daily basis, please follow this link: 
http://dba-tips.blogspot.ae/2015/05/oracle-database-health-check-script.html

More and more of smart and "easy to use" scripts for database administration tasks can be found in DBA Bundle in this link:
http://dba-tips.blogspot.ae/2014/02/oracle-database-administration-scripts.html

In case the download link is not working, please find below the full code:



4 comments:

hari said...

Hi GIThub,
I think this script DBAALARM is for Linux only.Is it compatible with AIX also?
Hari

Balajee said...

Excellent work. This is working fine

Thanks,
Balajee

Anonymous said...

Hi, I'm using your script. But ORA errors that get through the mail can not read.For example: ORA-01110: ТЮИК ДЮММШУ 228.

Mahmmoud ADEL said...

Hi Hari,
Actually I didn't test this script on an AIX environment, so I cannot give you an answer, but I've a doubt that all the functions inside the script will work there.