Wednesday, November 1, 2017

Shell Script To Monitor/Check The Lag/Sync Status Between Primary And Standby Oracle Database

Script Description:
This Linux shell script runs from the PRIMARY DB server to monitor and checks the lag/synchronization status between Primary and Standby Oracle DBs and reports the lag [if happened] to the user.

Download the script:
https://www.dropbox.com/s/a9k3rn928vj3ooc/check_standby_lag.sh?dl=0

How to use the script:
 
First: On the Primary DB: Create a DBA user to be used inside the script to check the dataguard status:
e.g. This is just n example, you can create the user with a proper name and password.
 
SQL> create user DBA_USER identified by "ABC123" profile;
SQL> grant dba to
DBA_USER;
 

Second: Unfortunately, this script will not be able to know what is the primary, standby database names itself; so you must manually modify the variables' values in orange color in the Variables section to match your environment: [Those variables are at the begging of the script]

# ######################################
# Variables MUST be modified by the user: [Otherwise the script will not work]
# ######################################

# Here you replace youremail@yourcompany.com with your Email address:
EMAIL="youremail@yourcompany.com"

# Replace ${ORACLE_SID} with the Primary DB instance SID:
ORACLE_SID=${ORACLE_SID}

# Replace STANDBY_TNS_ENTRY with the Standby Instance TNS entry you configured in the primary site tnsnames.ora file: e.g. orcl_standby
DRDBNAME=STANDBY_DB

# Replace ${ORACLE_HOME} with the ORACLE_HOME path on the primary server:
ORACLE_HOME=${ORACLE_HOME}

# Log Directory Location:
LOG_DIR='/tmp'
 
# Here you replace DBA_USER with a real user having DBA privilege:
ID=DBA_USER

# Here you replace ABC123 with the DBA user password on the standby DB:
CRD='ABC123'

# Replace "5" with the number of LAGGED ARCHIVELOGS if reached an Email alert will be sent to the receiver:
LAGTHRESHOLD=5


You can schedule this script in the crontab to run e.g. every 15 minutes to get you alerted if the number of lagged archive logs cross the predefined threshold.

Reminder: This script runs from the primary DB server NOT from the standby server.
Hope you will find it useful.

This script is part of DBA Bundle scripts: http://dba-tips.blogspot.ae/2014/02/oracle-database-administration-scripts.html

#!/bin/bash
# #########################################################################################
# This script MUST run from the Primary DB server.
# It checks the LAG between Primary & Standby database
# To be run by ORACLE user
# # # #
# Author: Mahmmoud ADEL # # # # ###
# Created: 29-10-2015 # # # # #
# Modified: 21-04-2021 allowed a DBA user to be used for SQLPLUS login instead of SYSDBA
# 21-04-2021 Simplified the Variables section for the end user.
#
# #########################################################################################
# ######################################
# Variables MUST be modified by the user: [Otherwise the script will not work]
# ######################################
# Here you replace youremail@yourcompany.com with your Email address:
EMAIL="youremail@yourcompany.com"
# Replace ${ORACLE_SID} with the Primary DB instance SID:
ORACLE_SID=${ORACLE_SID}
# Replace STANDBY_TNS_ENTRY with the Standby Instance TNS entry you configured in the primary site tnsnames.ora file:
DRDBNAME=STANDBY_DB
# Replace ${ORACLE_HOME} with the ORACLE_HOME path on the primary server:
ORACLE_HOME=${ORACLE_HOME}
# Log Directory Location:
LOG_DIR='/tmp'
# Here you replace DBA_USER with a real user having DBA privlege:
ID=DBA_USER
# Here you replace ABC123 with the DBA user password on the standby DB:
CRD='ABC123'
# Replace "5" with the number of LAGGED ARCHIVELOGS if reached an Email alert will be sent to the receiver:
LAGTHRESHOLD=5
export EMAIL
export ORACLE_SID
export DRDBNAME
export ORACLE_HOME
export LOG_DIR
export ID
export CRD
export LAGTHRESHOLD
# #############################################
# Other variables will be picked automatically:
# #############################################
SCRIPT_NAME="check_standby_lag.sh"
export SCRIPT_NAME
SRV_NAME=`uname -n`
export SRV_NAME
LNXVER=`cat /etc/redhat-release | grep -o '[0-9]'|head -1`
export LNXVER
MAIL_LIST="-r ${SRV_NAME} ${EMAIL}"
export MAIL_LIST
# Neutralize login.sql file:
# #########################
# Existance of login.sql file under current working directory eliminates many functions during the execution of this script:
if [ -f ./login.sql ]
then
mv ./login.sql ./login.sql_NeutralizedBy${SCRIPT_NAME}
fi
# #########################################
# Script part to execute On the Primary:
# #########################################
# Check the current Redolog sequence number:
PRDBNAME_RAW=$(${ORACLE_HOME}/bin/sqlplus -S "/ as sysdba" << EOF
select name from v\$database;
exit;
EOF
)
PRDBNAME=`echo ${PRDBNAME_RAW} | awk '{print $NF}'`
PRSEQ_RAW=$(${ORACLE_HOME}/bin/sqlplus -S "/ as sysdba" << EOF
select max(sequence#) from v\$archived_log;
exit;
EOF
)
PRSEQ=`echo ${PRSEQ_RAW} | awk '{print $NF}'`
export PRSEQ
# #########################################
# Script part to execute On the STANDBY:
# #########################################
# Get the last applied Archive Sequence number from the Standby DB:
DRSEQ_RAW=$(${ORACLE_HOME}/bin/sqlplus -S "/ as sysdba" << EOF
conn ${ID}/"${CRD}"@${DRDBNAME}
select max(sequence#) from v\$archived_log where applied='YES';
exit;
EOF
)
DRSEQ=`echo ${DRSEQ_RAW} | awk '{print $NF}'`
export DRSEQ
# Compare Both PRSEQ & DRSEQ to detect the lag:
# ############################################
LAG=$((${PRSEQ}-${DRSEQ}))
export LAG
if [ ${LAG} -ge ${LAGTHRESHOLD} ]
then
${ORACLE_HOME}/bin/sqlplus -S "/ as sysdba" << EOF
set linesize 1000 pages 100
spool ${LOG_DIR}/DR_LAST_APPLIED_SEQ.log
PROMPT Current Log Sequence on the Primary DB:
PROMPT ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
archive log list
PROMPT
PROMPT Last Applied Log Sequence# on the Standby DB:
PROMPT ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
conn ${ID}/"${CRD}"@${DRDBNAME}
set linesize 1000 pages 100
select THREAD#,max(SEQUENCE#) from V\$ARCHIVED_LOG where APPLIED='YES' group by THREAD#;
exit;
EOF
# Send Email with LAG details:
echo "Sending an Email alert ..."
mail -s "ALARM: DR DB [${DRDBNAME}] is LAGGING ${LAG} sequences behind Primary DB [${PRDBNAME}] on Server [${SRV_NAME}]" ${MAIL_LIST} < ${LOG_DIR}/DR_LAST_APPLIED_SEQ.log
fi
echo
echo Primary DB Sequence is: ${PRSEQ}
echo Standby DB Sequence is: ${DRSEQ}
echo Number of Lagged Archives Between Primary and Standby is: ${LAG}
echo
# De-Neutralize login.sql file:
# ############################
# If login.sql was renamed during the execution of the script revert it back to its original name:
if [ -f ./login.sql_NeutralizedBy${SCRIPT_NAME} ]
then
mv ./login.sql_NeutralizedBy${SCRIPT_NAME} ./login.sql
fi
# #############
# END OF SCRIPT
# #############
# REPORT BUGS to: mahmmoudadel@hotmail.com
# DOWNLOAD THE LATEST VERSION OF DATABASE ADMINISTRATION BUNDLE FROM:
# http://dba-tips.blogspot.com/2014/02/oracle-database-administration-scripts.html
# DISCLAIMER: THIS SCRIPT IS DISTRIBUTED IN THE HOPE THAT IT WILL BE USEFUL, BUT WITHOUT ANY WARRANTY. IT IS PROVIDED "AS IS".

6 comments:

  1. I am getting this error:
    ./check_standby_lag.sh: line 102: selected.-selected.: syntax error: invalid arithmetic operator (error token is ".-selected.")
    ./check_standby_lag.sh: line 105: [: -ge: unary operator expected

    ReplyDelete
    Replies
    1. The script is failing when it's trying to execute the following query on DR from the primary DB:

      conn SYS/"${CRD}"@${DRDBNAME} AS SYSDBA
      select max(sequence#) from v$archived_log where applied='YES';

      ${CRD} is SYS password
      ${DRDBNAME} is the Standby tns name

      Make sure that you can run the above query after replacing the variables against the standby DB from the primary DB.

      Delete
  2. Any fix for the script? It works manually executing, but fails on crontab job. The same error.

    ReplyDelete
  3. Am trying to modify this script to check for RAC instance by using thread#1 , 2 etc but somehow if part it is not going.

    if [ ${LAG} -ge ${LAGTHRESHOLD} ] | if [ ${LAG1} -ge ${LAGTHRESHOLD} ]; then
    ${ORACLE_HOME}/bin/sqlplus -S "/ as sysdba" << EOF

    ReplyDelete
    Replies
    1. I've just modified the script to make it easy to use.
      Please note in the new script, you don't need to use SYS account, just user a user granted DBA role privilege.

      I've modified the article as well, kindly read it one more time for better understanding of the instructions, and let me know if you're still facing any issues.

      Delete
  4. the script is not working for RAC.

    ReplyDelete