Monday, July 12, 2021

Step By Step How To Add a New Node To a 19c Cluster

One bad news before we start: In 19c versions older than 19.8 you will not be able to add a new node to the cluster using this procedure due to bug 30195027 [Doc ID 30195027.8]
The workaround is to upgrade your GRID_HOME to 19.8 and above using RU patch, I'll shed some light on a similar GRID_HOME upgrade task using zero downtime technique in a future a post.


In this demo:

The cluster is already having one node available with name clsn1, and we are adding one extra node with name clsn2.
orcl   refers to the cluster database name
orcl2 refers to the new instance name on the new node.
clsn2-vip refers to the virtual name of the new node.

Step1: Pre-requisites: [On the new node]

Note: Please strictly follow the prerequisites instructions, failing to do so will lead to mind-blowing strange errors during the process of adding the node to the cluster!

- Install the same OS and kernel version
as similar to the existing nodes in the cluster.

- Install Oracle's required packages: [Applicable for Linux 7]
  # wget http://public-yum.oracle.com/public-yum-ol7.repo
  # yum install -y oracle-database-preinstall-19c gcc gcc-c++ glibc-devel glibc-headers elfutils-libelf-devel gcc gcc-c++ kmod-libs kmod unixODBC unixODBC-devel dtrace-modules-headers
  # yum install -y fontconfig-devel libXrender-devel librdmacm-devel python-configshell targetcli compat-libstdc++-33
  # yum install -y oracleasm-support
 

- Provision the same ASM disks which already provisioned on the other nodes of the cluster to the new node.

- Create the same Groups & Oracle users with the same IDs as the existing nodes in the cluster:
  # groupadd -g 54322 dba
    groupadd -g 54324 backupdba
    groupadd -g 54325 dgdba
    groupadd -g 54326 kmdba
    groupadd -g 54327 asmdba
    groupadd -g 54328 asmoper
    groupadd -g 54329 asmadmin
    groupadd -g 54330 racdba

Note: Below example will add or modify oracle user, I'm doing this for oracle user only because I'm using the same oracle user "oracle" as the owner of both GRID and ORACLE DB homes.

  # useradd oracle -u 54321 -g oinstall -G dba,oper,asmdba,backupdba,dgdba,kmdba,racdba,asmadmin,asmdba,asmoper
  # usermod oracle -u 54321 -g oinstall -G dba,oper,asmdba,backupdba,dgdba,kmdba,racdba,asmadmin,asmdba,asmoper


- Copy the oracle user's .bash_profile from any of the other nodes in the cluster, replacing the node name and instance name with the right values accordingly.

- Set up the same network configuration comparing these files with other nodes in the cluster: /etc/hosts, /etc/resolv.conf
- Protect resolv.conf file from getting changed: # chattr +i /etc/resolv.conf

- Adjust MTU for loopback device: /etc/sysconfig/network-scripts/ifcfg-lo adding this parameter: MTU=16436 and execute this command: # ifconfig lo mtu 16436

- Set the NOZEROCONF parameter in /etc/sysconfig/network adding this parameter: NOZEROCONF=yes

- Setup the same OS settings as similar to other settings in the other nodes: /etc/sysctl.conf, /etc/security/limits.conf

- Stop/Disable Avahi-daemon
:
# systemctl stop avahi-daemon; systemctl disable avahi-daemon; systemctl status avahi-daemon

- Disable SELinux & Firewall:
Adding parameter: SELINUX=disabled to /etc/selinux/config ,
# systemctl stop firewalld; systemctl disable firewalld; systemctl status firewalld

- Set Server Timezone similar to other nodes in the cluster: Check /etc/localtime

- NTP configuration should be same as other nodes [if it's being used]: /etc/chrony.conf

- Configure password less SSH:

  Generate new key On the new node: # ssh-keygen -t rsa
  Copy the key to the other nodes in the cluster: e.g.  # ssh-copy-id oracle@clsn1  # cat ~/.ssh/id_rsa.pub | ssh oracle@clsn2 "mkdir -p ~/.ssh && chmod 700 ~/.ssh && cat >> ~/.ssh/authorized_keys && chmod 600 ~/.ssh/authorized_keys"
  Test the connectivity from the new node to the other nodes in the cluster: # ssh oracle@clsn1
  Copy the keys from each active node in the cluster to the new node: # ssh-copy-id oracle@clsn2
  Test the connectivity from other nodes to the new node: # ssh oracle@clsn2

- Change the ownership of the filesystem where Oracle installation files will be installed:

  # chown oracle:oinstall /u01
  # chmod 774 /u01


- Scan ASM disks: [As root on New node]
Note: Same disks on other cluster nodes should be provisioned to the new node as well before do the scanning.

  # oracleasm scandisks
  # oracleasm listdisks


Note: Make sure all ASM disks can be listed, comparing the result with other exist nodes.

- Compare the configuration between existing node and new node: [As GRID Owner - On any of the existing nodes in the cluster]

  # $GRID_HOME/bin/cluvfy comp peer -refnode <exist_node_in_the_cluster> -n <new_node> -orainv orainventory_group -osdba osdba_group -verbose
e.g.
  # $GRID_HOME/bin/cluvfy comp peer -refnode
clsn1 -n clsn2-orainv oinstall -osdba dba -verbose
 

- Workaround INS-06006 Passwordless bug: [As grid owner]

  # echo "export SSH_AUTH_SOCK=0" >> ~/.bashrc
  # export SSH_AUTH_SOCK=0


Step 2: Clone GRID_HOME to the new node and Add it to the cluster: [As GRID Owner - On any of the existing nodes in the cluster]

# export IGNORE_PREADDNODE_CHECKS=Y
# $GRID_HOME/addnode/addnode.sh -silent "CLUSTER_NEW_NODES={
clsn2}" "CLUSTER_NEW_VIRTUAL_HOSTNAMES={clsn2-vip}" -ignorePrereq -ignoreSysPrereqs

   Monitor the operation's log for errors:
    # tail -f /u01/oraInventory/logs/addNodeActions`date -u +"%Y-%m-%d_%H-%M"`*.log


At the End execute root.sh [As root - On the New Node]
# $GRID_HOME/root.sh

    Troubleshooting:
    In case of Error: scp: /u01/grid/12.2.0.3/gpnp/profiles/peer/profile.xml: No such file or directory
    Solution: Copy profile.xml on the RAC node you issued addnode.sh from: cp -p $GRID_HOME/gpnp/clsn1/profiles/peer/profile.xml  $GRID_HOME/gpnp/profiles/peer/profile.xml
         Then run root.sh again on the new node

Step 3: Clone the ORACLE HOME to the new node: [As oracle - On any of the existing nodes in the cluster]

# $ORACLE_HOME/addnode/addnode.sh -silent "CLUSTER_NEW_NODES={clsn2}" -ignorePrereqFailure -ignoreSysPrereqs

At the End execute root.sh [As root - On the New Node]
# $ORACLE_HOME/root.sh

Step 4: Start ACFS: If exist [As root - on the New Node]

# $GRID_HOME/bin/srvctl start filesystem -device <volume_device_name> -node clsn2

Step 5: Check the cluster integrity:

# cluvfy stage -post nodeadd -n clsn2 -verbose

Step 6: Add DB instance to New Node
: [As oracle - on the New Node]

# dbca -silent -ignorePrereqFailure -addInstance -nodeName clsn2 -gdbName orcl -instanceName orcl2 -sysDBAUserName sys -sysDBAPassword oracle#123#

-gdbName            Provide the same value in DB_UNIQUE_NAME if the parameter is set.
-instanceName     The name of the instance on the new node.


Note: This will add a new instance to the new node, add an extra REDOLOG thread plus one more UNDO tablespace as well dedicated for the new instance.

Add the new instance as a preferred instance to the DB Services: [Do this for each service]

# srvctl modify service -d orcl -s reporting_svc -n -i orcl1,orcl2

References: 

https://docs.oracle.com/en/database/oracle/oracle-database/19/cwadd/adding-and-deleting-cluster-nodes.html#GUID-929C0CD9-9B67-45D6-B864-5ED3B47FE458
 

No comments:

Post a Comment