Wednesday, March 2, 2022

ora_lms Consume Lots of Memory and May Lead to Memory Starvation in RAC 19c

Problem:

ora_lms process is consuming too much memory in 19c RAC database leading to memory resource starvation and heavily swapping ends up with having Nodes to evict from the cluster.

Facts:

- Don't try to kill the ora_lms process from OS (thinking it will restart itself), this will terminate the whole DB instance.

- The number of LMS processes is determined by the number of CPUs being used in the server. 

For example:

    n = "number of CPUs reported by the OS and used for CPU_COUNT by default"
         n < 4            => 1 LMS process will be started
         4 <= n < 16 => 2 LMS processes will be started
         n >=16        => 2 LMS + (1 LMS processes for every 32 CPU) will be started.

Reference: Doc ID 1392248.1

Analysis:

This issue is Bug.31969719 and currently being worked by Oracle's Development team.

Solution:

Implement Huge pages on OS side on all RAC nodes.

How to implement HugePages? here you go:

1- Calculate the number of huge pages using this formula:

vm.nr_hugepages = [SGA_MAX of all instances(KB) Except ASM / HugePageSize_you_willing_to_set(KB)] + 6 

i.e. If the SGA_MAX is 10G then the number of huge pages will be calculated as follows:
vm.nr_hugepages = [(10 x 1024 x 1024)  /  2048]  +  6  =
5126

2- Add
vm.nr_hugepages to /etc/sysctl.conf
# vi
/etc/sysctl.conf
vm.nr_hugepages=5126

3- Update memlock value for the database user owner in limits.conf: [in KB]

Make sure that memlock size in KB is equal or higher than the size of [HugePages Number x HugePage Size(KB)] =
5126 x 2048 = ‭10498048‬

# vi /etc/security/limits.conf
oracle soft memlock ‭10498048‬
oracle; hard memlock ‭
‭10498048‬

Verify the setting using this command after you log out and login back into oracle OS user:

# ulimit -l
10498048‬

3- Force the DB instance to use Huge Pages when instance startup:

SQL> alter system set use_large_pages=only scope=spfile sid='*';

4- Enable HugePages and Disable transparent_hugepage in the bootloader:

Edit the bootloader with the number of the HugePages along with the HugePage Size:
For Linux 7 and higher: Use grubby command to modify the bootloader:

[Get the default kernel boot path]
# grubby --default-kernel
/boot/vmlinuz-4.1.12-94.3.9.el7uek.x86_64


[Use the path from the previous output to update the default kernel with HugePages settings]:

# grubby --args="transparent_hugepage=never hugepagesz=2M hugepages=
5126 default_hugepagesz=2M" --update-kernel /boot/vmlinuz-4.1.12-94.3.9.el7uek.x86_64
 

5- Restart all RAC nodes one by one and make sure the following message in the alertlog has no errors when the instance startup:

 Supported system pagesize(s):
  PAGESIZE  AVAILABLE_PAGES  EXPECTED_PAGES  ALLOCATED_PAGES  ERROR(s)
        4K       Configured              11              11        NONE
     2048K            5126            5120            5120       NONE


No comments:

Post a Comment