Thursday, June 18, 2020

18c Remarkable New Features

18c is the new name of 12.2.0.2, In 2018 Oracle started to name the new DB versions represent the year the product was released on. I've already covered the top remarkable new features of 12.1 and 12.2.0.1 in a previous post: http://dba-tips.blogspot.com/2019/11/a-summary-of-remarkable-new-features-in.html

In this post I'll cover the top new features of 18c which are not much!

Miscellaneous Features:

Shadow lost write protection feature introduced to minimize data loss and the time required to repair a database due to lost Writes.

To Enable:
  -- First: Create A shadow bigfile tablespace to store only system change numbers (SCNs) for tracked data files it will allocate 2% of the protected data:
  SQL> CREATE BIGFILE TABLESPACE SHADOW_WRITE DATAFILE '/oradata/shadow_lwp1.dbf' SIZE 1G LOST WRITE PROTECTION;

  -- Second: Enable the scope of protection:
     [Database wise]
      SQL> ALTER DATABASE ENABLE LOST WRITE PROTECTION;
     [Tablespace wise]
      SQL> ALTER TABLESPACE USERS ENABLE LOST WRITE PROTECTION;
     [Datafile wise]
    SQL> ALTER DATABASE DATAFILE '/oradata/dfile1.dbf' ENABLE LOST WRITE PROTECTION;


To Disable: [Use same above command but with DISABLE keyword]
      SQL> ALTER DATABASE DISABLE LOST WRITE PROTECTION;
     [Tablespace wise]
      SQL> ALTER TABLESPACE USERS DISABLE LOST WRITE PROTECTION;
     [Datafile wise]
    SQL> ALTER DATABASE DATAFILE '/oradata/dfile1.dbf' DISABLE LOST WRITE PROTECTION;


SQL statement level KILL instead of killing the whole session:
Starting from 18c you can cancel the current running SQL Statement, leaving its session remain connected:
  SQL> ALTER SYSTEM CANCEL SQL '<SID>, <SERIAL#>, <SQLID>';
i.e.
  SQL> ALTER SYSTEM CANCEL SQL '20, 51142, 8vu7s907prbgr';


ORACLE HOME read-only mode:
Starting from 18c Oracle Home can be configured in a read-only mode, thus preventing creation or modification of files inside the Oracle home directory. A read-only Oracle home can be used as a software image that can be shared across multiple database servers.
How to implement this feature:
While installing a new Oracle Database software, choose software-only option, then configure it as a read-only Oracle home before you create the listener and the database.

Password File is now under ORACLE_BASE:
New location for Password file under $ORACLE_BASE/dbs instead of $ORACLE_HOME/dbs

Schema only accounts:
In Oracle 18c there is a new account type called schema only account, where it can hold objects but no one can log in to it:

      SQL> create test no authentication;
or:
      SQL> alter user test no authentication;


New Initialization Parameters:

OPTIMIZER_IGNORE_HINTS [Default is FALSE]
This will force the Optimizer to ignore all impeded hints.

OPTIMIZER_IGNORE_PARALLEL_HINTS [Default is FALSE]
Will force the Optimizer to ignore PARALLEL hints.

FORWARD_LISTENER       
Forward all the incoming connection from REMOTE_LISTENER to a particular listener.

References:
https://apex.oracle.com/database-features

Stay tuned for the 19c new features post.

Thursday, June 11, 2020

Re-execution of Duplicate command on 19c DB fails with RMAN-06054: media recovery requesting unknown archived log

Problem:

On a 19.5 database, I was trying to duplicate a database from an RMAN backup, first time the duplicate failed after restoring 90% of the datafiles due to space limitation on the underlying datafiles location, I re-ran the same duplicate command for the second time without deleting the already restored datafiles to see if the 19c RMAN will be able to continue the previous duplicate without problems using this command:

I've restarted the auxiliary DB again in mount mode and re-ran the duplicate command:
run {
ALLOCATE AUXILIARY CHANNEL ch1 DEVICE TYPE DISK;
ALLOCATE AUXILIARY CHANNEL ch2 DEVICE TYPE DISK;
ALLOCATE AUXILIARY CHANNEL ch3 DEVICE TYPE DISK;
ALLOCATE AUXILIARY CHANNEL ch4 DEVICE TYPE DISK;
ALLOCATE AUXILIARY CHANNEL ch5 DEVICE TYPE DISK;
ALLOCATE AUXILIARY CHANNEL ch6 DEVICE TYPE DISK;
ALLOCATE AUXILIARY CHANNEL ch7 DEVICE TYPE DISK;
ALLOCATE AUXILIARY CHANNEL ch8 DEVICE TYPE DISK;
duplicate database to "AWSNFR"  backup location '/bkp' nofilenamecheck
UNTIL TIME "TO_DATE('31/05/2020 15:20:52', 'DD/MM/YYYY HH24:MI:SS')";
}


But the second run of the duplicate command failed with the following error after restoring the rest of datafiles, requesting a very old archivelog for recovering the database:

.....
Starting restore at 07-Jun-2020 11:04:26
creating datafile file number=19 name=/awsnfr/AWSNFRDB1/datafiles/p_fz_data.283.1039812437
creating datafile file number=20 name=/awsnfr/AWSNFRDB1/datafiles/p_fz_data.284.1039812457
creating datafile file number=21 name=/awsnfr/AWSNFRDB1/datafiles/p_fz_data.285.1039812475
creating datafile file number=24 name=/awsnfr/AWSNFRDB1/datafiles/p_fz_data.288.1039812529
creating datafile file number=25 name=/awsnfr/AWSNFRDB1/datafiles/p_fz_data.289.1039812549
creating datafile file number=27 name=/awsnfr/AWSNFRDB1/datafiles/p_fz_data.291.1039812587
creating datafile file number=28 name=/awsnfr/AWSNFRDB1/datafiles/p_fz_data.292.1039812605
creating datafile file number=42 name=/awsnfr/AWSNFRDB1/datafiles/p_fz_data.306.1039812873
creating datafile file number=43 name=/awsnfr/AWSNFRDB1/datafiles/p_fz_data.307.1039812891
creating datafile file number=57 name=/awsnfr/AWSNFRDB1/datafiles/p_fz_data.321.1039813155
creating datafile file number=58 name=/awsnfr/AWSNFRDB1/datafiles/p_fz_data.322.1039813173

......

Finished restore at 07-Jun-2020 12:05:28
.....
starting media recovery

unable to find archived log
archived log thread=1 sequence=7


RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of Duplicate Db command at 06/07/2020 12:05:31
RMAN-05501: aborting duplication of target database
RMAN-03015: error occurred in stored script Memory Script
RMAN-06054: media recovery requesting unknown archived log for thread 1 with sequence 7 and starting SCN of 2470350


Analysis:

You can notice from the above log, when duplicate command ran again, it started with re-creating (not restoring) some blank datafiles (in green color).

After the failure of the duplicate command, I ran the following command and figured out that those datafiles (in green color) are the only ones that having very low checkpoint_time and checkpoint_change# than the other datafiles:

col name for a60
col CREATION_TIME for a20
col checkpoint_time for a20
select FILE#||'|'||name name,fuzzy, creation_change#,
to_char(creation_time, 'DD-MON-YYYY HH24:MI:SS') as creation_time,
to_char(checkpoint_time, 'DD-MON-YYYY HH24:MI:SS') as checkpoint_time,
to_char(checkpoint_change#, '999999999999999999999') as checkpoint_change#
from v$datafile_header;


Solution:

I decided to restore those datafiles manually from RMAN:

RMAN> restore datafile 19,20,21,24,25,27,28,42,43,57,58;

RMAN> recover database;
This will recover the database starting from the right sequence till it fail when no more archivelogs are available inside the backup.

Then, because I was running an RMAN duplicate on a different server, I need to complete the rest of in-complete duplicate steps manually:

1- Generate a create controlfile script:
RMAN> alter database backup controlfile to trace as '/home/oracle/control.trc';
 

2- Edit the controlfile trace by replacing the NORESETLOGS keyword with RESETLOGS then delete the rest of script contents :

# vi /home/oracle/control.trc

STARTUP NOMOUNT
CREATE CONTROLFILE REUSE DATABASE "
AWSNFR" RESETLOGS 
....
CHARACTER SET AL32UTF8
;


Then remove all the lines below CHARACTER SET AL32UTF8 ; lines:

Make sure that all datafiles, redo logs are pointing to the right exist paths on the target server.

3- Shutdown the database and execute the controlfile creation script "which we modified above" to overwrite the current clone controlfile:
RMAN> SHUTDOWN IMMEDIATE
RMAN> @/home/oracle/control.trc


4- Open the database with resetlogs option:  
RMAN> alter database open resetlogs;

Voilà! Database opened.


Add a tempfile to the Temporary tablespace:

RMAN> alter tablespace temp add tempfile;


Conclusion:
In Oracle 19c, RMAN can continue the duplicate from where it stopped, it will catch from where it stopped in the previous execution, but the catch here, it will not be able to re-restore the datafiles that were in the middle restoration during the last duplicate execution, so it will try to create them as blank datafiles, during the recovery it will try to recover these datafiles from the time they were created on the source DB! Of course most probably those archivelogs are since the creation of the source database itself, and they are not available anymore. The DBA has to correct this behavior by restore those datafiles manually from the same RMAN backup the duplicate was using. Still, the re-execution of the duplicate operation on 19c is not a fail proof!

Wednesday, June 3, 2020

19c Grid Infrastructure Installation when running root.sh with PRKH-1010 : Unable to communicate with CRS services.

Problem:

While executing root.sh during a 19c Grid Infrastructure on a two RAC nodes I experienced this error:

2020/05/18 10:22:47 CLSRSC-594: Executing installation step 18 of 19: 'ConfigNode'.
2020/05/18 10:22:59 Oracle Clusterware infrastructure error in CLSECHO (OS PID 27328): ADR home path /u01/oracle/diag/crs/clssrv1-vip2/crs does not exist; ADR initialization will try to create it
CLSRSC-594: Executing installation step 19 of 19: 'PostConfig'.
PRKH-1010 : Unable to communicate with CRS services.

PRKH-3003 : An attempt to communicate with the CSS daemon failed

2020/05/18 10:23:31 CLSRSC-180: An error occurred while executing the command 'srvctl start listener -l LISTENER'
Died at /u01/grid/12.2.0.3/crs/install/crsutils.pm line 12160. 


Analysis:

When checked Server static host name it wasn't set properly:

 # cat /etc/hostname
localhost.localdomain

# hostnamectl | grep hostname
Static hostname: localhost.localdomain
Transient hostname: clssrv1-vip2


# hostnamectl --static
localhost.localdomain 


Fix:

Set the server static host name properly using this Linux command:

# hostnamectl set-hostname clssrv1

Now the static name is showing the correct value of server name:

# hostnamectl
   Static hostname: clssrv1         
         Icon name: computer-server
           Chassis: server
        Machine ID: r93o4kde7b4d8o333jr4406686a4e
           Boot ID: 6b34567b7a3577767a1e3768306d954
  Operating System: Oracle Linux Server 7.4
       CPE OS Name: cpe:/o:oracle:linux:7:4:server
            Kernel: Linux 4.1.12-94.3.9.el7uek.x86_64
      Architecture: x86-64


Now re-execute the root.sh script. If it fails again remove the Grid Infrastructure installation and re-install it again.