Data Guard 19c in AWS with far sync

Posted on September 5, 2019 by Brian Fitzgerald

By Brian Fitzgerald

Introduction

Oracle Data Guard 19c with far sync setup is described here. Far sync can improve commit response time in a Maximum Availability Data Guard network. The Data Guard configuration is EC2 across two AWS Cloud regions. Database storage is ASM. Far sync creation is done using RMAN. The physical standby and far sync are implemented in a single configuration step. Some observations on network latency and switchover timing are shown.

License

Data Guard is a feature of the Oracle Database Enterprise Edition itself and does not require separate licensing. An Active Data Guard license is required for Far sync.

By using Amazon Elastic Cloud Computing, (EC2), you can control your license costs by configuring only the CPUs that you need.

Environment overview

A system overview is described in this table:

Description	Value
Cloud	AWS
Image ID	ami-2051294a
Red Hat version	7.2
EC2 InstanceType	m3.medium
Memory	3.75 GB
CPU	1
Swap	2 GB
Grid software owner	grid
Grid Infrastructure Version	19.3.0
Database Storage	ASM
Oracle software owner	oracle
Oracle Database Version	19.3.0
Oracle Instance Type	Restart

AWS instance type was initially m3.large (8 GB, 2 CPU), and then downsized after the grid and oracle home installations were complete. For additional information on the grid install, please refer to grid 19c install with ASM filter driver. For the network description, please refer to Data Guard network in AWS. The breakdown by region, availability zone, host, and role is:

description	primary	far sync	standby
Region	N. Virginia	N. Virginia	N. California
availability zone	us-east-1c	us-east-1d	us-west-1b
ip address	172.31.86.22	172.31.28.23	172.32.10.34
hostname -s	ip-172-31-86-22	ip-172-31-28-23	ip-172-32-10-34
db_unique_name	NY	FS	SF

Network latency

We can measure the network latency from the primary to the far sync and to the standby. Start qperf server on the far sync:

[ec2-user@ip-172-31-28-23 ~]$ qperf

Start qperf server on the standby

[ec2-user@ip-172-32-10-34 ~]$ qperf

Measure bandwidth and latency between two N. Virginia availability zones:

[ec2-user@ip-172-31-86-22 ~]$ qperf 172.31.28.23 tcp_bw tcp_lat
tcp_bw:
    bw  =  92.6 MB/sec
tcp_lat:
    latency  =  254 us

Measure bandwidth and latency between regions N. Virginia and N. California:

[ec2-user@ip-172-31-86-22 ~]$ qperf 172.32.10.34 tcp_bw tcp_lat
tcp_bw:
    bw  =  18.4 MB/sec
tcp_lat:
    latency  =  33.3 ms

Network bandwidth is approximately 5x higher, and latency is more than 100x lower across in-region availability zones, compared to cross-region. This fact motivates the far-sync. Marcin Przepiorowski actually showed that far sync can improve performance in cases where the network latency to the standby is higher than the latency to the far sync.

Database configuration

The initial primary instance configuration was:

*.archive_lag_target=900
*.audit_file_dest='/u01/app/oracle/admin/NY/adump'
*.audit_trail='db'
*.compatible='19.3.0'
*.control_files='+RECO01/NY/CONTROLFILE/current.486.1018151799','+DATA01/NY/CONTROLFILE/current.273.1018151799'
*.db_block_size=8192
*.db_create_file_dest='+DATA01'
*.db_create_online_log_dest_1='+RECO01'
*.db_create_online_log_dest_2='+DATA01'
*.db_name='ORCL'
*.db_recovery_file_dest='+RECO01'
*.db_recovery_file_dest_size=4000m
*.db_unique_name='NY'
*.dg_broker_config_file1='+DATA01/NY/dr1orcl.dat'
*.dg_broker_config_file2='+RECO01/NY/dr2orcl.dat'
*.diagnostic_dest='/u01/app/oracle'
*.dispatchers='(PROTOCOL=TCP) (SERVICE=ORCLXDB)'
*.local_listener='LISTENER_NY'
*.log_archive_format='%t_%s_%r.dbf'
*.nls_language='AMERICAN'
*.nls_territory='AMERICA'
*.open_cursors=300
*.pga_aggregate_target=360m
*.processes=300
*.remote_login_passwordfile='EXCLUSIVE'
*.sga_target=1080m
*.standby_file_management='AUTO'
*.undo_tablespace='UNDOTBS1'

If not already done, make these changes:

SQL> alter system set standby_file_management=AUTO;

System altered.

SQL> alter system set dg_broker_config_file1='+DATA01/NY/dr1orcl.dat';

System altered.

SQL> alter system set dg_broker_config_file2='+RECO01/NY/dr2orcl.dat';

System altered.

Optional settings

compatible. Must be set to the same value on the the primary, on the far sync, and on the standby.
db_create_online_log_dest_n. Points logfile members to specific disk groups. Simplifies alter database add standby logfile syntax.
archive_lag_target. Time-boxes each archive log. Adjust to manage control file contention.

Alter database

Make sure these alter database alterations are done:

SQL> alter database force logging;

Database altered.

Make these changes with the database mounted:

SQL> shutdown immediate
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> startup mount
ORACLE instance started.

Total System Global Area 1140849904 bytes
Fixed Size                  8895728 bytes
Variable Size             301989888 bytes
Database Buffers          822083584 bytes
Redo Buffers                7880704 bytes
Database mounted.
SQL> alter database archivelog;

Database altered.

SQL> alter database flashback on;

Database altered.

SQL> alter database open;

Database altered.

Standby logs

Identify the online redo log size. Check that all online redo logs are the same size. Check that this query returns exactly one row:

SQL> select count(*)numlogs, bytes/1024/1024 mb from v$log group by bytes;

   NUMLOGS         MB
---------- ----------
         3        200

If all online logs are not the same size, correct that condition before proceeding.

If standby redo logs exist, check that they are all the same size, and the same size as the online redo logs. Check that this query returns exactly one row:

SQL> select count(*)numlogs, bytes/1024/1024 mb from v$standby_log group by bytes;

   NUMLOGS         MB
---------- ----------
         4        200

Correct discrepant conditions before proceeding.

Create standby logs

Create standby logs, if needed. You can set:

SQL> alter system set db_create_online_log_dest_1 = '+DATA01';

System altered.

SQL> alter system set db_create_online_log_dest_2 = '+RECO01';

System altered.

And then run, for example:

SQL> alter database add standby logfile size 200m;

as many times as needed to get the desired number of standby logs. The optimal number of standby logs is usually greater than the number of online logs. If a high apply backlog is expected, then increase this number further.

Static listeners

On all hosts, primary, far sync, and standby, in the grid account, in $ORACLE_HOME/network/admin, edit listener.ora.

Primary:

SID_LIST_LISTENER =
  (SID_LIST =
    (SID_DESC =
      (ORACLE_HOME = /u01/app/oracle/product/19.3.0/dbhome_1)
      (SID_NAME = NY)
    )
  )

Far sync:

SID_LIST_LISTENER =
  (SID_LIST =
    (SID_DESC =
      (ORACLE_HOME = /u01/app/oracle/product/19.3.0/dbhome_1)
      (SID_NAME = FS)
    )
  )

Standby:

SID_LIST_LISTENER =
  (SID_LIST =
    (SID_DESC =
      (ORACLE_HOME = /u01/app/oracle/product/19.3.0/dbhome_1)
      (SID_NAME = SF)
    )
  )

Reload the listener. For example:

[grid@ip-172-31-86-22 ~]$ lsnrctl reload

LSNRCTL for Linux: Version 19.0.0.0.0 - Production on 02-SEP-2019 12:28:09

Copyright (c) 1991, 2019, Oracle. All rights reserved.

Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=ip-172-31-86-22.ec2.internal)(PORT=1521)))
The command completed successfully

startup far sync

On the far sync configure and startup nomount

Far sync audit directory

At the far sync

[oracle@ip-172-32-10-34 ~]$ mkdir -p /u01/app/oracle/admin/FS/adump

Far sync oratab and environment

[oracle@ip-172-31-28-23 ~]$ echo FS:/u01/app/oracle/product/19.3.0/dbhome_1:N >> /etc/oratab
[oracle@ip-172-31-28-23 ~]$ . oraenv
ORACLE_SID = [FS] ? FS
The Oracle base remains unchanged with value /u01/app/oracle

Far sync orapwd

Create orapwFS on the far sync:

[oracle@ip-172-31-28-23 ~]$ alias oh
alias oh='cd $ORACLE_HOME'
[oracle@ip-172-31-28-23 ~]$ oh
[oracle@ip-172-31-28-23 dbhome_1]$ cd dbs
[oracle@ip-172-31-28-23 dbs]$ orapwd file=orapwFS entries=10 password=zystm.22

Far sync startup

Create a temporary, minimal pfile on the far sync. Set the compatible setting to match the primary.

[oracle@ip-172-31-28-23 ops]$ cat > /tmp/initFS.ora
db_name = ORCL
compatible = '19.3.0'
^D
[oracle@ip-172-31-28-23 dbs]$ sqlplus / as sysdba

SQL*Plus: Release 19.0.0.0.0 - Production on Sat Aug 31 22:01:58 2019
Version 19.3.0.0.0

Copyright (c) 1982, 2019, Oracle.  All rights reserved.

Connected to an idle instance.

SQL> startup nomount pfile='/tmp/initFS.ora'
ORACLE instance started.

Total System Global Area  243268216 bytes
Fixed Size                  8895096 bytes
Variable Size             180355072 bytes
Database Buffers           50331648 bytes
Redo Buffers                3686400 bytes
SQL> Disconnected from Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.3.0.0.0

Startup the standby

On the standby configure and startup nomount

Standby audit directory

At the standby

[oracle@ip-172-32-10-34 ~]$ mkdir -p /u01/app/oracle/admin/SF/adump

Standby oratab and environment

[oracle@ip-172-32-10-34 ~]$ echo SF:/u01/app/oracle/product/19.3.0/dbhome_1:N >> /etc/oratab
[oracle@ip-172-32-10-34 ~]$ . oraenv
ORACLE_SID = [SF] ? SF
The Oracle base remains unchanged with value /u01/app/oracle

Standby orapwd

Create orapwSF on the standby:
[oracle@ip-172-32-10-34 ~]$ oh
[oracle@ip-172-32-10-34 dbhome_1]$ cd dbs
[oracle@ip-172-32-10-34 dbs]$ orapwd file=orapwSF entries=10 password=zystm.22

Standby startup

Create a temporary, minimal pfile on the standby. Set the compatible setting to match the primary.

[oracle@ip-172-32-10-34 ops]$ cat > /tmp/initSF.ora
db_name = ORCL
compatible = '19.3.0'
^D
[oracle@ip-172-32-10-34 dbs]$ sqlplus / as sysdba

SQL*Plus: Release 19.0.0.0.0 - Production on Sat Aug 31 22:01:58 2019
Version 19.3.0.0.0

Copyright (c) 1982, 2019, Oracle.  All rights reserved.

Connected to an idle instance.

SQL> startup nomount pfile='/tmp/initSF.ora'
ORACLE instance started.

Total System Global Area  243268216 bytes
Fixed Size                  8895096 bytes
Variable Size             180355072 bytes
Database Buffers           50331648 bytes
Redo Buffers                3686400 bytes
SQL> Disconnected from Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.3.0.0.0

tnsnames.ora

On all hosts, in the oracle account, in $ORACLE_HOME/network/admin/tnsnames.ora, add these entries:

NY =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = 172.31.86.22)(PORT = 1521))
    (CONNECT_DATA =
      (SID = NY)
    )
  )

FS =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = 172.31.28.23)(PORT = 1521))
    (CONNECT_DATA =
      (SID = FS)
    )
  )

SF =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = 172.32.10.34)(PORT = 1521))
    (CONNECT_DATA =
      (SID = SF)
    )
  )

Connectivity checklist

Check that you have configured all of these:

VPC enable DNS hostnames
DNS resolution across the peering connection
Route tables across the peering connection
Security groups
Static listener
orapwd
tnsnames.ora
ORACLE_HOME environment variable

Test connectivity

From the primary, test connectivity to all three instances:

sqlplus sys/zystm.22@OH as sysdba
sqlplus sys/zystm.22@VA_FS as sysdba
sqlplus sys/zystm.22@OH as sysdba

From the far sync and the standby, repeat these checks.

Duplicate primary to far sync

Now we are going to kick off an RMAN script. RMAN will transfer the spfile, the password file, and the control file.

[oracle@ip-172-31-86-22 ops]$ cat dup.db.farsync.rcv
DUPLICATE TARGET DATABASE
  FOR FARSYNC
  FROM ACTIVE DATABASE
  SPFILE
    SET db_unique_name='FS'
    SET dg_broker_config_file1='+DATA01/FS/dr1orcl.dat'
    SET dg_broker_config_file2='+RECO01/FS/dr2orcl.dat'
    SET audit_file_dest='/u01/app/oracle/admin/FS/adump'
  NOFILENAMECHECK;
[oracle@ip-172-31-86-22 ops]$ rman target sys/zystm.22@NY auxiliary sys/zystm.22@FS

Recovery Manager: Release 19.0.0.0.0 - Production on Thu Sep 5 04:53:02 2019
Version 19.3.0.0.0

Copyright (c) 1982, 2019, Oracle and/or its affiliates.  All rights reserved.

connected to target database: ORCL (DBID=1545932534)
connected to auxiliary database: ORCL (not mounted)

RMAN> @ dup.db.farsync.rcv

RMAN> DUPLICATE TARGET DATABASE
2>   FOR FARSYNC
3>   FROM ACTIVE DATABASE
4>   SPFILE
5>     SET db_unique_name='FS'
6>     SET dg_broker_config_file1='+DATA01/FS/dr1orcl.dat'
7>     SET dg_broker_config_file2='+RECO01/FS/dr2orcl.dat'
8>     SET audit_file_dest='/u01/app/oracle/admin/FS/adump'
9>   NOFILENAMECHECK;
Starting Duplicate Db at 05-SEP-19
using target database control file instead of recovery catalog
allocated channel: ORA_AUX_DISK_1
channel ORA_AUX_DISK_1: SID=39 device type=DISK

contents of Memory Script:
{
   backup as copy reuse
   passwordfile auxiliary format  '/u01/app/oracle/product/19.3.0/dbhome_1/dbs/orapwORCL'   ;
   restore clone from service  'NY' spfile to
 '/u01/app/oracle/product/19.3.0/dbhome_1/dbs/spfileORCL.ora';
   sql clone "alter system set spfile= ''/u01/app/oracle/product/19.3.0/dbhome_1/dbs/spfileORCL.ora''";
}
executing Memory Script

Starting backup at 05-SEP-19
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=88 device type=DISK
Finished backup at 05-SEP-19

Starting restore at 05-SEP-19
using channel ORA_AUX_DISK_1

channel ORA_AUX_DISK_1: starting datafile backup set restore
channel ORA_AUX_DISK_1: using network backup set from service NY
channel ORA_AUX_DISK_1: restoring SPFILE
output file name=/u01/app/oracle/product/19.3.0/dbhome_1/dbs/spfileORCL.ora
channel ORA_AUX_DISK_1: restore complete, elapsed time: 00:00:01
Finished restore at 05-SEP-19

sql statement: alter system set spfile= ''/u01/app/oracle/product/19.3.0/dbhome_1/dbs/spfileORCL.ora''

contents of Memory Script:
{
   sql clone "alter system set  db_unique_name =
 ''FS'' comment=
 '''' scope=spfile";
   sql clone "alter system set  dg_broker_config_file1 =
 ''+DATA01/FS/dr1orcl.dat'' comment=
 '''' scope=spfile";
   sql clone "alter system set  dg_broker_config_file2 =
 ''+RECO01/FS/dr2orcl.dat'' comment=
 '''' scope=spfile";
   sql clone "alter system set  audit_file_dest =
 ''/u01/app/oracle/admin/FS/adump'' comment=
 '''' scope=spfile";
   shutdown clone immediate;
   startup clone nomount;
}
executing Memory Script

sql statement: alter system set  db_unique_name =  ''FS'' comment= '''' scope=spfile

sql statement: alter system set  dg_broker_config_file1 =  ''+DATA01/FS/dr1orcl.dat'' comment= '''' scope=spfile

sql statement: alter system set  dg_broker_config_file2 =  ''+RECO01/FS/dr2orcl.dat'' comment= '''' scope=spfile

sql statement: alter system set  audit_file_dest =  ''/u01/app/oracle/admin/FS/adump'' comment= '''' scope=spfile

Oracle instance shut down

connected to auxiliary database (not started)
Oracle instance started

Total System Global Area    1140849904 bytes

Fixed Size                     8895728 bytes
Variable Size                301989888 bytes
Database Buffers             822083584 bytes
Redo Buffers                   7880704 bytes
duplicating Online logs to Oracle Managed File (OMF) location
duplicating Datafiles to Oracle Managed File (OMF) location

contents of Memory Script:
{
   sql clone "alter system set  control_files =
  ''+RECO01/FS/CONTROLFILE/current.257.1018155321'', ''+DATA01/FS/CONTROLFILE/current.258.1018155321'' comment=
 ''Set by RMAN'' scope=spfile";
   restore clone from service  'NY' farsync controlfile;
}
executing Memory Script

sql statement: alter system set  control_files =   ''+RECO01/FS/CONTROLFILE/current.257.1018155321'', ''+DATA01/FS/CONTROLFILE/current.258.1018155321'' comment= ''Set by RMAN'' scope=spfile

Starting restore at 05-SEP-19
allocated channel: ORA_AUX_DISK_1
channel ORA_AUX_DISK_1: SID=46 device type=DISK

channel ORA_AUX_DISK_1: starting datafile backup set restore
channel ORA_AUX_DISK_1: using network backup set from service NY
channel ORA_AUX_DISK_1: restoring control file
channel ORA_AUX_DISK_1: restore complete, elapsed time: 00:00:01
output file name=+RECO01/FS/CONTROLFILE/current.383.1018155325
output file name=+DATA01/FS/CONTROLFILE/current.265.1018155325
Finished restore at 05-SEP-19

contents of Memory Script:
{
   sql clone 'alter database mount';
}
executing Memory Script

sql statement: alter database mount

contents of Memory Script:
{
   sql 'alter system archive log current';
}
executing Memory Script

sql statement: alter system archive log current
Finished Duplicate Db at 05-SEP-19

RMAN> **end-of-file**

Note that there are no online logs, and that the standby logs are the same count and size as the primary.

SQL> select count(*)numlogs, bytes/1024/1024 mb from v$log group by bytes;

no rows selected

SQL> select count(*)numlogs, bytes/1024/1024 mb from v$standby_log group by bytes;

   NUMLOGS         MB
---------- ----------
         4        200

Notice these points about the operation:

All file transfers were done with RMAN. No ssh connection is needed.
There is no ALTER DATABASE CREATE FAR SYNC INSTANCE CONTROLFILE step. RMAN handled the control file by backing it up at the primary and tranferring it to the far sync.
Although the primary spfile is in ASM, the far sync spfile ends up on the file system in directory $ORACLE_HOME/dbs.
The orapwORCL that you created at the far sync got overwritten by a backup of the primary orapwORCL.

Duplicate primary to standby

Allocate an appropriate number of channels to reduce the time needed to duplicate the database.

[oracle@ip-172-31-86-22 ops]$ cat dup.db.standby.rcv
run {

allocate channel ch01 device type disk;
allocate channel ch02 device type disk;
allocate auxiliary channel aux01 device type disk;
allocate auxiliary channel aux02 device type disk;

DUPLICATE TARGET DATABASE
  FOR STANDBY
  FROM ACTIVE DATABASE
  DORECOVER
  SPFILE
    SET db_unique_name='SF'
    SET dg_broker_config_file1='+DATA01/SF/dr1orcl.dat'
    SET dg_broker_config_file2='+RECO01/SF/dr2orcl.dat'
    SET audit_file_dest='/u01/app/oracle/admin/SF/adump'
  NOFILENAMECHECK;

}
[oracle@ip-172-31-86-22 ops]$ vi dup.db.standby.rcv
[oracle@ip-172-31-86-22 ops]$ rman target sys/zystm.22@NY auxiliary sys/zystm.22@SF

Recovery Manager: Release 19.0.0.0.0 - Production on Thu Sep 5 05:04:52 2019
Version 19.3.0.0.0

Copyright (c) 1982, 2019, Oracle and/or its affiliates.  All rights reserved.

connected to target database: ORCL (DBID=1545932534)
connected to auxiliary database: ORCL (not mounted)

RMAN> @ dup.db.standby.rcv

RMAN> run {
2>
3> allocate channel ch01 device type disk;
4> allocate channel ch02 device type disk;
5> allocate auxiliary channel aux01 device type disk;
6> allocate auxiliary channel aux02 device type disk;
7>
8> DUPLICATE TARGET DATABASE
9>   FOR STANDBY
10>   FROM ACTIVE DATABASE
11>   DORECOVER
12>   SPFILE
13>     SET db_unique_name='SF'
14>     SET dg_broker_config_file1='+DATA01/SF/dr1orcl.dat'
15>     SET dg_broker_config_file2='+RECO01/SF/dr2orcl.dat'
16>     SET audit_file_dest='/u01/app/oracle/admin/SF/adump'
17>   NOFILENAMECHECK;
18>
19> }
using target database control file instead of recovery catalog
allocated channel: ch01
channel ch01: SID=84 device type=DISK

allocated channel: ch02
channel ch02: SID=85 device type=DISK

allocated channel: aux01
channel aux01: SID=40 device type=DISK

allocated channel: aux02
channel aux02: SID=41 device type=DISK

Starting Duplicate Db at 05-SEP-19
current log archived

contents of Memory Script:
{
   backup as copy reuse
   passwordfile auxiliary format  '/u01/app/oracle/product/19.3.0/dbhome_1/dbs/orapwORCL'   ;
   restore clone from service  'NY' spfile to
 '/u01/app/oracle/product/19.3.0/dbhome_1/dbs/spfileORCL.ora';
   sql clone "alter system set spfile= ''/u01/app/oracle/product/19.3.0/dbhome_1/dbs/spfileORCL.ora''";
}
executing Memory Script

Starting backup at 05-SEP-19
Finished backup at 05-SEP-19

Starting restore at 05-SEP-19

channel aux01: starting datafile backup set restore
channel aux01: using network backup set from service NY
channel aux01: restoring SPFILE
output file name=/u01/app/oracle/product/19.3.0/dbhome_1/dbs/spfileORCL.ora
channel aux01: restore complete, elapsed time: 00:00:02
Finished restore at 05-SEP-19

sql statement: alter system set spfile= ''/u01/app/oracle/product/19.3.0/dbhome_1/dbs/spfileORCL.ora''

contents of Memory Script:
{
   sql clone "alter system set  db_unique_name =
 ''SF'' comment=
 '''' scope=spfile";
   sql clone "alter system set  dg_broker_config_file1 =
 ''+DATA01/SF/dr1orcl.dat'' comment=
 '''' scope=spfile";
   sql clone "alter system set  dg_broker_config_file2 =
 ''+RECO01/SF/dr2orcl.dat'' comment=
 '''' scope=spfile";
   sql clone "alter system set  audit_file_dest =
 ''/u01/app/oracle/admin/SF/adump'' comment=
 '''' scope=spfile";
   shutdown clone immediate;
   startup clone nomount;
}
executing Memory Script

sql statement: alter system set  db_unique_name =  ''SF'' comment= '''' scope=spfile

sql statement: alter system set  dg_broker_config_file1 =  ''+DATA01/SF/dr1orcl.dat'' comment= '''' scope=spfile

sql statement: alter system set  dg_broker_config_file2 =  ''+RECO01/SF/dr2orcl.dat'' comment= '''' scope=spfile

sql statement: alter system set  audit_file_dest =  ''/u01/app/oracle/admin/SF/adump'' comment= '''' scope=spfile

Oracle instance shut down

connected to auxiliary database (not started)
Oracle instance started

Total System Global Area    1140849904 bytes

Fixed Size                     8895728 bytes
Variable Size                301989888 bytes
Database Buffers             822083584 bytes
Redo Buffers                   7880704 bytes
allocated channel: aux01
channel aux01: SID=44 device type=DISK
allocated channel: aux02
channel aux02: SID=45 device type=DISK
duplicating Online logs to Oracle Managed File (OMF) location
duplicating Datafiles to Oracle Managed File (OMF) location

contents of Memory Script:
{
   sql clone "alter system set  control_files =
  ''+RECO01/SF/CONTROLFILE/current.257.1018156101'', ''+DATA01/SF/CONTROLFILE/current.258.1018156101'' comment=
 ''Set by RMAN'' scope=spfile";
   restore clone from service  'NY' standby controlfile;
}
executing Memory Script

sql statement: alter system set  control_files =   ''+RECO01/SF/CONTROLFILE/current.257.1018156101'', ''+DATA01/SF/CONTROLFILE/current.258.1018156101'' comment= ''Set by RMAN'' scope=spfile

Starting restore at 05-SEP-19

channel aux01: starting datafile backup set restore
channel aux01: using network backup set from service NY
channel aux01: restoring control file
channel aux01: restore complete, elapsed time: 00:00:04
output file name=+RECO01/SF/CONTROLFILE/current.262.1018156111
output file name=+DATA01/SF/CONTROLFILE/current.267.1018156111
Finished restore at 05-SEP-19

contents of Memory Script:
{
   sql clone 'alter database mount standby database';
}
executing Memory Script

sql statement: alter database mount standby database

contents of Memory Script:
{
   set newname for clone tempfile  1 to new;
   switch clone tempfile all;
   set newname for clone datafile  1 to new;
   set newname for clone datafile  3 to new;
   set newname for clone datafile  4 to new;
   set newname for clone datafile  7 to new;
   restore
   from  nonsparse   from service
 'NY'   clone database
   ;
   sql 'alter system archive log current';
}
executing Memory Script

executing command: SET NEWNAME

renamed tempfile 1 to +DATA01 in control file

executing command: SET NEWNAME

executing command: SET NEWNAME

executing command: SET NEWNAME

executing command: SET NEWNAME

Starting restore at 05-SEP-19

channel aux01: starting datafile backup set restore
channel aux01: using network backup set from service NY
channel aux01: specifying datafile(s) to restore from backup set
channel aux01: restoring datafile 00001 to +DATA01
channel aux02: starting datafile backup set restore
channel aux02: using network backup set from service NY
channel aux02: specifying datafile(s) to restore from backup set
channel aux02: restoring datafile 00003 to +DATA01
channel aux02: restore complete, elapsed time: 00:00:38
channel aux02: starting datafile backup set restore
channel aux02: using network backup set from service NY
channel aux02: specifying datafile(s) to restore from backup set
channel aux02: restoring datafile 00004 to +DATA01
channel aux01: restore complete, elapsed time: 00:00:44
channel aux01: starting datafile backup set restore
channel aux01: using network backup set from service NY
channel aux01: specifying datafile(s) to restore from backup set
channel aux01: restoring datafile 00007 to +DATA01
channel aux01: restore complete, elapsed time: 00:00:05
channel aux02: restore complete, elapsed time: 00:00:12
Finished restore at 05-SEP-19

sql statement: alter system archive log current
current log archived

contents of Memory Script:
{
   restore clone force from service  'NY'
           archivelog from scn  2246774;
   switch clone datafile all;
}
executing Memory Script

Starting restore at 05-SEP-19

channel aux01: starting archived log restore to default destination
channel aux01: using network backup set from service NY
channel aux01: restoring archived log
archived log thread=1 sequence=9
channel aux02: starting archived log restore to default destination
channel aux02: using network backup set from service NY
channel aux02: restoring archived log
archived log thread=1 sequence=10
channel aux01: restore complete, elapsed time: 00:00:02
channel aux02: restore complete, elapsed time: 00:00:02
Finished restore at 05-SEP-19

datafile 1 switched to datafile copy
input datafile copy RECID=5 STAMP=1018156184 file name=+DATA01/SF/DATAFILE/system.266.1018156127
datafile 3 switched to datafile copy
input datafile copy RECID=6 STAMP=1018156184 file name=+DATA01/SF/DATAFILE/sysaux.265.1018156129
datafile 4 switched to datafile copy
input datafile copy RECID=7 STAMP=1018156184 file name=+DATA01/SF/DATAFILE/undotbs1.264.1018156167
datafile 7 switched to datafile copy
input datafile copy RECID=8 STAMP=1018156184 file name=+DATA01/SF/DATAFILE/users.269.1018156173

contents of Memory Script:
{
   set until scn  2247713;
   recover
   standby
   clone database
    delete archivelog
   ;
}
executing Memory Script

executing command: SET until clause

Starting recover at 05-SEP-19

starting media recovery

archived log for thread 1 with sequence 9 is already on disk as file +RECO01/SF/ARCHIVELOG/2019_09_05/thread_1_seq_9.261.1018156181
archived log for thread 1 with sequence 10 is already on disk as file +RECO01/SF/ARCHIVELOG/2019_09_05/thread_1_seq_10.259.1018156183
archived log file name=+RECO01/SF/ARCHIVELOG/2019_09_05/thread_1_seq_9.261.1018156181 thread=1 sequence=9
archived log file name=+RECO01/SF/ARCHIVELOG/2019_09_05/thread_1_seq_10.259.1018156183 thread=1 sequence=10
media recovery complete, elapsed time: 00:00:01
Finished recover at 05-SEP-19

contents of Memory Script:
{
   delete clone force archivelog all;
}
executing Memory Script

deleted archived log
archived log file name=+RECO01/SF/ARCHIVELOG/2019_09_05/thread_1_seq_9.261.1018156181 RECID=1 STAMP=1018156181
Deleted 1 objects

deleted archived log
archived log file name=+RECO01/SF/ARCHIVELOG/2019_09_05/thread_1_seq_10.259.1018156183 RECID=2 STAMP=1018156182
Deleted 1 objects

Finished Duplicate Db at 05-SEP-19
released channel: ch01
released channel: ch02
released channel: aux01
released channel: aux02

RMAN> **end-of-file**

RMAN>

Recovery Manager complete.

Notice that the online logs and the standby logs are the same as the primary:

SQL> select count(*)numlogs, bytes/1024/1024 mb from v$log group by bytes;

   NUMLOGS         MB
---------- ----------
         3        200

SQL> select count(*)numlogs, bytes/1024/1024 mb from v$standby_log group by bytes;

   NUMLOGS         MB
---------- ----------
         4        200

Configure restart

All new systems should be mounted at this stage. Register your restart instances. (At the primary, this step might already be done.)

[oracle@ip-172-31-86-22 ~]$ srvctl add database -database NY -role PRIMARY -stopoption IMMEDIATE -instance ORCL -oraclehome /u01/app/oracle/product/19.3.0/dbhome_1 -spfile +DATA01/ORCL/PARAMETERFILE/spfile.266.1017440879 -diskgroup DATA01,RECO01

At the far sync:

[oracle@ip-172-31-28-23 ~]$ srvctl add database -database FS -role physical_standby -startoption MOUNT -stopoption ABORT -instance FS -oraclehome /u01/app/oracle/product/19.3.0/dbhome_1 -spfile /u01/app/oracle/product/19.3.0/dbhome_1/dbs/spfileFS.ora -diskgroup DATA01,RECO01

To activate the change, I found it necessary to issue srvctl start. That does not seem right. “srvctl enable database” should do it. In restart, “srvctl enable instance” is not available. Unless you activate the instance, it will not start automatically upon host reboot.

[oracle@ip-172-31-28-23 ~]$ srvctl start database -database FS

At the standby:

[oracle@ip-172-32-10-34 ~]$ srvctl add database -database SF -role physical_standby -startoption MOUNT -stopoption ABORT -instance SF -oraclehome /u01/app/oracle/product/19.3.0/dbhome_1 -spfile /u01/app/oracle/product/19.3.0/dbhome_1/dbs/spfileSF.ora -diskgroup DATA01,RECO01
[oracle@ip-172-32-10-34 ~]$ srvctl start database -database SF

Reboot all hosts and check that the database instances start automatically, in the proper startup mode.

Validate static listeners

Validating the static listeners is critical. For each database or far sync (NY, FS, SF):

Connect over the network with sqlplus
Shutdown (shutdown abort for standbys and far syncs)
Startup (startup mount for standbys and far syncs)

Example:

[oracle@ip-172-31-28-23 ops]$ sqlplus sys/zystm.22@FS as sysdba

SQL*Plus: Release 19.0.0.0.0 - Production on Fri Sep 13 11:36:37 2019
Version 19.3.0.0.0

Copyright (c) 1982, 2019, Oracle.  All rights reserved.


Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.3.0.0.0

SQL> shutdown abort
ORACLE instance shut down.
SQL> startup mount
ORACLE instance started.

Total System Global Area 1140849904 bytes
Fixed Size                  8895728 bytes
Variable Size             301989888 bytes
Database Buffers          822083584 bytes
Redo Buffers                7880704 bytes
Database mounted.

This is an example of a failed static listener check.

[oracle@ip-172-31-28-23 ops]$ sqlplus sys/zystm.22@FS as sysdba

SQL*Plus: Release 19.0.0.0.0 - Production on Fri Sep 13 11:36:37 2019
Version 19.3.0.0.0

Copyright (c) 1982, 2019, Oracle.  All rights reserved.


Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.3.0.0.0

SQL> shutdown abort
ORACLE instance shut down.
SQL> startup mount
ORACLE instance started.

Total System Global Area 1140849904 bytes
Fixed Size                  8895728 bytes
Variable Size             301989888 bytes
Database Buffers          822083584 bytes
Redo Buffers                7880704 bytes
Database mounted.
SQL> shutdown abort
ORACLE instance shut down.
ERROR:
ORA-12505: TNS:listener does not currently know of SID given in connect
descriptor


SQL> startup mount
SP2-0640: Not connected

Resolve all failures before proceeding.

Broker start

At this stage, primary database is open and the far sync and standby intances are mounted. At the primary, far sync, and standby, start the Data Guard broker:

SQL> alter system set dg_broker_start = true;

System altered.

Create the broker configuration

Connect to the Data Guard broker:

[oracle@ip-172-31-86-22 broker]$ dgmgrl sys/zystm.22@NY
DGMGRL for Linux: Release 19.0.0.0.0 - Production on Mon Sep 2 13:59:25 2019
Version 19.3.0.0.0

Copyright (c) 1982, 2019, Oracle and/or its affiliates.  All rights reserved.

Welcome to DGMGRL, type "help" for information.
Connected to "NY"
Connected as SYSDBA.

Create the Data Guard broker configuration. One way to do it is to create and test the standby first, and add the far sync later. However, in this example, we add the far sync and the standby in one fell swoop.

While creating new objects, if you want to preserve name case, you should quote your identifiers.

DGMGRL> create configuration 'ORCL_CONFIG' as primary database is 'NY' connect identifier is 'NY';
Configuration "ORCL_CONFIG" created with primary database "NY"
DGMGRL> add far_sync 'FS' as connect identifier is 'FS';
far sync instance "FS" added
DGMGRL> add database 'SF' as connect identifier is 'SF' maintained as physical;
Database "SF" added

Add the redo routes:

DGMGRL> edit database NY set property RedoRoutes = '(LOCAL : FS SYNC)';
Property "redoroutes" updated
DGMGRL> edit database SF set property RedoRoutes = '(LOCAL : FS SYNC)';
Property "redoroutes" updated
DGMGRL> edit far_sync FS set property RedoRoutes = '(NY : SF ASYNC)(SF : NY ASYNC)';
Property "redoroutes" updated

Enable the configuration

DGMGRL> enable configuration
Enabled.

A point of interest is the fal_server configuration symmetry.

parameter	primary	far sync	standby
fal_server	‘FS’,’SF’*	‘NY’,’SF’	‘FS’,’NY’

* after switchover

The Data Guard broker sets fal_server at the far sync and standby when you enable the configuration. If you perform a switchover, the broker sets fal_server at the new standby and clears fal_server at the old standby.

Switchover test

Test switchover to SF. The output becomes:

DGMGRL> switchover to SF
Performing switchover NOW, please wait...
Operation requires a connection to database "SF"
Connecting ...
Connected to "SF"
Connected as SYSDBA.
New primary database "SF" is opening...
Oracle Clusterware is restarting database "NY" ...
Connected to an idle instance.
Connected to an idle instance.
Connected to an idle instance.
Connected to an idle instance.
Connected to "NY"
Connected to "NY"
Switchover succeeded, new primary is "sf"

Test switchover to NY. The output is now:

DGMGRL> switchover to NY
Performing switchover NOW, please wait...
New primary database "NY" is opening...
Oracle Clusterware is restarting database "SF" ...
Connected to an idle instance.
Connected to an idle instance.
Connected to "SF"
Connected to "SF"
Switchover succeeded, new primary is "ny"

Setting the protection mode

Maximum Performance

In Maximum Performance protection mode:

Transactions commit as soon as all redo data generated by those transactions has been written to the online log

Maximum Availability

In Maximum Availability mode:

Under normal operations, transactions do not commit until all redo data needed to recover those transactions has been written to the online redo log AND based on user configuration, one of the following is true:

- redo has been received at the standby, I/O to the standby redo log has been initiated, and acknowledgement sent back to primary
- redo has been received and written to standby redo log at the standby and acknowledgement sent back to primary

If the primary does not receive acknowledgement from at least one synchronized standby, then it operates as if it were in maximum performance mode to preserve primary database availability until it is again able to write its redo stream to a synchronized standby database.

With far sync, replace “standby” in the preceding description with “far sync”. Transactions do not commit until redo has been written to the far sync standby log. Because the far sync is in the same region as the primary, commit performance of a Maximum Availability system is expected to be better with a far sync.

Set protection mode to Maximum Availability.

DGMGRL> edit configuration set protection mode as MaxAvailability;
Succeeded.

Wait a few minutes and check:

DGMGRL> show configuration

Configuration - ORCL_CONFIG

  Protection Mode: MaxAvailability
  Members:
  NY - Primary database
    FS - Far sync instance
      SF - Physical standby database

Fast-Start Failover:  Disabled

Configuration Status:
SUCCESS   (status updated 51 seconds ago)

Test switchover and switch back:

DGMGRL> switchover to SF
Performing switchover NOW, please wait...
Operation requires a connection to database "SF"
Connecting ...
Connected to "SF"
Connected as SYSDBA.
New primary database "SF" is opening...
Oracle Clusterware is restarting database "NY" ...
Connected to an idle instance.
Connected to an idle instance.
Connected to "NY"
Connected to "NY"
Switchover succeeded, new primary is "sf"
DGMGRL> switchover to NY
Performing switchover NOW, please wait...
Operation requires a connection to database "NY"
Connecting ...
Connected to "NY"
Connected as SYSDBA.
New primary database "NY" is opening...
Oracle Clusterware is restarting database "SF" ...
Connected to an idle instance.
Connected to an idle instance.
Connected to an idle instance.
Connected to "SF"
Connected to "SF"
Switchover succeeded, new primary is "ny"

Notice at the far sync that one or more standby logs are assigned

SQL> select group#, bytes/1024/1024 mb, thread#, sequence#, status from v$standby_log;

    GROUP#         MB    THREAD#  SEQUENCE# STATUS
---------- ---------- ---------- ---------- ----------
         1        200          1        572 ACTIVE
         2        200          1          0 UNASSIGNED
         3        200          1          0 UNASSIGNED
         4        200          0          0 UNASSIGNED

and at the physical standby one or more standby logs are assigned.

SQL> select group#, bytes/1024/1024 mb, thread#, sequence#, status from v$standby_log;

    GROUP#         MB    THREAD#  SEQUENCE# STATUS
---------- ---------- ---------- ---------- ----------
         4        200          1          0 UNASSIGNED
         5        200          1        572 ACTIVE
         6        200          0          0 UNASSIGNED
         7        200          0          0 UNASSIGNED

Disabling far sync

You can disable far sync

DGMGRL> edit database NY set property RedoRoutes = '(LOCAL : SF SYNC)';
Property "redoroutes" updated
DGMGRL> edit database SF set property RedoRoutes = '(LOCAL : NY SYNC)';
Property "redoroutes" updated
DGMGRL> edit far_sync FS set property RedoRoutes = '';
Property "redoroutes" updated
DGMGRL> disable far_sync FS
Disabled.

The configuration display looks like this:

DGMGRL> show configuration

Configuration - ORCL_CONFIG

  Protection Mode: MaxAvailability
  Members:
  NY - Primary database
    SF - Physical standby database

  Members Not Receiving Redo:
  FS - Far sync instance (disabled)
    ORA-16749: The member was disabled manually.

Fast-Start Failover:  Disabled

Configuration Status:
SUCCESS   (status updated 24 seconds ago)

Re-enabling far sync

You can re-enable far sync. You must touch the objects in the reverse order that you did when you disabled far sync: far sync, far sync redo routes, database redo routes.

DGMGRL> enable far_sync FS
Enabled.
DGMGRL> edit far_sync FS set property RedoRoutes = '(NY : SF ASYNC)(SF : NY ASYNC)';
Property "redoroutes" updated
DGMGRL> edit database SF set property RedoRoutes = '(LOCAL : FS SYNC)';
Property "redoroutes" updated
DGMGRL> edit database NY set property RedoRoutes = '(LOCAL : FS SYNC)';
Property "redoroutes" updated

Check:

DGMGRL> show configuration

Configuration - ORCL_CONFIG

  Protection Mode: MaxAvailability
  Members:
  NY - Primary database
    FS - Far sync instance
      SF - Physical standby database

Fast-Start Failover:  Disabled

Configuration Status:
SUCCESS   (status updated 39 seconds ago)

Removing the far sync

You can remove a disabled far sync from the configuration.

DGMGRL> remove far_sync FS
Removed far sync instance "fs" from the configuration

Switchover performance

We want to know how long after starting the switchover until the new primary is available. In other words how much time elapsed from when the switchover command is issued until the new primary is open. The times appear in the alert log. For example:

2019-09-05T14:44:04.271609-04:00
SWITCHOVER VERIFY BEGIN
...
2019-09-05T14:44:59.392512-04:00
TMI: adbdrv open database END 2019-09-05 14:44:59.392313
Starting background process CJQ0
Completed: ALTER DATABASE OPEN

We can compare database opening time with far sync

  Protection Mode: MaxAvailability
  Members:
  NY - Primary database
    FS - Far sync instance
      SF - Physical standby database

to time without far sync (direct route)

  Protection Mode: MaxAvailability
  Members:
  NY - Primary database
    SF - Physical standby database

With far sync, the switchover timing data are:

sw to	date	start sw	new primary open	elapsed time to open (s)	redo route	prot level
SF	9/5/2019	05:54:16.9	05:55:11.4	54.6	far sync	Max Perf
NY	9/5/2019	06:02:11.7	06:04:12.2	120.5	far sync	Max Perf
SF	9/5/2019	06:09:34.2	06:10:40.9	66.8	far sync	Max Perf
NY	9/5/2019	06:13:32.7	06:15:21.9	109.1	far sync	Max Perf
SF	9/5/2019	13:38:38.6	13:40:18.1	99.5	far sync	Max Avail
NY	9/5/2019	14:07:56.7	14:10:23.7	147.0	far sync	Max Avail
SF	9/5/2019	14:15:32.9	14:17:58.9	145.9	far sync	Max Avail
NY	9/5/2019	14:20:04.6	14:21:38.9	94.3	far sync	Max Avail
Average				104.7
Stdev				33.5

compared to without far sync:

sw to	date	start sw	new primary open	elapsed time to open (s)	redo route	prot level
SF	9/5/2019	06:19:48.0	06:20:47.4	59.4	direct	Max Perf
NY	9/5/2019	06:23:10.5	06:24:06.8	56.3	direct	Max Perf
SF	9/5/2019	06:27:20.5	06:28:19.5	59.0	direct	Max Perf
NY	9/5/2019	06:29:57.2	06:30:54.0	56.8	direct	Max Perf
SF	9/5/2019	14:34:49.6	14:35:50.1	60.5	direct	Max Avail
NY	9/5/2019	14:38:04.4	14:38:59.3	54.9	direct	Max Avail
SF	9/5/2019	14:40:59.6	14:41:58.1	58.5	direct	Max Avail
NY	9/5/2019	14:44:04.3	14:44:59.4	55.1	direct	Max Avail
Average				57.6
Stdev				2.1

Without far sync, switchover time is lower (Average) and more consistent (Stdev). Far sync, therefore, imposes a switchover time penalty.

Conclusion

A far sync implementation is presented. There are several points of interest.

The platform is Red Hat Linux 7.2 on Amazon AWS EC2.
The network setup is described in a separate article.
You can measure network latency and throughput with qperf.
The grid and oracle home users are separate accounts.
Database storage is in ASM, not operating system files.
The number of standby log files depends on the expected workload.
Parameter compatible must be set the same at the primary, far sync, and standby.
Check standby log size before duplicating the database.
File transfer is by RMAN only.
SSH trust is not required to setup Data Guard.
Multiple RMAN channels should be used to duplicate the database.
Configure Restart for all instances.
Validate the static listeners
Data Guard configuration and operations are done from the broker, not sqlplus.
The the standby and the far sync are set up at the same time, not separately.
Reboot hosts and confirm automatic startup, startup mode, and Data Guard role.
Test switchover after configuring restart.
Data Guard broker manages fal_server at all nodes during role transitions.
You can change the protection mode to Maximum Availability.
Far sync can reduce or prevent data loss.
With far sync, switchover takes longer and the elapsed time is less consistent.

We have presented an Oracle Data Guard 19c setup with far sync. Some techniques here are variants on contributions found elsewhere. Some findings on performance are presented.

Data Guard network in AWS

Posted on September 3, 2019 by Brian Fitzgerald

By Brian Fitzgerald

Introduction

This a cross-region Data Guard network in Amazon AWS. This article covers VPC, peering, and security groups.

VPC

We are starting with a default VPC at the primary with CIDR block 172.31.0.0/16. At the standby, in preparation for peering, we will create a new VPC with non-overlapping CIDR block 172.32.0.0/20.

cr.vpc Enable DNS hostnames, which is required for the Oracle grid setup.

vpc.enable.dns.hostnames2

We need to create at least one subnet.

cr.subnet

Optionally, you may create an internet gateway and route.

Peering

At the standby VPC, setup peering to the primary VPCs:

cr.peering

Note the acknowledgement:

conf.peering

At the primary, accept the request.

accept.peering

Press “Yes, Accept”

accept.peering.yes.png

Note the acknowledgement. Select “Modify my route tables now”:

will.modify.route

At the standby, add a route to the primary VPC.

edit.routes.sf

At the primary, add a route to the standby VPC.

edit.routes.va

Security Groups

Create primary, far sync, and standby EC2 instances. If you have Enterprise Manager set up already, you may consider it now. Assuming these IP addresses:

Description	IP Address
primary	172.31.86.22
far sync	172.31.28.23
standby	172.32.10.34
Enterprise Manager	172.31.82.194

A minimalist security group arrangement involves just the database boxes. In that case, setup security groups such as the following. At the primary EC2 instance, accept incoming Oracle connections from the far sync and the standby. At all EC2 instances, accept incoming SSH and em agent connections.

sg.prim

At the far sync, accept Oracle connections from the primary and the standby.

sg.far

At the standby, accept Oracle connections from the primary and the far sync.

sg.stby

Suppose, however, that there are database application clients on subnets 172.32.0.0/20 and 172.31.80.0/20. In that case, at you could revise the primary and standby security groups as follows. At both the primary and the standby, accept Oracle connections from the primary and the standby subnets and from the far sync.

sg.app.prim

sg.app.stby

Instead of opening up incoming SSH to all hosts, you can create one or more bastion hosts, and you can restrict incoming SSH to only the bastion.

Conclusion

Using the AWS console, you can setup networking for a cross-region Data Guard network. At the VPC level, the first key point is to select non-overlapping IP address ranges with a view to establishing peering. For a smooth grid install, enable DNS hostnames. After your EC2 instances are created, you must configure security groups to accept incoming database, EM agent, and ssh connections. Security groups should be as restrictive as possible.

grid setup error: could not retrieve local node name

Posted on September 1, 2019 by Brian Fitzgerald

By Brian Fitzgerald

Error

In AWS EC2 Redhat Linux, gridSetup.sh reports this error:

PRVF-0002 : could not retrieve local node name

Solution

Set the EnableDnsHostnames flag

Error details

In Linux, you run Oracle 19c grid setup

[grid@ip-10-0-2-197 ~]$ cd /u01/app/19.3.0/grid
[grid@ip-10-0-2-197 grid]$ ./gridSetup.sh

After the splash screen:

this error appears:

The error appears in several log files. For example: /tmp/GridSetupActions2019-09-01_09-40-29AM/gridSetupActions2019-09-01_09-40-29AM.log

SEVERE: [Sep 1, 2019 9:40:36 AM] [FATAL] PRVF-0002 : could not retrieve local node name
ip-10-0-2-197.ec2.internal: ip-10-0-2-197.ec2.internal: Name or service not known.
Refer associated stacktrace #oracle.install.commons.util.exception.AbstractErrorAdvisor:165

/tmp/GridSetupActions2019-09-01_09-40-29AM/time2019-09-01_09-40-29AM.log

oracle.cluster.verification.VerificationException: PRVF-0002 : could not retrieve local node name
ip-10-0-2-197.ec2.internal: ip-10-0-2-197.ec2.internal: Name or service not known

Note that the local hostname does not resolve to an IP address:

[grid@ip-10-0-2-197 grid]$ hostname
ip-10-0-2-197.ec2.internal
[grid@ip-10-0-2-197 grid]$ ping ip-10-0-2-197.ec2.internal
ping: ip-10-0-2-197.ec2.internal: Name or service not known

Cause

The Virtual Private Cloud (VPC) has the EnableDnsHostnames flag set to Disabled (False). This could happen in a non-default VPC, meaning a VPC that you create. You can see this in the AWS console:

or from python:

Python 2.7.5 (default, Jun 11 2019, 14:33:56)
[GCC 4.8.5 20150623 (Red Hat 4.8.5-39)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from boto3 import client
>>> rsp = client('ec2').describe_vpc_attribute(
...     Attribute='enableDnsHostnames',
...     VpcId = 'vpc-0786f6ef4dc16cefc'
... )
>>> rsp['EnableDnsHostnames']['Value']
False

or from the command line:

[grid@ip-10-0-2-197 grid]$ aws ec2 describe-vpc-attribute --vpc-id vpc-0786f6ef4dc16cefc --attribute enableDnsHostnames
{
    "VpcId": "vpc-0786f6ef4dc16cefc",
    "EnableDnsHostnames": {
        "Value": false
    }
}

Fix

Set the EnableDnsHostnames flag, which the AWS manual describes:

Indicates whether the instances launched in the VPC get DNS hostnames. If enabled, instances in the VPC get DNS hostnames; otherwise, they do not.

From the VPC Dashboard, select your VPC. Click Actions, and select “Edit DNS hostnames”. At the Edit DNS hostnames screen, check “enable” and press “Save”.

or, in python:

>>> from boto3 import client
>>> rsp = client('ec2').modify_vpc_attribute(
...     VpcId = 'vpc-0786f6ef4dc16cefc',
...     EnableDnsHostnames = { 'Value': True }
... )

or, using the aws command line, issue modify-vpc-attribute:

[grid@ip-10-0-2-197 grid]$ aws ec2 modify-vpc-attribute --vpc-id vpc-0786f6ef4dc16cefc --enable-dns-hostnames
[grid@ip-10-0-2-197 grid]$ aws ec2 describe-vpc-attribute --vpc-id vpc-0786f6ef4dc16cefc --attribute enableDnsHostnames
{
    "VpcId": "vpc-0786f6ef4dc16cefc",
    "EnableDnsHostnames": {
        "Value": true
    }
}

Retest

The change does not affect the EC2 instance immediately.

[grid@ip-10-0-2-197 grid]$ ping ip-10-0-2-197.ec2.internal
ping: ip-10-0-2-197.ec2.internal: Name or service not known

Reboot the host:

[ec2-user@ip-10-0-2-197 scr]$ sudo reboot

Now test:

[ec2-user@ip-10-0-2-197 ~]$ ping ip-10-0-2-197.ec2.internal
PING ip-10-0-2-197.ec2.internal (10.0.2.197) 56(84) bytes of data.

Now gridSetup.sh starts normally, and the installer menu appears:

[grid@ip-10-0-2-197 grid]$ ./gridSetup.sh

Discussion

Note that error PRVF-0002 has been widely reported in the blogs. In this case, the environment is AWS EC2 Redhat Linux 7.2 and Oracle 19c Grid Infrastructure. The error also appears in Oracle Linux, and in Oracle versions as early as 11g. PRVF-0002 can as well appear in a silent install.

The fix is sometimes given as a local /etc/hosts edit. However, in this blog article, I am proposing a fix that you can implement once per VPC, so that you don’t have to edit hosts every time. Also, you can get more consistent deployments across all VPCs, not just the default VPC.

Conclusion

gridSetup error “PRVF-0002 : could not retrieve local node name” can be resolved at the VPC level by setting EnableDnsHostnames.

Dual stack listeners on Linux

Posted on August 31, 2019 by Brian Fitzgerald

By Brian Fitzgerald

Question

$ netstat -ntl | grep :1521
tcp6 0 0 :::1521 :::* LISTEN

Q: Does the netstat output shown here mean that the listener accepts only IPv6 connections?

A: No. By default, a Linux listener uses a dual stack socket.

Oracle listener trace

Here is an strace of the Oracle TNS listener socket binding.

32649 socket(AF_INET6, SOCK_STREAM, IPPROTO_IP) = 8
32649 setsockopt(8, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
32649 bind(8, {sa_family=AF_INET6, sin6_port=htons(1521), inet_pton(AF_INET6, "::", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, 28) = 0
32649 listen(8, 128)

In the bind call, notice that the socket address family is AF_INET6, and the IP address shown is “::”, meaning listen on all IP addresses on the local host. The netstat output looks like this:

[grid@ip-172-32-10-34 ~]$ netstat -ntl | grep :1521
tcp6 0 0 :::1521 :::* LISTEN

However, the listener will accept either IPv4 or IPv6. You can check this by testing IPv4 and IPv6 one at a time:

[ec2-user@ip-172-32-10-34 ~]$ nc -v -4 localhost 1521 < /dev/null
Ncat: Version 7.50 ( https://nmap.org/ncat )
Ncat: Connected to 127.0.0.1:1521.
Ncat: 0 bytes sent, 0 bytes received in 0.02 seconds.
[ec2-user@ip-172-32-10-34 ~]$ nc -v -6 localhost 1521 < /dev/null
Ncat: Version 7.50 ( https://nmap.org/ncat )
Ncat: Connected to ::1:1521.
Ncat: 0 bytes sent, 0 bytes received in 0.03 seconds.

The connect calls were:

connect(3, {sa_family=AF_INET, sin_port=htons(1521), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress)
connect(3, {sa_family=AF_INET6, sin6_port=htons(1521), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, 28) = -1 EINPROGRESS (Operation now in progress)

You can compare the single-stack Oracle listener to other listeners that use separate sockets.

Linux sshd trace

By comparison, here is a trace of the sshd listener socket bindings.

1142 socket(AF_INET, SOCK_STREAM, IPPROTO_TCP) = 3
1142 fcntl(3, F_GETFL) = 0x2 (flags O_RDWR)
1142 fcntl(3, F_SETFL, O_RDWR|O_NONBLOCK) = 0
1142 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
1142 bind(3, {sa_family=AF_INET, sin_port=htons(22), sin_addr=inet_addr("0.0.0.0")}, 16) = 0
1142 listen(3, 128) = 0
...
1142 socket(AF_INET6, SOCK_STREAM, IPPROTO_TCP) = 4
1142 fcntl(4, F_GETFL) = 0x2 (flags O_RDWR)
1142 fcntl(4, F_SETFL, O_RDWR|O_NONBLOCK) = 0
1142 setsockopt(4, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
1142 setsockopt(4, SOL_IPV6, IPV6_V6ONLY, [1], 4) = 0
1142 bind(4, {sa_family=AF_INET6, sin6_port=htons(22), inet_pton(AF_INET6, "::", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, 28) = 0
1142 listen(4, 128)

Notice that for socket 3, the address family is AF_INET and the listener IP address is given as “0.0.0.0”, again meaning listen on all IP addresses. Examining socket 4 trace carefully, we see that before the bind call, socket option IPV6_V6ONLY is set. The netstat output looks like this:

[ec2-user@ip-172-32-10-34 ~]$ netstat -ntl | grep :22
tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN
tcp6 0 0 :::22 :::* LISTEN

netstat displays one output line per socket. sshd has two listener sockets, one for IPv4 and a separate IPv6 socket.

IPv6-only listener

You can demonstrate an IPv6-only listener:

[ec2-user@ip-172-32-10-34 ~]$ nc -6 -l 6666

[ec2-user@ip-172-32-10-34 ~]$ netstat -ntl | grep :6666
tcp6       0      0 :::6666                 :::*                    LISTEN

An IPv4 connection fails:

[ec2-user@ip-172-32-10-34 ~]$ nc -4 localhost 6666
Ncat: Connection refused.

The nc utility makes two connection attempts:

socket(AF_INET, SOCK_STREAM, IPPROTO_TCP) = 3
fcntl(3, F_GETFL)                       = 0x2 (flags O_RDWR)
fcntl(3, F_SETFL, O_RDWR|O_NONBLOCK)    = 0
connect(3, {sa_family=AF_INET, sin_port=htons(6666), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress)
...
socket(AF_INET, SOCK_STREAM, IPPROTO_TCP) = 4
fcntl(4, F_GETFL)                       = 0x2 (flags O_RDWR)
fcntl(4, F_SETFL, O_RDWR|O_NONBLOCK)    = 0
connect(4, {sa_family=AF_INET, sin_port=htons(6666), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress)

IPv4-only listener

Likewise, you can demonstrate an IPv4-only listener.

[ec2-user@ip-172-32-10-34 ~]$ nc -4 -l 4444

Netstat:

[ec2-user@ip-172-32-10-34 ~]$ netstat -ntl | grep :4444
tcp 0 0 0.0.0.0:4444 0.0.0.0:* LISTEN

IPv6 connection fails:

[ec2-user@ip-172-32-10-34 ~]$ nc -6 localhost 4444
Ncat: Connection refused.

The connect call:

connect(3, {sa_family=AF_INET6, sin6_port=htons(4444), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, 28) = -1 EINPROGRESS (Operation now in progress)

Conclusion

In the netstat output,

[ec2-user@ip-172-32-10-34 ~]$ netstat -ntl | grep :1521
tcp6 0 0 :::1521 :::* LISTEN

The lack of a line such as

tcp 0 0 0.0.0.0:1521 0.0.0.0:* LISTEN

does not mean that the listener does not accept IPv4 connections. It could mean that the listener implements a dual stack socket.

grid 19c install with ASM filter driver

Posted on August 26, 2019 by Brian Fitzgerald

Introduction

Starting in Oracle 12.2, the grid installation experience changed in two major ways:

In the past, you unzipped the binaries into a download folder and executed runInstaller. Oracle Universal Installer (OUI) copied the binaries to your grid oracle home. In the new procedure, you unzip the grid binaries into the grid home and then run gridSetup.sh to configure the software in-place.
Instead of configuring asmlib, you are going to initialize the disks using ASM Filter Driver (AFD).

These changes raise new organizational issues and introduce a new, critical step.

Scope

The scope of this article is:

Installation to cloud virtual machine (Azure or AWS EC2)
Oracle Restart (Not RAC)
x86_64 hardware
Linux 7
udev is not considered
New install, not an upgrade
No preexisting asmlib

Grid infrastructure in the cloud

RAC

The main purpose of oracle Grid Infrastructure was to support Real Application Clusters (RAC). RAC requires sharing storage volumes across hosts, which most Cloud providers disallow. An iscsi server can be used to share drives, but at the cost of added complexity. RAC deployments are, therefore, less common in the cloud.

ASM

Oracle Grid Infrastructure is extremely useful, even without RAC. Automatic Storage Management is a main component of grid. Some benefits of ASM are:

Convenient, consolidated, managed storage
Efficient RMAN and Data Guard administration
Monitoring and managing space in ASM using Oracle Enterprise Manager

restart

An Oracle standalone (non-RAC) database that is running on grid infrastructure is known as a “restart”. You use grid to manage oracle startup and shutdown, and to monitor the database instance. Oracle restart works very well in the cloud.

Organizational issues

Previously, during the operating system setup, asmlib could be configured along with other root steps, prior to grid installation. However, AFD labeling must wait until grid installation has begun. In an organization with segregation of SA and DBA duties, the DBA is going to need to schedule three separate tasks for the SA.

Setup the operating system
Label ASM disks using AFD
Run root.sh during the grid setup

Instead of personal handoffs, you can do one of these:

Have one person with root and grid access perform all steps.
Use an automated deployment tool.
Run a script as root with some steps su’d to grid.

This blog article covers manual command line steps as root or grid, and use of the gridSetup.sh X windows GUI.

Steps leading up to AFD

The steps leading up to ASM disk labeling are, briefly:

Select a compatible machine image (Linux on x86_64)
Create the OS (ex: 2 CPU, 8G RAM, 32G operating system disk)
Update and install packages
Configure the operating system (swap, sysctl.conf)
Create the grid user and directories
Download and unzip the software

Now you are ready to configure your ASM disks.

Disk allocation

In the cloud, you can allocate new disks from the web console. You can also use a command line tool. Finally, you can write a script to do the job, in python for example. You can select from among available type (hard drive or SSD), you can specify the size, and in some cases, you can specify the IOPS.

You can attach a disk to a live system, and you can detach a disk from a live system. Before allocating new disks, list the existing disks

[root@grid19c ~]# df /
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/sda2 33025276 12824288 20200988 39% /
[root@grid19c ~]# swapon
NAME TYPE SIZE USED PRIO
/mnt/resource/swapfile file 2G 0B -2
[root@grid19c ~]# ls -l /dev/sd*
brw-rw----. 1 root disk 8, 0 Aug 23 12:40 /dev/sda
brw-rw----. 1 root disk 8, 1 Aug 23 12:40 /dev/sda1
brw-rw----. 1 root disk 8, 2 Aug 23 12:40 /dev/sda2
brw-rw----. 1 root disk 8, 16 Aug 23 12:40 /dev/sdb
brw-rw----. 1 root disk 8, 17 Aug 23 12:40 /dev/sdb1

AWS

We are mainly interested in Linux systems that are available in the AWS Marketplace. The virtualization type in the AWS Marketplace right now is predominantly Hardware Virtual Machine (HVM). In the systems that I have checked, the possible device names are /dev/xvd[a-z], or only 26 devices, which is not very flexible. The allowable device size range is 1 GiB to 16384 GiB. The largest SSD devices are burstable up to 64000 IOPS. In AWS, an attached device can be resized.

aws.disks

AWS lets you choose the Linux device node name. You can retrieve the node name from the console, the command line, or the API. For example:

>>> from boto3 import resource
>>> resource('ec2').Volume('vol-0679cfe2a209db2ed').attachments[0]['Device']
'/dev/sdg'

Azure

In Azure, disks range in size from 32 GiB to 32767 GiB. A disk can be resized, and the type (HDD or SSD) may be modified only when the disk is unattached. The largest premium SSD offers IOPS up to 20000.

azure.disks

Azure does not identify the new disk names for you. I have not found a way to retrieve the operating system node name from the Azure cloud. The closest I can get is to display the attachment status.

>>> from azure.common.credentials import get_azure_cli_credentials
>>> from azure.mgmt.compute import ComputeManagementClient
>>> cli = ComputeManagementClient(
... *get_azure_cli_credentials()
... )
>>> cli.disks.get(
... resource_group_name='ora',
... disk_name='asm-0'
... ).disk_state
'Attached'

This seems like a gap to me. You want a certain method to distinguish new disks from existing disks, and you want a positive method to identify each disk.

Identify the new disks

Example:

brw-rw----. 1 root disk 8, 32 Aug 23 12:40 /dev/sdc
brw-rw----. 1 root disk 8, 48 Aug 23 12:40 /dev/sdd

AFD label

Prior to starting the grid install, you need to label the disk or disks that you intend to use for the first ASM disk group. We will use asmcmd afd_label. This is a critical step.

afd_label

To label your disks, sudo to root, and execute

asmcmd afd_label label devicenode –init

For example:

[root@grid19c ~]# export ORACLE_HOME=/u01/app/19.3.0/grid
[root@grid19c ~]# export ORACLE_BASE=/tmp
[root@grid19c ~]# cd /u01/app/19.3.0/grid/bin
[root@grid19c bin]# ./asmcmd afd_label DATA01_00001 /dev/sdc --init

ORACLE_BASE=/tmp avoids creating root-owned files under the grid oracle base.
Exercise care to follow this instruction exactly because an incorrect invocation may not print an error message at all.

separation of duties

If root steps are not done by the DBA, then the DBA should prepare a script for the AFD labeling. For example:

#!/bin/bash

export ORACLE_HOME=/u01/app/19.3.0/grid
export ORACLE_BASE=/tmp
cd /u01/app/19.3.0/grid/bin
./asmcmd afd_label DATA01_00001 /dev/sdc --init

afd_lslbl

If your disk discovery string is the default value ‘/dev/sd*’, as in Azure, you will be able to list your ASM disks with this simple command:

[root@grid19c bin]# ./asmcmd afd_lslbl
Could not open pfile '/etc/oracleafd.conf'--------------------------------------------------------------------------------
Label Duplicate Path
================================================================================
DATA01_00001 /dev/sdc

On AWS, your disk string might be ‘/dev/xvd*’, so this might not work:

[root@ip-172-31-86-22 bin]# ./asmcmd afd_lslbl
Could not open pfile '/etc/oracleafd.conf'No devices to be scanned.

In that case, you need to specify the disk string on the command line:

[root@ip-172-31-86-22 bin]# ./asmcmd afd_lslbl '/dev/xvd*'
--------------------------------------------------------------------------------
Label Duplicate Path
================================================================================
ASM01_00001 /dev/xvdc

After you run root.sh, you do not need to specify the disk discovery string.

[root@ip-172-31-25-179 bin]# ./asmcmd afd_lslbl
--------------------------------------------------------------------------------
Label Duplicate Path
================================================================================
ASM01_00001 /dev/xvdc

After root.sh, afd_lslbl gets its disk string from file /etc/oracleafd.conf:

# cat /etc/oracleafd.conf
afd_diskstring='/dev/xvd*'

afd_unlabel

You can erase the AFD label with:

[root@grid19c bin]# ./asmcmd afd_unlabel /dev/sdc --init
[root@grid19c bin]# ./asmcmd afd_lslbl
Could not open pfile '/etc/oracleafd.conf'No devices to be scanned.

Bugs and limitations

the afd_label –init option

If you leave out the –init option afd_label might not write a label to your disk, but it will also print no error message, and will return a success code.

[root@grid19c bin]# ./asmcmd afd_label DATA01_00001 /dev/sdc
[root@grid19c bin]# echo $?
0
[root@grid19c bin]# ./asmcmd afd_lslbl
Could not open pfile '/etc/oracleafd.conf'No devices to be scanned.

You can use dd to verify that the disk is empty

[root@grid19c bin]# dd if=/dev/sdc ibs=8192 obs=8192 count=1 | od -X
0000000 00000000 00000000 00000000 00000000
*
0020000
1+0 records in
1+0 records out
8192 bytes (8,2 kB) copied, 0.032763 s, 25.0 MB/s

the afd_unlabel –init option

Likewise, if you run afd_unlabel without the –init option, asmcmd may silently not unlabel your disk.

[root@grid19c bin]# ./asmcmd afd_label DATA01_00001 /dev/sdc --init
[root@grid19c bin]# ./asmcmd afd_lslbl
Could not open pfile '/etc/oracleafd.conf'--------------------------------------------------------------------------------
Label Duplicate Path
================================================================================
DATA01_00001 /dev/sdc
[root@grid19c bin]# ./asmcmd afd_unlabel /dev/sdc
[root@grid19c bin]# ./asmcmd afd_lslbl
Could not open pfile '/etc/oracleafd.conf'--------------------------------------------------------------------------------
Label Duplicate Path
================================================================================
DATA01_00001 /dev/sdc

You can also use dd to view the label

[root@grid19c bin]# dd if=/dev/sdc ibs=8192 obs=8192 count=1 | od -X
0000000 00000000 00000000 00000000 1f75bc96
0000020 00000000 00000000 00000000 00000000
0000040 4c43524f 4b534944 41544144 305f3130
0000060 31303030 00000000 00000000 00000000
0000100 00000000 00000000 00000000 00000000
*
0000440 00000000 00000000 0500000a 5d5ee7d3
0000460 00000000 00000000 00000000 00000000
*
1+0 records in
1+0 records out
8192 bytes (8.2 kB) copied*
0020000

other utilities

In this context, other afd commands might run, produce no output, do nothing, and exit with success code.

[root@grid19c bin]# ./asmcmd afd_state
[root@grid19c bin]# echo $?
0

[root@grid19c bin]# ./asmcmd afd_configure
[root@grid19c bin]# echo $?
0

At this stage so far in the grid installation, only commands afd_label is needed. You can also run afd_unlabel, and afd_lslbl. If you experiment with other commands or options, you may end up in a state that is difficult to diagnose.

grid installation directory

Oracle grid can be installed in virtually any directory. Originally, however, path”/u01/app/19.0.0/grid” is hard-coded into scripts such as kfod. You cannot kfod and other such scripts at this early stage. Later in the installation procedure, root.sh replaces such files with scripts that can run out of any installation directory.

[root@grid19c bin]# ./asmcmd
/u01/app/19.3.0/grid/bin/kfod: line 22: /u01/app/19.0.0/grid/bin/kfod.bin: No such file or directory
Use of uninitialized value $result[0] in scalar chomp at /u01/app/19.3.0/grid/lib/asmcmdbase.pm line 5982.
Use of uninitialized value $result[0] in split at /u01/app/19.3.0/grid/lib/asmcmdbase.pm line 5985.
Use of uninitialized value $clus_mode in scalar chomp at /u01/app/19.3.0/grid/lib/asmcmdbase.pm line 5987.
Use of uninitialized value $clus_mode in concatenation (.) or string at /u01/app/19.3.0/grid/lib/asmcmdbase.pm line 5988.
Use of uninitialized value $clus_mode in string eq at /u01/app/19.3.0/grid/lib/asmcmdbase.pm line 5993.
...
ASMCMD> exit

conclusion

Most utilities are designed to run with full functionality after gridSetup.sh and root.sh have completed. One notable exception is subset of asmcmd afd sub commands with very specific options.

Because of the bugginess or limitations of the asmcmd AFD options, it is critical to be aware of the limitations, and by all means, to get the AFD disk labeling correct the first time.

Grid setup

This section describes the grid setup, with emphasis on the ASM Disk Group screen

start X server (Xming) on Windows
On Linux, sudo to grid and cd to the grid home
Start gridSetup.sh
At the Configuration Options screen, select Standalone (Oracle Restart).
Click Next to go to the Create ASM Disk Group screen.

Here is where you are going to create your first ASM disk group. gridSetup.sh scans the operating system for labeled disks. Labeled disks matching “/dev/sd*” will appear in your display.

cr.asm.dg.azure

On some AWS systems, you need to search for “/dev/xvd*”. “Click Change Disk Discovery Path”. For example:

ch.dsk.dsc.path

cr.asm.dg

You can change the disk group name, allocation unit size, and redundancy. Select the disk or disks that you want in your first disk group. Check that Configure Oracle ASM Filter Driver is checked. Press Next

Continuing the grid install

After the Create ASM diskgroup screen, installation can proceed smoothly

Complete the remaining screens
When prompted, run root.sh

After root.sh is finished, all utilities in grid $ORACLE_HOME/bin should operate with full functionality.

Summary

Oracle Grid Infrastructure substantially improves manageability. Grid is less complex to install than Oracle RAC, but requires a disk installation step which requires attention to detail. The Oracle 19c grid installation requires configuration of the ASM Filter Driver prior to running gridSetup.sh. You can use cloud and Linux administrative tools to allocate storage and then use asmcmd to label the disks.

Oracle binaries or patch download from Linux command line

Posted on August 21, 2019 by Brian Fitzgerald

By Brian Fitzgerald

Introduction

Our objective is to download Oracle binaries or patches directly to Linux using the bash shell. For binaries, we will log in to Oracle and accept the license agreement in a Windows browser, and then perform the download in Linux. For patches, we will log in to MOS. The technique is useful for cloud or on-prem.

Linux preparations

Complete the following preparations.

Check your Linux version

$ uname -r
3.10.0-1062.9.1.el7.x86_64

wget download from oracle may fail on Red Hat Enterprise Linux el5 or lower.

Check your ssl version

$ openssl version
OpenSSL 1.0.2k-fips 26 Jan 2017

wget download from oracle may fail on openssl version 0.9 or lower.

Install wget

If necessary, install wget

sudo -i -u root
yum -y install wget

cd to your download area:

Redirect cat standard output to a file.

[grid@ora19c ~]$ mkdir download
[grid@ora19c ~]$ cd download/
[grid@ora19c download]$ cat > cook.txt

Let the cat session wait there.

Windows preparations

browser

Chrome is used in this example.

cookie plugin

Install the EditThisCookie plugin.
Right click EditThisCookie and navigate to Options
For export format, select “Netscape HTTP Cookie File”

cook

other browsers

Export methods exist for other browsers. For example, in Firefox, you may use “Export Cookies” by Rotem Dan. Also, you may try your browser’s builtin inspection tools.

binaries download page

Login to oracle.com with your ID and password.
Navigate to the download page. For example:

https://www.oracle.com/database/technologies/oracle19c-linux-downloads.html

Export cookies

Complete the cookie export and start the download within 30 minutes:

From Chrome click the download icon.
When prompted, accept the license agreement.
Click the large download button.

Interrupt the download.
Click the EditThisCookie icon
In the EditThisCookie, clear the search box,
Enter “oracle.com”
Using the scrollbar, check that these cookies are present:

domain	name
.oracle.com	oraclelicense
edelivery.oracle.com	OAMAuthnCookie_edelivery.oracle.com:443

export oamauth

If OAMAuthnCookie_edelivery.oracle.com:443 is missing, it could be because:
- you did not start the download
- you did not search oracle.com
Click the export button. Note the confirmation message:

exported

Paste the cookies into your Linux cat session.
Type ctrl-d (end of file) into the cat session.
Check Linux cookies file cook.txt with egrep:

[grid@ora19c download]$ egrep 'oraclelicense|OAMAuthnCookie_edelivery' cook.txt
.oracle.com     TRUE    /       FALSE   1566398899      oraclelicense   152015
edelivery.oracle.com    FALSE   /       TRUE    0       OAMAuthnCookie_edelivery.oracle.com:443 0ba9acafe65e911747b1cc43daf2bb6269988c0b%7EbOV%2F2lmNMEtBQ0AQibaanasshtknkgx1dWreuPjqkvXEFXKl3n3wj1zmKsG1lkM56uv%2F2cWPSSrH3HRWYEsWaDh2eGTSS8fizlEZF01NQdIF6hEG8rp5qkdvQv3twKXCO%2BpAgxiixhlxb61xlMvNDiLiv8JpDY97yvTSf1QyDxuY7Fjs03Qb0Zaq6i88NPsqkq0gu1C8fEbwEZbgW55YAUVQQdKfOpN4yR2iXenEJfiiKmHTCISZRvPNzgPf0JxPK3fVnvLsAO9HEFBsu%2Fwx0uUsYqe%2FHPJJwkFjBLudqkD8bO84t7HTCED80o%2BrxzcOJ%2BE%2BuvlaqItiidQ7asFuHQanPUS10CeTEXXWzK3h4Lni4Gp29wKtnQ6ADJfPET%2FCnKehcnEsB9lYeQliSvdhWkRD9v3Azr5SYkhEdsDCF4z%2FWXP1dOT6gB9oQVn1352FcJbnA8Zo6ulnXPhKdwJE21SiskmOAelR0GVvA9M3DNU%2FokD1s2Msy293j9goob%2Fpj0IvtsTyuxOdRc1p4gV1XyZbulYn8q1q6Fnu7SpEq7hxiKh6TjY1737uh2w%2FTO3G%2F1QviFlrG%2BK1Ts76KNp45lYwefYA7U29v9W4H4gJ4Cs%3D

Download

In Linux, start the download with wget:

[grid@ora19c download]$ wget --load-cookies=cook.txt 'https://download.oracle.com/otn/linux/oracle19c/190000/LINUX.X64_193000_grid_home.zip'

Downloading patches from MOS

The procedure for downloading patches from My Oracle Support is simpler because you don’t have to acknowledge the license agreement and then start and interrupt the download. However, MOS download links contain the “&” and “?” characters, so you need to quote the URL.

To download a patch from MOS:

Login to MOS
open EditThisCookie
clear the URL window and enter “oracle.com”
click the export button
In your shell, issue cat > /tmp/cook.txt
Paste your shell into the cat
Type ctrl-D
navigate to the patch download link. For example:

dl.patch

hover your mouse over the download button and right-click
select “Copy link address”
paste the link into your shell between quotes, such as in this example:

wget --load-cookies=/tmp/cook.txt ''

after paste:

 wget --load-cookies /tmp/cook.txt 'https://updates.oracle.com/Orion/Services/download/p30501926_180000_Linux-x86-64.zip?aru=23242022&patch_file=p30501926_180000_Linux-x86-64.zip'

You can specify the output file using the “-O” option. i.e.

-O p30501926_180000_Linux-x86-64.zip

Putting it all together, you get:

wget --load-cookies=/tmp/cook.txt 'https://updates.oracle.com/Orion/Services/download/p30501926_180000_Linux-x86-64.zip?aru=23242022&patch_file=p30501926_180000_Linux-x86-64.zip' -O p30501926_180000_Linux-x86-64.zip

Press enter to start the download

Cookies file check

For MOS downloads, there is only one required cookie, namely Oracle_updates_auth

grep Oracle_updates_auth  /tmp/cook.txt
.updates.oracle.com     TRUE    /       FALSE   1582699648      Oracle_updates_auth     ZDA1YzU4OThkODUyM2Y3NjgzZWUxYmNhNGQzMDFjN2UtNTM2MTZjNzQ2NTY0NWY1ZmY5ZDVkMDZmZGZkMWQyMzBkY2FmZGI5NDdlOWE3NWVkMWY1NWVjYjYzZmE4ZDc3ZjAwN2JkYjMzNzViYjAyMmZkZjUzNmRkYWE3OWRiYzg4NDIzY2JlY2JkMzcwMjk5OWQyZDNlMWYzYTAzZTFhYjU
oracle:/ora_local/download/oracle/PSU_PATCHES

Special characters in the URL

You should quote the URL. Otherwise:

If ‘&’ appears in the URL, the shell will interpret the string to the right of the ‘&’ as a separate command.
If ‘?’ appears in the URL, the shell may interpret it as a metacharacter.

An example of this problem while downloading a Release Update from MOS:

wget --load-cookies /tmp/cook.txt https://updates.oracle.com/Orion/Services/download/p30501926_180000_Linux-x86-64.zip?aru=23242022&patch_file=p30501926_180000_Linux-x86-64.zip -O p30501926_180000_Linux-x86-64.zip
[1] 13367
--2020-02-25 14:15:12--  https://updates.oracle.com/Orion/Services/download/p30501926_180000_Linux-x86-64.zip?aru=23242022
bash: -O: command not found...
Resolving updates.oracle.com (updates.oracle.com)... oracle:/ora_local/download/oracle/PSU_PATCHES
=>141.146.44.51
Connecting to updates.oracle.com (updates.oracle.com)|141.146.44.51|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://aru-akam.oracle.com/adcarurepos/vol/patch14/PLATFORM/CORE/Linux-x86-64/R600000000071008/p30501926_180000_Linux-x86-64.zip?FilePath=/adcarurepos/vol/patch14/PLATFORM/CORE/Linux-x86-64/R600000000071008/p30501926_180000_Linux-x86-64.zip&File=p30501926_180000_Linux-x86-64.zip&params=UGw0Nnppc2RUZnBGTkx2aUEraURsUTphcnU9MjMyNDIwMjImZW1haWw9YmYyMTkwQG55dS5lZHUmZmlsZV9pZD0xMDY0NzI4MTYmcGF0Y2hfZmlsZT1wMzA1MDE5MjZfMTgwMDAwX0xpbnV4LXg4Ni02NC56aXAmdXNlcmlkPW8tYmYyMTkwQG55dS5lZHUmc2l6ZT0xMTQ4Mjg4MjMmY29udGV4dD1BQDEwK0hAYWFydXZtdHAwNy5vcmFjbGUuY29tK1BAJmRvd25sb2FkX2lkPTQxMjI3Mzc2Mw@@&AuthParam=1582658233_7e9698372721c6852f3ca053d3a8e443 [following]
--2020-02-25 14:15:13--  https://aru-akam.oracle.com/adcarurepos/vol/patch14/PLATFORM/CORE/Linux-x86-64/R600000000071008/p30501926_180000_Linux-x86-64.zip?FilePath=/adcarurepos/vol/patch14/PLATFORM/CORE/Linux-x86-64/R600000000071008/p30501926_180000_Linux-x86-64.zip&File=p30501926_180000_Linux-x86-64.zip&params=UGw0Nnppc2RUZnBGTkx2aUEraURsUTphcnU9MjMyNDIwMjImZW1haWw9YmYyMTkwQG55dS5lZHUmZmlsZV9pZD0xMDY0NzI4MTYmcGF0Y2hfZmlsZT1wMzA1MDE5MjZfMTgwMDAwX0xpbnV4LXg4Ni02NC56aXAmdXNlcmlkPW8tYmYyMTkwQG55dS5lZHUmc2l6ZT0xMTQ4Mjg4MjMmY29udGV4dD1BQDEwK0hAYWFydXZtdHAwNy5vcmFjbGUuY29tK1BAJmRvd25sb2FkX2lkPTQxMjI3Mzc2Mw@@&AuthParam=1582658233_7e9698372721c6852f3ca053d3a8e443
Resolving aru-akam.oracle.com (aru-akam.oracle.com)... 104.124.100.114
Connecting to aru-akam.oracle.com (aru-akam.oracle.com)|104.124.100.114|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 114828823 (110M) [application/zip]
Saving to: ‘p30501926_180000_Linux-x86-64.zip?aru=23242022’

100%[=========================================================================================================================================================================>] 114,828,823 42.1MB/s   in 2.6s

2020-02-25 14:15:15 (42.1 MB/s) - ‘p30501926_180000_Linux-x86-64.zip?aru=23242022’ saved [114828823/114828823]

Notice that:

the shell interpreted

patch_file=p30501926_180000_Linux-x86-64.zip -O

as a command and printed this error:

bash: -O: command not found...

The file got saved as ‘p30501926_180000_Linux-x86-64.zip?aru=23242022’

Solution:
Quote the URL.

 wget --load-cookies /tmp/cook.txt 'https://updates.oracle.com/Orion/Services/download/p30501926_180000_Linux-x86-64.zip?aru=23242022&patch_file=p30501926_180000_Linux-x86-64.zip'

Special characters in the output file name

You may find that wget saves the file to an unexpected name. For example:

ls
p30501926_180000_Linux-x86-64.zip?aru=23242022&patch_file=p30501926_180000_Linux-x86-64.zip

Solution:

Specify the output file using “-O”, as previously explained.

Summary

Using Chrome, you can log in to Oracle and accept the license agreement. Next, using EditThisCookie, you can export the needed cookies. Finally, using wget and the –load-cookies option, you can start your download.

The possible benefit is that the download will run faster on the Linux server than on your Windows host. Also, direct download saves you an additional transfer step from Windows.

X windows for Oracle DBAs

Posted on August 12, 2019 by Brian Fitzgerald

By Brian Fitzgerald

Introduction

This is for DBAs on who want to use Oracle X-windows based tools such as OUI or DBCA. Techniques described here can lead to a smooth installation experience. Beyond just a cookbook, this blog article also demonstrates the principles behind X forwarding. Red Hat Linux on AWS EC2 is presented here as an example platform.

In hurry? Skip to tl;dr, at the end.

X-windows

The X windows system consists of a server, a display, and clients. The X server accepts requests for graphical output from the clients and renders those requests on the display. In older usage, the server and the client processes usually ran on the same host, a desktop Unix system. In the present context, the X server runs on your Windows PC and displays to your screen. The X server usually listens on TCP port 6000.

In X-windows lingo, applications and utilities are called clients. In the present context, clients run on the Oracle host system, an instance of AWS EC2. Examples of clients include xterm and Oracle Universal Installer (OUI), which run on separate server hardware. Our objective, therefore, is to have an Oracle installation tool, such as OUI, and connect back to the X server on your PC so you can view and control the graphical user interface (GUI). Paradoxically, the X server does not run on “server” hardware, but on your windows PC. This table should clear up any confusion.

Description	X Server	X Client
Hardware	PC	server
Operating system	Windows	Linux
Has display monitor	yes	no
X server port (default 6000)	binds listens accepts connections	connects to
Software	Xming	xterm, dbca, etc.

The client application connects to the X server via sshd on Linux and via PuTTY on your PC.

AWS EC2

Red Hat Linux on Amazon EC2 is presented as an example platform. AWS EC2 instances are built from an available Amazon Machine Image (AMI). For an Oracle Database installation, be sure to choose an AMI with a supported processor type, such as Intel x86_64 (not ARM), and a supported operating system, such as Red Hat (not Ubuntu or Amazon Linux).

The final step in EC2 launch is to select an existing key pair or create and download a new key pair. A key pair file from AWS has extension .pem.

Linux packages

Linux package installs are required before you can run X clients. In addition, it is helpful to install basic troubleshooting tools.

After you launch a new EC2 instance, run yum update.

# yum -y update

Running yum update may get you a few new tools, such as lsipc and lsmem.

Depending on the AMI, you may find that basic Linux tools are missing. Installing those tools makes troubleshooting easier. For example:

# yum -y install nc lsof unzip strace traceroute

You want to set up a working X windows environment, but before you do, notice three things from the login shell:

[ec2-user@ip-172-31-89-75 ~]$ echo $DISPLAY

[ec2-user@ip-172-31-89-75 ~]$ netstat -ntl | grep :60..

The DISPLAY environment variable is not set
There is no listener in the 6000 range
No file .Xauthority is present

Now install the X windows packages.

# yum -y install motif xorg-x11-utils xterm xauth

xterm here is for testing X windows.

In /etc/ssh/sshd_config, check:

X11Forwarding yes
X11UseLocalhost no

If necessary, edit /etc/ssh/sshd_config and then issue:

service sshd restart

After installing xauth, login using a new PuTTY session. On the first login, the following output will appear:

/usr/bin/xauth: file /home/ec2-user/.Xauthority does not exist

Now recheck:

[ec2-user@ip-172-31-89-75 ~]$ echo $DISPLAY
localhost:10.0
[ec2-user@ip-172-31-89-75 ~]$ netstat -ntl | grep :60..
tcp        0      0 127.0.0.1:6010          0.0.0.0:*               LISTEN
tcp6       0      0 ::1:6010                :::*                    LISTEN

The DISPLAY environment variable is set
There is a new listener in the 6000 range.
A new .Xauthority file exists

The “10” in DISPLAY refers to the offset from 6000. When you set your DISPLAY to localhost:10.0, all X clients will connect to port 6010 by default.

Notice which process is listening;

[root@ip-172-31-88-44 ~]# lsof -Pi :6010
COMMAND  PID     USER   FD   TYPE DEVICE SIZE/OFF NODE NAME
sshd    1351 ec2-user    9u  IPv6  20649      0t0  TCP localhost:6010 (LISTEN)
sshd    1351 ec2-user   10u  IPv4  20650      0t0  TCP localhost:6010 (LISTEN)

The sshd process is listening on port 6010. You are going to run X-client application on the Linux box. The client will connect to port 6010. The sshd process will forward the X-requests to your PC.

X server

There are several MS Windows-based X-servers available, Xming, for example. By default, the X server listens on port 6000. Confirmation:

C:\>netstat -na | findstr :6000 | findstr LISTEN
TCP 0.0.0.0:6000 0.0.0.0:0 LISTENING

X server issues

Can’t load font

A message such as this appears

xterm: cannot load font -misc-fixed-medium-r-semicondensed--13-120-75-75-c-60-iso10646-1

Solution:

If you are using Xming, then install Xming fonts. For example: Xming-fonts-7-7-0-10-setup.exe

xdpyinfo

Behind the scenes, OUI runs xdpyinfo. If xdpyinfo is missing, this message will appear:

ERROR: Unable to verify the graphical display setup. This application requires X display. Make sure that xdpyinfo exist under PATH variable.

To get xdpyinfo, install package xorg-x11-utils. An example of normal OUI output is:

Checking monitor: must be configured to display at least 256 colors.   Actual 16777216    Passed

Warning: No xauth data; using fake authentication data for X11 forwarding.

Symptom:

You run ssh -X and observe a warning message such as:

brian@PQHKEA * xauth $ ssh -X -i ~/.ssh/acme-key.pem ec2-user@3.84.232.118
Warning: No xauth data; using fake authentication data for X11 forwarding.
Last login: Sat Nov 7 18:31:30 2020 from pool-98-113-87-97.nycmny.fios.verizon.net

The message is from the local ssh client, not the remote host. You can confirm this by referring to package https://github.com/openssh/openssh-portable, file clientloop.c.

Fix:

xauth add :${DISPLAY#*:} . `mcookie`

Explanation.

:${DISPLAY#*:} gives you the DISPLAY value without the IP address.

mcookie a 128-bit random hexadecimal number for use with the X authority system

xterm issue

message:

Warning: locale not supported by C library, locale unchanged

Solution:

export LC_ALL=C

X libraries

The manual may instruct you to install certain X windows runtime libraries, or OUI may so instruct you. The library list depends on the Oracle product and the version. Example:

yum -y install libX11 libXau libXi libXtst libXrender

If you have installed xterm, then you already have these dependent libraries.

mwm

mwm is the Motif Window Manager. It lets you move and resize windows. Without a window manager, you may encounter situations where you have a dialog box waiting for input but is inaccessible because it is behind another window. Here is the X root window after dragging three windows to non-overlapping locations.

mwm

The shell commands leading up to this were:

xterm & disown
mwm & disown
unzip -q ~/download/LINUX.X64_193000_db_home.zip
./runInstaller

Network

The X windows display technique described here assumes that you can connect via the secure shell (ssh) from your PC directly to the Oracle server host. The AWS EC2 instance has a Public IP address that is reachable from your PC.

The X windows protocol is a very “chatty” protocol, meaning that clients make a high number of small server requests. Chattiness makes every button press on the GUI respond very slowly. Having a fast network connection is essential. One might give some thought to enabling SSH compression; however, SSH compression does not mitigate the chattiness.

putty.compression

Enabling SSH compression will usually not result in a noticeable improvement in responsiveness.

Running an X windows client across sudo

In a prior section, xterm worked fine from the login shell (ec2-user). In practice, you will sudo to the oracle account to execute runInstaller or dbca. You may also sudo to the grid account to run gridSetup.sh. First, test xterm from the login account:

[ec2-user@ip-172-31-82-194 ~]$ echo $DISPLAY
localhost:11.0
[ec2-user@ip-172-31-82-194 ~]$ xterm
(it works)

One thing you will notice here in this case is that the offset in the DISPLAY variable is 11. The offset is not always 10, and depends on what port is available for X forwarding when you log in.

[ec2-user@ip-172-31-82-194 ~]$ netstat -ntl | grep :60..
tcp        0      0 127.0.0.1:6010          0.0.0.0:*               LISTEN
tcp        0      0 127.0.0.1:6011          0.0.0.0:*               LISTEN
tcp6       0      0 ::1:6010                :::*                    LISTEN
tcp6       0      0 ::1:6011                :::*                    LISTEN

In this example, port 6010 was is in use when I logged in, so sshd started listening on port 6011 and set offset to 11.

If you try to run an X client across a sudo, you will get this error:

[ec2-user@ip-172-31-82-194 ~]$ sudo -i -u oracle
[oracle@ip-172-31-82-194 ~]$ xterm
xterm: Xt error: Can't open display:
xterm: DISPLAY is not set
[oracle@ip-172-31-82-194 ~]$ echo $DISPLAY

The DISPLAY environment setting did not carry over to the sudo shell. You can try setting DISPLAY, but you will get a different error:

[oracle@ip-172-31-82-194 ~]$ export DISPLAY=localhost:11.0
[oracle@ip-172-31-82-194 ~]$ xterm
PuTTY X11 proxy: wrong authorisation protocol attemptedError: Can't open display: localhost:11.0

To fix the authorization error, copy .Xauthority from the login account to the oracle account:

[ec2-user@ip-172-31-82-194 ~]$ sudo cp -up ~/.Xauthority ~oracle/.Xauthority
[ec2-user@ip-172-31-82-194 ~]$ sudo chown oracle:` id -gn oracle ` ~oracle/.Xauthority
[ec2-user@ip-172-31-82-194 ~]$ sudo -i -u oracle
[oracle@ip-172-31-82-194 ~]$ xterm

To automate the copying, I added this code to the login user .bash_profile:

if [ -z "$SUDO_USER" -a -f ~/.Xauthority ] ; then
    for u in oracle grid ; do
        if id $u > /dev/null 2>&1 ; then
            eval hd=~$u
            sudo cp -up ~/.Xauthority $hd/.Xauthority
            sudo chown $u:$(id -gn $u) $hd/.Xauthority
        fi
    done
fi

To automate setting DISPLAY, I added this code to the oracle .bash_profile:

if [ -z "$DISPLAY" ] ; then
    for port in {6010..6019} ; do
        if netstat -ntl | grep -q :$port ; then
            offset=$(( port - 6000 ))
            export DISPLAY=localhost:$offset.0
            break
        fi
    done
fi

To summarize, before sudo, copy .Xauthority from the login account to oracle. After sudo, set the DISPLAY environment variable. Automate those settings in .bash_profile.

puttygen

To connect PuTTY from Windows to AWS EC2 you need to generate a Putty Private Key (PPK) file. Start with the PEM (Privacy Enhanced Mail) file that you downloaded when you created your EC2 instance. Startup PuTTY Key Generator (puttygen). Load the PEM file. Click Save private key. Provide a name. Example:

us-east-1-key.pem
us-east-1-key.ppk

puttygen

X forwarding with PuTTY

To establish a session with PuTTY, navigate the PuTTY screens and make these settings.

Screen	Item	Example
Session	Host Name (or IP address)	3.86.243.42
Connection->Data	Auto-login username	ec2-user
Connection->SSH->Auth	Private key file for authorization	C:\Users\Brian Fitzgerald\.aws\us-east-1-key.ppk
Connection->SSH->X11	Enable X11 forwarding	checked
Connection->SSH->X11	X display location	localhost:0.0

putty.host

You may save your settings for future reuse.

putty.user

The private key file is the ppk file that you converted from the pem file.

putty.x11

X display location refers to the X server on your PC, and is usually localhost:0.0.

A simple command line example

Instead of using PuTTy, you could use ssh. This example uses bash on Windows Subsystem for Linux. The hypothetical hosts are lcl and rem.

Generate your key

lcl $ ssh-keygen
Generating public/private rsa key pair.
Enter file in which to save the key (/home/bf/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/bf/.ssh/id_rsa.
Your public key has been saved in /home/bf/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:Jb9cfmEZI8r1FNe611gM0dBLYInq+/CXs35NPpSlB88 bf@lcl
The key's randomart image is:
+---[RSA 2048]----+
|            .o*=o|
|           ....++|
|        . o o =+.|
|         * o +o=+|
|        S + . =O+|
|         o + .++E|
|         .+ . +=.|
|         .o  = oo|
|          .ooo+ .|
+----[SHA256]-----+

Copy your key to remote

lcl $ ssh-copy-id rem
/usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/home/bf/.ssh/id_rsa.pub"
The authenticity of host 'rem (10.130.33.179)' can't be established.
ECDSA key fingerprint is SHA256:Gr8YlEJrgTQdsNMGXOZoU+6ugg4TPyX5B9lCLTE0g/8.
Are you sure you want to continue connecting (yes/no)? yes
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys


bf@rem's password:

Number of key(s) added: 1

Now try logging into the machine, with:   "ssh 'rem'"
and check to make sure that only the key(s) you wanted were added.

Set your DISPLAY

Startup Xming, then run:

lcl $ export DISPLAY=127.0.0.1:0.0

ssh using the -X flag

lcl $ ssh -X rem
/usr/bin/xauth:  file /home/bf/.Xauthority does not exist
rem $

test

rem $ xdpyinfo | wc
    287    1136    9544

permit your .Xauthority file

rem $ chmod 750 ~
rem $ chmod 640 ~/.Xauthority

sudo to oracle

rem $ sudo su - oracle
[sudo] password for bf:

copy .Xauthority

oracle@rem $ cp -p ~bf/.Xauthority .Xauthority

set your DISPLAY

oracle@rem $ export DISPLAY=10.130.33.179:10.0

test

oracle@rem $ xdpyinfo | wc
    287    1136    9544

Oracle Universal Installer

oracle@rem $ cd /u01/download/oracle/em13400/
oracle@rem $ ./em13400_linux64.bin -J-Djava.io.tmpdir=/u01/tmp

oui.em

Notes

Display

rem $ echo $DISPLAY
10.130.33.179:10.0

Notice :10.0, not :0.0

Ports

rem $ netstat -nat | grep :60
tcp        0      0 0.0.0.0:6010            0.0.0.0:*               LISTEN
tcp6       0      0 :::6010                 :::*

Notice 6010, not 6000

File

rem $ ls -l .Xauthority
-rw-------. 1 bf dba 100 Jul  6 17:41 .Xauthority

X forwarding via bastion host

You can use ssh from the Windows command line to setup X forwarding via a tunnel at the bastion. Identify the following items:

item	example
bastion key file	us-east-1-key.pem
arbitrary Windows local port	3333
oracle box private IP address	172.31.95.14
ssh port	22
bastion host user	brian
bastion public IP address	52.90.61.241

and use them as follows:

ssh -i keyfile -C -N -L localport:oraclebox:22 bastionuser@bastionhost

For example:

C:\>ssh -i us-east-1-key.pem -N -L 3333:172.31.95.14:22 ec2-user@52.90.61.241

The bastion key file is the pem file that you downloaded when you created the EC2 instance
The local port is a free port on your PC that you will choose arbitrarily.
The oracle box is the private IP address or the oracle box that you want to ssh to.
Bastion user is your login on the bastion host, i.e. ec2-user.
bastionhost is the public IP address of your bastion host.

Next, in a second cmd window:

set DISPLAY=localhost:0.0
ssh -i us-east-1-key.pem -Y -p 3333 ec2-user@localhost
Warning: No xauth data; using fake authentication data for X11 forwarding.
Last login: Tue Nov 19 02:50:02 2019 from ip-172-31-88-110.ec2.internal
Last login: Tue Nov 19 02:50:02 2019 from ip-172-31-88-110.ec2.internal
ec2-user@ip-172-31-95-14 ~$ xterm

The requirements for the bastion host are different from the oracle box.

Establishing ssh trust from the bastion to the oracle box is not required.
The tunneling session is not responsible for X forwarding. Option -X is not required.
Disabling X11Forwarding at the bastion does not disable X forwarding as described here because the X11 forwarding happens at the Oracle host and in PuTTY.
The bastion login user does not have to be the same as the oracle box login user.
The bastion host does not have to be the same hardware or operating system architecture as the oracle box. Amazon Linux works.
In some bastion EC2 systems, no configuration is required. ssh tunneling works out of the box.
If the bastion /etc/ssh/sshd_config has “AllowTcpForwarding no”, then ssh tunneling is disabled.

You do not need to log on to the bastion with an interactive shell. After the tunneling session is established, you will not interact with it.

X windows without X11 forwarding

If you have a connection from Linux to the X11 port, usually 6000, on your PC, then you can attempt X windows without X11 forwarding. You are going to set DISPLAY=pc.dnsname.or.ipaddress:6000

Verify port 6000 with:

C:\>netstat -na | findstr LISTEN | findstr :6000
TCP 0.0.0.0:6000 0.0.0.0:0 LISTENING

“0.0.0.0:6000” refers to listening on port 6000 on all IP addresses on all network interfaces. If you see

 TCP 127.0.0.1:6000 0.0.0.0:0 LISTENING

then your X server will not accept connections from outside your PC.

You can find your IP address with:

C:\>ipconfig | findstr Address
Link-local IPv6 Address . . . . . : fe80::f9f0:5830:cd8e:7502%24
IPv4 Address. . . . . . . . . . . : 192.168.1.7
Link-local IPv6 Address . . . . . : fe80::85d7:7339:ff49:e5e6%51
IPv4 Address. . . . . . . . . . . : 192.168.74.113

The addresses shown in this example are private IP addresses. If the oracle box and the PC are not on the same private network, you cannot refer to this address.

To get the NATted IP address you can use netstat:

[ec2-user@ip-172-31-86-22 ~]$ netstat -nat | grep :22 | grep ESTABLISHED
tcp 0 64 172.31.86.22:22 74.101.107.146:49459 ESTABLISHED
tcp 0 0 172.31.86.22:22 74.101.107.146:54036 ESTABLISHED

or last:

[ec2-user@ip-172-31-86-22 ~]$ last -1ai ec2-user
ec2-user pts/1 Tue Aug 27 09:05 still logged in 74.101.107.146

The PC IP address is, therefore, 74.101.107.146. Instead of the IP address, you may refer to DNS name. The DNS name appears when you login.

Using username "ec2-user".
Authenticating with public key "imported-openssh-key"
Last login: Mon Aug 26 15:57:51 2019 from pool-74-101-107-146.nycmny.fios.verizon.net

or from last:

[ec2-user@ip-172-31-86-22 ~]$ last -1a ec2-user
ec2-user pts/1        Tue Aug 27 09:05   still logged in    pool-74-101-107-146.nycmny.fios.verizon.net

Before attempting runInstaller, test connectivity using nc

[ec2-user@ip-172-31-86-22 ~]$ time nc -v 74.101.107.146 6000
Ncat: Version 7.50 ( https://nmap.org/ncat )
Ncat: Connection timed out.

real    0m10.029s
user    0m0.012s
sys     0m0.004s

By default, nc times out in 10 seconds.

If the nc connection times out, then an X-windows client will also time out.

[ec2-user@ip-172-31-86-22 ~]$ export DISPLAY=74.101.107.146:0.0
[ec2-user@ip-172-31-86-22 ~]$ time xterm
xterm: Xt error: Can't open display: 74.101.107.146:0.0

real    2m7.366s
user    0m0.021s
sys     0m0.016s

The TCP timeout depends on tcp_syn_retries

[root@ip-172-31-86-22 ~]# cat /proc/sys/net/ipv4/tcp_syn_retries
6

If tcp_syn_retries = 6, then the TCP timeout works out to 127 seconds, calculated thus:

tcp.timeout

You can monitor the connection in a separate window.

[ec2-user@ip-172-31-86-22 ~]$ netstat -nat | grep :6000
tcp        0      1 172.31.86.22:36310      74.101.107.146:6000     SYN_SENT

SYN_SENT lasting for a long time means that the connection is about to time out. runInstaller will take more than twice as long as xterm does

to time out.

[oracle@ip-172-31-86-22 dbhome_1]$ export DISPLAY=74.101.107.146:0.0
[oracle@ip-172-31-86-22 dbhome_1]$ time ./runInstaller
ERROR: Unable to verify the graphical display setup. This application requires X display. Make sure that xdpyinfo exist under PATH variable.

Can't connect to X11 window server using '74.101.107.146:0.0' as the value of the DISPLAY variable.

real    4m17.524s
user    0m2.835s
sys     0m0.303s

First, xdpyinfo times out:

[ec2-user@ip-172-31-86-22 ~]$ ps -t pts/1
  PID TTY          TIME CMD
 3324 pts/1    00:00:00 bash
11279 pts/1    00:00:00 sudo
11281 pts/1    00:00:00 bash
11311 pts/1    00:00:00 runInstaller
11315 pts/1    00:00:00 xdpyinfo

Then OUI (java) times out:

[ec2-user@ip-172-31-86-22 ~]$ ps -t pts/1
  PID TTY          TIME CMD
 3324 pts/1    00:00:00 bash
11279 pts/1    00:00:00 sudo
11281 pts/1    00:00:00 bash
11311 pts/1    00:00:00 runInstaller
11436 pts/1    00:00:00 perl
11455 pts/1    00:00:01 java

To save time, test your connection using nc, before attempting runInstaller.

dbca issues

dbca NoClassDefFoundError: sun.awt.X11.XToolkit

The DISPLAY environment variable is set, but this messsage appears:

[oracle@ip-172-31-88-246 ~]$ dbca
Exception in thread "main" java.lang.NoClassDefFoundError: Could not initialize class sun.awt.X11.XToolkit
        at java.lang.Class.forName0(Native Method)
        at java.lang.Class.forName(Class.java:264)
        at java.awt.Toolkit$2.run(Toolkit.java:860)
        at java.awt.Toolkit$2.run(Toolkit.java:855)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.awt.Toolkit.getDefaultToolkit(Toolkit.java:854)
        at com.jgoodies.looks.LookUtils.isLowResolution(LookUtils.java:484)
        at com.jgoodies.looks.LookUtils.(LookUtils.java:249)
        at com.jgoodies.looks.plastic.PlasticLookAndFeel.(PlasticLookAndFeel.java:135)
        at java.lang.Class.forName0(Native Method)
        at java.lang.Class.forName(Class.java:348)
        at javax.swing.SwingUtilities.loadSystemClass(SwingUtilities.java:1874)
        at javax.swing.UIManager.setLookAndFeel(UIManager.java:582)
        at oracle.install.commons.util.Application.startup(Application.java:940)
        at oracle.install.commons.flow.FlowApplication.startup(FlowApplication.java:181)
        at oracle.install.commons.flow.FlowApplication.startup(FlowApplication.java:198)
        at oracle.assistants.common.base.driver.AssistantApplication.startup(AssistantApplication.java:328)
        at oracle.assistants.dbca.driver.DBConfigurator.startup(DBConfigurator.java:378)
        at oracle.assistants.dbca.driver.DBConfigurator.main(DBConfigurator.java:513)

Cause: The X server is down or not reachable. Troubleshoot as you would an xterm issue, such as:

[oracle@ip-172-31-88-246 ~]$ xterm
xterm: Xt error: Can't open display: localhost:10.0

Client startup step by step

If you follow these steps one at a time and resolve issues as you go, it is less likely that you will encounter difficulty when you try to run an Oracle tool

In PuTTY, configure X11 forwarding.
Connect PuTTY and log in.
Check the DISPLAY environment variable.
Test xterm.
Test xdpyinfo.

xdpyinfo > /dev/null

copy .Xauthority to the oracle account.
sudo to oracle.
Set the DISPLAY environment variable.
Test xterm again.
startup mwm.
finally, start your Oracle tool. Ex:

./runInstaller

Summary

It is extremely helpful to configure a proper Linux and X windows environment before attempting to run an Oracle tool such as OUI. Commands nc, lsof, strace, traceroute, lsipc, and lsmem were used while writing this blog article. From xorg-x11-utils, xlsclients and xev were used. From motif, mwm was used. xauth is required for X windows. An oracle installation may require unzip and xdpyinfo. Not all needed utilities installed by default. You must install them yourself.

If you Google “x windows forwarding not working” you will find cases where one person’s advice works for one person and not another. Trying random tips and tricks without understanding can lead to your own solutions that don’t work all the time, and enabling unnecessary options. To improve understanding, I have added diagnostic output from netstat and lsof.

Be sure to understand that the X server runs on your PC and listens on port 6000 by default. On the oracle box, sshd listens on port 6000 + an offset, often 10. Setting DISPLAY=localhost:10.0 will lead X clients to connect to the sshd listening on port 6010 on the local host. sshd will forward the X traffic back to the X server on your PC.

It is well to break up the setup and testing into multiple, simple steps, rather than attempting to startup runInstaller in one go.

tl;dr

Install needed packages
Copy .Xauthority to the oracle account
Set the DISPLAY environment variable

ssh to postgres issue solved

Posted on July 8, 2019 by Brian Fitzgerald

by Brian Fitzgerald

Scenario

You have double checked everything, but ssh to postgres does not work.

Conditions

You did a postgres package install
ssh to other accounts works
ssh to postgres using a key does not work
Directory .ssh mode is 700
File authorized_keys mode is 600

Selinux

Check whether selinux is enforcing acccess controls.

[postgres@test ~]$ getenforce
Enforcing

Note in /var/log/audit/audit.log, “avc: denied…scontext…sshd…tcontext…postressql_db_t”

type=AVC msg=audit(1562467696.927:316): avc:  denied  { read } for  pid=2282 comm="sshd" name="authorized_keys" dev="sda2" ino=56806 scontext=system_u:system_r:sshd_t:s0-s0:c0.c1023 tcontext=unconfined_u:object_r:postgresql_db_t:s0 tclass=file permissive=0

Note that the inode is 56806. Check ls -i:

[postgres@test ~]$ cd .ssh

[postgres@test .ssh]$ ls -i authorized_keys
56806 authorized_keys

Check the authorized_keys selinux type:

[postgres@test .ssh]$ ls -Z authorized_keys
-rw-------. postgres postgres unconfined_u:object_r:postgresql_db_t:s0 authorized_keys

The type is “postgresql_db_t”. In order for ssh to work, the type needs to be “ssh_home_t”.

Solution

Issue restorecon -R .ssh

[postgres@test ~]$ restorecon -R .ssh
[postgres@test ~]$ ls -RZ .ssh
.ssh:
-rw-------. postgres postgres unconfined_u:object_r:ssh_home_t:s0 authorized_keys

Check:

[postgres@pgstby .ssh]$ ssh test date
Sun Jul  7 02:57:21 UTC 2019

ssh works!

Note that new files created under directory .ssh will inherit the necessary ssh_home_t type.

chcon

If the home directory is an incompatible subtype then restorecon will not work. In that case, run chcon. i.e.:

[postgres@test ~]$ chcon -R unconfined_u:object_r:ssh_home_t:s0 ~/.ssh

[postgres@test ~]$ chcon -R system_u:object_r:usr_t:s0 ~/.ssh

semamage

If restorecon did not work, you can configure it to work in the future:

[root@test ~]# semanage fcontext -a -t ssh_home_t ~postgres/'.ssh(/.+)?'

Cause

The linux system was built from Azure image Redhat Enterprise Linux 7.6, which has selinux enabled by default.

Postgres was installed from package rh-postgresql10 created the user postgres, and then set the selinux type of most files and directories to postressql_db_t.

The .ssh directory inherited its type from its parent.

Normal home directory

The normal type of a home directory is user_home_dir_t. Subdirectory .ssh gets created as type ssh_home_t

[normal@test ~]$ ls -dZ
drwx------. normal normal unconfined_u:object_r:user_home_dir_t:s0 .
[normal@test ~]$ mkdir .ssh
[normal@test ~]$ ls -dZ .ssh
drwxrwxr-x. normal normal unconfined_u:object_r:ssh_home_t:s0 .ssh

Affected systems

In this example, the system was Redhat Linux on Azure, but it can happen on any system running selinux. The issue was noticed after a PostgreSQL package install. The same problem could appear with other package installs that create home directories.

Recommended practice

On selinux-enabled systems, create the .ssh directory this way:

[postgres@test ~]$ mkdir .ssh
[postgres@test ~]$ chmod 700 .ssh
[postgres@test ~]$ restorecon -R .ssh

Connecting across VPCs using Peering

Posted on June 13, 2019 by Brian Fitzgerald

By Brian Fitzgerald

Introduction

The requirement to connect applications across regions is ubiquitous. In Amazon Web Services (AWS), applications are deployed to a Virtual Private Cloud (VPC), but a VPC is specific to a single AWS region — to connect across regions, it is necessary to connect across VPCs. For speed and security, it is preferable to connect VPCs across Amazon’s internal networks, not across the public internet. We are going to establish our cross-VPC connection using peering. For this peering to succeed, planning is necessary to avoid overlapping IP address ranges. Peering across AWS accounts will also be demonstrated.

Virtual Private Clouds

An Amazon AWS Virtual Private Cloud (VPC) is an isolated network in a single region. A VPC covers all availability zones in the region and can have multiple subnets. A VPC covers a specific CIDR (Classless Inter-Domain Routing) IP address range, or “block”. In this section, we’re going to cover VPC IP address ranges, which is going to lead into the next section on VPC peering.

Networks

IP network configuration in AWS VPCs is quite flexible. The network number can reflect almost any legal IPv4 address range. IPv6 CIDR ranges are also available. A VPC CIDR block size can range from /16 netmask (65534 IP addresses) to /28 netmask (14 IP addresses).

Amazon recommends that you specify a CIDR block from the private IPv4 address ranges as specified in RFC 1918:

10.0.0.0 – 10.255.255.255 (10/8 prefix)
172.16.0.0 – 172.31.255.255 (172.16/12 prefix)
192.168.0.0 – 192.168.255.255 (192.168/16 prefix)

If you use an AWS tool to automatically create a VPC, you will find that the generated CIDR follows that guidance. Here are some examples of VPC CIDR blocks generated by AWS tools:

10.0.0.0/16
172.31.0.0/16

You may create a VPC with a CIDR block outside the RFC 1918 ranges, but most users will refrain from doing so on aesthetic grounds, or to avoid misunderstandings. You may not create an AWS VPC CIDR block beginning with 0. or 127.

The default limit on VPCs is 5 = per region. To get a higher limit, you have to open a support case and submit a limit increase request. To avoid needing a higher VPC limit, you might decide to create your VPCs as large as allowable, i.e. netmask /16, and to avoid inadvertently creating ones. Some AWS only work by creating a new VPC. For example, if you use the Getting Started menu to setup an Elastic Container Service with load balancing, you will have no option but to create a new VPC. Other AWS tools offer to create a new VPC For example if you create a new a new EC2 or RDS instance in the AWS console, the menu offers creating a new VPC as as option. In conclusion, you can tightly manage your number of VPCs, or you can request a higher limit.

One way to simplify and standardize your administration is to create all your VPCs using RFC 1918 IP address ranges and netmask /16. If you do that, you can create networks among these ranges

10.0.0.0/16 – 10.255.0.0/16 (256 networks)
172.16.0.0/16 – 172.31.0.0/16 (16 networks)
196.168.0.0/16 (1 network)

for a total of up to 273 VPCs per region. For many applications, 273 VPCs per region is ample.

In a new account, the default VPC in each region has CIDR block 172.31.0.0/16. As of this writing, a new account covers 17 regions. The upshot is that a new AWS account has 17 VPCs all with the same CIDR block. In the case of isolated VPCs, this is not a problem, but default VPCs cannot be connected by peering because the IP address ranges overlap.

In addition to creating new VPCs, you can increase the size of an existing VPC by adding additional CIDR blocks. However, you cannot mix across RFC 1918 IP address ranges. Specifically, you cannot combine “10.0” and “172.” CIDR blocks in a single VPC.

In conclusion, if you consider all allowable ranges and netmasks, you can choose from over a half billion possible CIDR blocks. However, even if you restrict your choice to private networks and the largest allowable size, you can choose from among 273 different CIDR blocks.

Subnets

Subnetting VPCs is also quite flexible. You may specify a subnet mask ranging from /16 (65534 IP addresses) to /28 (14 IP addresses). By default, a VPC may have up to 200 subnets.

You cannot create a subnet larger than netmask /16. For example, if you compose a VPC from two contiguous netmask /16 networks, you could not, therefore, create a single netmask /15 subnet.

VPC summary

A new account covers multiple regions, each with a default VPC with CIDR block range 172.31.0.0/16. You can create additional VPCs. Some users may decide to stick with netmask /16 and RFC 1918 networks. The detailed explanations of VPC IP addresses in this section set the stage for the next section, which is VPC peering.

VPC Peering

A simple way to connect across VPCs is to establish VPC peering. VPC peering connects two VPCs to form a single network. Traffic is routed not across the Internet, but across a private AWS network. VPC peering is more secure and more reliable than using an internet gateway.

VPC peering requires that the VPC CIDR blocks do not overlap. Subnets are not considered. In other words, if two VPCs have overlapping CIDRs, you cannot establish VPC peering, even if no existing subnets overlap.

All default VPC CIDR blocks are 172.31.0.0/16. You cannot establish VPC peering across more than one default VPC because the IP address range overlaps. You must create one or more new VPC.

For this blog, we’re going to setup VPC peering across regions us-east-1, ap-northeast-1, and eu-west-1 (N. Virginia, Tokyo, and London). We’ll start by deleting the default VPC, namely 172.31.0.0/16, and creating these VPCs

Region	Region	VPC ID	CIDR block
N. Virginia	us-east-1	vpc-0ed2447f33a01d301	10.1.0.0/16
Tokyo	ap-northeast-1	vpc-07251b9829e270787	10.2.0.0/16
London	eu-west-2	vpc-0bf90b5507089c175	10.3.0.0/16

For example:

vpc

I have deleted the default VPCs for neatness: I have no need for them now. In each new VPC, create a subnet. Also, only for the sake of this blog, I’ll create an internet gateway in us-east-1 and add a route to the Internet via the gateway. Spin up an EC2, download the key pair, convert to putty keys, connect with PuTTY and we’re in. Install nc:

sudo yum -y update
sudo yum -y install nc

In ap-northeast-1 and eu-west-2, spin up an EC2 in each. Save the SSH keys (*.pem) for later.

Region	Subnet	EC2 IP address	public IP address
us-east-1	10.1.0.0/24	10.1.0.244	107.23.67.190
ap-northeast-1	10.2.0.0/24	10.2.0.241
eu-west-2	10.3.0.0/24	10.3.0.183

Attempt to connect from us-east-1 to ap-northeast-1 and eu-west-2, and the connections time out.

timeout

Now, we’ll setup VPC peering.

Setting up VPC peering

To set up VPC peering, send an invitation by following the Create Peering Connection dialog. For example, from us-east-1, invite ap-northeast-1.

invite

In the ap-northeast-1 region, accept the invitation.

In the “Actions” menu, select “Accept request”. In the dialog. click “Yes, Accept”, and in the next dialog, click “Modify my route tables now”, or select “Route Tables” from the left navigation pane.

In the us-east-1 route table, add a route to 10.2.0.0/16 via the peered connection. In the ap-northeast-1 route table, add a route to 10.1.0.0/16 via the peered connection. You can skip ahead to the screenshots in the next subsection to get a preview of the final route table.

Retest the connection to ap-northeast-1. Success:

tcp22ok

Likewise:

In us-east-1, invite to peering, eu-west-2 VPC.
In eu-west-2, accept the peering invitation.
In the us-east-1 route table, add a route to 10.3.0.0/16 via the peering connection.
In the eu-west-1 route table, add a route to 10.1.0.0/16 via the peering connection.

Using WinSCP, copy the *.pem files that you downloaded when you created the EC2s to the us-east-1 EC1. Change the file mode to 600. Now ssh succeeds:

from us-east-1:
ssh -i ap-northeast-1-key.pem ec2-user@10.2.0.241
ssh -i eu-west-2-key.pem ec2-user@10.3.0.183

Connecting Tokyo to London

Peering is not transitive, meaning that, so far, you cannot connect directly from ap-northeast-1 to eu-west-2 or vice versa. You may, however, setup peering directly between ap-northeast-1 and eu-west-2. Be sure to update the route tables.The final us-east-1 route table looks thus:

rout.us-east-1

The ap-northeast-1 route table is:

rout.ap-northeast-1

The eu-west-2 route table is:

rout.eu-west-2

Once routing is setup, you can connect between any two IP addresses in the three regions.

If you want to ssh from ap-northeast-1 to eu-west-2, then copy eu-west-2-key.pem to ap-northeast-1 first.

from us-east-1:
cd .ssh/
scp -i ap-northeast-1-key.pem -p eu-west-2-key.pem  ec2-user@10.2.0.241:.ssh
ssh -i ap-northeast-1-key.pem ec2-user@10.2.0.241
from ap-northeast-1:
cd .ssh/
ssh -i eu-west-2-key.pem ec2-user@10.3.0.183

Again, note that only the us-east-1 EC2 instance is public. The ap-northeast-1 and eu-west-1 EC2 instances are private, and are accessible only via the us-east-1 EC2 instance.

This was an example of interconnecting three regions in the same account. The connections were accomplished using ssh (port 22). In the next section, we will connect across two separate AWS accounts via Oracle database link.

Connection across accounts

So far, we have setup VPC peering across regions in the same account. Now we are going to establish VPC peering across separate AWS accounts. Setup your accounts. Setup VPCs with non-overlapping CIDR blocks. For example:

Account number	Region	Region ID	VPC ID	CIDR block
665575760545	Seoul	ap-northeast-2	vpc-04260ecd771d09cdb	10.5.0.0/16
128887077649	Singapore	ap-southeast-1	vpc-043e1448a4e98a416	10.6.0.0/16

In each VPC, setup at least two subnets in separate availability zones.

Account number	VPC ID	Subnet	Availability Zone
665575760545	vpc-04260ecd771d09cdb	10.5.0.0/24	ap-northeast-2b (apne2-az2)
665575760545	vpc-04260ecd771d09cdb	10.5.1.0/24	ap-northeast-2a (apne2-az1)
128887077649	vpc-043e1448a4e98a416	10.6.0.0/24	ap-southeast-1c (apse1-az3)
128887077649	vpc-043e1448a4e98a416	10.6.1.0/24	ap-southeast-1a (apse1-az1)

Create databases

In the first account:

Setup an internet gateway and a route for the sake of this blog.
Enable DNS hostnames.
Create an Oracle Database RDS (internet facing for the sake of this blog).
Test from Oracle SQL Developer.

dbsuccess

In the second account, create an Oracle Database RDS, private. Enable listener log exports. RDS Summary:

Account number	DB ident	Endpoint
665575760545	seoul-ora	seoul-ora.c7oolvrrvu91.ap-northeast-2.rds.amazonaws.com
128887077649	singapore-ora	singapore-ora.cdhkgqcl8pkk.ap-southeast-1.rds.amazonaws.com

(… continued)

Account number	IP Address	PORT	db name
665575760545	52.79.225.94	1521	ORCL
128887077649	10.6.1.194	1521	ORCL

Note that the seoul-ora IP address is public and the singapore-ora IP address is private.

From the first account, from seoul-ora, the create database link statement succeeds:

CREATE DATABASE LINK singapore_link 
CONNECT TO admin IDENTIFIED BY "sing..33"
USING 'singapore-ora.cdhkgqcl8pkk.ap-southeast-1.rds.amazonaws.com:1521/ORCL';

A query across the database link times out:

select host_name from v$instance@singapore_link;
ORA-12170: TNS:Connect timeout occurred

Next we will setup VPC peering across accounts.

VPC Peering across accounts

From the first account, in Seoul, send the invitation:

invite.acct

From the second account, in Singapore, accept the invitation.

accept.acct

Click “Yes, Accept”, and in the next dialog, click “Modify my route tables now”, or select “Route Tables” from the left navigation pane. Add a route to destination 10.5.0.0/16 via peering connection pcx-09199f486b1e1a533.

From the first account, in Seoul, add a route to 10.6.0.0/16 via peering connection pcx-09199f486b1e1a533.

From the second account:

Select Singapore
Navigate to Services->RDS
Select singapore-ora
Identify the security group
Navigate to the security group
Add inbound rule:
- TCP port: 1521
- Source: 10.5.0.0/16

Here is a screenshot of the Singapore RDS security group inbound rules.

sing.sg Retry the database link from Seoul:

select host_name from v$instance@singapore_link;
HOST_NAME
ip-172-21-2-91

Success. Peering across AWS accounts works.

Note that the query returns the Singapore RDS hostname, ip-172-21-2-91. The host is an EC2 instance that is not accessible from your AWS account.

In the second account, in Singapore, CloudWatch, in Logs, in /aws/rds/instance/singapore-ora/listener, observe the establish record.

12-JUN-2019 21:44:13 * (CONNECT_DATA=(SERVICE_NAME=ORCL)(CID=
 (PROGRAM=oracle)(HOST=ip-172-23-0-229)(USER=Brian Fitzgerald))) 
 * (ADDRESS=(PROTOCOL=tcp)(HOST=10.5.1.93)(PORT=64171)) 
 * establish * ORCL * 0

Notice that connection source IP address 10.5.1.93 is in Seoul VPC CIDR block 10.5.0.0/16.

Again, note that only the ap-northeast-2 (Seoul) RDS instance is public. The ap-southeast-1 (Singapore) RDS instances is private, and is accessible only via the ap-northeast-2 RDS instance.

Our result: a successful TCP connection across AWS accounts. In this case, we set up a database link across Oracle databases. The connection could as well have been a Microsoft SQL Server linked server, or a database client, such as ODBC or JDBC. VPC peering is not limited to database technology. A wide range of applications, tools, and services can be deployed across AWS VPCs, regions, or accounts by leveraging VPC peering. This blog has covered TCP over IPv4, but other transports, such as UDP can be considered. IPv6 is supported in AWS, as well as IPv4.

Programming

All actions that were demonstrated from the AWS console can be accomplished programmatically.

Command Line Interfaces

The AWS command line interface (CLI) can be used to issue commands that perform the same actions that were demonstrated from the AWS console. For example, to drop and reinstatiate N. Virginia to Tokyo peering:

C:\>aws ec2 delete-vpc-peering-connection --vpc-peering-connection-id pcx-0006343192557953b
True
C:\>aws ec2 create-vpc-peering-connection --peer-vpc-id vpc-07251b9829e270787 --vpc-id vpc-0ed2447f33a01d301 --peer-region ap-northeast-1
VPCPEERINGCONNECTION    2019-06-21T01:54:00.000Z        pcx-0d5aa3deb15773138
ACCEPTERVPCINFO 665575760545    ap-northeast-1  vpc-07251b9829e270787
REQUESTERVPCINFO        10.1.0.0/16     665575760545    us-east-1       vpc-0ed2447f33a01d301
CIDRBLOCKSET    10.1.0.0/16
PEERINGOPTIONS  False   False   False
STATUS  initiating-request      Initiating Request to 665575760545

C:\>aws ec2 accept-vpc-peering-connection --vpc-peering-connection-id pcx-0d5aa3deb15773138 --region ap-northeast-1
VPCPEERINGCONNECTION    pcx-0d5aa3deb15773138
ACCEPTERVPCINFO 10.2.0.0/16     665575760545    ap-northeast-1  vpc-07251b9829e270787
CIDRBLOCKSET    10.2.0.0/16
PEERINGOPTIONS  False   False   False
REQUESTERVPCINFO        10.1.0.0/16     665575760545    us-east-1       vpc-0ed2447f33a01d301
CIDRBLOCKSET    10.1.0.0/16
PEERINGOPTIONS  False   False   False
STATUS  provisioning    Provisioning

All other actions demonstrated from the AWS console in this blog article can be run from the CLI. Examples:

aws ec2 create-vpc
aws ec2 create-route
aws ec2 create-subnet
aws ec2 authorize-security-group-ingress

to mention only a few.

Programming APIs

The commands can be scripted in several languages, including javascript, powershell, and python. The python library is boto3. EC2 client methods include these examples:

create_vpc()
create_vpc_peering_connection()
accept_vpc_peering_connection()
create_route()
create_subnet()

Technical summary

The key details needed to setup VPC peering are:

Non-overlapping IP address ranges across VPCs.
Send and accept invitation.
Adding routes to the route table.
Security group inbound rules that cover the remote IP address ranges.

In this blog article, we explained, demonstrated, or mentioned:

AWS VPC peering across three regions.
Deleting the default VPC
Creating new VPCs
Use of RFC 1918 private IP address ranges
Use of /16 netmask for VPC
Sending and accepting VPC peering invitations.
Adding routes to the route table
Creation of subnets
Use of /24 subnet mask
Downloading ssh key pairs (pem)
Using PuTTYgen to convert ssh keys to putty keys (ppk).
Connecting to EC2 via PuTTY or WinSCP.
VPC peering across AWS accounts
Installing nc
Using nc in EC2 to test TCP connectivity
ssh across EC2 using ssh key pairs (pem)
Limiting RDS access via security groups.
Creating a database link
Reviewing RDS listener log

Conclusion

AWS VPC peering is a great way to connect applications across VPCs, regions, or accounts. VPC peering is faster, more reliable, and more secure than using the Internet. VPC peering can be implemented smoothly by avoiding overlapping IP address ranges. This blog covered ssh and database connections, but VPC peering applies to a wide range of networked application technology.

Notes on private temporary tables

Posted on February 18, 2018 by Brian Fitzgerald

By Brian Fitzgerald

New in 18c, private temporary tables are temporary database objects that are dropped at the end of a transaction or session. Private temporary tables are stored in memory and each one is visible only to the session that created it.

Here are a few findings on private temporary tables (PTT).

Basic operation

Create, insert, and select.

create private temporary table ora$ptt_a
(
 n number
) on commit preserve definition;
insert into ora$ptt_a ( n ) values ( 0 );
select n 
from ora$ptt_a;

Table created.
1 row(s) inserted.

N
0

A PTT cannot be partitioned or index-organized.

Using “on commit drop definition”

Commit drops the PTT if the PTT is declared “on commit drop definition”.

create private temporary table ora$ptt_a
(
n number
) on commit drop definition;
insert into ora$ptt_a ( n ) values ( 0 );
commit;
select n 
from ora$ptt_a;

Table created.
1 row(s) inserted.
Statement processed.
ORA-00942: table or view does not exist

The same applies to rollback:

drop table ora$ptt_a;
create private temporary table ora$ptt_a
(
n number
)
on commit drop definition;
rollback;
select n from ora$ptt_a;
Table dropped.
Table created.
Statement processed.
ORA-00942: table or view does not exist

No commit on create

Creating a PTT does not itself issue a commit, as it would with a conventional table. This fact leads to the primary use case of PTTs:

“When an application stores temporary data in transient tables that are populated once, read few times, and then dropped at the end of a transaction or session”

In other words, a PTT can be used as a driving table which you populate once, and then use for multiple queries.

Rollback

Rollback rolls back the insert, but not the create table.

drop table ora$ptt_a;
create private temporary table ora$ptt_a
(
n number
)
on commit preserve definition;
insert into ora$ptt_a ( n ) values ( 4 );
rollback;
select n from ora$ptt_a;

Table dropped.
Table created.
1 row(s) inserted.
Statement processed.
no data found

Alter table

Alter table add column fails with an erroneous ORA-00942:

alter table ora$ptt_a add m number;
ORA-00942: table or view does not exist

ORA-00942 can be taken to mean that Oracle searched for, and did not find a conventional table, and is indicative of a bug.

Views

Views with info about PTTs are:

USER_PRIVATE_TEMP_TABLES

DBA_PRIVATE_TEMP_TABLES

There is no ALL_PRIVATE_TEMP_TABLES.

select * from USER_PRIVATE_TEMP_TABLES;

(results transposed)

SID	2284
SERIAL#	40199
OWNER	SQL_RADNMXBEQPEYTXKXEYBLDVRPC
TABLE_NAME	ORA$PTT_B
TABLESPACE_NAME	TEMP
DURATION	SESSION
NUM_ROWS	0
BLOCKS	0
AVG_ROW_LEN	0
LAST_ANALYZED	2/18/2018 23:01
TXN_ID	0
SAVE_POINT_NUM	0

sid, serial# refer to the session that created the PTT.

Parallel

The parallel create option succeeds:

drop table ora$ptt_a;
create private temporary table ora$ptt_a
 (
 d date
 )
 on commit preserve definition 
 parallel 8;

Table dropped. 

Table created.

However, I find the parallel degree nowhere in the catalog.

Altering the parallel degree fails with an erroneous ORA-00942:

alter table ora$ptt_a parallel 4;
ORA-00942: table or view does not exist

No primary key

You cannot declare a primary key

drop table ora$ptt_a;
create private temporary table ora$ptt_a
(
n number primary key
) 
on commit drop definition;

ORA-14451: unsupported feature with temporary table

You cannot create indexes, defaults, or not null constraints.

Dropping

You can explicitly drop a PTT.

drop table ora$ptt_a;

Table dropped.

Prefix

You can prefix the table name with the owner when referring to it.

drop table SQL_ZZOZIKMDVVNDRUUEJJIJXJMKR.ora$ptt_a;
create private temporary table SQL_ZZOZIKMDVVNDRUUEJJIJXJMKR.ora$ptt_a
on commit preserve definition
as select level n
from dual
connect by level <= 2;

rowid

You can query rowid. dbms_rowid.rowid_object returns a number that is unique per PTT and not in user_objects.

create private temporary table ora$ptt_a
as
select level n from dual connect by level <= 3;

create private temporary table ora$ptt_b
as
select level n from dual connect by level <= 3;
select rowid, n, dbms_rowid.rowid_object(rowid) from ora$ptt_a;

select rowid, n, dbms_rowid.rowid_object(rowid) from ora$ptt_b;
select max(object_id) from user_objects;
Table created.
Table created.

ROWID	N	DBMS_ROWID.ROWID_OBJECT(ROWID)
AATFyHAABAADFyIAAA	1	5004423
AATFyHAABAADFyIAAB	2	5004423
AATFyHAABAADFyIAAC	3	5004423

ROWID	N	DBMS_ROWID.ROWID_OBJECT(ROWID)
AATF4HAABAADF4IAAA	1	5004807
AATF4HAABAADF4IAAB	2	5004807
AATF4HAABAADF4IAAC	3	5004807

MAX(OBJECT_ID)
129337

Flashback query

Flashback queries are not allowed on temporary tables. The expected message is:

ORA-30051: VERSIONS clause not allowed here

However flashback query on PTTs fail with an erroneous ORA-00942:

select n from ora$ptt_a
versions between scn 10717996 and 10720679;
ORA-00942: table or view does not exist

Grant

You cannot grant access to a PTT. Grant fails with an erroneous ORA-00942.

grant select on ora$ptt_a to system;
ORA-00942: table or view does not exist

Truncate

You can truncate a PTT. Truncating a PTT does not issue a COMMIT, as it would with a conventional table.

create global temporary table gtt_a
(
n number 
)
on commit delete rows;
create private temporary table ora$ptt_a
(
n number
)
on commit drop definition;
insert into ora$ptt_a
select level from dual 
connect by level <= 10000;
insert into gtt_a
select level from dual 
connect by level < 10000;
select count(*)nptt from ora$ptt_a;
truncate table ORA$PTT_A;
select count(*)nptt from ora$ptt_a;
select count(*)ngtt from gtt_a;
commit;
select count(*)ngtt from gtt_a;

Table created.
Table created.
10000 row(s) inserted.
9999 row(s) inserted.
NPTT
10000
Table truncated.
NPTT
0
NGTT
9999
Statement processed.
NGTT
0

Gather stats

You cannot gather stats on a private temporary table.

begin
dbms_stats.gather_table_stats(null,'ora$ptt_a');
end;
/

ORA-20000: Unable to analyze TABLE "SQL_RADNMXBEQPEYTXKXEYBLDVRPC"."ORA$PTT_A", insufficient privileges or does not exist ORA-06512: at "SYS.DBMS_STATS", line 39094
ORA-06512: at "SYS.DBMS_STATS", line 38371
ORA-06512: at "SYS.DBMS_STATS", line 38530
ORA-06512: at "SYS.DBMS_STATS", line 39076
ORA-06512: at line 2
ORA-06512: at "SYS.DBMS_SQL", line 1721

You cannot analyze a PTT

analyze table ora$ptt_a estimate statistics;

ORA-00942: table or view does not exist

PTT statistics normally show 0 rows; however, PTT statistics get populated in the case of create table as select.

create private temporary table ora$ptt_a
as select level n
from dual
connect by level <= 10000;
select num_rows, blocks, avg_row_len
from user_private_temp_tables;
Table created

NUM_ROWS	BLOCKS	AVG_ROW_LEN
10000	16	4

Metadata

PTTs do not appear in USER_TABLES, USER_SEGMENTS, or USER_OBJECTS.

PTTs do not have an object_id.

No PTT column metadata has been found (so far). However, I would check x$ tables.

Multiple sessions

Multiple sessions by the same user can create a PTT having the same name. The definition and the data are visible only to the creating session.

Name clash

You cannot create a conventional table, or any other object, beginning with “ORA$PTT_”.

create procedure ora$ptt_p is begin null; end;
/

ORA-32463: cannot create an object with a name 
matching private temporary table prefix

The PTT prefix can be changed using initialization parameter PRIVATE_TEMP_TABLE_PREFIX, but it cannot be modified at the session level.

alter session set PRIVATE_TEMP_TABLE_PREFIX = 'PRIV$TMP_';

ORA-02096: specified initialization parameter is not modifiable 
with this option

To change the PTT prefix at the instance level, issue, for example:

alter system set PRIVATE_TEMP_TABLE_PREFIX = 'PRIV$TMP_' deferred;

Current sessions are unaffected. Future connections will catch the new setting.

PL/SQL

You can use a PTT in an anonymous PL/SQL block

drop table ora$ptt_a;
create private temporary table ora$ptt_a
(
n number
)
on commit preserve definition;
insert into ora$ptt_a ( n ) values ( 7 );
declare
 l_num number;
begin
 select n into l_num from ora$ptt_a;
 dbms_output.put_line('l_num='||l_num);
end;
/
Table dropped.
Table created.
1 row(s) inserted.
l_num=7

A PTT column cannot be used in a type declaration.

drop table ora$ptt_a;
create private temporary table ora$ptt_a
(
n number
)
on commit preserve definition;
declare
 l_num ora$ptt_a.n%type;
begin
 null;
end;
/
Table dropped.
Table created.
ORA-06550: line 2, column 9:
PLS-00201: identifier 'ORA$PTT_A.N' must be declared

You cannot create a procedure that uses a PTT in static PL/SQL.

drop table ora$ptt_a;
create private temporary table ora$ptt_a
(
n number
)
on commit preserve definition;

create or replace procedure pr
as
begin
 insert into ora$ptt_a ( n ) values ( 0 );
end;
/
select line, position, text from user_errors where name = 'PR';

Table dropped. 

Table created.
Error at line: 12

LINE	POSITION	TEXT
4	14	PL/SQL: ORA-14451: unsupported feature with temporary table
4	2	PL/SQL: SQL Statement ignored

You can write an anonymous PL/SQL block that declares a procedure that uses a PTT.

drop table ora$ptt_a;
create private temporary table ora$ptt_a
(
n number
)
on commit preserve definition;

declare
procedure pr
as
begin
 insert into ora$ptt_a ( n ) values ( 5 );
end pr;
begin
 pr;
end;
/
select n from ora$ptt_a;

Table dropped.
Table created.
1 row(s) inserted.

N
5

Notice the feedback on the insert. I have not seen such feedback in PL/SQL before. This demo was run on Oracle Live SQL.

You can create a package that creates and uses a PTT using dynamic SQL.

drop table ora$ptt_a;
create or replace package ptt_pkg
is
 procedure crptt;
 procedure insptt;
 function pttval
 return number;
end ptt_pkg;
/
create or replace package body ptt_pkg
is
 procedure crptt
 is
 begin
 execute immediate
 q'{create private temporary table ora$ptt_a
 (
 n number
 )
 on commit preserve definition}';
 end crptt;

procedure insptt
 is
 begin
 execute immediate
 q'{insert into ora$ptt_a ( n ) values ( 3 )}';
 end insptt;

function pttval
 return number
 is
 l_num number;
 begin
 execute immediate
 q'{select n
 from ora$ptt_a}' into l_num;
 return l_num;
 end pttval;
end ptt_pkg;
/

declare
 l_num number;
begin
 ptt_pkg.crptt;
 ptt_pkg.insptt;
 l_num := ptt_pkg.pttval;
 dbms_output.put_line('in package l_num='||l_num);
end;
/

Table dropped.
Package created.
Package Body created.
in package l_num=3

An anonymous PL/SQL block will not compile if it refers to an object that does not exist yet. This will not work:

declare
l_num number;
begin
ptt_pkg.crptt;
ptt_pkg.insptt;
select n into l_num
from ora$ptt_a;
end;
/

ORA-06550: line 7, column 6: 
PL/SQL: ORA-00942: table or view does not exist

In-memory

Dan Morgan reports “ORA-14451: unsupported feature with temporary table” in a PDB with inmemory_size set. Please refer to the blog feedback.

Explain plan

You can run explain plan on a statement that uses PTT.

drop table ora$ptt_a;
create private temporary table ora$ptt_a
 (
 n number
 )
 on commit preserve definition;
explain plan for
select n
from ora$ptt_a;
select * from table ( dbms_xplan.display );
Table dropped.
Table created.
Statement processed.

PLAN_TABLE_OUTPUT
Plan hash value: 2125934360

---------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 13 | 2 (0)| 00:00:01 |
| 1 | TABLE ACCESS STORAGE FULL| ORA$PTT_A | 1 | 13 | 2 (0)| 00:00:01 |
---------------------------------------------------------------------------------------

Parallel plan

The optimizer can create a parallel plan on a PTT.

drop table ora$ptt_a;
create private temporary table ora$ptt_a
 (
 n number
 )
 on commit preserve definition
 parallel 8;
explain plan for
select n
from ora$ptt_a;
select * from table ( dbms_xplan.display );
Table dropped.
Table created.
Statement processed.

PLAN_TABLE_OUTPUT
Plan hash value: 2895541530

-----------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | TQ |IN-OUT| PQ Distrib |
-----------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 13 | 2 (0)| 00:00:01 | | | |
| 1 | PX COORDINATOR | | | | | | | | |
| 2 | PX SEND QC (RANDOM) | :TQ10000 | 1 | 13 | 2 (0)| 00:00:01 | Q1,00 | P->S | QC (RAND) |
| 3 | PX BLOCK ITERATOR | | 1 | 13 | 2 (0)| 00:00:01 | Q1,00 | PCWC | |
| 4 | TABLE ACCESS STORAGE FULL| ORA$PTT_A | 1 | 13 | 2 (0)| 00:00:01 | Q1,00 | PCWP | |
-----------------------------------------------------------------------------------------------------------------------

Note
-----
 - dynamic statistics used: dynamic sampling (level=2)
 - Degree of Parallelism is 8 because of table property

Troubleshooting

Query troubleshooting could be impeded by lack of information about the design of the PTT.

Prerequisites

The create table privilege is required to create a PTT. Granting create table permits creating any type of table, not just PTTs. The feature is clearly intended for use in application code at run time. The DBA is faced with a choice: Grant create table to the application run time user, or deny the use of private temporary tables.

Caution on new features

The PTT is a new 18c feature. Users should exercise care and be alert to bugs in PTTs. Bugs tend to appear when multiple lightly used features are combined. Beware of performance, internal error, and corruption bugs. Also, optimization can lead to results errors. Use PTTs when the potential business value outweighs the risk and added testing cost. Exercise conservative practices when using PTTs.

Conclusion

PTTs are a new Oracle Database 18c feature. Before using PTTs it’s a good idea to be aware of the restrictions.