How to Really Failover Virtual IP to Another Node

  • by

Since 11g, we rely more and more on SCAN Virtual IP to load balance and fail over connections among nodes of a cluster, so the original Virtual IP (VIP) is not as important as the old days. That's why VIP becomes optional for grid installation from 18c. However, some systems may still depend on VIP to fail over connections at connect-time.

In reality, if one node in a cluster failed to service, then its VIP automatically failover to another available node. That is to say, the failover of VIP is literally done when its home node leaves the cluster, but the problem is, we can't use it to connect to the database. It turns out that the failover VIP is useless.

In this post, we not only show you how to failover VIP but also provide a solution to really use the failover VIP.

  1. Failover Virtual IP (VIP)
  2. Solution to Use Failover VIP

A. Failover Virtual IP (VIP)

I'm going to make the second node of the RAC servers leave the cluster. Let's see steps in this part.

  1. Check Status before Failover
  2. Make the Second Node Leave Cluster
  3. Check Status of After Failover

1. Check Status before Failover

Configuration of Both VIP

[grid@primary01 ~]$ srvctl config vip -vip primary01-vip
VIP exists: network number 1, hosting node primary01
VIP Name: primary01-vip
VIP IPv4 Address: 192.168.0.111
VIP IPv6 Address:
VIP is enabled.
VIP is individually enabled on nodes:
VIP is individually disabled on nodes:
[grid@primary01 ~]$ srvctl config vip -vip primary02-vip
VIP exists: network number 1, hosting node primary02
VIP Name: primary02-vip
VIP IPv4 Address: 192.168.0.112
VIP IPv6 Address:
VIP is enabled.
VIP is individually enabled on nodes:
VIP is individually disabled on nodes:

As you can see, every VIP belongs to its home node.

192.168.0.111 primary01-vip
192.168.0.112 primary02-vip

Resource Status of Both VIP

[grid@primary01 ~]$ crsctl status resource ora.primary01.vip
NAME=ora.primary01.vip
TYPE=ora.cluster_vip_net1.type
TARGET=ONLINE
STATE=ONLINE on primary01
[grid@primary01 ~]$ crsctl status resource ora.primary02.vip
NAME=ora.primary02.vip
TYPE=ora.cluster_vip_net1.type
TARGET=ONLINE
STATE=ONLINE on primary02

Current Status of Both VIP

[grid@primary01 ~]$ srvctl status vip -vip primary01-vip
VIP 192.168.0.111 is enabled
VIP 192.168.0.111 is running on node: primary01
[grid@primary01 ~]$ srvctl status vip -vip primary02-vip
VIP 192.168.0.112 is enabled
VIP 192.168.0.112 is running on node: primary02

2. Make the Second Node Leave Cluster

Here we stop the services on node 2 by the normal procedure, you can also perform a blackout on node 2. They have the same effect.

Setup Grid Environment for root

We setup a grid environment for root, because we have to use root to perform some critical operations on the cluster.

[root@primary01 ~]# . /home/grid/.bash_profile

Stop Cluster on Node 2 by Root

Stopping all cluster service on node 2 needs root privilege.

[root@primary01 ~]# crsctl stop cluster -n primary02
CRS-2673: Attempting to stop 'ora.crsd' on 'primary02'
CRS-2790: Starting shutdown of Cluster Ready Services-managed resources on server 'primary02'
CRS-2673: Attempting to stop 'ora.chad' on 'primary02'
CRS-2673: Attempting to stop 'ora.orclcdb.db' on 'primary02'
CRS-2677: Stop of 'ora.orclcdb.db' on 'primary02' succeeded
CRS-33673: Attempting to stop resource group 'ora.asmgroup' on server 'primary02'
CRS-2673: Attempting to stop 'ora.OCR.dg' on 'primary02'
CRS-2673: Attempting to stop 'ora.DATA.dg' on 'primary02'
CRS-2673: Attempting to stop 'ora.MGMT.dg' on 'primary02'
CRS-2673: Attempting to stop 'ora.LISTENER.lsnr' on 'primary02'
CRS-2673: Attempting to stop 'ora.LISTENER_SCAN1.lsnr' on 'primary02'
CRS-2677: Stop of 'ora.DATA.dg' on 'primary02' succeeded
CRS-2677: Stop of 'ora.OCR.dg' on 'primary02' succeeded
CRS-2677: Stop of 'ora.LISTENER.lsnr' on 'primary02' succeeded
CRS-2677: Stop of 'ora.LISTENER_SCAN1.lsnr' on 'primary02' succeeded
CRS-2673: Attempting to stop 'ora.primary02.vip' on 'primary02'
CRS-2673: Attempting to stop 'ora.scan1.vip' on 'primary02'
CRS-2677: Stop of 'ora.MGMT.dg' on 'primary02' succeeded
CRS-2673: Attempting to stop 'ora.asm' on 'primary02'
CRS-2677: Stop of 'ora.primary02.vip' on 'primary02' succeeded
CRS-2677: Stop of 'ora.scan1.vip' on 'primary02' succeeded
CRS-2677: Stop of 'ora.chad' on 'primary02' succeeded
CRS-2677: Stop of 'ora.asm' on 'primary02' succeeded
CRS-2673: Attempting to stop 'ora.ASMNET1LSNR_ASM.lsnr' on 'primary02'
CRS-2677: Stop of 'ora.ASMNET1LSNR_ASM.lsnr' on 'primary02' succeeded
CRS-2673: Attempting to stop 'ora.asmnet1.asmnetwork' on 'primary02'
CRS-2677: Stop of 'ora.asmnet1.asmnetwork' on 'primary02' succeeded
CRS-33677: Stop of resource group 'ora.asmgroup' on server 'primary02' succeeded.
CRS-2672: Attempting to start 'ora.scan1.vip' on 'primary01'
CRS-2672: Attempting to start 'ora.primary02.vip' on 'primary01'
CRS-2676: Start of 'ora.scan1.vip' on 'primary01' succeeded
CRS-2672: Attempting to start 'ora.LISTENER_SCAN1.lsnr' on 'primary01'
CRS-2676: Start of 'ora.LISTENER_SCAN1.lsnr' on 'primary01' succeeded
CRS-2676: Start of 'ora.primary02.vip' on 'primary01' succeeded
CRS-2673: Attempting to stop 'ora.ons' on 'primary02'
CRS-2677: Stop of 'ora.ons' on 'primary02' succeeded
CRS-2673: Attempting to stop 'ora.net1.network' on 'primary02'
CRS-2677: Stop of 'ora.net1.network' on 'primary02' succeeded
CRS-2792: Shutdown of Cluster Ready Services-managed resources on 'primary02' has completed
CRS-2677: Stop of 'ora.crsd' on 'primary02' succeeded
CRS-2673: Attempting to stop 'ora.storage' on 'primary02'
CRS-2677: Stop of 'ora.storage' on 'primary02' succeeded
CRS-2673: Attempting to stop 'ora.asm' on 'primary02'
CRS-2677: Stop of 'ora.asm' on 'primary02' succeeded
CRS-2673: Attempting to stop 'ora.cluster_interconnect.haip' on 'primary02'
CRS-2677: Stop of 'ora.cluster_interconnect.haip' on 'primary02' succeeded
CRS-2673: Attempting to stop 'ora.ctssd' on 'primary02'
CRS-2673: Attempting to stop 'ora.evmd' on 'primary02'
CRS-2677: Stop of 'ora.ctssd' on 'primary02' succeeded
CRS-2677: Stop of 'ora.evmd' on 'primary02' succeeded
CRS-2673: Attempting to stop 'ora.cssd' on 'primary02'
CRS-2677: Stop of 'ora.cssd' on 'primary02' succeeded

3. Check Status of After Failover

Current Status of Failover VIP

[grid@primary01 ~]$ srvctl status vip -vip primary02-vip
VIP 192.168.0.112 is enabled
VIP 192.168.0.112 is running on node: primary01

Good, it's on the first node.

Resource Status of Failover VIP

[grid@primary01 ~]$ crsctl status resource ora.primary02.vip
NAME=ora.primary02.vip
TYPE=ora.cluster_vip_net1.type
TARGET=ONLINE
STATE=INTERMEDIATE on primary01

Did you see that? The state of failover VIP is INTERMEDIATE, not ONLINE. According to the Resource State of Oracle Clusterware 19c, INTERMEDIATE means:

A resource is partially online. For example, the Oracle Database VIP resource fails over to another server when its home server leaves the cluster. However, applications cannot use this VIP to access the database while it is on a non-home server. Similarly, when an Oracle Database instance is started and not open, the resource is partially online: it is running but is not available to provide services.

Does it mean that we can't use the failover VIP? It's a bad news.

NIC Status of Failover VIP

The status of the first network interface of the first node proves that it has bound the VIP.

[grid@primary01 ~]$ ifconfig -a
...
ens33:6: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.0.112  netmask 255.255.255.0  broadcast 192.168.0.255
        ether 00:0c:29:dc:65:90  txqueuelen 1000  (Ethernet)

Yes, it did fail over to the first node.

Listener Status

[grid@primary01 ~]$ lsnrctl status
...
Listener Parameter File   /u01/app/19.0.0/grid/network/admin/listener.ora
Listener Log File         /u01/app/grid/diag/tnslsnr/primary01/listener/alert/log.xml
Listening Endpoints Summary...
  (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=LISTENER)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.0.11)(PORT=1521)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.0.111)(PORT=1521)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcps)(HOST=primary01)(PORT=5500))(Security=(my_wallet_directory=/u01/app/oracle/product/19.0.0/db_1/admin/ORCLCDB/xdb_wallet))(Presentation=HTTP)(Session=RAW))
Services Summary...
Service "+ASM" has 1 instance(s).
  Instance "+ASM1", status READY, has 1 handler(s) for this service...
Service "+ASM_DATA" has 1 instance(s).
  Instance "+ASM1", status READY, has 1 handler(s) for this service...
Service "+ASM_MGMT" has 1 instance(s).
  Instance "+ASM1", status READY, has 1 handler(s) for this service...
Service "+ASM_OCR" has 1 instance(s).
  Instance "+ASM1", status READY, has 1 handler(s) for this service...
Service "ORCLCDB" has 1 instance(s).
  Instance "ORCLCDB1", status READY, has 1 handler(s) for this service...
Service "ORCLCDBXDB" has 1 instance(s).
  Instance "ORCLCDB1", status READY, has 1 handler(s) for this service...
Service "b19aa2333d82dcfbe0530c2aa8c0ee3a" has 1 instance(s).
  Instance "ORCLCDB1", status READY, has 1 handler(s) for this service...
Service "orclpdb" has 1 instance(s).
  Instance "ORCLCDB1", status READY, has 1 handler(s) for this service...
The command completed successfully

No, we don't see 192.168.0.112 (primary02-vip) on the list of the listener. That is to say, no listener for 192.168.0.112.

All facts tell us that we cannot use the failover VIP to service external connections, although VIP actually failed over to another node. It seems to be the end of story.

Next, let's see our solution.

B. Solution to Use Failover VIP

The solution is to create a new listener running upon the failover VIP. Let's see steps in this part.

  1. Create a listener in ORACLE_HOME by oracle
  2. Add Listener to Database
  3. Startup the New Listener
  4. Show Listener Status
  5. Test Connection

1. Create a listener in ORACLE_HOME by oracle

Please note that, we configure a new listener at database-level, not grid-level.

Node 1

[oracle@primary01 ~]$ vi $ORACLE_HOME/network/admin/listener.ora
LISTENER2 =
  (DESCRIPTION_LIST =
    (DESCRIPTION =
      (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.0.112)(PORT = 1521))
    )
  )

Node 2

[oracle@primary02 ~]$ vi $ORACLE_HOME/network/admin/listener.ora
LISTENER2 =
  (DESCRIPTION_LIST =
    (DESCRIPTION =
      (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.0.111)(PORT = 1521))
    )
  )

2. Add Listener to Database

We use multiple addresses to set LOCAL_LISTENER with both scope and all SID. This can make the database register services with the new listener.

[oracle@primary01 ~]$ sqlplus / as sysdba
...
SQL> alter system set local_listener='(ADDRESS_LIST=(ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.0.111)(PORT = 1521))(ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.0.112)(PORT = 1521)))' scope=both sid='*';

System altered.

SQL> show parameter local_listener

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
local_listener                       string      (ADDRESS_LIST=(ADDRESS = (PROT
                                                 OCOL = TCP)(HOST = 192.168.0.1
                                                 11)(PORT = 1521))(ADDRESS = (P
                                                 ROTOCOL = TCP)(HOST = 192.168.
                                                 0.112)(PORT = 1521)))

3. Startup the New Listener

Since the VIP of node 2 has been failed over to node 1, we can start LISTENER2 at node 1. Vice versa, if node 1 failed, we should start it on node 2.

[oracle@primary01 ~]$ lsnrctl start listener2

LSNRCTL for Linux: Version 19.0.0.0.0 - Production on 26-DEC-2020 15:32:15

Copyright (c) 1991, 2019, Oracle.  All rights reserved.

Starting /u01/app/oracle/product/19.0.0/db_1/bin/tnslsnr: please wait...

TNSLSNR for Linux: Version 19.0.0.0.0 - Production
System parameter file is /u01/app/oracle/product/19.0.0/db_1/network/admin/listener.ora
Log messages written to /u01/app/oracle/diag/tnslsnr/primary01/listener2/alert/log.xml
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.0.112)(PORT=1521)))

Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.0.112)(PORT=1521)))
STATUS of the LISTENER
------------------------
Alias                     listener2
Version                   TNSLSNR for Linux: Version 19.0.0.0.0 - Production
Start Date                26-DEC-2020 15:32:15
Uptime                    0 days 0 hr. 0 min. 0 sec
Trace Level               off
Security                  ON: Local OS Authentication
SNMP                      OFF
Listener Parameter File   /u01/app/oracle/product/19.0.0/db_1/network/admin/listener.ora
Listener Log File         /u01/app/oracle/diag/tnslsnr/primary01/listener2/alert/log.xml
Listening Endpoints Summary...
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.0.112)(PORT=1521)))
The listener supports no services
The command completed successfully

Of course, no service registers with the listener so far. It's waiting for services to register with.

4. Show Listener Status

[oracle@primary01 ~]$ lsnrctl status listener2
...
Listening Endpoints Summary...
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.0.112)(PORT=1521)))
Services Summary...
Service "ORCLCDB" has 1 instance(s).
  Instance "ORCLCDB1", status READY, has 1 handler(s) for this service...
Service "ORCLCDBXDB" has 1 instance(s).
  Instance "ORCLCDB1", status READY, has 1 handler(s) for this service...
Service "b19aa2333d82dcfbe0530c2aa8c0ee3a" has 1 instance(s).
  Instance "ORCLCDB1", status READY, has 1 handler(s) for this service...
Service "orclpdb" has 1 instance(s).
  Instance "ORCLCDB1", status READY, has 1 handler(s) for this service...
The command completed successfully

All services are back.

5. Test Connection

We use SQL*Plus to test the connection of the database from a client.

C:\Users\edchen>sqlplus system/password@192.168.0.112:1521/orclcdb

SQL*Plus: Release 11.2.0.4.0 Production on Wed Dec 26 15:53:17 2020

Copyright (c) 1982, 2013, Oracle.  All rights reserved.


Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production

SQL> select instance_name from v$instance;

INSTANCE_NAME
----------------
ORCLCDB1

We make it.

Later on, the service of LISTENER2 will become unavailable when node 2 comes back.

Leave a Reply

Your email address will not be published. Required fields are marked *