Data Guard

How to Resolve ORA-16649 after a Fast-Start Failover

After a fast-start failover, the broker will try to reinstate the standby database (former primary database) for sustaining data guard architecture. While reinstating, the broker startup only one node of RAC to perform transportation service, which will remain the other node of RAC in mount state. Even when the primary role has been switched back, there’s still one node serviceable to public. If you try to open the mounted instance, it will throw ORA-16649 error.

Let’s see the problem on the primary side after a fast-start failover, reinstatement and then a switchback.
  1. Login to the primary database, and check the database status by instance.
  2. $ sqlplus / as sysdba
    SQL> select inst_id, open_mode, database_role, switchover_status from gv$database;

       INST_ID OPEN_MODE            DATABASE_ROLE    SWITCHOVER_STATUS
    ---------- -------------------- ---------------- --------------------
             2 MOUNTED              PRIMARY          NOT ALLOWED
             1 READ WRITE           PRIMARY          TO STANDBY

    One instance is still in mount state, even you restart the clusterware. It seems the broker won’t let it open.
  3. Try to open the mounted instance #2.
  4. SQL> alter database open;
    alter database open
    *
    ERROR at line 1:
    ORA-16649: possible failover to another database prevents this database from being opened

    It did show more valuable information about the wrong status of database and prevent it from being opened, I suspect the broker blocks the way to open.
  5. Disable data guard broker.
  6. $ srvctl stop database -d compdb
    $ srvctl start database -d compdb -o nomount
    $ sqlplus / as sysdba
    SQL> alter system set dg_broker_start=FALSE scope=spfile;

    System altered.

  7. Bounce the primary database normally.
  8. $ srvctl stop database -d compdb
    $ srvctl start database -d compdb
    $ sqlplus / as sysdba
    SQL> select inst_id, open_mode, database_role, switchover_status from gv$database;

       INST_ID OPEN_MODE            DATABASE_ROLE    SWITCHOVER_STATUS
    ---------- -------------------- ---------------- --------------------
             1 READ WRITE           PRIMARY          TO STANDBY
             2 READ WRITE           PRIMARY          TO STANDBY

    The problematic instance is opened, but the broker is disabled currently, we should try to bring the broker back to be functional.
  9. Switch logfile on every instance of the primary RAC.
  10. SQL> alter system switch logfile;

    System altered.

    Then wait for the standby database applies the new received logfile. This may take a few seconds.
  11. Enable data guard broker for fixing up the broker.
  12. SQL> alter system set dg_broker_start=TRUE scope=both;

    System altered.

    Currently, the broker still has problems, we need go further.
  13. Optionally, disable fast-start failover before re-enabling the broker.
  14. If your fast-start failover is not enabled, you can skip this step.
    DGMGRL> disable fast_start failover;
    Disabled.
    DGMGRL> show configuration;

    Configuration - COMPDR

      Protection Mode: MaxPerformance
      Databases:
        primdb - Primary database
        standb - Physical standby database

    Fast-Start Failover: DISABLED

    Configuration Status:
    ORA-16610: command "Broker automatic health check" in progress
    DGM-17017: unable to determine configuration status

    The broker is still confused with current status.
  15. Re-enable the configuration.
  16. DGMGRL> disable configuration;
    Disabled.
    DGMGRL> enable configuration;
    Enabled.
    DGMGRL> show configuration;

    Configuration - COMPDR

      Protection Mode: MaxPerformance
      Databases:
        primdb - Primary database
        standb - Physical standby database

    Fast-Start Failover: DISABLED

    Configuration Status:
    SUCCESS

    DGMGRL> show database primdb;

    Database - primdb

      Role:            PRIMARY
      Intended State:  TRANSPORT-ON
      Instance(s):
        primdb1
        primdb2

    Database Status:
    SUCCESS

    DGMGRL> show database standb;

    Database - standb

      Role:            PHYSICAL STANDBY
      Intended State:  APPLY-ON
      Transport Lag:   0 seconds
      Apply Lag:       0 seconds
      Real Time Query: OFF
      Instance(s):
        standb1 (apply instance)
        standb2

    Database Status:
    SUCCESS

    The broker is back to normal.
  17. Enable fast-start failover, this is an optional step.
  18. DGMGRL> enable fast_start failover;
    Enabled.

  19. Restart the primary database normally, then check the status.
  20. $ srvctl stop database -d compdb
    $ srvctl start database -d compdb
    $ sqlplus / as sysdba
    SQL> select inst_id, open_mode, database_role, switchover_status from gv$database;

       INST_ID OPEN_MODE            DATABASE_ROLE    SWITCHOVER_STATUS
    ---------- -------------------- ---------------- --------------------
             1 READ WRITE           PRIMARY          TO STANDBY
             2 READ WRITE           PRIMARY          TO STANDBY

    There is no problem now.
I think the key point in the whole issue is that the broker got confused after a failover, you have to restart and re-enable the broker in order to make the database normal.

Leave a Reply

Your email address will not be published. Required fields are marked *