Data Guard

How to Implement Fast-Start Failover 11g

Before enabling fast-start failover in data guard broker, the only required precondition is enabling Flashback Database. But before enabling Flashback Database, you must enable Flash Recovery Area (FRA). The other configurations that are not required but they might make your fast-start failover go smoothly.
  • Set FastStartFailoverTarget on a database which is the target database to be switched to play the primary role after a failover.
  • Set the protection mode to MaxAvailability.
  • Set LogXptMode to SYNC.
Assuming Data Guard Broker is well configured. Let’s enable Fast-Start Failover step by step.
  1. Make sure Flashback Database is ON in both databases.
  2. SQL> SELECT flashback_on FROM v$database;

    FLASHBACK_ON
    ------------------
    YES

  3. Make sure Flashback Database are all enabled at tablespace-level in both databases.
  4. SQL> SELECT name, flashback_on FROM v$tablespace;

    NAME                           FLA
    ------------------------------ ---
    SYSTEM                         YES
    SYSAUX                         YES
    UNDOTBS1                       YES
    USERS                          YES
    TEMP                           YES
    EXAMPLE                        YES
    UNDOTBS2                       YES

    7 rows selected.

  5. Check Data Guard Broker configuration.
  6. $ dgmgrl
    DGMGRL> connect sys/password@primdb1
    DGMGRL> show configuration verbose;

    Configuration - COMPDR

      Protection Mode: MaxPerformance
      Databases:
        primdb - Primary database
        standb - Physical standby database

    Fast-Start Failover: DISABLED

    Configuration Status:
    SUCCESS

    So far, Fast-Start Failover have not been enabled.
  7. Set a target standby database at database-level in the broker configuration.
  8. DGMGRL> EDIT DATABASE primdb SET PROPERTY FastStartFailoverTarget='standb';
    Property "faststartfailovertarget" updated
    DGMGRL> EDIT DATABASE standb SET PROPERTY FastStartFailoverTarget='primdb';
    Property "faststartfailovertarget" updated
    DGMGRL> show database verbose primdb;

    Database - primdb

      Role:            PRIMARY
      Intended State:  TRANSPORT-ON
      Instance(s):
        primdb1
        primdb2

      Properties:
        DGConnectIdentifier             = 'primdb'
        ObserverConnectIdentifier       = ''
        LogXptMode                      = 'ASYNC'
        DelayMins                       = '0'
        Binding                         = 'optional'
        MaxFailure                      = '0'
        MaxConnections                  = '1'
        ReopenSecs                      = '300'
        NetTimeout                      = '30'
        RedoCompression                 = 'DISABLE'
        LogShipping                     = 'ON'
        PreferredApplyInstance          = ''
        ApplyInstanceTimeout            = '0'
        ApplyParallel                   = 'AUTO'
        StandbyFileManagement           = 'AUTO'
        ArchiveLagTarget                = '0'
        LogArchiveMaxProcesses          = '30'
        LogArchiveMinSucceedDest        = '1'
        DbFileNameConvert               = 'standb, primdb'
        LogFileNameConvert              = ''
        FastStartFailoverTarget         = 'standb'
        StatusReport                    = '(monitor)'
        InconsistentProperties          = '(monitor)'
        InconsistentLogXptProps         = '(monitor)'
        SendQEntries                    = '(monitor)'
        LogXptStatus                    = '(monitor)'
        RecvQEntries                    = '(monitor)'
        HostName(*)
        SidName(*)
        StaticConnectIdentifier(*)
        StandbyArchiveLocation(*)
        AlternateLocation(*)
        LogArchiveTrace(*)
        LogArchiveFormat(*)
        TopWaitEvents(*)
        (*) - Please check specific instance for the property value

    Database Status:
    SUCCESS

    DGMGRL> show database verbose standb;

    Database - standb

      Role:            PHYSICAL STANDBY
      Intended State:  APPLY-ON
      Transport Lag:   0 seconds
      Apply Lag:       0 seconds
      Real Time Query: OFF
      Instance(s):
        standb1 (apply instance)
        standb2

      Properties:
        DGConnectIdentifier             = 'standb'
        ObserverConnectIdentifier       = ''
        LogXptMode                      = 'ASYNC'
        DelayMins                       = '0'
        Binding                         = 'OPTIONAL'
        MaxFailure                      = '0'
        MaxConnections                  = '1'
        ReopenSecs                      = '300'
        NetTimeout                      = '30'
        RedoCompression                 = 'DISABLE'
        LogShipping                     = 'ON'
        PreferredApplyInstance          = ''
        ApplyInstanceTimeout            = '0'
        ApplyParallel                   = 'AUTO'
        StandbyFileManagement           = 'AUTO'
        ArchiveLagTarget                = '0'
        LogArchiveMaxProcesses          = '30'
        LogArchiveMinSucceedDest        = '1'
        DbFileNameConvert               = 'primdb, standb'
        LogFileNameConvert              = ''
        FastStartFailoverTarget         = 'primdb'
        StatusReport                    = '(monitor)'
        InconsistentProperties          = '(monitor)'
        InconsistentLogXptProps         = '(monitor)'
        SendQEntries                    = '(monitor)'
        LogXptStatus                    = '(monitor)'
        RecvQEntries                    = '(monitor)'
        HostName(*)
        SidName(*)
        StaticConnectIdentifier(*)
        StandbyArchiveLocation(*)
        AlternateLocation(*)
        LogArchiveTrace(*)
        LogArchiveFormat(*)
        TopWaitEvents(*)
        (*) - Please check specific instance for the property value

    Database Status:
    SUCCESS

    If you have only one standby database, you can skip this step.
  9. Set allowable time for data applying lag between both databases.
  10. DGMGRL> EDIT CONFIGURATION SET PROPERTY FastStartFailoverLagLimit=30;
    Property "faststartfailoverlaglimit" updated

  11. Set allowable time for observer lose connection from primary database, if disconnection time exceeds the threshold, it will trigger a fast-start failover.
  12. DGMGRL> EDIT CONFIGURATION SET PROPERTY FastStartFailoverThreshold=60;
    Property "faststartfailoverthreshold" updated

  13. Open another console to start observer.
  14. $ dgmgrl
    DGMGRL> connect sys@standb1
    DGMGRL> START OBSERVER;
    Observer started

    Since the observer will listen the status of all databases after started, therefore, it’s normal for the prompt stays unreturned, and please don’t interrupt it.

    An observer is usually the third party beside primary and standby to judge the health condition of data guard, but we can choose any one of the nodes to play the role, in this case, we choose the first node (standby01.example.com) of standby RAC.

    The observer can be enabled before or after enabling fast-start failover. In our case, we choose to enable observer first.
  15. Enable fast-start failover.
  16. DGMGRL> ENABLE FAST_START FAILOVER;
    Enabled.

  17. Check the broker configuration.
  18. DGMGRL> show configuration verbose;

    Configuration - COMPDR

      Protection Mode: MaxPerformance
      Databases:
        primdb - Primary database
        standb - (*) Physical standby database

      (*) Fast-Start Failover target

    Fast-Start Failover: ENABLED

      Threshold:        60 seconds
      Target:           standb
      Observer:         standby01.example.com
      Lag Limit:        30 seconds
      Shutdown Primary: TRUE
      Auto-reinstate:   TRUE

    Configuration Status:
    SUCCESS

  19. Check Fast-Start Failover configuration.
  20. DGMGRL> SHOW FAST_START FAILOVER;

    Fast-Start Failover: ENABLED

      Threshold:        60 seconds
      Target:           standb
      Observer:         standby01.example.com
      Lag Limit:        30 seconds
      Shutdown Primary: TRUE
      Auto-reinstate:   TRUE

    Configurable Failover Conditions
      Health Conditions:
        Corrupted Controlfile          YES
        Corrupted Dictionary           YES
        Inaccessible Logfile            NO
        Stuck Archiver                  NO
        Datafile Offline               YES

      Oracle Error Conditions:
        (none)

    All of the configurations are correctly set.

When fast-start is ready, let’s simulate a blackout on primary database to test fast-failover function. Our goal is to failover to standby database by the broker when primary database is blackout, and then reinstate the latter standby database automatically after it goes to mount state.
  1. Keep observer console on the screen.
  2. DGMGRL> start observer;
    Observer started

  3. Check current applying lag on standby database.
  4. SQL> column name format a10;
    SQL> column value format a12;
    SQL> column datum_time format a20;
    SQL> column time_computed format a20;
    SQL> SELECT name, value, datum_time, time_computed FROM V$DATAGUARD_STATS WHERE name like 'apply lag';

    NAME       VALUE        DATUM_TIME           TIME_COMPUTED
    ---------- ------------ -------------------- --------------------
    apply lag  +00 00:00:00 11/03/2012 10:13:20  11/03/2012 10:13:35

  5. Shutdown abort the primary database to simulate a blackout.
  6. $ srvctl stop database -d compdb -o abort
    After waiting for 60 seconds, the observer is displaying failover message.
  7. Watch the failover process on the observer.
  8. ...
    10:14:22.73  Saturday, November 03, 2012
    Initiating Fast-Start Failover to database "standb"...
    Performing failover NOW, please wait...
    Failover succeeded, new primary is "standb"
    10:15:44.17  Saturday, November 03, 2012

    You can see the failover is successful.
  9. Manually startup current standby (former primary) database to mount state.
  10. Currently, the current standby database is down due to a blackout. The broker will not perform reinstatement on it until it’s in the mount state.
    $ srvctl start database -d compdb -o mount
  11. Keep watching the failover process on observer, you can see the broker is trying to reinstate current standby database.
  12. ...
    10:17:33.30  Saturday, November 03, 2012
    Initiating reinstatement for database "primdb"...
    Reinstating database "primdb", please wait...
    Operation requires shutdown of instance "primdb2" on database "primdb"
    Shutting down instance "primdb2"...
    ORA-01109: database not open

    Database dismounted.
    ORACLE instance shut down.
    Operation requires startup of instance "primdb2" on database "primdb"
    Starting instance "primdb2"...
    ORACLE instance started.
    Database mounted.
    Continuing to reinstate database "primdb" ...
    Reinstatement of database "primdb" succeeded
    10:20:45.15  Saturday, November 03, 2012

    It shows the reinstatement is successful, the data guard architecture is sustained and data synchronization is back to normal.
  13. Restart current standby database (the former primary database).
  14. Since the broker startup only one instance of current standby RAC after reinstatement to receive and apply redo logs, you would be better to restart all instances of the current standby database to mount state.
    $ srvctl stop database -d compdb
    $ srvctl start database -d compdb -o mount

  15. Check status after a successful fast-start failover.
    1. On current primary database
    2. SQL> select inst_id, open_mode, database_role, switchover_status from gv$database;

         INST_ID OPEN_MODE            DATABASE_ROLE    SWITCHOVER_STATUS
      ---------- -------------------- ---------------- --------------------
               2 READ WRITE           PRIMARY          TO STANDBY
               1 READ WRITE           PRIMARY          TO STANDBY

    3. On current standby database
    4. SQL> select inst_id, open_mode, database_role, switchover_status from gv$database;

         INST_ID OPEN_MODE            DATABASE_ROLE    SWITCHOVER_STATUS
      ---------- -------------------- ---------------- --------------------
               2 MOUNTED              PHYSICAL STANDBY NOT ALLOWED
               1 MOUNTED              PHYSICAL STANDBY NOT ALLOWED

    5. Overall Data Guard Broker status.
    6. DGMGRL> SHOW FAST_START FAILOVER;

      Fast-Start Failover: ENABLED

        Threshold:        60 seconds
        Target:           primdb
        Observer:         standby01.example.com
        Lag Limit:        30 seconds
        Shutdown Primary: TRUE
        Auto-reinstate:   TRUE

      Configurable Failover Conditions
        Health Conditions:
          Corrupted Controlfile          YES
          Corrupted Dictionary           YES
          Inaccessible Logfile            NO
          Stuck Archiver                  NO
          Datafile Offline               YES

        Oracle Error Conditions:
          (none)

      DGMGRL> SHOW CONFIGURATION VERBOSE;

      Configuration - COMPDR

        Protection Mode: MaxPerformance
        Databases:
          standb - Primary database
          primdb - (*) Physical standby database

        (*) Fast-Start Failover target

      Fast-Start Failover: ENABLED

        Threshold:        60 seconds
        Target:           primdb
        Observer:         standby01.example.com
        Lag Limit:        30 seconds
        Shutdown Primary: TRUE
        Auto-reinstate:   TRUE

      Configuration Status:
      SUCCESS

      DGMGRL> SHOW DATABASE VERBOSE primdb;

      Database - primdb

        Role:            PHYSICAL STANDBY
        Intended State:  APPLY-ON
        Transport Lag:   0 seconds
        Apply Lag:       0 seconds
        Real Time Query: OFF
        Instance(s):
          primdb1
          primdb2 (apply instance)

        Properties:
          DGConnectIdentifier             = 'primdb'
          ObserverConnectIdentifier       = ''
          LogXptMode                      = 'ASYNC'
          DelayMins                       = '0'
          Binding                         = 'optional'
          MaxFailure                      = '0'
          MaxConnections                  = '1'
          ReopenSecs                      = '300'
          NetTimeout                      = '30'
          RedoCompression                 = 'DISABLE'
          LogShipping                     = 'ON'
          PreferredApplyInstance          = ''
          ApplyInstanceTimeout            = '0'
          ApplyParallel                   = 'AUTO'
          StandbyFileManagement           = 'AUTO'
          ArchiveLagTarget                = '0'
          LogArchiveMaxProcesses          = '30'
          LogArchiveMinSucceedDest        = '1'
          DbFileNameConvert               = 'standb, primdb'
          LogFileNameConvert              = ''
          FastStartFailoverTarget         = 'standb'
          StatusReport                    = '(monitor)'
          InconsistentProperties          = '(monitor)'
          InconsistentLogXptProps         = '(monitor)'
          SendQEntries                    = '(monitor)'
          LogXptStatus                    = '(monitor)'
          RecvQEntries                    = '(monitor)'
          HostName(*)
          SidName(*)
          StaticConnectIdentifier(*)
          StandbyArchiveLocation(*)
          AlternateLocation(*)
          LogArchiveTrace(*)
          LogArchiveFormat(*)
          TopWaitEvents(*)
          (*) - Please check specific instance for the property value

      Database Status:
      SUCCESS

      DGMGRL> SHOW DATABASE VERBOSE standb;

      Database - standb

        Role:            PRIMARY
        Intended State:  TRANSPORT-ON
        Instance(s):
          standb1
          standb2

        Properties:
          DGConnectIdentifier             = 'standb'
          ObserverConnectIdentifier       = ''
          LogXptMode                      = 'ASYNC'
          DelayMins                       = '0'
          Binding                         = 'OPTIONAL'
          MaxFailure                      = '0'
          MaxConnections                  = '1'
          ReopenSecs                      = '300'
          NetTimeout                      = '30'
          RedoCompression                 = 'DISABLE'
          LogShipping                     = 'ON'
          PreferredApplyInstance          = ''
          ApplyInstanceTimeout            = '0'
          ApplyParallel                   = 'AUTO'
          StandbyFileManagement           = 'AUTO'
          ArchiveLagTarget                = '0'
          LogArchiveMaxProcesses          = '30'
          LogArchiveMinSucceedDest        = '1'
          DbFileNameConvert               = 'primdb, standb'
          LogFileNameConvert              = ''
          FastStartFailoverTarget         = 'primdb'
          StatusReport                    = '(monitor)'
          InconsistentProperties          = '(monitor)'
          InconsistentLogXptProps         = '(monitor)'
          SendQEntries                    = '(monitor)'
          LogXptStatus                    = '(monitor)'
          RecvQEntries                    = '(monitor)'
          HostName(*)
          SidName(*)
          StaticConnectIdentifier(*)
          StandbyArchiveLocation(*)
          AlternateLocation(*)
          LogArchiveTrace(*)
          LogArchiveFormat(*)
          TopWaitEvents(*)
          (*) - Please check specific instance for the property value

      Database Status:
      SUCCESS

      DGMGRL>

    After situations are all cleared on the primary side, we can switchback to the production environment to keep the data guard architecture working.
  16. Switchback to production environment.
  17. DGMGRL> switchover to primdb;
    Performing switchover NOW, please wait...
    Operation requires a connection to instance "primdb2" on database "primdb"
    Connecting to instance "primdb2"...
    Connected.
    Operation requires shutdown of instance "standb2" on database "standb"
    Shutting down instance "standb2"...
    Database closed.
    Database dismounted.
    ORACLE instance shut down.
    New primary database "primdb" is opening...
    Operation requires shutdown of instance "standb1" on database "standb"
    Shutting down instance "standb1"...
    ORA-01109: database not open

    Database dismounted.
    ORACLE instance shut down.
    Operation requires startup of instance "standb1" on database "standb"
    Starting instance "standb1"...
    ORACLE instance started.
    Database mounted.
    Switchover succeeded, new primary is "primdb"
    DGMGRL> show configuration;

    Configuration - COMPDR

    Protection Mode: MaxPerformance
    Databases:
    primdb - Primary database
    standb - (*) Physical standby database

    Fast-Start Failover: ENABLED

    Configuration Status:
    SUCCESS


If you do enable Flashback Database, but leave one or more tablespaces to be disabled on the feature, the reinstatement will fail like this:
...
21:30:01.61  Saturday, November 03, 2012
Initiating reinstatement for database "primdb"...
Reinstating database "primdb", please wait...
Error: ORA-16653: failed to reinstate database

Failed.
Reinstatement of database "primdb" failed

Though the failover is successful, but the reinstatement is failed.

It’s usually enough to do a successful fast-start failover on MaxPerformance mode, but if you have availability issues, you can change the mode into MaxAvailability. DGMGRL> DISABLE FAST_START FAILOVER;
Disabled.
DGMGRL> EDIT DATABASE 'primdb' SET PROPERTY LogXptMode=SYNC;
Property "logxptmode" updated
DGMGRL> EDIT DATABASE 'standb' SET PROPERTY LogXptMode=SYNC;
Property "logxptmode" updated
DGMGRL> EDIT CONFIGURATION SET PROTECTION MODE AS MaxAvailability;
Succeeded.
DGMGRL> ENABLE FAST_START FAILOVER;
Enabled.
DGMGRL> show configuration verbose

Configuration - COMPDR

  Protection Mode: MaxAvailability
  Databases:
    primdb - Primary database
    standb - (*) Physical standby database

  (*) Fast-Start Failover target

Fast-Start Failover: ENABLED

  Threshold:        60 seconds
  Target:           standb
  Observer:         standby01.example.com
  Lag Limit:        30 seconds (not in use)
  Shutdown Primary: TRUE
  Auto-reinstate:   TRUE

Configuration Status:
SUCCESS

DGMGRL> show database verbose primdb

Database - primdb

  Role:            PRIMARY
  Intended State:  TRANSPORT-ON
  Instance(s):
    primdb1
    primdb2

  Properties:
    DGConnectIdentifier             = 'primdb'
    ObserverConnectIdentifier       = ''
    LogXptMode                      = 'sync'
    DelayMins                       = '0'
    Binding                         = 'optional'
    MaxFailure                      = '0'
    MaxConnections                  = '1'
    ReopenSecs                      = '300'
    NetTimeout                      = '30'
    RedoCompression                 = 'DISABLE'
    LogShipping                     = 'ON'
    PreferredApplyInstance          = ''
    ApplyInstanceTimeout            = '0'
    ApplyParallel                   = 'AUTO'
    StandbyFileManagement           = 'AUTO'
    ArchiveLagTarget                = '0'
    LogArchiveMaxProcesses          = '30'
    LogArchiveMinSucceedDest        = '1'
    DbFileNameConvert               = 'standb, primdb'
    LogFileNameConvert              = ''
    FastStartFailoverTarget         = 'standb'
    StatusReport                    = '(monitor)'
    InconsistentProperties          = '(monitor)'
    InconsistentLogXptProps         = '(monitor)'
    SendQEntries                    = '(monitor)'
    LogXptStatus                    = '(monitor)'
    RecvQEntries                    = '(monitor)'
    HostName(*)
    SidName(*)
    StaticConnectIdentifier(*)
    StandbyArchiveLocation(*)
    AlternateLocation(*)
    LogArchiveTrace(*)
    LogArchiveFormat(*)
    TopWaitEvents(*)
    (*) - Please check specific instance for the property value

Database Status:
SUCCESS

DGMGRL> show database verbose standb

Database - standb

  Role:            PHYSICAL STANDBY
  Intended State:  APPLY-ON
  Transport Lag:   0 seconds
  Apply Lag:       0 seconds
  Real Time Query: OFF
  Instance(s):
    standb1 (apply instance)
    standb2

  Properties:
    DGConnectIdentifier             = 'standb'
    ObserverConnectIdentifier       = ''
    LogXptMode                      = 'sync'
    DelayMins                       = '0'
    Binding                         = 'OPTIONAL'
    MaxFailure                      = '0'
    MaxConnections                  = '1'
    ReopenSecs                      = '300'
    NetTimeout                      = '30'
    RedoCompression                 = 'DISABLE'
    LogShipping                     = 'ON'
    PreferredApplyInstance          = ''
    ApplyInstanceTimeout            = '0'
    ApplyParallel                   = 'AUTO'
    StandbyFileManagement           = 'AUTO'
    ArchiveLagTarget                = '0'
    LogArchiveMaxProcesses          = '30'
    LogArchiveMinSucceedDest        = '1'
    DbFileNameConvert               = 'primdb, standb'
    LogFileNameConvert              = ''
    FastStartFailoverTarget         = 'primdb'
    StatusReport                    = '(monitor)'
    InconsistentProperties          = '(monitor)'
    InconsistentLogXptProps         = '(monitor)'
    SendQEntries                    = '(monitor)'
    LogXptStatus                    = '(monitor)'
    RecvQEntries                    = '(monitor)'
    HostName(*)
    SidName(*)
    StaticConnectIdentifier(*)
    StandbyArchiveLocation(*)
    AlternateLocation(*)
    LogArchiveTrace(*)
    LogArchiveFormat(*)
    TopWaitEvents(*)
    (*) - Please check specific instance for the property value

Database Status:
SUCCESS


Fast-start failover can control the steps of failover automatically, it’s fast and easy to setup, and almost no human intervention is needed during failover. As a DBA, you must watch out a false failover very closely, a false failover due to temporary network problems results an unnecessary failover and it costs your time to recover or maybe your credits.

In case a false failover happens during a planned shutdown, you should take one of the following preventive steps to secure the planned shutdown:
  • Disable fast-start failover.
  • Remain fast-start failover enabled, but stop the observer.
  • Remain fast-start failover enabled and observer started, but shutdown the standby database first.

Leave a Reply

Your email address will not be published. Required fields are marked *