Storage

What Makes Instance Shutdown Itself

What makes a running instance shutdown itself could be a very complicated topic to talk about, because we have to know what instance was thinking. What factors to make instance decide to run away from current situation? Was it trying to protect itself from further damage? In my cases, those causes seem to point to one thing: IO issue.

Due to ORA-19809, Space Failure

If you restart the database, the starting up instance may survive around few seconds to minutes, then it will shutdown itself due to ORA-19809.

Errors in file /u01/app/oracle/diag/rdbms/orcl/ORCL/trace/ORCL_ora_3516.trc:
ORA-19809: limit exceeded for recovery files
ORA-19804: cannot reclaim 68847616 bytes disk space from 1073741824 bytes limit

NET  (PID:3516): Error 19809 Creating archive log file to '/u01/app/oracle/fast_recovery_area/ORCL/archivelog/...
Errors in file /u01/app/oracle/diag/rdbms/orcl/ORCL/trace/ORCL_ora_3516.trc:
ORA-16038: log 1 sequence# 13 cannot be archived
ORA-00312: online log 1 thread 1: '/u01/app/oracle/oradata/ORCL/redo01.log'
USER (ospid: ): terminating the instance due to ORA error

The reason why ORA-19809 was thrown is that the space of Fast Recovery Area (FRA) was exhausted, even worse, archival is trying to work on it. I have talked about how to resolve ORA-19809: limit exceeded for recovery files in another post.

Due to ORA-00471, IO Failure

When we found ORA-00471 in alert log, the running instance is shutdown.

Errors in file /oracle/admin/ORCL/bdump/ORCL_dbw5_43853451.trc:
ORA-07445: exception encountered: core dump [] [] [] [] [] []

Errors in file /oracle/admin/ORCL/bdump/ORCL_pmon_13500160.trc:
ORA-00471: DBWR process terminated with error

PMON: terminating instance due to error 471
Instance terminated by PMON, pid = 13500160

As we can see, PMON determined to terminate the instance because of ORA-00471. Even though we know a db writer process (dbw5) was involved, the root cause is still unknown because we have no argument in ORA-07445. We suspected that the problem was resulted from IO channels. Do you have any idea about it? Please leave your comment below.

Due to ORA-63999, Disk Failure

Normally, files that store in disk array are pretty safe, but not 100% safe. For example, 2 disks out of a RAID 5 group will make you lose files. Let’s see what happen when we lost only one data file.

Errors in file /u01/app/oracle/diag/rdbms/orcl/ORCL/trace/ORCL_ckpt_1757.trc:
ORA-63999: data file suffered media failure
ORA-01116: error in opening database file 5
ORA-01110: data file 5: '/u01/app/oracle/oradata/ORCL/example01.dbf'
ORA-27041: unable to open file
Linux-x86_64 Error: 2: No such file or directory
Additional information: 3
Errors in file /u01/app/oracle/diag/rdbms/orcl/ORCL/trace/ORCL_ckpt_1757.trc  (incident=20161):
ORA-63999 [] [] [] [] [] [] [] [] [] [] [] []
Incident details in: /u01/app/oracle/diag/rdbms/orcl/ORCL/incident/incdir_20161/ORCL_ckpt_1757_i20161.trc

Errors in file /u01/app/oracle/diag/rdbms/orcl/ORCL/trace/ORCL_mz00_2331.trc:
ORA-01110: data file 5: '/u01/app/oracle/oradata/ORCL/example01.dbf'
ORA-01565: error in identifying file '/u01/app/oracle/oradata/ORCL/example01.dbf'
ORA-27037: unable to obtain file status
Linux-x86_64 Error: 2: No such file or directory
Additional information: 7

USER (ospid: ): terminating the instance due to ORA error

As we can see, some processes tried to write or verify a data file but they failed to do that. So the instance decided to shutdown itself immediately.

Leave a Reply

Your email address will not be published. Required fields are marked *