Can not allocate log, archival required
Saw an error-like message in alert log:
ORACLE Instance ORCL - Can not allocate log, archival required
Actually, this is not an error, this message implies LGWR is waiting for archival completion of next redo log in turn. This kind of waiting could be from one of the following reasons:
Respective section is described below.
A. Archiving not started (ARCH)
For Oracle 9i, the archiving processes, also called ARCH, ARCn specifically, needs to be started explicitly even though the database is in archivelog mode. Therefore, you can start the archiving manually like this:
SQL> archive log start;
It looks weird? Yes, it does. For newer versions of database (10g onward), there's no need to start archiving separately. It will do it automatically. As an experienced DBA, I think you should know this in order to do more troubleshootings.
If you fell into this case, this message is truly an error. You have to fix it. Because nobody is able to move a muscle in this database until the archival is back.
B. Temporary IO bottleneck
If you fell into this case, this message is sort of a warning or notification, not an error. You may ignore it.
When the database is dumping or migrating data, the archiving could go faster and faster. In this moment, LGWR is also busy switching logs and possibly waiting for next log file to be released from archiving. That's why you saw this message repeating itself in alert log.
As such, you should keep tracking the rate of archiving, there could be tremedous amount of archived logs generated in a short period. The server could face serious disk IO bottleneck right now. Later on you should try to improve IO performance in order to reduce any waiting time.
Here I list some candidates that you can consider to improve at low cost.
- Async IO: It is kind of a writing buffer of disks, which can let processes write their data to this buffer and return immediately as if they have finished their disk writing. You should enable it whenever it's possible.
- Redo log size: If the size of redo logs is small, LGWR may be busy switching log files, which could cause a lot of waits. Please try to raise the size to avoid frequent switching.
- Parallelism: You might have set a high parallelism on your job before running. Probably it's too high to make disk IO work smoothly. Please try to find out the optimized number of parallelism.
- Less redo logs: You can turn off logging by alter table nologging + insert append hint to ease the database server. It will generate very little redo logs but with only one exception. force logging at database level will suppress any other nologging at object level and make them useless.
- Data pump: I talked about how to optimize data pump in the post - Tips to Speed Up Data Pump Export or Import. You can adopt some useful tips for your needs from this post.
For your references on disk IO monitoring:
C. Data Guard
In ASYNC + ARCH mode of data guard, the primary database was waiting for the archiving to complete on the standby database. Mostly, the bandwidth of network could be the major issue of the problem.