I am sure there have been countless how-to’s, documents, and etc written on the topic, but recent occurrences have provided me the drive to do a quick blog on it. This is definitely something that an Oracle DBA showing over 20 years of experience on their resume should know.
Archivelog space filling up elicits the same symptoms as one of your table spaces filling up, none of which are any good.
A few fun facts:
- The archivelogs are filled groups of redo log files that are moved to an offline location for archiving, the archive destination. The query sample provided below contains a view with a plethora of other pertinent information.
- These archivelogs are backed up and purged along with regularly scheduled RMAN backups
- If your archivelog is suddenly filling up, take a look at your backups as they are probably failing, thus the archivelog is not getting purged out
- The archivelog destination being filled up will elicit some interesting errors depending on one’s Oracle release, but will result in no transactions or connections being allowed by the database
It is a good idea to monitor your archivelog destination size, along with table space sizes, this is a redundant check of one’s backups, especially if you have another group or team that handles that portion. These are proactive steps that will prevent a loss of service.
Archive destination query:
- Select name, round(space_limit/1048576), round(space_used/1048576), round(space_used/1048576)*100/round(space_limit/1048576) from v$recovery_file_dest;
- This can be put into a daily job to email the results
If your archive destination is full, or about to be full, there are a couple possible scenarios here and the last 2 involve losing your archivelogs and are for more extreme situations, for instance an outage situation:
- Re-run a failed backup – this will backup your archivelogs and is the best path
- If you have the time and no outage has occurred, but since you have followed my recommendation to run a periodic alert and report on the archive destination space, you are golden, right?
- Connect to the instance with rman and delete the archive logs
- Connect to the ASM instance with the ASM command line utility and delete them manually
- This requires additional cleanup work, but very fast recovery time
Option 1 is simple and straightforward with the other 2 being a little more involved and carrying complications. I have mentioned them in the order of best path to follow if in this situation. We will now get into options 2 and 3 in more detail.
Option 2: This is fairly straightforward and will work most of the time, however note that you are deleting your archived redo logs. In most cases this not a huge issue and mitigates the outage. Make sure your environmental settings are correct and perform the following at a Linux/Unix command line:
- rman target /
- backup archivelog all delete input;
This does exactly how it sounds, you could backup them up to another location of course, however if pressed for time and the database is in a valid state this is quick.
Option 3: This option is in the rare instances that last resort option 2 fails to work. In a RAC environment this will need to be done as the grid user and with the ORACLE_SID set as the ASM SID using the ASM asmcmd command line utility. At a Linux/Unix command line with the environmental settings correct:
- asmcmd> cd /<FLASH_RECOVERY_LOCA>/<SID>/<DATE>
- asmcmd> rm thr*
This will delete all the threads within the dated folder, however there is another set of steps to cleanup after this, which is to get rman to realize these archivelogs are gone:
- set environment as user oracle for the target SID
- rman target /
- crosscheck archivelog all;
- delete expired archivelog all;
Option 3 is generally my last resort and have only had to use it a handful of times when the space was completely filled up and no connections at all could be made on 11gR2. I have not had to do this with 12c to date.
Getting to the reason for this post, I have seen a DBA with +20 years of experience working on a project for a testing environment where to alleviate the strain of RMAN backups and the storage requirements they disabled RMAN backups.
The goal was to disable RMAN backups completely and rely on a few days of datapumps, it is just a test environment was the thought here, so the DBA in question proceeded to disable all RMAN backups and etc without realizing, possibly knowing or whatever the reason, but they did not take the databases out of archivelog mode. This meant that with daily RMAN backups not occurring anymore that the archivelog destination proceeded to fill up over a few days time.
Not a good situation.