30 Performing User-Managed Database Flashback and Recovery
A user-managed backup and recovery strategy means a method that does not depend on RMAN. Use the flashback features of Oracle Database in a user-managed backup and recovery strategy.
30.1 Performing Flashback Database with SQL*Plus
You can use SQL*Plus to perform flashback database operations on multitenant container databases (CDBs) and pluggable databases (PDBs). Oracle Flashback Database returns your entire database or an entire PDB to a previous state without requiring you to restore files from backup.
Flashback Database requires you to create a fast recovery area for your database and enable the collection of flashback logs. The requirements and preparations for flashback database are the same whether you use RMAN or SQL*Plus.
See Also:
-
Performing Flashback and Database Point-in-Time Recovery for details about how the Flashback Database feature works, requirements for using Flashback Database, and how to enable the collection of flashback logs required for Flashback Database
30.1.1 Performing Flashback Database of CDBs with SQL*Plus
Use the SQL*Plus FLASHBACK DATABASE
command to return a
whole multitenant container database (CDB) to a prior state. This command performs the same
function as the RMAN FLASHBACK DATABASE
command.
Prerequisites:
The prerequisites for performing a flashback database operation are described in "Prerequisites for Flashback Database and Restore Points".
See Also:
Performing Flashback and Database Point-in-Time Recovery for details about how the Flashback Database feature works, requirements for using Flashback Database, and how to enable the collection of flashback logs required for Flashback Database
30.1.2 Performing Flashback Database of PDBs with SQL*Plus
Use the SQL*Plus FLASHBACK
DATABASE
command to return a specific pluggable database (PDB) to a prior state. The remaining PDBs in the multitenant container database (CDB) are not impacted by the flashback operation on a single PDB.
FLASHBACK DATABASE
command performs the same function as the RMAN FLASHBACK DATABASE
command.
To perform a flashback of a PDB using SQL*Plus:
Note:
Flashback operations on a proxy PDB are not supported.
See Also:
Performing Flashback and Database Point-in-Time Recovery for details about how the Flashback Database feature works, requirements for using Flashback Database, and how to enable the collection of flashback logs required for Flashback Database
30.2 Overview of User-Managed Media Recovery
This section provides an overview of recovery with SQL*Plus. This section contains the following topics:
30.2.1 About User-Managed Restore and Recovery
Typically, you restore a file when a media failure or user error has damaged or deleted one or more data files. In a user-managed restore operation, you use an operating system utility to restore a backup of the file.
If a media failure affects data files, then the recovery procedure depends on:
-
The archiving mode of the database:
ARCHIVELOG
orNOARCHIVELOG
-
The type of media failure
-
The files affected by the media failure (data files, control files, archived redo logs, and the server parameter file are all candidates for restore operations)
If either a permanent or temporary media failure affects any data files of a database operating in NOARCHIVELOG
mode, then the database automatically shuts down. If the media failure is temporary, then correct the underlying problem and restart the database. Usually, crash recovery recovers all committed transactions from the online redo log. If the media failure is permanent, then recover the database as described in "Recovering a Database in NOARCHIVELOG Mode".
Table 30-1 explains the implications for media recovery when you lose files in a database that runs in ARCHIVELOG
mode.
Table 30-1 User-Managed Restore Operations
If You Lose... | Then... |
---|---|
Data files in the |
The database automatically shuts down. If the hardware problem is temporary, then fix it and restart the database. Usually, crash recovery recovers lost transactions. If the hardware problem is permanent, then restore the data files from backups and recover the database as described in "Performing Closed Database Recovery". |
Data files not in the |
Affected data files are taken offline, but the database stays open. If the unaffected portions of the database must remain available, then do not shut down the database. Take tablespaces containing problem data files offline using the temporary option, then recover them as described in "Performing Open Database Recovery". |
All copies of the current control file |
You must restore a backup control file and then open the database with the If you do not have a backup, then you can attempt to re-create the control file. If possible, use the script included in the |
One copy of a multiplexed control file |
Copy an intact multiplexed control file into the location of the damaged or missing control file and open the database. If you cannot copy the control file to its original location, then edit the initialization parameter file to reflect a new location or remove the damaged control file. Then, open the database. |
One or more archived logs required for media recovery |
You must restore backups of these archived logs for recovery to proceed. You can restore either to the default or nondefault location. If you do not have backups, then you must perform incomplete recovery up to an SCN before the first missing redo log and open |
The server parameter file (SPFILE) |
If you have a backup of the server parameter file, then restore it. Alternatively, if you have a backup of the client-side initialization parameter file, then you can restore a backup of this file, start the instance, and then re-create the server parameter file. |
Note:
Restore and recovery of Oracle Managed Files is no different from restore and recovery of user-named files.
To perform media recovery, Oracle recommends that you use the RECOVER
statement in SQL*Plus. You can also use the SQL statement ALTER
DATABASE
RECOVER
, but the RECOVER
statement is often simpler. To start any type of media recovery, you must adhere to the following restrictions:
-
You must have administrator privileges.
-
All recovery sessions must be compatible.
-
One session cannot start complete media recovery while another performs incomplete media recovery.
-
You cannot start media recovery if you are connected to the database through a shared server process.
30.2.2 Automatic Recovery with the RECOVER Command
When using SQL*Plus to perform media recovery, the easiest strategy is to perform automatic recovery with the SQL*Plus RECOVER
command. Automatic recovery initiates recovery without manually prompting SQL*Plus to apply each individual archived redo log.
When using SQL*Plus, you have the following options for automating the application of the default file names of archived redo logs needed during recovery:
-
Issuing
SET
AUTORECOVERY
ON
before issuing theRECOVER
command. If you perform recovery withSET
AUTORECOVERY
OFF
, which is the default, then you must enter file names manually or accept the suggested file name by pressing Enter. -
Specifying the
AUTOMATIC
keyword as an option of theRECOVER
command.
In either case, no interaction is required when you issue the RECOVER
command if the necessary files are in the correct locations with the correct names. When the database successfully applies a redo log file, the following message is returned:
Log applied.
You are then prompted for the next redo log in the sequence. If the most recently applied log is the last required log, then recovery is terminated.
The file names used for automatic recovery are derived from the concatenated values of LOG_ARCHIVE_FORMAT
with LOG_ARCHIVE_DEST_
n
, where n
is the highest value among all enabled, local destinations. For example, assume that the following initialization parameter settings are in effect in the database instance:
LOG_ARCHIVE_DEST_1 = "LOCATION=/arc_dest/loc1/" LOG_ARCHIVE_DEST_2 = "LOCATION=/arc_dest/loc2/" LOG_ARCHIVE_DEST_STATE_1 = DEFER LOG_ARCHIVE_DEST_STATE_2 = ENABLE LOG_ARCHIVE_FORMAT = arch_%t_%s_%r.arc
In this example, SQL*Plus automatically suggests the file name /arc_dest/loc2/arch_%t_%s_%r.arc
(where %t
is the thread, %s
is the sequence and %r
is the resetlogs ID).
30.2.2.1 Automatic Recovery with SET AUTORECOVERY
After restoring data file backups, you can run the SET
AUTORECOVERY
ON
command to enable automatic recovery. For example, you could enter the following commands in SQL*Plus to perform automatic recovery and open the database:
STARTUP MOUNT SET AUTORECOVERY ON RECOVER DATABASE ALTER DATABASE OPEN;
Note:
After issuing the SQL*Plus RECOVER
command, you can view all files that have been considered for recovery in the V$RECOVERY_FILE_STATUS
view. You can access status information for each file in the V$RECOVERY_STATUS
view. These views are not accessible after you terminate the recovery session.
30.2.2.2 Automatic Recovery with the AUTOMATIC Option of the RECOVER Command
Besides using SET
AUTORECOVERY
to turn on automatic recovery, you can also simply specify the AUTOMATIC
keyword in the RECOVER
command. For example, you could enter the following commands in SQL*Plus to perform automatic recovery and open the database:
STARTUP MOUNT RECOVER AUTOMATIC DATABASE ALTER DATABASE OPEN;
If you use an Oracle Real Application Clusters configuration, and if you are performing incomplete recovery or using a backup control file, then the database can only compute the name of the first archived redo log file from the first redo thread. You may have to manually apply the first log file from the other redo threads. After the first log file in a given thread has been supplied, the database can suggest the names of the subsequent logs in this thread.
30.2.3 Recovery When Archived Logs Are in the Default Location
No additional setup is required to perform recovery when the archived redo log files are present in the default location.
During recovery, as a log is needed, the database suggests the file name. If you run nonautomatic media recovery with SQL*Plus, then the output is displayed in the format shown by this example:
ORA-00279: change 53577 generated at 11/26/02 19:20:58 needed for thread 1 ORA-00289: suggestion : /oracle/oradata/trgt/arch/arcr_1_802.arc ORA-00280: change 53577 for thread 1 is in sequence #802 Specify log: [<RET> for suggested | AUTO | FROM logsource | CANCEL ]
Similar messages are returned when you use an ALTER
DATABASE
...
RECOVER
statement. However, no prompt is displayed.
The database constructs suggested archived log file names by concatenating the current values of the initialization parameters LOG_ARCHIVE_DEST_
n
(where n
is the highest value among all enabled, local destinations) and LOG_ARCHIVE_FORMAT
and using log history data from the control file. The following are possible settings:
LOG_ARCHIVE_DEST_1 = 'LOCATION = /oracle/oradata/trgt/arch/' LOG_ARCHIVE_FORMAT = arcr_%t_%s.arc SELECT NAME FROM V$ARCHIVED_LOG; NAME ---------------------------------------- /oracle/oradata/trgt/arch/arcr_1_467.arc /oracle/oradata/trgt/arch/arcr_1_468.arc /oracle/oradata/trgt/arch/arcr_1_469.arc
Thus, if all the required archived log files are present at the LOG_ARCHIVE_DEST_1
destination, and if the value for LOG_ARCHIVE_FORMAT
is never altered, then the database can suggest and apply log files to complete media recovery automatically.
30.2.4 Recovery When Archived Logs Are in a Nondefault Location
To perform media recovery when archived redo log files are stored in a nondefault location, you must specify the location of archived redo log files.
You have the following mutually exclusive options when performing media recovery when archived logs are not in their default location:
-
Edit the
LOG_ARCHIVE_DEST_
n
parameter that specifies the location of the archived redo logs, then recover as usual.This task is described in Resetting the Archived Log Destination.
-
Use the
SET
statement in SQL*Plus to specify the nondefault log location before recovery, or theLOGFILE
parameter of theRECOVER
command.This task is described in Overriding the Archived Log Destination.
30.2.4.2 Overriding the Archived Log Destination
In some cases, you may want to override the current setting for the archiving destination parameter as a source for archived log files.
To recover archived logs in a nondefault location with SET LOGSOURCE:
Note:
Overriding the redo log source does not affect the archive redo log destination for online redo log groups being archived.
30.2.5 Recovery Using Storage Snapshot Optimization
Storage Snapshot Optimization enables you to use third-party snapshots of the database, taken when the database is not in backup mode, to recover the database either to the current time or to a specified point in time after the snapshot was created. If the database was not placed in backup mode when the storage snapshot was created, then you can perform recovery using this snapshot only if the snapshot conforms to Oracle requirements. See "Making Backups with Third-Party Snapshot Technologies". If these conditions are met, then you can take the same basic recovery steps as any other backup method, using either RMAN or SQL*Plus.
If the storage snapshot does not conform to the requirements for using Storage Snapshot Optimization, then you create a snapshot by placing the data files in backup mode. To perform recovery using such snapshots, use the procedure described in Performing Complete Database Recovery Using SQL*Plus or Performing Incomplete Database Recovery.
Specifying the Time for Snapshot Recovery
If a storage snapshot was created when the database was not in backup mode, you must specify the SNAPSHOT TIME
option while using this snapshot to recover the database. The SNAPSHOT TIME
option can be used in both the RMAN or SQL*Plus RECOVER
command. The time specified using the SNAPSHOT TIME
option must be a time that is immediately after the snapshot is complete. If you specify an incorrect time, then the database may be corrupt in a way that is not repairable.
Because the time clocks in the storage array, where the snapshot takes place, and the machine hosting the Oracle Database may not be perfectly synchronized, it is recommended that you add a few seconds to the time that you specify in the SNAPSHOT TIME
option. This helps you avoid any possibility of leaving the files in an inconsistent state by recovering to a point before the snapshot was taken.
All times specified in the RECOVER
command, including in the SNAPSHOT TIME
clause, are assumed to be in the time zone of the Oracle Database host. However, the time clocks in the storage array may be in a different time zone from the Oracle Database host. If the storage array reports its snapshot times in a different time zone, then you must take that difference into account when specifying the time in the SNAPSHOT TIME
option.
Note:
The recovery point, specified by the UNTIL
option, cannot be earlier than the specified SNAPSHOT TIME
.
Examples: Recovery Using Storage Snapshots
The examples in this section use the RECOVER DATABASE
command to perform recovery using snapshots. You can use the RECOVER DATABASE
command from RMAN or SQL*Plus. However, the UNTIL CANCEL
clause is valid only in SQL*Plus.
To completely recover a database:
RECOVER DATABASE;
To recover a database using a particular snapshot:
This example recovers uses a snapshot taken on August 15 at 2:00 P.M. to recover the database. The UNTIL TIME
clause can specify any time after the snapshot.
RECOVER DATABASE UNTIL TIME '10/15/2012 15:00:00' SNAPSHOT TIME '10/15/2012 14:00:00';
To perform a partial recovery using archived redo log files:
This example uses the log files from a snapshot taken on August 15 at 2:00 P.M.
RECOVER DATABASE UNTIL CANCEL SNAPSHOT TIME '10/15/2012 14:00:00';
30.2.6 Recovery Cancellation During User-Managed Recovery
If you start media recovery and must then interrupt it, then either enter CANCEL
when prompted for a redo log file, or use your operating system's interrupt signal if you must terminate when recovering an individual data file, or when automated recovery is in progress. After recovery is canceled, you can resume it later with the RECOVER
command. Recovery resumes where it left off when it was canceled.
30.2.7 Parallel Media Recovery
By default, Oracle Database uses parallel media recovery to improve performance of the roll forward phase of media recovery. In parallel recovery of media, the database uses a "division of labor" approach to allocate different processes to different data blocks while rolling forward, thereby making the procedure more efficient. The number of processes used is derived from the CPU_COUNT
initialization parameter, which by default equals the number of CPUs on the system. For example, if parallel recovery is performed on a system where CPU_COUNT
is 4
, and only one data file is recovered, then four spawned processes read blocks from the archive logs and apply redo.
Typically, media recovery is limited by data block reads and writes. Parallel recovery attempts to use all of the available I/O bandwidth of the system to improve performance. Unless there is a system I/O bottleneck or poor asynchronous I/O support, parallel recovery is likely to improve performance of recovery.
To override the default behavior of performing parallel recovery, use the SQL*Plus RECOVER
command with the NOPARALLEL
option, or RECOVER
PARALLEL
0
. The RECOVERY_PARALLELISM
initialization parameter controls instance or crash recovery only. Media recovery is not affected by the value used for RECOVERY_PARALLELISM
.
See Also:
SQL*Plus User's Guide and Reference for more information about the SQL*Plus RECOVER
...
PARALLEL
and NOPARALLEL
commands
30.3 Performing Complete Database Recovery Using SQL*Plus
Typically, you perform complete recovery of the database when a media failure has made one or more data files inaccessible. During complete database recovery, you use all available redo to recover the database to the current SCN.
The V$RECOVER_FILE
view indicates which files need recovery. Depending on the circumstances, you can either recover the whole database or recover individual tablespaces or data files. Because you do not have to open the database with the RESETLOGS
option after complete recovery, you have the option of recovering some data files at one time and the remaining data files later.
The topics in this section describe the steps necessary to complete media recovery operations.
30.3.1 Performing Closed Database Recovery
When performing complete recovery while the database is not open, you can recover either all damaged data files in one operation or perform individual recovery of each damaged data file in separate operations.
This procedure assumes the following:
-
The current control file is available.
-
You have backups of all needed data files.
-
All necessary archived redo logs are available.
To restore and recover damaged or missing data files:
30.3.2 Performing Open Database Recovery
You can perform complete recovery of non-SYSTEM
data files in a database while the database is open.
This procedure assumes the following:
-
The current control file is available.
-
You have backups of all needed data files.
-
All necessary archived redo logs are available.
It is possible for a media failure to occur while the database remains open, leaving the undamaged data files online and available for use. Damaged data files—but not the tablespaces that contain them—are automatically taken offline if the database writer cannot write to them. If the database writer cannot open a data file, an error is still returned. Queries that cannot read damaged files return errors, but the data files are not taken offline because of the failed queries. For example, you may run a SQL query and see output such as:
ERROR at line 1: ORA-01116: error in opening database file 3 ORA-01110: data file 11: '/oracle/oradata/trgt/cwmlite02.dbf' ORA-27041: unable to open file SVR4 Error: 2: No such file or directory Additional information: 3
Note:
You cannot use the procedure in this section to perform complete media recovery on the SYSTEM
tablespace while the database is open. If the media failure damages data files of the SYSTEM
tablespace, then the database automatically shuts down.
To restore data files in an open database:
See Also:
-
Oracle Database Administrator's Guide to learn about creating data files and Oracle Database SQL Language Reference to learn about
ALTER
DATABASE
RENAME
FILE
-
"Recovering After the Loss of All Current Control Files" and "Re-Creating a Control File" for information about restoring or re-creating the control file
-
"Re-Creating Data Files When Backups Are Unavailable" for information about performing recovery when data file backups are missing
-
"Performing Incomplete Database Recovery" for information about performing database point-in-time recovery when you are missing redo required to completely recover the database
30.3.3 Performing Crash and Instance Recovery of CDBs
Oracle Database performs crash and instance recovery for the entire multitenant container database (CDB). You cannot recover individual pluggable databases (PDBs).
This procedure assumes the following:
-
The current control file is available.
-
You have backups of all needed data files.
-
All necessary archived redo logs are available.
To perform crash and instance recovery for a CDB:
See Also:
-
"Recovering After the Loss of All Current Control Files" and "Re-Creating a Control File" for information about restoring or re-creating the control file
-
"Re-Creating Data Files When Backups Are Unavailable" for information about performing recovery when data file backups are missing
-
"Performing Incomplete Database Recovery" for information about performing database point-in-time recovery when you are missing redo required to completely recover the database
30.4 Performing Incomplete Database Recovery
Incomplete recovery is also known as database point-in-time recovery.
Typically, you perform database point-in-time recovery (DBPITR) in the following situations:
-
You want to recover the database to an SCN before a user or administrative error.
-
The database contains corrupt blocks.
-
Complete database recovery failed because all necessary archived redo logs were not available.
-
You are creating a test database or a reporting database from production database backups.
If the database is operating in ARCHIVELOG
mode, and if the only copy of an archived redo log file is damaged, then the damaged file does not affect the present operation of the database. Table 30-2 describes situations that can arise depending on when the redo log was written and when you backed up the data file.
Table 30-2 Loss of Archived Redo Logs
If You Backed Up... | Then... |
---|---|
All data files after the filled online redo log group (which is now archived) was written |
The archived version of the filled online redo log group is not required for complete media recovery. |
A specific data file before the filled online redo log group was written |
If the corresponding data file is damaged by a permanent media failure, then use the most recent backup of the damaged data file and perform tablespace point-in-time recovery of the damaged data file, up to the damaged archived redo log file. |
Caution:
If you know that an archived redo log group has been damaged, then immediately back up all data files so that you have a whole database backup that does not require the damaged archived redo log.
The technique for DBPITR is very similar to the technique for performing closed database recovery, except that you terminate DBPITR by specifying a particular time or SCN or entering CANCEL
. Cancel-based recovery prompts you with the suggested file names of archived redo logs. Recovery stops when you specify CANCEL
instead of a file name or when all redo has been applied to the data files. Cancel-based recovery is the best technique to control which archived log terminates recovery.
30.4.1 Performing Cancel-Based Incomplete Recovery
In cancel-based recovery, recovery proceeds by prompting you with the suggested file names of archived redo log files. Recovery stops when you specify CANCEL
instead of a file name or when all redo has been applied to the data files.
This procedure assumes the following:
-
The current control file is available.
-
You have backups of all needed data files.
To perform cancel-based recovery:
See Also:
-
"About User-Managed Media Recovery Problems" for descriptions of situations that can cause
ALTER
DATABASE
OPEN
RESETLOGS
to fail -
"Recovering After the Loss of All Current Control Files" for information about restoring or re-creating the control file
-
"Re-Creating Data Files When Backups Are Unavailable" for information about performing recovery when data file backups are missing
30.4.2 Performing Time-Based or Change-Based Incomplete Recovery
You can specify an SCN or time for the end point of incomplete recovery.
If your database is affected by seasonal time changes (for example, daylight savings time), then you may experience a problem if a time appears twice in the redo log and you want to recover to the second, or later time. To handle time changes, perform cancel-based or change-based recovery.
This procedure assumes the following:
-
The current control file is available.
-
You have backups of all needed data files.
To perform change-based or time-based recovery:
See Also:
-
"Recovering After the Loss of All Current Control Files" for information about restoring or re-creating the control file
-
"Re-Creating Data Files When Backups Are Unavailable" for information about performing recovery when data file backups are missing
30.5 Recovering a Database in NOARCHIVELOG Mode
If a media failure damages data files in a NOARCHIVELOG
database, then the only option for recovery is usually to restore a consistent whole database backup. If you are using logical backups created by Oracle Data Pump Export to supplement regular physical backups, then you can also attempt to restore the database by importing an exported backup of the database into a re-created database or a database restored from an old backup.
To restore and recover the most recent whole database backup:
See Also:
Oracle Database Administrator's Guide for more information about renaming and relocating data files, and Oracle Database SQL Language Reference to learn about ALTER
DATABASE
RENAME
FILE
30.6 Troubleshooting Media Recovery
This section describes how to troubleshoot user-managed media recovery, that is, media recovery performed without using Recovery Manager (RMAN). This section includes the following topics:
30.6.1 About User-Managed Media Recovery Problems
Table 30-4 describes potential problems that can occur during media recovery.
Table 30-4 Media Recovery Problems
Problem | Description |
---|---|
Missing or misnamed archived log |
Recovery stops because the database cannot find the archived log recorded in the control file. |
When you attempt to open the database, error |
This error commonly occurs because:
|
Two possible cases are as follows:
|
|
Logs may be corrupted while they are stored on or copied between storage systems. If |
|
If you enable the parallel redo feature, then the database generates redo logs in a new format. Prior releases of Oracle are unable to apply parallel redo logs. However, releases before Oracle9i Database Release 2 (9.2) can detect the parallel redo format and indicate the inconsistency with the following error message: |
|
A data file backup may have contained a corrupted data block, or the data block may become corrupted either during recovery or when it is copied to the backup. If |
|
Random problems |
Memory corruptions and other transient problems can occur during recovery. |
The symptoms of media recovery problems are usually external or internal errors signaled during recovery. For example, an external error indicates that a redo block or a data block has failed checksum verification checks. Internal errors can be caused by either bugs in the database or errors arising from the underlying operating system and hardware.
If media recovery encounters a problem while recovering a database backup, then whether it is a stuck recovery problem or a problem during redo application, the database always stops and leaves the data files undergoing recovery in a consistent state, that is, at a consistent SCN preceding the failure. You can then do one of the following:
-
Open the database read-only to investigate the problem.
-
Open the database with the
RESETLOGS
option, if the requirements for openingRESETLOGS
have been met. TheRESETLOGS
restrictions apply to opening the physical standby database as well, because a standby database is updated by a form of media recovery.
In general, opening the database read-only or opening with the RESETLOGS
option requires all online data files to be recovered to the same SCN. If this requirement is not met, then the database may signal ORA-1113
or other errors when you attempt to open it. Some common causes of ORA-1113
are described in Table 30-4.
The basic methodology for responding to media recovery problems occurs in the following phases:
-
Try to identify the cause of the problem. Run a trial recovery if needed.
-
If the problem is related to missing redo logs or if you suspect that there is a redo log, memory, or data block corruption, then try to resolve the problem using the methods described in Table 30-5.
-
If you cannot resolve the problem using the methods described in Table 30-5, then do one of the following:
-
Open the database with the
RESETLOGS
option if you are recovering a whole database backup. If you have performed serial media recovery, then the database contains all the changes up to but not including the changes at the SCN where the corruption occurred. No changes from this SCN onward are in the recovered part of the database. If you have restored online backups, then openingRESETLOGS
succeeds only if you have recovered through all theALTER
...
END
BACKUP
operations in the redo stream. -
Proceed with recovery by allowing media recovery to corrupt data blocks. After media recovery completes, try performing block media recovery using RMAN.
-
Call Oracle Support Services as a last resort.
See Also:
"Performing Disaster Recovery" to learn about block media recovery
-
30.6.2 Investigating the Media Recovery Problem: Phase 1
If media recovery encounters a problem, then obtain as much information as possible after recovery halts. You do not want to waste time fixing the wrong problem, which may make matters worse.
The goal of this initial investigation is to determine whether the problem is caused by incorrect setup, corrupted redo logs, corrupted data blocks, memory corruption, or other problems. If you see a checksum error on a data block, then the data block is corrupted. If you see a checksum error on a redo log block, then the redo log is corrupted.
Sometimes the cause of a recovery problem can be difficult to determine. Nevertheless, the methods in this section enable you to quickly recover a database even when you do not completely understand the cause of the problem.
To investigate media recovery problems:
- Examine the
alert.log
to see whether the error messages give general information about the nature of the problem. For example, does thealert_
SID
.log
indicate any checksum failures? Does thealert_
SID
.log
indicate that media recovery may have to corrupt data blocks to continue? - Check the trace file generated by the Oracle Database during recovery. It may contain additional error information.
30.6.3 Trying to Fix the Recovery Problem Without Corrupting Blocks: Phase 2
Depending on the type of media recovery problem you suspect, you have different solutions at your disposal. You can try one or a combination of the techniques described in Table 30-5. These solutions are common repair techniques and fairly safe for resolving most media recovery issues.
Table 30-5 Media Recovery Solutions
If You Suspect... | Then... |
---|---|
Missing or misnamed archived redo logs |
Determine whether you entered the correct file name. If you did, then check whether the log is missing from the operating system. If it is missing, and if you have a backup, then restore the backup and apply the log. If you do not have a backup, then if possible perform incomplete recovery up to the point of the missing log. |
|
Review the causes of this error in Table 30-4. Ensure that all read/write data files requiring recovery are online. If you use a backup control file for recovery, then the control file and data files must be at a consistent SCN for the database to be opened. If you do not have the necessary redo, then you must re-create the control file. |
Corrupt archived logs |
The log is corrupted if the checksum verification on the log redo block fails. If The |
Archived logs with incompatible parallel redo format |
If you run an Oracle Database release before Oracle9i Database Release 2, and if you attempt to apply redo logs created with the parallel redo format, then you must do the following steps:
|
Memory corruption or transient problems |
You may be able to fix the problem by shutting down the database and restarting recovery. The database should be left in a consistent state if the second attempt also fails. |
Corrupt data blocks |
Restore and recover the data file again with user-managed methods, or restore and recover individual data blocks with the RMAN A data block is corrupted if the checksum verification on the block fails. If |
If you cannot fix the problem with the methods described in Table 30-5, then there may be no easy way to fix the problem without losing data. You have these options:
-
Open the database with the
RESETLOGS
option (for whole database recovery).This solution discards all changes after the point where the redo problem occurred, but guarantees a logically consistent database.
-
Allow media recovery to corrupt one or more data blocks and then proceed.
This option only succeeds if the alert log indicates that recovery can continue if it is allowed to corrupt a data block, which is the case for most recovery problems. This option is best if you must bring up the database quickly and recover all changes. If you are considering this option, then proceed to "Deciding Whether to Allow Recovery to Mark as Corrupt Blocks: Phase 3".
See Also:
" Performing Block Media Recovery "to learn how to perform block media recovery with the
RECOVER ... BLOCK
command
30.6.4 Deciding Whether to Allow Recovery to Mark as Corrupt Blocks: Phase 3
When media recovery encounters a problem, the alert log may indicate that recovery can continue if it is allowed to mark as corrupt the data block causing the problem. The alert log contains information about the block: its block type, block address, the tablespace it belongs to, and so forth. For blocks containing user data, the alert log may also report the data object number.
In this case, the database can proceed with recovery if it is allowed to mark the problem block as corrupt. Nevertheless, this response is not always advisable. For example, if the block is an important block in the SYSTEM
tablespace, marking the block as corrupt can eventually prevent you from opening the recovered database. Another consideration is whether the recovery problem is isolated. If this problem is followed immediately by many other problems in the redo stream, then you may want to open the database with the RESETLOGS
option.
For a block containing user data, you can usually query the database to discover which object or table owns this block. If the database is not open, then you can open the database read-only, even if you are recovering a whole database backup. The following example cancels recovery and opens the database read-only:
CANCEL ALTER DATABASE OPEN READ ONLY;
Assume that the data object number reported in the alert_
SID
.log
is 8031
. You can determine the owner, object name, and object type by issuing this query:
SELECT OWNER, OBJECT_NAME, SUBOBJECT_NAME, OBJECT_TYPE FROM DBA_OBJECTS WHERE DATA_OBJECT_ID = 8031;
To determine whether a recovery problem is isolated, you can run a diagnostic trial recovery, which scans the redo stream for problems but does not actually make any changes to the recovered database. If a trial recovery discovers any recovery problems, then it reports them in the alert_
SID
.log
. You can use the RECOVER
...
TEST
statement to invoke trial recovery, as described in "Executing the RECOVER... TEST Statement".
After you have done these investigations, you can follow the guidelines in Table 30-6 to decide whether to allow recovery to permit corrupt blocks.
Table 30-6 Guidelines for Allowing Recovery to Permit Corrupt Blocks
If the Problem Is... | and the Block Is... | Then... |
---|---|---|
Not isolated |
You can open the database with the |
|
Isolated |
In the |
Do not corrupt the block, because it may eventually prevent you from opening the database. However, sometimes data in the |
Isolated |
Index data |
Consider corrupting index blocks because the index can be rebuilt later after the database has been recovered. |
Isolated |
User data |
Decide based on the importance of the data. If you continue with data file recovery and corrupt a block, then you lose data in the block. However, you can use RMAN to perform block media recovery later, after data file recovery completes. If you open |
Isolated |
Rollback or undo data |
If all of the transactions are committed, then consider corrupting the rollback or undo block. The database is not harmed if the transactions that generated the undo are never rolled back. However, if those transactions are rolled back, then corrupting the undo block can cause problems. If you are unsure, then contact Oracle Support Services. |
See Also:
"Performing Trial Recovery" to learn how to perform trial recovery, and "Allowing Recovery to Corrupt Blocks: Phase 4" if you decide to allow recovery to permit corrupt blocks
30.6.6 Performing Trial Recovery
When problems such as stuck recovery occur, you have a difficult choice. If the block is relatively unimportant, and if the problem is isolated, then it is better to corrupt the block. But if the problem is not isolated, then it may be better to open the database with the RESETLOGS
option.
Because of this situation, Oracle Database supports trial recovery. A trial recovery applies redo in a way similar to normal media recovery, but it never writes its changes to disk and it always rolls back its changes. Trial recovery occurs only in memory.
See Also:
30.6.6.1 How Trial Recovery Works
By default, if a trial recovery encounters a stuck recovery or similar problem, then it always marks the data block as corrupt in memory when this action can allow recovery to proceed. The database writes errors generated during trial recovery to alert files. These errors are clearly marked as test run errors.
Like normal media recovery, trial recovery can prompt you for archived log file names and ask you to apply them. Trial recovery ends when:
-
The database runs out of the maximum number of buffers in memory that trial recovery is permitted to use
-
An unrecoverable error is signaled, that is, an error that cannot be resolved by corrupting a data block
-
You cancel or interrupt the recovery session
-
The next redo record in the redo stream changes the control file
-
All requested redo has been applied
When trial recovery ends, the database removes all effects of the test run from the system—except the possible error messages in the alert files. If the instance fails during trial recovery, then the database removes all effects of trial recovery from the system, because trial recovery never writes changes to disk.
Trial recovery lets you foresee what problems might occur if you were to continue with normal recovery. For problems caused by ongoing memory corruption, trial recovery and normal recovery can encounter different errors.
30.6.6.2 Executing the RECOVER... TEST Statement
You can use the TEST
option for any RECOVER
command. For example, you can start SQL*Plus and then issue any of the following commands:
RECOVER DATABASE TEST RECOVER DATABASE USING BACKUP CONTROLFILE UNTIL CANCEL TEST RECOVER TABLESPACE users TEST RECOVER DATABASE UNTIL CANCEL TEST
By default, trial recovery always attempts to corrupt blocks in memory if this action allows trial recovery to proceed. Trial recovery by default can corrupt an unlimited number of data blocks. You can specify the ALLOW
n
CORRUPTION
clause on the RECOVER
...
TEST
statement to limit the number of data blocks that trial recovery can corrupt in memory.
A trial recovery command is usable in any scenario in which a normal recovery command is usable. Nevertheless, you only need to run trial recovery when recovery runs into problems.
See Also: