How to Perform a Complete Recovery of Your Oracle 12c Database - dummies

How to Perform a Complete Recovery of Your Oracle 12c Database

By Chris Ruel, Michael Wessler

Complete recovery is what you want to shoot for. It means you recover every block and every transaction that was committed into the Oracle 12c database. You let Oracle take over and do the recovery until the end of all the backup files and archive logs; don’t manually intervene and stop it before it’s finished.

How to perform a complete recovery: One or more Oracle 12c data files

If you see the “unable to identify/lock datafile” error, you need to do a couple of things:

  • Know the extent of the damage. This knowledge helps so you only have to do one recovery instead of two.

  • Determine whether the lost file is required for the instance to run (or if it is an application data file). Required data files are SYSTEM, SYSAUX, and UNDO. This is important for reducing your overall mean time to recovery (MTTR). Oracle crashes only if you lose a required data file.

If you determine that the lost files aren’t important to basic operation, you can open the database (if it even went down) before you begin recovery. That allows at least partial data access to some users. You may prefer that users remain out of the system until you’re done.

What’s the first thing to do if Oracle 12c crashes? Go to the Data Recovery Advisor (DRA). For the DRA to work, the database has to, at the very least, be in NOMOUNT state. The database can’t be completely shut down. The DRA can also be run with the database open, minimizing downtime.

In the following exercise, the USERS tablespace data file was lost while the database was running. Because that data file is not a required data file (system, sysaux, undo), you can do the recovery without even shutting down the database.

  1. Log in to your target with RMAN.

  2. Type this:

    <list failure;>

    You see something like this:

    List of Database Failures
    =========================
    Failure ID Priority Status  Time Detected    Summary
    ---------- -------- --------- -------------------- -------
    722    HIGH   OPEN   30-JUN-2013 11:25:20 One or more non-system datafiles are missing

    A non-system (critical) file is missing. What to do?

  3. Ask the DRA what to do:

    <advise failure;>

    You see something like this:

    Database Role: PRIMARY
    List of Database Failures
    =========================
    Failure ID Priority Status  Time Detected    Summary
    ---------- -------- --------- -------------------- -------
    722    HIGH   OPEN   30-JUN-2013 11:25:20 One or more non-system datafiles are missing
    analyzing automatic repair options; this may take some time
    allocated channel: ORA_DISK_1
    channel ORA_DISK_1: SID=56 device type=DISK
    analyzing automatic repair options complete
    Mandatory Manual Actions
    ========================
    no manual actions available
    Optional Manual Actions
    =======================
    1. If file /u01/app/oracle/oradata/dev12c/users01.dbf was
    unintentionally renamed or moved, restore it Automated Repair Options ======================== Option Repair Description ------ ------------------ 1 Restore and recover datafile 4 Strategy: The repair includes complete media recovery with no data loss Repair script: /u01/app/oracle/diag/rdbms/dev12c/dev12c/hm/reco_3875560744.hm

    Get a load of that! Not only does the DRA tell you exactly what you need to do, but it also provides a script so you don’t have to write a single line of code. If you open that script, it looks something like this:

      # restore and recover datafile
      sql 'alter database datafile 4 offline';
      restore ( datafile 4 );
      recover datafile 4;
      sql 'alter database datafile 4 online';
  4. Type the following to have the DRA fix the problem:

    <repair failure;>

    You see something like this:

    RMAN> repair failure;
    Strategy: The repair includes complete media recovery with no data loss
    Repair script: /u01/app/oracle/diag/rdbms/dev12c/dev12c/hm/reco_3875560744.hm
    contents of repair script:
      # restore and recover datafile
      sql 'alter database datafile 4 offline';
      restore ( datafile 4 );
      recover datafile 4;
      sql 'alter database datafile 4 online';
    Do you really want to execute the above repair (enter YES or NO)? YES
    repair failure complete

    At the very end of an Advisor-based recovery, if the database was closed, it asks whether you want to open the database. Most of the time you will choose Yes. You might choose No if you want to spend more time going over what happened before you release the database back to the users.

Here are some nitpicks about the DRA, here is what we’d say:

  • The DRA doesn’t say you can take data files offline and then open the database for everyone else if the database is closed. It tells you that the files can be offline and recovered if the database is already open. At least it told you that they were non-system files.

  • If you have to restore the files to a new location, the DRA can’t take over and do the whole recovery for you.

    Say you lost a disk and it ain’t coming back. The DRA isn’t smart enough to choose a new location for you and incorporate that into a repair script. It tells you what’s wrong and what it suggests doing, which may help get you going in the right direction, but it falls short after that.

Be realistic. How can you expect it to have every situation indexed for all types of systems and environments?

How to perform a complete recovery: One or more Oracle 12c control files

What if you manage to lose all your control files and your database crashes?

Control files are critical system files.

DRA to the rescue:

  1. Log in to RMAN.

  2. List failure.

  3. Advise failure.

  4. Repair failure.

But wait a minute . . . Why didn’t these steps open the database? Recovery from losing all your control files is a little more involved than standard data file recovery. (Supposedly the DRA doesn’t want to continue with the recovery without you getting a chance to check things out.)