2012年4月12日星期四

Database Hang and Recover Procedure


In our example, we assume the database is hanging for unknown reason.

1. Try to identify process ID the top loading process

2. Try to check the oracle user session detail by using the above process ID

3. If there is no particular high load process or we cannot connect to oracle database to check user session detail (i.e. connect via sqlplus but seem hang). We assume the database is hang.

4. Dump System Statistics
   e.g. export ORACLE_SID=DB_SID
          sqlplus -prelim / as sysdba
          oradebug setmypid
          oradebug unlimit
          oradebug dump systemstate 10
          -> wait 90 seconds (i.e. This is not a command)
          oradebug dump systemstate 10
          -> wait 90 seconds (i.e. This is not a command)
          oradebug dump systemstate 10
          exit


5. Perform Hang Analysis
   e.g. export ORACLE_SID=DB_SID
          sqlplus -prelim / as sysdba
          oradebug setmypid
          oradebug unlimit
          oradebug hanganalyze 3
          -> wait 90 seconds (i.e. This is not a command)
          oradebug hanganalyze 3
          exit

6. Create SR in oracle and update the trace files generated  in Step 5 and Step 6 to Oracle SR

7. Shutdown Database abnormally by killing the Oracle SMON process and all related process will be killed

8. Check all process related to target instance still existed. If they are still existed, please kill them manually.

9. Start oracle instance in restricted mode
   e.g. export ORACLE_SID=DB_SID
          sqlplus / nolog
          connect / as sysdba
          startup restrict

10. If the database startup normally, please shutdown it to make sure database can be shutdown normally.

11. Start oracle instance in restricted mode and check the date integrity first.

12. If everything ok, please disable restricted mode and notify user that the system is back to normal

沒有留言: