Feeds:
Posts
Comments

Archive for November, 2013

I am posting today to the honor of my friend’s  mother or father who had suffer and is suffering from CANCER. May god treat there disease soon and make them healthy and well .

Few days Ago one of my friend ask the What is RMAN incremental backup, when to use it Question. So, today I am talking about that.

Below are some RMAN recovery scenario for real time.

 RMAN incremental backups back up only datafile blocks that have changed since a specified previous backup. You can make incremental backups of databases, individual tablespaces or datafiles.

The most important reason for doing incremental backups is associated with data warehouse environments, where many operations are done in NOLOGGING mode and data changes do not go to the archived log files.
Considering the massive size of data warehouses today, and the fact that most of the data in them does not change, full backups are not acceptable. Therefore , doing incremental backups in RMAN is an ideal alternative. Always follow the backup strategy according to an acceptable MTTR (mean time to recover) in your environment.

Incremental backups can be either level 0 or level 1. A level 0 incremental backup, which is the base for subsequent incremental backups, copies all blocks containing data, backing the datafile up into a backup set just as a full backup would. The only difference between a level 0 incremental backup and a full backup is that a full backup is never included in an incremental strategy.

A level 1 incremental backup can be either of the following types:

  • differential backup, which backs up all blocks changed after the most recent incremental backup at level 1 or 0
  • cumulative backup, which backs up all blocks changed after the most recent incremental backup at level 0

Differential Incremental Backups

RMAN> BACKUP INCREMENTAL LEVEL 1 DATABASE;

Description of Figure 4-1 follows

In the example shown in Figure 1, the following occurs:

  • Sunday An incremental level 0 backup backs up all blocks that have ever been in use in this database.
  • Monday – SaturdayOn each day from Monday through Saturday, a differential incremental level 1 backup backs up all blocks that have changed since the most recent incremental backup at level 1 or 0. So, the Monday backup copies blocks changed since Sunday level 0 backup, the Tuesday backup copies blocks changed since the Monday level 1 backup, and so forth.
  • The cycle is repeated for the next week.

Cumulative Incremental Backups

RMAN> BACKUP INCREMENTAL LEVEL 1 CUMULATIVE DATABASE;

Description of Figure 4-2 follows

In the example shown in Figure 2, the following occurs:

  • SundayAn incremental level 0 backup backs up all blocks that have ever been in use in this database.
  • Monday – SaturdayA cumulative incremental level 1 backup copies all blocks changed since the most recent level 0 backup. Because the most recent level 0 backup was created on Sunday, the level 1 backup on each day Monday through Saturday backs up all blocks changed since the Sunday backup.
  • The cycle is repeated for the next week.

RMAN command to create level 0 backup which is needed before running of incremental backup level 1
RMAN> BACKUP INCREMENTAL LEVEL 0 DATABASE;

RMAN command to run level 1 backup. Level 1 backup will backup all blocks changed since most recent cumulative or differential backup. If a level 0 backup doesn’t exists, when running INCREMENTAL backup Oracle will perform a full backup.
RMAN> BACKUP INCREMENTAL LEVEL 1 DATABASE;

RMAN command to run level 1 cumulative backup. Level 1 backup will backup all blocks changed since most recent Level 0 backup. If a level 0 backup doesn’t exists, when running INCREMENTAL backup Oracle will perform a full backup.
RMAN> BACKUP INCREMENTAL LEVEL 1 CUMULATIVE DATABASE;

RMAN command to backup database level 1 and skip datafiles and archived redo logs that cannot be read due to I/O errors to be excluded from backup
RMAN> BACKUP INCREMENTAL LEVEL 1 INACCESSIBLE DATABASE;


 

There are couple of ways to determine if RMAN database is registered with a catalog.

Using RMAN when you connect to RMAN catalog and try to run a RMAN command like “list backup” it will generate an error as shown below
$ rman target / catalog rmancataloguser/rmancatalogpassword@catalogdb

Recovery Manager: Release 10.2.0.4.0 – Production on Sun Jul 17 08:33:42 2011

Copyright (c) 1982, 2007, Oracle. All rights reserved.

connected to target database: TEST01 (DBID=1023910334)
connected to recovery catalog database

RMAN> list backup;

RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of list command at 07/17/2011 08:33:48
RMAN-06004: ORACLE error from recovery catalog database: RMAN-20001: target database not found in recovery catalog

Another way would be to connect to the catalog schema through SQL*Plus and check view RC_DATABASE
$ sqlplus rmancataloguser/rmancatalogpassword@catalogdb
SQL*Plus: Release 10.2.0.4.0 – Production on Sun Jul 17 08:45:58 2011

Copyright (c) 1982, 2007, Oracle. All Rights Reserved.

Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 – 32bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> select * from rc_database where name = ‘TEST01’;

no rows selected


 

Remember that only SPFILE or control file can be restored from autobackup

Scenario when all control files for some reason have been lost then how to restore.

1) Try to shutdown database which will fail as the control file(s) doesn’t exists so we needs to perform shutdown abort
SQL> shutdown immediate;
ORA-00210: cannot open the specified control file
ORA-00202: control file: ‘/apps/oracle/oradata/TEST01/control1.ora’
ORA-27041: unable to open file
Linux-x86_64 Error: 2: No such file or directory
Additional information: 3

SQL> shutdown abort;
ORACLE instance shut down.

2) Start database in nomount to restore the controlfile as controlfile is missing so database can only be started in mount
SQL> startup nomount;
ORACLE instance started.

Total System Global Area 209715200 bytes
Fixed Size 2019672 bytes
Variable Size 109055656 bytes
Database Buffers 96468992 bytes
Redo Buffers 2170880 bytes

3) Start rman and connect to target database and set a past DBID
$ORACLE_HOME/bin/rman

Recovery Manager: Release 10.2.0.1.0 – Production on Tue Dec 7 21:08:53 2010

Copyright (c) 1982, 2005, Oracle. All rights reserved.

RMAN> connect target /

connected to target database: TEST (not mounted)

RMAN> set dbid 1992878807

executing command: SET DBID

4) Restore controlfile from autobackup
RMAN> restore controlfile from autobackup;

Starting restore at 16-NOV-13
using channel ORA_DISK_1

..
Finished restore at 16-NOV-13

5) Open database in mount
RMAN> alter database mount;

database mounted
released channel: ORA_DISK_1

Later if you got RMAN-06189

RMAN-06189 current DBID number does not match target mounted database (number)
Cause SET DBID was used to set a DBID that does not match the DBID of the database to which RMAN is connected.
Action If the current operation is a restore to copy the database, do not mount the database. Otherwise, avoid using the SET DBID command, or restart RMAN.

 

6) Recover database
RMAN> recover database;

Starting recover at 16-NOV-13

….

..

media recovery complete, elapsed time: 00:00:02
Finished recover at 16-NOV-13

 7) Database has to be open with resetlogs

Why open resetlogs, or what is the use of open resetlogs after a incomplete recovery?

because its

1. Archives the current online redo logs.

2. Clears the contents of the online redo logs, and resets the online redo logs to log sequence 1
RMAN> alter database open resetlogs;

database opened

Restore spfile through RMAN, here are some scenerios/examples.

1. In this scenerio there is a autobackup is present of spfile, database is in no mount state so to use AUTOBACKUP, DBID needs to be set before restoring spfile.
$ $ORACLE_HOME/bin/rman

Recovery Manager: Release 10.2.0.1.0 – Production on Sun Nov 28 14:40:05 2010

Copyright (c) 1982, 2005, Oracle. All rights reserved.

RMAN> connect target /

connected to target database: TEST01 (not mounted)

RMAN> set dbid 1992878807

executing command: SET DBID

RMAN> restore spfile from autobackup;

Starting restore at 16-NOV-13
using target database control file instead of recovery catalog

..

channel ORA_DISK_1: SPFILE restore from autobackup complete
Finished restore at 16-NOV-13

2. In this scenerio there spfile is restored from autobackup, database is in nomount state and if the filename of backup piece is known so it can be passed to restore spfile
$ $ORACLE_HOME/bin/rman

Recovery Manager: Release 10.2.0.1.0 – Production on Mon Nov 29 17:45:04 2010

Copyright (c) 1982, 2005, Oracle. All rights reserved.

RMAN> connect target /

connected to target database: TEST01 (not mounted)

RMAN> restore spfile from ‘/apps/oracle/product/10.2.0/db_1/dbs/backup_piece_name’;
– RMAN> restore spfile from ‘backup_piece_name’;

Starting restore at 16-NOV-13
using channel ORA_DISK_1

….

Finished restore at 28-NOV-10

 

Scenerio where there is no autobackup of control file so Oracle goes to last 7 days by default and does not find it

If you want to tell RMAN to look for spfile more than 7 days so using maxdays RMAN will look for spfile from current date to currentday – maxdays

“RMAN> restore spfile from autobackup maxdays 200;”
$ $ORACLE_HOME/bin/rman

Recovery Manager: Release 10.2.0.1.0 – Production on Sun Nov 28 14:40:05 2010

Copyright (c) 1982, 2005, Oracle. All rights reserved.

RMAN> connect target /

connected to target database: TEST01 (not mounted)

RMAN> set dbid 1992878807

executing command: SET DBID

RMAN> restore spfile from autobackup;

Starting restore at 16-NOV-13
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: sid=35 devtype=DISK

channel ORA_DISK_1: looking for autobackup on day: 20131115
channel ORA_DISK_1: looking for autobackup on day: 20131114
channel ORA_DISK_1: looking for autobackup on day: 20131113
channel ORA_DISK_1: looking for autobackup on day: 20131112
channel ORA_DISK_1: looking for autobackup on day: 20131111
channel ORA_DISK_1: looking for autobackup on day: 20131110
channel ORA_DISK_1: looking for autobackup on day: 20131109
channel ORA_DISK_1: no autobackup in 7 days found
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of restore command at 11/16/2013 15:08:30
RMAN-06172: no autobackup found or specified handle is not a valid copy or piece

 

When AUTOBACKUP is turned on Oracle will create backup of controlfile and spfile. When one runs backup of the database or archive log, changes in tablespace occur like creating a new tablespace, add a new datafile and dropping tablespace Oracle create a separate backup piece for the control file and spfile.

Shows how to turn on CONTROLFILE AUTOBACKUP
RMAN> connect target /

connected to target database: TEST (DBID=1992878807)

RMAN> configure CONTROLFILE AUTOBACKUP ON;

old RMAN configuration parameters:
CONFIGURE CONTROLFILE AUTOBACKUP OFF;
new RMAN configuration parameters:
CONFIGURE CONTROLFILE AUTOBACKUP ON;
new RMAN configuration parameters are successfully stored

Below is the code to turn on controlfile autobackup through PL/SQL and setting the format.
VARIABLE RECNO NUMBER;
EXECUTE :RECNO := SYS.DBMS_BACKUP_RESTORE.SETCONFIG(‘CONTROLFILE AUTOBACKUP’,’ON’);
EXECUTE :RECNO := SYS.DBMS_BACKUP_RESTORE.SETCONFIG(‘CONTROLFILE AUTOBACKUP FORMAT FOR DEVICE TYPE’,’DISK TO ”T_%U”’);

Steps if you need to restore all datafiles in a tablespace is corrupted so you would need to restore the table space. The following example shows the steps

  1. Tablespace cannot be made offline as it will try to flush all blocks in the datafiles but as the datafile is corrupted.

SQL> alter tablespace ts_something offline;
alter tablespace newton offline
*
ERROR at line 1:
ORA-01116: error in opening database file 4
ORA-01110: data file 4: ‘/apps/oracle/oradata/TEST01/newton01.dbf’
ORA-27041: unable to open file
Linux-x86_64 Error: 2: No such file or directory
Additional information: 3

  •  So using table space offline immediate Oracle will not check if the file exists and does not perform checkpoint and we need to do media recovery on the table space

SQL> alter tablespace newton offline immediate;

Tablespace altered.

  •  Display datafile status

SQL> select d.name, d.status
from v$datafile d, v$tablespace t
where t.name = ‘NEWTON’
and t.ts# = d.ts#;

NAME STATUS
—————————————————————— ——-
/apps/oracle/oradata/TEST01/newton01.dbf RECOVER

  •  restore the table space

RMAN> run {
2> restore tablespace NEWTON;
3> }

Starting restore at 15-NOV-13
using channel ORA_DISK_1

….

..
channel ORA_DISK_1: restore complete, elapsed time: 00:00:15
Finished restore at 15-NOV-13

  • table space has be restored but not recovered so shows it cannot be made online till table space is recovered

SQL> alter tablespace ts_something online;
alter tablespace ts_something online
*
ERROR at line 1:
ORA-01113: file 4 needs media recovery
ORA-01110: data file 4: ‘/apps/oracle/oradata/TEST01/newton01.dbf’

  •  recover table space

SQL> recover tablespace ts_something;
Media recovery complete.

  •  Tablespace can be made online

SQL> alter tablespace ts_something online;

Tablespace altered.

  • Shows data file status

SQL> select d.name, d.status
from v$datafile d, v$tablespace t
where t.name = ‘NEWTON’
and t.ts# = d.ts#;

NAME STATUS
————————————————————————————————————————————
/apps/oracle/oradata/TEST01/newton01.dbf ONLINE


 

Using the steps below one take cold backup using RMAN. As it’s a cold backup the database as the database is in mount stage and the database doesn’t have to be archivelog mode .

Step 1) Shutdown database

SQL> shutdown immediate;
Database closed.
Database dismounted.
ORACLE instance shut down.

Step 2) Start database in mount stage

SQL> startup mount;
ORACLE instance started.

Total System Global Area 167772160 bytes
Fixed Size 2019320 bytes
Variable Size 75497480 bytes
Database Buffers 88080384 bytes
Redo Buffers 2174976 bytes

Step 3) Run rman and connect to target database and run rman to backup database and connection to catalog if you are using one
$ $ORACLE_HOME/bin/rman target /

Recovery Manager: Release 10.2.0.1.0 – Production on Fri Apr 23 02:33:38 2010

Copyright (c) 1982, 2005, Oracle. All rights reserved.

connected to target database: TEST01 (DBID=1992878807, not open)

RMAN> backup database;

 

To find RMAN catalog version log-in to the catalog through SQL*Plus and query the table rcver will print version.

SQL> select * from rcver;

VERSION
————
10.02.00.00

SQL> select object_type from user_objects where object_name = ‘RCVER’;

OBJECT_TYPE
——————-
TABLE

SQL> desc rcver;
Name Null? Type
—————————————– ——– —————————-
VERSION NOT NULL VARCHAR2(12)

Hope it’s help. If you guys do have any recover or restore scenario you can post you comments about that.

Thanks & Regards

Nimai Karmakar

Read Full Post »

Some issues which I have faced at past are some of my blocks got corrupted on the pre-prod, I can’t  give you the exact scenario but I will try to make that scenario again.

So, if some of our data block got corrupted, How we can restore them?

Answer is simple Oracle Block Media Recovery.

When some of the data blocks in the data files got physically corrupted and if you don’t have RMAN backups then you will have to restore the full data file from backup to restore and recover those few blocks (like expdp/impdp, cold, hot backup), which can be quite a hectic job. But if you use rman backups then you can take benefit of this very powerful feature called BMR (BLOCK MEDIA RECOVERY). Using Block Media Recovery only those blocks which got corrupted can be recovered from a backup instead of recovering whole file.

The steps to resolve block corruption are simple…..

  • Start SQL*Plus and connect to the target database.
  • Query V$DATABASE_BLOCK_CORRUPTION to determine whether corrupt blocks exist. For example, execute the following statement:
SQL> SELECT * FROM V$DATABASE_BLOCK_CORRUPTION;
  • (if exist) Start RMAN and connect to the target database.
  • Recover all blocks marked corrupt in V$DATABASE_BLOCK_CORRUPTION.The following command repairs all physically corrupted blocks recorded in the view:
  • RMAN> RECOVER CORRUPTION LIST;

Lets create the scenario.

For this we need to have a test tablespace with a different data-file where we can corrupt our data block. So lets create the tablespace.

Login to sqlplus.

$ sqlplus / as sysdba

create tablespace test datafile ‘/u01/apps/oradata/test/test01.dbf’ size 1G autoextend on next 10M

extent management dictionary segment space management auto

/

Tablespace created.

create user nimai identified by password default tablespace test

quota unlimited on test

/

User created.

grant connect , resource,dba to nimai;

Grant succeeded.

conn nimai/password;

Connected.

create table testnimai as select * from all_objects;

Table created.

SQL> exit

Ok, now we have a tablespace named as test, which has a table named testnimai owned by the user nimai.

Lets take a backup of this tablespace using RMAN.

$ rman target /

RMAN>run

{

allocate channel no1 type disk;

sql ‘alter system switch log’;

backup datafile 5 format /u02/rman_backup/backup/nimai_%d_%s_%p_%t’;

release channel no1;

}

Starting backup at 14-NOV-13

using channel no1

channel no1: starting full datafile backupset

channel no1: specifying datafile(s) in backupset

input datafile fno=00005 name=/u01/apps/oradata/test/test01.dbf

channel no1: starting piece 1 at 14-NOV-13

channel no1: finished piece 1 at 14-NOV-13

piece handle=/u02/rman_backup/backup/nimai_5vhcr4vz_.bkp

comment=NONE

channel no1: backup set complete, elapsed time: 00:00:01

Finished backup at 14-NOV-13

Check that your backup piece exists

RMAN> list backup;

RMAN> exit

Recovery Manager complete.

Now we have a backup of data file “/u01/apps/oradata/test/test01.dbf”.

let check out the status of the table.

$ sqlplus / as sysdba

select segment_name,header_file,header_block from dba_segments where segment_name = ‘TESTNIMAI’

and owner = ‘NIMAI’;

SEGMENT_NAME                 HEADER_FILE HEADER_BLOCK

—————————- ———– ————

TESTNIMAI                              5           10

SQL> exit

The header of the table is in block 11, so if we are able to corrupt the next block we can create a scenario for test. Lets corrupt the next block which is 11 using the “dd” command in Linux (Note: Please use the dd command carefully in you environment as its can remove some important blocks from your environment).

$ cd /u01

$ dd of=/apps/oradata/test/test01.dbf bs=512 count=1 seek=11 << EOF

Ok, now we have executed the command and the block 11 got corrupted as well in the data file “/apps/oradata/test/test01.dbf”.

$ sqlplus / as sysdba

SQL> alter system flush buffer_cache;

System altered.

We need to flush the buffer_cache because if the block 11 is in the buffer_cache then we can’t read that block from the data file.

We will get the block corruption error

  1.  when we will query the v$database_block_corruption.
  2.  in the alert log file.
  3. when we use dbverify (dbv) utility for that file.
  4. when we will try to query the table.

lets check here simply as users are.

SQL> conn nimai/password

Connected.

SQL> select count(*) from testnimai;

select count(*) from testnimai

*

ERROR at line 1:

ORA-01578: ORACLE data block corrupted (file # 5, block # 11)

ORA-01110: data file 5: ‘/u01/apps/oradata/test/test01.dbf’

SQL> exit

As our scenario is created and now we can proceed further to recover the block 11 of data file 5.

$ rman target /

RMAN> BLOCKRECOVER DATAFILE 5 BLOCK 11;

Starting blockrecover at 14-NOV-13

using target database control file instead of recovery catalog

allocated channel: ORA_DISK_1

channel ORA_DISK_1: sid=154 devtype=DISK

channel ORA_DISK_1: restoring block(s)

channel ORA_DISK_1: specifying block(s) to restore from backup set

restoring blocks of datafile 00005

channel ORA_DISK_1: restored block(s) from backup piece 1

piece handle=/u02/rman_backup/backup/nimai_5vhcr4vz_.bkp

channel ORA_DISK_1: block restore complete, elapsed time: 00:00:01

starting media recovery

media recovery complete, elapsed time: 00:00:01

Finished blockrecover at 14-NOV-13

RMAN> exit

Recovery Manager complete.

BLOCK MEDIA RECOVERY Complete.

$ sqlplus nimai/password

SQL> select count(*) from testnimai;

COUNT(*)

———-

40688

SQL> exit

Lets now corrupt more then one blocks in the file and do a test for Block Media Recovery.

$ cd /u01

$ dd of=/apps/oradata/test/test01.dbf bs=512 count=1 seek=11 << EOF

$ dd of=/apps/oradata/test/test01.dbf bs=512 count=1 seek=12 << EOF

$ dd of=/apps/oradata/test/test01.dbf bs=512 count=1 seek=13 << EOF

$ sqlplus nimai/password

SQL> select count(*) from testnimai;

COUNT(*)

———-

40688

So,why are we not getting the error, remember as I told you earlier we need to flush the buffer_cache because if the block is in the buffer_cache then we can’t read that block from the data file.So, here we have queried the blocks from buffer_cache.

SQL> conn / as sysdba

Connected.

SQL> alter system flush buffer_cache;

System altered.

Now query again.

SQL> conn nimai/password

Connected.

SQL> select count(*) from testnimai;

select count(*) from testnimai

*

ERROR at line 1:

ORA-01578: ORACLE data block corrupted (file # 5, block # 11)

ORA-01110: data file 5: ‘/u01/apps/oradata/test/test01.dbf’

SQL> exit

Now say that you have three blocks corrupt in your data file, your automated backup script started in the night somewhere and took a backup of the file marking the blocks as corrupt.  When RMAN finds corrupt blocks in the data file it reports them in v$backup_corruption.

$ rman target /

after getting the corrupted block count from v$backup_corruption we can set in the backup script to not backup those 3 block  (as we have corrupted three blocks e.g. 11,12,13)

We will use set maxcorrupt command in RMAN which will ignore 3 corrupted blocks in the file 5 and mark the blocks as corrupt and mark those 3 blocks in v$backup_corruption.

RMAN> run {

allocate channel no1 type disk;

set maxcorrupt for datafile 5 to 3;

backup datafile 5;

release channel no1;

}

executing command: SET MAX CORRUPT

using target database control file instead of recovery catalog

Starting backup at 14-NOV-13

allocated channel: no1

channel no1: sid=158 devtype=DISK

channel no1: starting full datafile backupset

channel no1: specifying datafile(s) in backupset

input datafile fno=00005

name=/u01/apps/oradata/test/test01.dbf

channel no1: starting piece 1 at 14-NOV-13

channel no1: finished piece 1 at 14-NOV-13

piece handle=

channel ORA_DISK_1: backup set complete, elapsed time: 00:00:03

Finished backup at 14-NOV-13

RMAN> exit

Recovery Manager complete.

The backup is complete now lets query the v$backup_corruption to see our corrupted blocks count.

$ sqlplus / as sysdba

select piece#, file#, block# , blocks , marked_corrupt from v$backup_corruption;

PIECE#      FILE#     BLOCK#     BLOCKS MAR

———- ———- ———- ———- —

1          5         11          3 YES

SQL> exit

It tells us there are three blocks corrupted in data file 5. Now we can simply go to RMAN use the BLOCK RECOVER command to recover all these blocks from the backup we took earlier.

$ rman target /

RMAN> list backup;

RMAN> BLOCKRECOVER CORRUPTION LIST from tag=tag_name;

CORRUPTION LIST means all the blocks reported corrupt in v$backup_corruption.

OR

RMAN> RECOVER CORRUPTION LIST;

RMAN> exit

Recovery Manager complete.

Now lets query again and check that do we have our corrupted data blocks with us in ok condition.

$ sqlplus nimai/password

SQL> select count(*) from testnimai;

COUNT(*)

———-

40688

SQL> exit

All blocks are recovered successfully. Always off and on do take rman backup cause you never know when you can face such type of scenario in your prod environment.

Hope it’s help.

Thanks & Regards

Nimai Karmakar

$

Read Full Post »

Guys as discussed at my last post oracle-performance-tuning-queries I have promised that I will be back with some DBA monitoring shell scripts , So there they are. Below shell scripts can be used for daily Database monitoring by DBA, Some are created by my fellow DBA .

Please edit the configuration as per your environment

  • This scripts checks for tablespace usage. If tablespace is 10 percent free,  it will send an alert e-mail.

#####################################################################
## check_tablespace.sh ##
##
#####################################################################
###!/bin/bash
##
####### Start of configuration
##
######## Oracle Enviorment variables ##########
##
export ORACLE_BASE=/u01/app
export ORACLE_HOME=/rac/app/oracle/product/11.2.0
export ORACLE_UNQNAME=TEST01
export PATH=$PATH:$ORACLE_HOME/bin:$ORACLE_HOME/rdbms/admin:$ORACLE_HOME/lib
export alrt=/rac/app/oracle/diag/rdbms/test01/TEST01/trace/alert_TEST01.log
export asmalrt=/rac/app/oracle/diag/asm/+asm/+ASM1/trace/alert_+ASM1.log
export TNS_ADMIN=$ORACLE_HOME/network/admin
export MONITOR_DIR=$ORACLE_HOME/dba-scripts/monitor

export ORACLE_SID=TEST01
##
######## Other variables #################
##
DBA=nimai.karmakar@hotmail.com
DATABASE=TEST01
datevar=$(date)
datevar2=$(date ‘+%Y-%m-%d-%H-%M’)
##
####### End of configuration
##
sqlplus -s “/ as sysdba” << SQL1
set feed off
set linesize 100
set pagesize 200
spool /rac/app/oracle/product/11.2.0/dba-scripts/monitor/tablespace.alert
SELECT F.TABLESPACE_NAME,
TO_CHAR ((T.TOTAL_SPACE – F.FREE_SPACE),’999,999′) “USED(MB)”,
TO_CHAR (F.FREE_SPACE, ‘999,999’) “FREE(MB)”,
TO_CHAR (T.TOTAL_SPACE, ‘999,999’) “TOTAL(MB)”,
TO_CHAR ((ROUND ((F.FREE_SPACE/T.TOTAL_SPACE)*100)),’999,99999′)||’ %’ PERCENT_FREE
FROM   (
SELECT       TABLESPACE_NAME,
ROUND (SUM (BLOCKS*(SELECT VALUE/1024
FROM V\$PARAMETER
WHERE NAME = ‘db_block_size’)/1024)
) FREE_SPACE
FROM DBA_FREE_SPACE WHERE TABLESPACE_NAME NOT IN (‘SYSTEM’,’SYSAUX’,’TEMP’,’USERS’,’UNDOTBS1′,’UNDOTBS2′)
GROUP BY TABLESPACE_NAME
) F,
(
SELECT TABLESPACE_NAME,
ROUND (SUM (BYTES/1048576)) TOTAL_SPACE
FROM DBA_DATA_FILES WHERE TABLESPACE_NAME NOT IN (‘SYSTEM’,’SYSAUX’,’TEMP’,’USERS’,’UNDOTBS1′,’UNDOTBS2′)
GROUP BY TABLESPACE_NAME
) T
WHERE F.TABLESPACE_NAME = T.TABLESPACE_NAME
AND (ROUND ((F.FREE_SPACE/T.TOTAL_SPACE)*100)) < 10;
spool off
exit
SQL1
if [ `cat /rac/app/oracle/product/11.2.0/dba-scripts/monitor/tablespace.alert|wc -l` -gt 0 ]
then
echo Tablespace less than 10% free on ${DATABASE}.Please add space as necassary >> /rac/app/oracle/product/11.2.0/dba-scripts/monitor/tablespace.tmp
cat /rac/app/oracle/product/11.2.0/dba-scripts/monitor/tablespace.alert >> /rac/app/oracle/product/11.2.0/dba-scripts/monitor/tablespace.tmp
mailx -s “Tablespace percent usage for ${DATABASE} at $datevar” $DBA < /rac/app/oracle/product/11.2.0/dba-scripts/monitor/tablespace.tmp
#####mv /disk1/tablespace.tmp /disk1/tablespace_$datevar2.alert
rm /rac/app/oracle/product/11.2.0/dba-scripts/monitor/tablespace.alert
rm /rac/app/oracle/product/11.2.0/dba-scripts/monitor/tablespace.tmp
fi

  • Script to gather SGA stats in timely manner

####################################################
##
##SGA.sh
##
####################################################
export ORACLE_HOME=/rac/app/oracle/product/11.2.0
export ORACLE_SID=TEST01
export ORACLE_UNQNAME=TEST01
export PATH=$PATH:$ORACLE_HOME/bin:$ORACLE_HOME/rdbms/admin:$ORACLE_HOME/lib
export alrt=/rac/app/oracle/diag/rdbms/test01/TEST01/trace/alert_test011.log
export asmalrt=/rac/app/oracle/diag/asm/+asm/+ASM1/trace/alert_+ASM1.log
export TNS_ADMIN=$ORACLE_HOME/network/admin
MAILID=nimai.karmakar@hotmail.com
#
#
#
sqlplus -s “/ as sysdba” << SQL1
set feed off
set linesize 100
set pagesize 200
spool /rac/app/oracle/product/11.2.0/dba-scripts/monitor/logs/sga_stats.log
col inst_id format 9999999 heading “INSTANCE ID”
col sga_size/1024 format 9999999 heading “SGA SIZE”
col sga_size_factor format 9999999 heading “SGA SIZE FACTOR”
col estd_physical_reads format 9999999 heading “PHYSICAL READ”
select inst_id, sga_size/1024, sga_size_factor, estd_db_time, estd_db_time_factor, estd_physical_reads from gv\$sga_target_advice
order by inst_id, sga_size_factor;
spool off
exit;
SQL1

if [ `cat /rac/app/oracle/product/11.2.0/dba-scripts/monitor/logs/sga_stats.log|wc -l` -gt 0 ]; then
cat /rac/app/oracle/product/11.2.0/dba-scripts/monitor/logs/sga_stats.log
mailx -s “Statistics at `date +%H+%M` for `hostname`” $MAILID << /rac/app/oracle/product/11.2.0/dba-scripts/monitor/logs/sga_stats.log
fi
exit

  • Script to check Server Process (RAC)

#!/bin/bash
export PATH=$PATH:/grid/app/bin/

SUBJECT=”Server Process failed for – Server `hostname` on `date ‘+%m/%d/%y %X %A ‘`”
REPSUBJECT=”Server `hostname` health check report at `date +%H:%M` hours on `date +%d-%m-%Y`”
ERRLOG=$MONITOR_DIR/logs/server_process_check.log
REPORT=$MONITOR_DIR/logs/server_process_report.txt
BODY=$MONITOR_DIR/server_process_report_email_body.txt

## Delete the errorlog file if found
/usr/bin/find  $ERRLOG -type f -exec rm {} \; 2>&1 > /dev/null

##Report recipients
MAILID=’nimai.karmakar@hotmail.com’
chour=`date +%H“date +%M`

if [ `grep -i “TEST01  HEALTH CHECK FOR” $REPORT | wc -l` -eq 0 ]; then
echo “—————————-TEST01 HEALTH CHECK FOR `date +%m_%d_%Y`———————————” > $REPORT
fi

echo ” ” >> $REPORT

echo “—————————-CRS process status `date +%m_%d_%Y_%H:%M`—————————-” >> $REPORT
crsctl check crs | while read outputline
do
if test `echo $outputline | grep ‘online’ | wc -l` -eq 0 ## This will check if the CRS process is online or not
then
echo “Date        : “`date ‘+%m/%d/%y %X %A ‘` >> $ERRLOG
echo “Details     :”`crsctl check  crs` >> $ERRLOG
echo ”  ” >> $ERRLOG
echo ” Details     :”`crsctl check  crs` >> $REPORT
echo ” ” >> $REPORT
echo ” Skiiping other tests ” >> $REPORT
echo ”  ” >> $ERRLOG
##mutt -s “$SUBJECT” $MAILTO < $ERRLOG
/bin/mail -s “$SUBJECT” “$MAILID”
exit
else
echo ” Details     :”`crsctl check  crs` >> $REPORT
echo ” ” >> $REPORT
fi
done

echo “—————————-PMON process Status count on `date +%m_%d_%Y_%H:%M`—————————-” >> $REPORT
if test `ps -ef|grep pmon|grep -v grep |wc -l` -ne 2 ## This will check the no of pmon process for each of the server
then
echo “Date        : “`date ‘+%m/%d/%y %X %A ‘` >> $ERRLOG
echo “Details     :”`ps -ef|grep pmon|grep -v ‘grep’ |wc -l` >> $ERRLOG
echo ” ” >> $ERRLOG
echo ” PMON process not found. Oracle Instance on `hostname` may be down . Require immediate attention” >> $ERRLOG
echo ” ” >> $REPORT
echo ” Skiiping other tests ” >> $REPORT
echo ”  ” >> $ERRLOG
/bin/mail -s “$SUBJECT” “$MAILID”
exit
else
echo ” Details     :”`crsctl check  crs` >> $REPORT
echo ” ” >> $REPORT
fi

echo “—————————-Listener Status on `date +%m_%d_%Y_%H:%M`—————————-” >> $REPORT

##Check whether listener is running. Output should be 1
if test `ps -ef|grep tnslsnr | grep -v “grep” |wc -l` -ne 1 ##Check the no of listener running.
then
echo “Date        : “`date ‘+%m/%d/%y %X %A ‘` >> $ERRLOG
echo “Details     :”`ps -ef|grep tnslsnr |grep -v ‘grep’ |wc -l` >> $ERRLOG
echo ” ” >> $ERRLOG
echo ” Listener on `hostname` may be down . Require immediate attention” >> $ERRLOG
echo ” ” >> $REPORT
echo ”  ” >> $ERRLOG
/bin/mail -s “$SUBJECT” “$MAILID”
else
echo ” Details     :”`crsctl check  crs` >> $REPORT
echo ” ” >> $REPORT
fi

echo “—————————-Checking number of oracle processes `date +%m_%d_%Y_%H:%M`—————————-” >> $REPORT

##Check Process count of “oracle” user. Output should be less than or equal to 1500
if test `ps -ef|grep -i oracle |wc -l` -ge 1000
then
echo “Date        : “`date ‘+%m/%d/%y %X %A ‘` >> $ERRLOG
echo “Details     : “`ps -ef|grep -i oracle|wc -l` >> $ERRLOG
echo ”  ”
echo ” Count of processes exceeded 1000. Require immediate attention” >>  $ERRLOG
echo ”  ” >> $ERRLOG
echo ”  ” >> $ERRLOG
/bin/mail -s “$SUBJECT” “$MAILID”
else
echo “Number of oracle processes: ” `ps -ef|grep -i oracle |wc -l` >> $REPORT
echo ” ” >> $REPORT
fi

##Send the report at particular times (e.g 1500 hours or 2300 hours)
if [ $chour -ge 1500 -a $chour -lt 1502 ]; then
mutt  -s “$REPSUBJECT” -a $REPORT $MAILID < $BODY
if [ $? -eq 0 ]; then
cp $REPORT $MONITOR_DIR/logs/server_process_report_`date +%d-%m-%Y`.txt
> $REPORT
fi
fi

if [ $chour -ge 2350 -a $chour -lt 2355 ]; then
mutt -s “$REPSUBJECT” -a $REPORT $MAILID < $BODY
if [ $? -eq 0 ]; then
cp $REPORT $MONITOR_DIR/logs/server_process_report_`date +%d-%m-%Y`.txt
> $REPORT
fi
fi

exit

  • Script for purging old files

echo ‘Setting your environment’

###ORACLE_SID=#replace with your SID
ORACLE_SID=TEST01
export ORACLE_SID
###
###BDUMP=#replace with your BDUMP path
BDUMP=/rac/app/oracle/diag/rdbms/test01/TEST01/trace
export BDUMP
###
###ADUMP=#replace with your ADUMP path
ADUMP=/rac/app/oracle/admin/TEST01/adump
export ADUMP
###
###UDUMP=#replace with your UDUMP path
UDUMP=/rac/app/oracle/diag/rdbms/test01/TEST01/trace
export UDUMP

DT=`date “+%d%m%y”`
export DT
PID=${$}
export PID
FSEQ=${PID}
export FSEQ

################## Creating Backup Dir if not exist #############################
echo ‘Creating Backup Dir if not exist’
mkdir -p $BDUMP/bdump_oldfiles
mkdir -p $UDUMP/udump_oldfiles
mkdir -p $ADUMP/adump_oldfiles

#### Deleting old Alert log files and trace files################################
echo ‘Deleting old Alert log files and trace files’
cd $BDUMP/bdump_oldfiles
find . -name “*.trc.gz”  -mtime +5 -exec rm {} \ ;
find . -name “*.log.gz”  -mtime +5 -exec rm {} \ ;

cd $BDUMP
ls -lrt | grep “.trc” | awk ‘{print “mv  “$9 ” $BDUMP/bdump_oldfiles “}’ > /tmp/mv$FSEQ.sh
sh /tmp/mv$FSEQ.sh
rm /tmp/mv$FSEQ.sh

#### Backup and Purging of Alert logfile #######################################
echo ‘Backup and Purging of Alert logfile’
cd $BDUMP
cp alert_$ORACLE_SID.log $BDUMP/bdump_oldfiles/alert_$ORACLE_SID.log

cd $BDUMP
>  alert_$ORACLE_SID.log

#### Compression of old Alert log files ########################################
gzip -f $BDUMP/bdump_oldfiles/*.log

#### Deleting old user trace files #############################################
cd $UDUMP/udump_oldfiles
find /$UDUMP/udump_oldfiles -name “*.trc.gz”  -mtime +5 -exec rm {} \ ;

cd $UDUMP
ls -lrt | grep “.trc” | awk ‘{print “mv  “$9 ” $UDUMP/udump_oldfiles “}’ > /tmp/mv$FSEQ.sh
sh /tmp/mv$FSEQ.sh
rm /tmp/mv$FSEQ.sh

cd $UDUMP/udump_oldfiles
ls -lrt | grep “.trc” | grep -v “.gz” | awk ‘{print “gzip -f  ” $9 }’ > /tmp/gzip$FSEQ.sh
sh  /tmp/gzip$FSEQ.sh
rm  /tmp/gzip$FSEQ.sh

#### Deleting old audit files ##################################################
cd $ADUMP/oldfiles
find /$ADUMP/oldfiles -name “*.aud.gz” -mtime +5 -type f -exec rm {} \

cd $ADUMP
ls -lrt | grep “.aud” | awk ‘{print “mv  “$9 ” $ADUMP/oldfiles “}’ > /tmp/mv$FSEQ.sh
sh /tmp/mv$FSEQ.sh
rm /tmp/mv$FSEQ.sh

cd $ADUMP/oldfiles
ls -lrt | grep “.aud” | grep -v “.gz” | awk ‘{print “gzip -f  ” $9 }’ > /tmp/gzip$FSEQ.sh
sh  /tmp/gzip$FSEQ.sh
rm  /tmp/gzip$FSEQ.sh

############################# END ######################################

  • This scripts checks for database recovery area usage and send an  alert e-mail.

#####################################################################
## db_recovery_area_usage.sh ##
##
## Author : Nimai Karmakar
#####################################################################
###!/bin/bash
##
#######
####### Start of configuration
#######
######## Oracle Enviorment variables ##########
##
export ORACLE_BASE=/u01/app
export ORACLE_HOME=/u01/app/oracle/product/11.2.0/test01
export JAVA_HOME=/usr/java/jdk1.6.0_30/
export PATH=${ORACLE_HOME}/bin:${JAVA_HOME}/bin:$PATH:.:
export ORACLE_SID=TEST01
##
######## Other variables #################
##
DBA=nimai.karmakar@hotmail.com
DATABASE=TEST01
datevar=$(date)
datevar2=$(date ‘+%Y-%m-%d-%H-%M’)
##
#######
####### End of configuration
#######
##
sqlplus -s “/ as sysdba” << SQL1
set feed off
set linesize 100
set pagesize 200
spool /disk1/scripts/archive.alert
set lines 100
col Location format a60
select     name “Location”
,  floor(space_limit / 1024 / 1024/1024) “Size GB”
,  ceil(space_used  / 1024 / 1024/1024) “Used GB”
from       v\$recovery_file_dest
order by name
/
spool off
exit
SQL1
if [ `cat /disk1/scripts/archive.alert|awk ‘{print $3}’|wc -l` -gt 0 ]
then
cat /disk1/scripts/archive.alert > /disk1/scripts/archive.tmp
mail -s “DB Recovery area usage for ${DATABASE} at $datevar” $DBA < /disk1/scripts/archive.tmp
rm /disk1/scripts/archive.tmp
rm /disk1/scripts/archive.alert
fi

  • This scripts checks for any ORA errors at alert log file and if found any ORA error send an  alert e-mail.

#####################################################################
## check_alert_log.sh ##
##
## Author : Nimai Karmakar
#####################################################################
###!/bin/bash
##
#######
####### Start of configuration
#######
######## Oracle Enviorment variables ##########
##
export ORACLE_BASE=/u01/app
export ORACLE_HOME=/u01/app/oracle/product/11.2.0/test01
export JAVA_HOME=/usr/java/jdk1.6.0_30/
export PATH=${ORACLE_HOME}/bin:${JAVA_HOME}/bin:$PATH:.:
export ORACLE_SID=TEST01
##
######## Other variables #################
##
DBA=nimai.karmakar@hotmail.com
datevar=$(date)
##
#######
#######
#######
####### End of configuration
#######
#######
##
cd $ORACLE_BASE/diag/rdbms/test01/$ORACLE_SID/trace/
if [ -f alert_${ORACLE_SID}.log ]
then
tail -200 alert_${ORACLE_SID}.log > /disk1/scripts/alert_work.log
grep ORA- /disk1/scripts/alert_work.log >> /disk1/scripts/alert.err
grep Shut /disk1/scripts/alert_work.log >> /disk1/scripts/alert.err
fi
export error << `cat /disk1/scripts/alert.err`

if[ `cat /disk1/scripts/alert.err|wc -l` -gt 0 ]
then
mailx -s “${ORACLE_SID} ORACLE ALERT ERRORS $datevar” $DBA < /disk1/scripts/alert_work.log
fi
fi
rm -f /disk1/scripts/alert.err
rm -f /disk1/scripts/alert_work.log

exit

  • SCRIPT FOR CHECKING MOUNT POINT SPACE IN UNIX

######################################
#!/bin/ksh
#rm /rac/app/oracle/product/11.2.0/dba-scripts/monitor/logs/ERROR_LOG.txt
echo “df -k output for `date` `uname -n`” > /rac/app/oracle/product/11.2.0/dba-scripts/monitor/logs/ERROR_LOG.txt
echo ” ” >> /rac/app/oracle/product/11.2.0/dba-scripts/monitor/logs/ERROR_LOG.txt
echo “File system usage exceeded the threshold on `uname -n` server- `date`” >> /rac/app/oracle/product/11.2.0/dba-scripts/monitor/logs/ERROR_LOG.txt
echo ” ” >> /tmp/dfk.txt
i=1
while [ $i -le `df -k | grep -v proc | grep -v capacity | wc -l` ] ;do
if [ `df -k | grep -v proc | grep -v capacity | head -n $i | tail -1 | awk ‘{print $5}’ | \sed -e ‘s/%//’` -gt 90 ] ; then
echo “File system usage exceeded the threshold on `uname -n` server- `date`” >> /rac/app/oracle/product/11.2.0/dba-scripts/monitor/logs/ERROR_LOG.txt
echo ” ” >> /rac/app/oracle/product/11.2.0/dba-scripts/monitor/logs/ERROR_LOG.txt
df -k | grep -v proc | grep -v capacity | head -n $i | tail -1 >> /rac/app/oracle/product/11.2.0/dba-scripts/monitor/logs/ERROR_LOG.txt
fi
((i=i+1))

done
if [ `cat /rac/app/oracle/product/11.2.0/dba-scripts/monitor/logs/ERROR_LOG.txt | wc -l` -gt 2 ] ; then
#cat /rac/app/oracle/product/11.2.0/dba-scripts/monitor/logs/ERROR_LOG.txt | mailx -s “File system full alert” nimai.karmakar@hotmail.com
cat /rac/app/oracle/product/11.2.0/dba-scripts/monitor/logs/ERROR_LOG.txt
else
exit
fi

  • Script to check the node Eviction (RAC)

#!/bin/bash
. ~/.bash_profile

mailid=’nimai.karmakar@hotmail.com’
date=`date  +%Y-%m-%d” “%H:`

alertlog=/rac/app/oracle/product/11.2.0/dba-scripts/logs/ioerr.log
errlog=err_`date –date=”0 days ago” +_%d_%m_%y_%H_%M`.txt

err1=`grep -B 2 “$date” $alertlog | grep -A 2 “An I/O error” | wc -l`
err2=`grep -B 2 “$date” $alertlog | grep -A 2 “Network communication with node” | wc -l`

if [ $err1 -ge 1 -o $err2 -ge 1 ]
then
echo “Node eviction condition found in server `hostname`. Immediately check DB alert log for further action”  >> $errlog
echo “” >> $errlog
echo `grep -B 2 “$date” $alertlog | grep -A 2 “An I/O error”` >> $errlog
echo “” >> $errlog
echo =`grep -B 2 “$date” $alertlog | grep -A 2 “Network communication with node”` >> $errlog
mutt -s “Node evition type condition found in `hostname`” $mailto < $errLog
rm $errlog
fi

Thats it, Hope it’s helpful.

Thanks & Regards

Nimai Karmakar

Read Full Post »

Johribazaar's Blog

Just another WordPress.com weblog

Ayurveda and Yoga

Site for Ayurveda and Yoga articles

SanOraLife

Few adventures in my slow running life...

pavankumaroracledba

4 out of 5 dentists recommend this WordPress.com site

ORACLE WRITES

Technical Articles by Kanchana Selvakumar

SAP Basis Cafe

SAP Basis Cafe - Exploring SAP R/3 Basis World

Life Is A Mystery

This Is Wihemdra's Blog...I Share What I Have

ursvenkat's Blog

Oracle Applications DBA Blog

The Ivica Arsov Blog

Database Management & Performance

gumpx

DBA's online diary