Tuesday, November 6, 2012

GoldenGate High availability using Clusterware

This post is to setup High Availability for Goldengate.

Clusterware will manage Goldengate resources to start/stop/relocate of Goldengate Processes.

For 10g & 11gR1 Clusterware.

--Add VIP for Goldengate application

--As Oracle - Create resource profile.  Use IP address which was pre-allocated 1.2.2 section. 
/u00/app/oracle/product/10.2.0/CRS/bin/crs_profile -create ggatevip -t application -a /u00/app/oracle/product/10.2.0/CRS/bin/usrvip -p balanced -h dummyxt200,dummyxt205 -o oi=eth0,ov=10.10.10.10,on=255.255.255.0

--Next, Register the VIP as oracle:
/u00/app/oracle/product/10.2.0/CRS/bin/crs_register ggatevip

--Because the assignment of an IP address is done by the root user, you have to set the ownership of the VIP to the root user. 

--Connect as root and execute:
sudo /u00/app/oracle/product/10.2.0/CRS/bin/crs_setperm ggatevip -o root

--As root, allow oracle to run the script to start the VIP.
sudo /u00/app/oracle/product/10.2.0/CRS/bin/crs_setperm ggatevip -u user:oracle:r-x

--Then, as oracle, start the VIP:
/u00/app/oracle/product/10.2.0/CRS/bin/crs_start ggatevip

--To validate whether the VIP is running and on which node it is running, execute:
/u00/app/oracle/product/10.2.0/CRS/bin/crs_stat ggatevip -t

Try to Ping Goldengate VIP. 
ping -c4 [Goldengate VIP]

Copy & Test GoldenGate action script
Place Goldengate Action script Which is in Addendum of This Post to $CRS_HOME/crs/public on each Node.  

$CRS_HOME/crs/public/ggaction.scr
/u00/app/oracle/product/10.2.0/CRS/crs/public/ggaction.scr

Give full permission on it.
chmod 777 ggaction.scr

Once copied please test it using below argument. 

/u00/app/oracle/product/10.2.0/CRS/crs/public/ggaction.scr  [start|stop|check]
/u00/app/oracle/product/10.2.0/CRS/crs/public/ggaction.scr stop

Add Goldengate application resource to cluster
--As Oracle create profile for Goldengate Application 
/u00/app/oracle/product/10.2.0/CRS/bin/crs_profile -create goldengate_app -t application -r ggatevip -a /u00/app/oracle/product/10.2.0/CRS/crs/public/ggaction.scr -o ci=10

--As oracle
/u00/app/oracle/product/10.2.0/CRS/bin/crs_register goldengate_app

--As root
sudo /u00/app/oracle/product/10.2.0/CRS/bin/crs_setperm goldengate_app -o root

--As root
sudo /u00/app/oracle/product/10.2.0/CRS/bin/crs_setperm goldengate_app -u user:oracle:r-x

--As oracle
sudo /u00/app/oracle/product/10.2.0/CRS/bin/crs_start goldengate_app

--As oracle.
/u00/app/oracle/product/10.2.0/CRS/bin/crs_stat goldengate_app -t

Name           Type           Target    State     Host
------------------------------------------------------------
ggatevip       application    ONLINE    ONLINE    dummyxt200
goldengate_app application    ONLINE    ONLINE    dummyxt200

Manage Application.
--To Relocate Goldengate on different Node. 
dummyxt200@:/u00/app/oracle/product/10.2.0/CRS/crs/public :CRS $crs_relocate -f goldengate_app
Attempting to stop `goldengate_app` on member `dummyxt200`
Stop of `goldengate_app` on member `dummyxt200` succeeded.
Attempting to stop `ggatevip` on member `dummyxt200`
Stop of `ggatevip` on member `dummyxt200` succeeded.
Attempting to start `ggatevip` on member `dummyxt208`
Start of `ggatevip` on member `dummyxt208` succeeded.
Attempting to start `goldengate_app` on member `dummyxt208`
Start of `goldengate_app` on member `dummyxt208` succeeded.

--Test relocation of resource on All Node in cluster. 

crs_relocate -f goldengate_app -n 

--Confirm Goldengate process has been started on relocated Node.

--Issue below command on Node where Goldengate application resource is relocated. 
> ps -ef | grep mgr
root     23942     1  0 15:41 ?        00:00:00 ./mgr PARAMFILE /ORAGG/product/11.2.1/gghome10/dirprm/mgr.prm REPORTFILE /ORAGG/product/11.2.1/gghome10/dirrpt/MGR.rpt PROCESSID MGR PORT 7809
> ps -ef | grep extract
root     23958 23942  2 15:41 ?        00:00:01 /ORAGG/product/11.2.1/gghome10/extract PARAMFILE /ORAGG/product/11.2.1/gghome10/dirprm/identde.prm REPORTFILE /ORAGG/product/11.2.1/gghome10/dirrpt/IDENTDE.rpt PROCESSID IDENTDE USESUBDIRS
root     23959 23942  0 15:41 ?        00:00:00 /ORAGG/product/11.2.1/gghome10/extract PARAMFILE /ORAGG/product/11.2.1/gghome10/dirprm/identdp.prm REPORTFILE /ORAGG/product/11.2.1/gghome10/dirrpt/IDENTDP.rpt PROCESSID IDENTDP USESUBDIRS

--To stop Goldengate using clusterware. 
> crs_stop goldengate_app
Attempting to stop `goldengate_app` on member `dummyxt208`
Stop of `goldengate_app` on member `dummyxt208` succeeded.
dummyxt208 | CRS | /ORAGG/product/11.2.1/gghome10
> crs_stat -t
Name           Type           Target    State     Host
------------------------------------------------------------
ggatevip       application    ONLINE    ONLINE    dummyxt208
goldengate_app application    OFFLINE   OFFLINE

--To start Goldengate using clusterware. 
dummyxt200@:/ORAGG/product/11.2.1/gghome10 :CRS $crs_start goldengate_app
Attempting to start `ggatevip` on member `dummyxt200`
Start of `ggatevip` on member `dummyxt200` succeeded.
Attempting to start `goldengate_app` on member `dummyxt200`
Start of `goldengate_app` on member `dummyxt200` succeeded.

For 11gR2 Clusterware

--Add VIP Goldengate Application

. oraenv
ORACLE_SID = [TARGG1] ? GRID
The Oracle base for ORACLE_HOME=/u00/app/11.2.0/GRID is /u00/app/oracle

--As root 
sudo /u00/app/11.2.0/GRID/bin/appvipcfg create -network=1 -ip=10.10.10.10 -vipname=mvggatevip -user=root

-----------------Output---------------
Production Copyright 2007, 2008, Oracle.All rights reserved
2012-11-04 10:24:30: Creating Resource Type
2012-11-04 10:24:30: Executing cmd: /u00/app/11.2.0/GRID/bin/crsctl add type app.appvip.type -basetype cluster_resource -file /u00/app/11.2.0/GRID/crs/template/appvip.type
2012-11-04 10:24:30: Create the Resource
2012-11-04 10:24:30: Executing cmd: /u00/app/11.2.0/GRID/bin/crsctl add resource mvggatevip -type app.appvip.type -attr USR_ORA_VIP=10.10.10.10,START_DEPENDENCIES=hard(ora.net1.network) pullup(ora.net1.network),STOP_DEPENDENCIES=hard(ora.net1.network),ACL='owner:root:rwx,pgrp:root:r-x,other::r--,user:root:r-x'

dummyxt271 | GRID | /export/home/oracle
--As root,  Allow the Oracle Grid infrastructure software owner (e.g. oracle) to run the script to start the VIP.
sudo /u00/app/11.2.0/GRID/bin/crsctl setperm resource mvggatevip -u user:oracle:r-x

--As oracle, start the VIP:
/u00/app/11.2.0/GRID/bin/crsctl start resource mvggatevip
CRS-2672: Attempting to start 'mvggatevip' on 'dummyxt274'
CRS-2676: Start of 'mvggatevip' on 'dummyxt274' succeeded


--To validate whether the VIP is running and on which node it is running, execute:
/u00/app/11.2.0/GRID/bin/crsctl status resource mvggatevip
NAME=mvggatevip
TYPE=app.appvip.type
TARGET=ONLINE
STATE=ONLINE on dummyxt274

--At this point you can also connect to another server in the subnet and ping the VIP's IP address. You should get a reply from this IP address.

ping -c4 mvggatevip

Copy & Test Goldengate Action script.
--Place Goldengate Action script from Addendum of this Post to $GRID_HOME/crs/public  on each Node.  
/u00/app/11.2.0/GRID/crs/public/ggaction.scr [start|stop|check]
/u00/app/11.2.0/GRID/crs/public/ggaction.scr start

Add goldengate application resource.
--As Oracle
/u00/app/11.2.0/GRID/bin/crsctl add resource ggateapp -type cluster_resource -attr  "ACTION_SCRIPT=/u00/app/11.2.0/GRID/crs/public/ggaction.scr, CHECK_INTERVAL=30, START_DEPENDENCIES='hard(mvggatevip) pullup(mvggatevip)', STOP_DEPENDENCIES='hard(mvggatevip)'"

--As Oracle 
dummyxt271 | GRID | /u00/app/11.2.0/GRID/crs/public
crsctl start resource ggateapp
CRS-2672: Attempting to start 'ggateapp' on 'dummyxt274'
CRS-2676: Start of 'ggateapp' on 'dummyxt274' succeeded

--As Oracle 
crsctl status resource ggateapp
NAME=ggateapp
TYPE=cluster_resource
TARGET=ONLINE
STATE=ONLINE on dummyxt274

--confirm of Goldengate has started on perticular Node. 
dummyxt274 | ORA102 | /export/home/oracle
ps -ef | grep mgr
oracle   12582     1  0 10:46 ?        00:00:00 ./mgr PARAMFILE /ORAGG/product/11.2.1/gghome11/dirprm/mgr.prm REPORTFILE /ORAGG/product/11.2.1/gghome11/dirrpt/MGR.rpt PROCESSID MGR PORT 7809

--Relocate Goldengate Application. 
crsctl relocate resource ggateapp -f
CRS-2673: Attempting to stop 'ggateapp' on 'dummyxt274'
CRS-2677: Stop of 'ggateapp' on 'dummyxt274' succeeded
CRS-2673: Attempting to stop 'mvggatevip' on 'dummyxt274'
CRS-2677: Stop of 'mvggatevip' on 'dummyxt274' succeeded
CRS-2672: Attempting to start 'mvggatevip' on 'dummyxt271'
CRS-2676: Start of 'mvggatevip' on 'dummyxt271' succeeded
CRS-2672: Attempting to start 'ggateapp' on 'dummyxt271'
CRS-2676: Start of 'ggateapp' on 'dummyxt271' succeeded

--Test Relocate on each node. 
crsctl relocate resource ggateapp -n Node_name -f

Goldengate Action script.
#!/bin/sh
#############################################################################
#@(#) Clusterware script to manage Golden Gate v1.0
# Script to Manage Golden Gate from Clusterware
# change required Environment. 
#############################################################################
GGS_HOME=/ORAGG/product/11.2.1/gghome11
LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:${GGS_HOME}
ORACLE_HOME=/u00/app/oracle/product/11.2.0/DB
export GGS_HOME LD_LIBRARY_PATH ORACLE_HOME
# Function runCmd to run the Golden Gate Script Execution
runCmd()
{
ggsciCmd=$1
result=`${GGS_HOME}/ggsci << EOF
${ggsciCmd}
exit
EOF`
}
# Function CheckMgr to check the Golden Gate Manager process
checkMgr()
{
if ( [ -f "${GGS_HOME}/dirpcs/MGR.pcm" ] )
then
pid=`cut -f8 "${GGS_HOME}/dirpcs/MGR.pcm"`
if [ ${pid} = `ps -e |grep ${pid} |grep mgr |cut -d " " -f2` ]
then
exit 0
else
if [ ${pid} = `ps -e |grep ${pid} |grep mgr |cut -d " " -f1` ]
then
exit 0
else
exit 1
fi
fi
else
exit 1
fi
}
# Main Code to get the input and run it
case $1 in
'start') runCmd 'start manager'
runCmd 'start er *'
sleep 5
checkMgr
;;
'stop') runCmd 'stop er *'
runCmd 'stop er *!'
runCmd 'stop manager!'
exit 0
;;
'check') checkMgr
;;
'clean') runCmd 'stop er *'
runCmd 'stop er *!'
runCmd 'kill er *'
runCmd 'stop manager!'
exit 0
;;
'abort') runCmd 'stop er *!'
runCmd 'kill er *'
runCmd 'stop manager!'
exit 0
;;
esac
# End of Script
#############################################################################

Reference: Oracle GoldenGate Best Practices: Oracle GoldenGate high availability using Oracle Clusterware [ID 1313703.1]

3 comments:

  1. Hi Jignesh, In 10g & 11gR1, while relocating GG to other node, it is assumed that the filesystem hosting the cfg & trail files is available. In 11gR2, ACFS can handle that but in 10g, how does one relocate the filesystem too?

    Thanks

    ReplyDelete
  2. Hi Jignesh, In 10g & 11gR1, while relocating GG to other node, it is assumed that the filesystem hosting the cfg & trail files is available. In 11gR2, ACFS can handle that but in 10g, how does one relocate the filesystem too?

    Thanks

    ReplyDelete
  3. Hi Jignesh, Nice post post.. Thanks for sharing.
    How to automate the start/stop golden gate process by using resource "ggateapp" (which you created in your post)

    Thanks, Satish

    ReplyDelete