Tuesday, November 18, 2014

Exadata Schedule Exachk using OEM and command Line | Exadata healthcheck OEM

Links to this post
There is New Plugin to Install and schedule Exachk on Exadata/ Exalogic system Through 12c Grid control.

Plugin has to first deploy on OMS server and then Exadata / Exalogic Agents.

12c Home page setup (right upper corner) -> extensibility -> Plugin -> Engineered system -> Oracle Exadata healthcheck.

Once it deploy on 12c Grid OMS and Exadata agent.
You can create Monitoring Targets for Exadata Healthcheck Metrics.

https://docs.oracle.com/cd/E24628_01/install.121/e27420/toc.htm#PICHK105 

To schedule Using command Line.

Download Latest Exachk from
Oracle Exadata Database Machine exachk or HealthCheck (Doc ID 1070954.1).
Exachk Health-Check Tool for Exalogic (Doc ID 1449226.1)


Note : Exachk must run from Enterprise Vserver in Virtual Exalogic Environment.

---check Version

./exachk -v
EXACHK  VERSION: 12.1.0.2.1

---Set User equivelency between DB node to storage cell and IB switches 

./exachk -initpresetup   

---setup auto restart and Exachk Deamon. 

./exachk -initsetup
Setting up exachk auto restart functionality using inittab
Starting exachk daemon. . .  .

---Check auto restart and deamon status . 

./exachk -initcheck
Auto restart functionality is configured.
exachk daemon is running. PID : 00000

---schedule exachk. 

AUTORUN_SCHEDULE * * * *       :- Automatic run at specific time in daemon mode.
                 - - - -
                 ? ? ? ?
                 ? ? ? +----- day of week (0 - 6) (0 to 6 are Sunday to Saturday)
                 ? ? +---------- month (1 - 12)
                 ? +--------------- day of month (1 - 31)
                 +-------------------- hour (0 - 23)

./exachk -set "AUTORUN_SCHEDULE=23 * * 0;AUTORUN_FLAGS= -a -o v;NOTIFICATION_EMAIL=firstname.lastname@company.com;PASSWORD_CHECK_INTERVAL=1;"

---List all parameter for exachk. 

./exachk -get all

Sunday, November 2, 2014

Exadata Patching | Upgrade Exadata

Links to this post
This is High Level step by step Instruction for APR-2014 (11.2.0.4) Exadata QFSDP.

- if you are at Jan-2105 . do not place PSU into NFS mount.
- do not use -s to shutdown cluster if you are using dbnodeupdate.sh 4.13
- You must force_reset everytime you run patch precheck.
- dbnodeupdate.sh will run only on Node you are patching. reverse of Upgrading Exalogic compute Nodes.
- patchmgr always run from database nodes.


It is divided in Three major Part.

-Upgrade Exadata Database server RPMs
-Upgrade Storage server Image and Infiniband Switch.
-Upgrade Grid Home and Oracle Home .

1 ExadataDatabaseServer
This Part required Reboot and shutdown of CRS on Local Node. We will apply This patch In rolling.
Database Node gets updates from script provide along with Exadata QFSDP zip file.
Check usage of script.
dbnodeupdate.sh: Exadata Database Server Patching using the DB Node Update Utility (Doc ID 1553103.1)

1.1 copy zip file to Local directory and Inflate dbnodeupdate script comes with QFSDP

--You can skip this part if patch is located on Local storage.

 dcli -g db_group -l root "mkdir /u01/patches/YUM/”  
 dcli -g db_group -l root "cp <patch unzip location>/18370227/Infrastructure/11.2.3.3.0/ExadataDatabaseServer/p17809253_112330_Linux-x86-64.zip /u01/patches/YUM"  

This will inflate dbnodeupdate.sh script.

 cd <patch_location>/18370227/Infrastructure/ExadataDBNodeUpdate/3.2  
 unzip p16486998_121110_Linux-x86-64.zip  

Usage of script

 Usage: dbnodeupdate.sh [ -u | -r | -c ] -l <baseurl|zip file> [-p] <phase> [-n] [-s] [-q] [-v] [-t] [-a] <alert.sh> [-b] [-m] | [-V] | [-h]  
 -u            Upgrade  
 -r            Rollback  
 -c            Complete post actions (relink all homes, enable GI to start)  
 -l <baseurl|zip file>  Baseurl (http or zipped iso file for the repository)  
 -s            Shutdown stack before upgrading/rolling back  
 -p            Bootstrap phase (1 or 2) only to be used when instructed by dbnodeupdate.sh  
 -q            Quiet mode (no prompting) only be used in combination with -t  
 -n            No backup will be created  
 -t            'to release' - used when in quiet mode or used when updating to one-offs/releases via 'latest' channel (requires 11.2.3.2.1)  
 -v            Verify prereqs only. Only to be used with -u and -l option  
 -b            Peform backup only  
 -a <alert.sh>      Full path to shell script used for alert trapping  
 -m            Install / update-to exadata-sun/hp-computenode-minimum only (11.2.3.3.0 and later)  
 -V            Print version  
 -h            Print usage  

1.2 Run pre-upgrade steps .

 ./dbnodeupdate.sh -u -l /u01/patches/YUM/p18876946_112331_Linux-x86-64.zip -v  
 ##########################################################################################################################  
 #                                                                           #  
 # Guidelines for using dbnodeupdate.sh (rel. 3.53):                                            #  
 #                                                            #  
 # - Prerequisites for usage:                                               #  
 #     1. Refer to dbnodeupdate.sh options. See MOS 1553103.1                             #  
 #     2. Use the latest release of dbnodeupdate.sh. See patch 16486998                        #  
 #     3. Run the prereq check with the '-v' option.                                 #  
 #                                                            #  
 #  I.e.: ./dbnodeupdate.sh -u -l /u01/my-iso-repo.zip -v                                #  
 #     ./dbnodeupdate.sh -u -l http://my-yum-repo -v                                 #  
 #                                                            #  
 # - Prerequisite dependency check failures can happen due to customization:                       #  
 #   - The prereq check detects dependency issues that need to be addressed prior to running a successful update.    #  
 #   - Customized rpm packages may fail the built-in dependency check and system updates cannot proceed until resolved. #  
 #                                                            #  
 #  When upgrading from releases later than 11.2.2.4.2 to releases before 11.2.3.3.0:                  #  
 #   - Conflicting packages should be removed before proceeding the update.                      #  
 #                                                            #  
 #  When upgrading to releases 11.2.3.3.0 or later:                                   #  
 #   - When the 'exact' package dependency check fails 'minimum' package dependency check will be tried.        #  
 #   - When the 'minimum' package dependency check also fails,                             #  
 #    the conflicting packages should be removed before proceeding.                          #  
 #                                                            #  
 # - As part of the prereq checks and as part of the update, a number of rpms will be removed.              #  
 #  This removal is required to preserve Exadata functioning. This should not be confused with obsolete packages.    #  
 #   - See /var/log/cellos/packages_to_be_removed.txt for details on what packages will be removed.          #  
 #                                                            #  
 # - In case of any problem when filing an SR, upload the following:                           #  
 #   - /var/log/cellos/dbnodeupdate.log                                        #  
 #   - /var/log/cellos/dbnodeupdate.<runid>.diag                                    #  
 #   - where <runid> is the unique number of the failing run.                             #  
 #                                                            #  
 ##########################################################################################################################  
 Continue ? [y/n]  
 y  
  (*) 2015-02-01 15:46:28: Unzipping helpers (QFSDP_JULY2014_EXADATA/19069261/Infrastructure/ExadataDBNodeUpdate/3.53/dbupdate-helpers.zip) to /opt/oracle.SupportTools/dbnodeupdate_helpers  
  (*) 2015-02-01 15:46:29: Initializing logfile /var/log/cellos/dbnodeupdate.log  
  Warning: Active NFS and/or SMBFS mounts found on this DB node.  
       Before taking a backup or performing the actual update these need to be unmounted.  
       For the actual update (not now) dbnodeupdate.sh will try unmounting them silently.  
       During collection of system configuration (prereq) stale network mounts may cause long waits and dbnodeupdate.sh to stall  
       It is therefore recommended (not required) to unmount any active network mount now before continuing.  
 Continue ? [y/n]  
 y  
  (*) 2015-02-01 15:47:52: Collecting system configuration details. This may take a while...  
  (*) 2015-02-01 15:48:40: Validating system details for known issues and best practices. This may take a while...  
  (*) 2015-02-01 15:48:40: Checking free space in /u01/patches/YUM/iso.stage.010215154537  
  (*) 2015-02-01 15:48:40: Unzipping /u01/patches/YUM/p18876946_112331_Linux-x86-64.zip to /u01/patches/YUM/iso.stage.010215154537, this may take a while  
  (*) 2015-02-01 15:48:51: Original /etc/yum.conf moved to /etc/yum.conf.010215154537, generating new yum.conf  
  (*) 2015-02-01 15:48:51: Generating Exadata repository file /etc/yum.repos.d/Exadata-computenode.repo  
  ERROR: Duplicate entries detected in /etc/fstab. Correct settings and rerun dbnodeupdate.sh.  
  (*) 2015-02-01 15:50:03: Cleaning up iso and temp mount points  
 p –v    



1.2 Upgrade Compute Node From Local Yum zip file.

This command upgrade / and reboot Compute Node at end Of Patch. If we include –s it will stop CRS in this Node.
you can monitor progress by tail -f /var/log/cellos/dbnodeupdate.log

 ./dbnodeupdate.sh -u -l /u01/patches/YUM/p18876946_112331_Linux-x86-64.zip -n -s  
  (*) 2015-02-03 00:12:29: Cleaning up the yum cache.  
  (*) 2015-02-03 00:12:31: Performing yum package dependency check for 'exact' dependencies. This may take a while...  
  (*) 2015-02-03 00:12:33: 'Exact' package dependency check failed.  
  (*) 2015-02-03 00:12:54: Performing yum package dependency check for 'minimum' dependencies. This may take a while...  
  (*) 2015-02-03 00:12:56: 'Minimum' package dependency check succeeded.  
 Active Image version  : 11.2.3.3.0.131014.1  
 Active Kernel version : 2.6.39-400.126.1.el5uek  
 Active LVM Name    : /dev/mapper/VGExaDb-LVDbSys1  
 Inactive Image version : n/a  
 Inactive LVM Name   : /dev/mapper/VGExaDb-LVDbSys2  
 Current user id    : root  
 Action         : upgrade  
 Upgrading to      : 11.2.3.3.1.140529.1 (to exadata-sun-computenode-minimum)  
 Baseurl        : file:///var/www/html/yum/unknown/EXADATA/dbserver/030215001041/x86_64/ (iso)  
 Iso file        : /u01/patches/YUM/iso.stage.030215001041/112331_base_repo_140529.1.iso  
 Create a backup    : No  
 Shutdown stack     : Yes (Currently stack is up)  
 RPM exclusion list   : Not in use (add rpms to /etc/exadata/yum/exclusion.lst and restart dbnodeupdate.sh)  
 RPM obsolete list   : /etc/exadata/yum/obsolete.lst (lists rpms to be removed by the update)  
             : RPM obsolete list is extracted from exadata-sun-computenode-11.2.3.3.1.140529.1-1.x86_64.rpm  
 Exact dependencies   : Will fail on a next update. Update to 'exact' will be not possible. Falling back to 'minimum'  
             : See /var/log/cellos/exact_conflict_report.030215001041.txt for more details  
             : Update target switched to 'minimum'  
 Minimum dependencies  : No conflicts  
 Logfile        : /var/log/cellos/dbnodeupdate.log (runid: 030215001041)  
 Diagfile        : /var/log/cellos/dbnodeupdate.030215001041.diag  
 Server model      : SUN SERVER X4-2  
 Remote mounts exist  : Yes (dbnodeupdate.sh will try unmounting)  
 dbnodeupdate.sh rel.  : 3.53 (always check MOS 1553103.1 for the latest release of dbnodeupdate)  
 Note          : After upgrading and rebooting run './dbnodeupdate.sh -c' to finish post steps.  
 The following known issues will be checked for but require manual follow-up:  
  (*) - Issue - Yum rolling update requires fix for 11768055 when Grid Infrastructure is below 11.2.0.2 BP12  
 Continue ? [y/n]  
 Continue ? [y/n]  
 y  
  (*) 2015-02-03 00:13:06: Verifying GI and DB's are shutdown  
  (*) 2015-02-03 00:13:06: Shutting down GI and db  
  (*) 2015-02-03 00:13:36: Collecting console history for diag purposes  
  (*) 2015-02-03 00:14:00: Successfully unmounted network mount /nfs_mount/backup02  
  (*) 2015-02-03 00:14:00: Successfully unmounted network mount /nfs_mount/backup01  
  (*) 2015-02-03 00:14:05: Successfully unmounted network mount /nfs_mount/backup01  
  (*) 2015-02-03 00:14:06: Successfully unmounted network mount /nfs_mount/backup02  
  (*) 2015-02-03 00:14:06: Successfully unmounted network mount /nfs_mount/p01_bak01  
  (*) 2015-02-03 00:14:06: Successfully unmounted network mount /nfs_mount/p01_bak02  
  (*) 2015-02-03 00:14:06: Successfully unmounted network mount /nfs_mount/p01_bak01  
  (*) 2015-02-03 00:14:06: Successfully unmounted network mount /nfs_mount/p01_bak02  
  (*) 2015-02-03 00:14:06: Unmount of /boot successful  
  (*) 2015-02-03 00:14:06: Check for /dev/sda1 successful  
  (*) 2015-02-03 00:14:06: Mount of /boot successful  
  (*) 2015-02-03 00:14:06: Disabling stack from starting  
  (*) 2015-02-03 00:14:13: ExaWatcher stopped successful  
  (*) 2015-02-03 00:14:13: Validating the specified source location.  
  (*) 2015-02-03 00:14:14: Cleaning up the yum cache.  
  (*) 2015-02-03 00:14:17: Performing yum update. Node is expected to reboot when finished.  
  (*) 2015-02-03 00:16:45: Waiting for post rpm script to finish. Sleeping another 60 seconds (60 / 900)  
 Remote broadcast message (Tue Feb 3 00:16:53 2015):  
 Exadata post install steps started.  
 It may take up to 15 minutes.  
  (*) 2015-02-03 00:17:45: Waiting for post rpm script to finish. Sleeping another 60 seconds (120 / 900)  
  (*) 2015-02-03 00:18:45: Waiting for post rpm script to finish. Sleeping another 60 seconds (180 / 900)  
  (*) 2015-02-03 00:19:45: Waiting for post rpm script to finish. Sleeping another 60 seconds (240 / 900)  
  (*) 2015-02-03 00:20:45: Waiting for post rpm script to finish. Sleeping another 60 seconds (300 / 900)  
  (*) 2015-02-03 00:21:45: Waiting for post rpm script to finish. Sleeping another 60 seconds (360 / 900)  
  (*) 2015-02-03 00:22:45: Waiting for post rpm script to finish. Sleeping another 60 seconds (420 / 900)  
 Remote broadcast message (Tue Feb 3 00:23:13 2015):  
 Exadata post install steps completed.  
  (*) 2015-02-03 00:23:45: Waiting for post rpm script to finish. Sleeping another 60 seconds (480 / 900)  
  (*) 2015-02-03 00:24:46: All post steps are finished.  
  (*) 2015-02-03 00:24:46: System will reboot automatically for changes to take effect  
  (*) 2015-02-03 00:24:46: After reboot run "./dbnodeupdate.sh -c" to complete the upgrade  
  (*) 2015-02-03 00:25:05: Cleaning up iso and temp mount points  
  (*) 2015-02-03 00:25:06: Rebooting now...  
 Broadcast message from root (pts/6) (Tue Feb 3 00:25:06 2015):  
 The system is going down for reboot NOW!  
 ----------------------------  
 1st time reboot.   
 ----------------------------  
 ./dbnodeupdate.sh -c  
 Continue ? [y/n]  
 y  
  (*) 2015-02-03 01:49:49: Unzipping helpers (/QFSDP_JULY2014_EXADATA/19069261/Infrastructure/ExadataDBNodeUpdate/3.53/dbupdate-helpers.zip) to /opt/oracle.SupportTools/dbnodeupdate_helpers  
  (*) 2015-02-03 01:49:49: Initializing logfile /var/log/cellos/dbnodeupdate.log  
  (*) 2015-02-03 01:49:50: Collecting system configuration details. This may take a while...  
 Active Image version  : 11.2.3.3.1.140529.1  
 Active Kernel version : 2.6.39-400.128.17.el5uek  
 Active LVM Name    : /dev/mapper/VGExaDb-LVDbSys1  
 Inactive Image version : n/a  
 Inactive LVM Name   : /dev/mapper/VGExaDb-LVDbSys2  
 Current user id    : root  
 Action         : finish-post (validate image status, fix known issues, cleanup, relink and enable crs to auto-start)  
 Shutdown stack     : No (Currently stack is down)  
 Logfile        : /var/log/cellos/dbnodeupdate.log (runid: 030215014947)  
 Diagfile        : /var/log/cellos/dbnodeupdate.030215014947.diag  
 Server model      : SUN SERVER X4-2  
 dbnodeupdate.sh rel.  : 3.53 (always check MOS 1553103.1 for the latest release of dbnodeupdate)  
 The following known issues will be checked for but require manual follow-up:  
  (*) - Issue - Yum rolling update requires fix for 11768055 when Grid Infrastructure is below 11.2.0.2 BP12  
 Continue ? [y/n]  
 y  
  (*) 2015-02-03 01:54:28: Verifying GI and DB's are shutdown  
  (*) 2015-02-03 01:54:31: Verifying firmware updates/validations. Maximum wait time: 60 minutes.  
  (*) 2015-02-03 01:54:31: If the node reboots during this firmware update/validation, re-run './dbnodeupdate.sh -c' after the node restarts.........  
 Broadcast message from root (console) (Tue Feb 3 02:03:08 2015):  
 The system is going down for system halt NOW!  
 ----------------------------  
 2nd time reboot.   
 ----------------------------  
 [root@pwerxd01dbadm04 3.53]# ./dbnodeupdate.sh -c  
 Continue ? [y/n]  
 y  
  (*) 2015-02-03 02:13:44: Unzipping helpers (/19069261/Infrastructure/ExadataDBNodeUpdate/3.53/dbupdate-helpers.zip) to /opt/oracle.SupportTools/dbnodeupdate_helpers  
  (*) 2015-02-03 02:13:45: Initializing logfile /var/log/cellos/dbnodeupdate.log  
  (*) 2015-02-03 02:13:45: Collecting system configuration details. This may take a while...  
 Active Image version  : 11.2.3.3.1.140529.1  
 Active Kernel version : 2.6.39-400.128.17.el5uek  
 Active LVM Name    : /dev/mapper/VGExaDb-LVDbSys1  
 Inactive Image version : n/a  
 Inactive LVM Name   : /dev/mapper/VGExaDb-LVDbSys2  
 Current user id    : root  
 Action         : finish-post (validate image status, fix known issues, cleanup, relink and enable crs to auto-start)  
 Shutdown stack     : No (Currently stack is down)  
 Logfile        : /var/log/cellos/dbnodeupdate.log (runid: 030215021342)  
 Diagfile        : /var/log/cellos/dbnodeupdate.030215021342.diag  
 Server model      : SUN SERVER X4-2  
 dbnodeupdate.sh rel.  : 3.53 (always check MOS 1553103.1 for the latest release of dbnodeupdate)  
 The following known issues will be checked for but require manual follow-up:  
  (*) - Issue - Yum rolling update requires fix for 11768055 when Grid Infrastructure is below 11.2.0.2 BP12  
 Continue ? [y/n]  
 y  
  (*) 2015-02-03 01:16:00: Verifying GI and DB's are shutdown  
  (*) 2015-02-03 01:16:02: Verifying firmware updates/validations. Maximum wait time: 60 minutes.  
  (*) 2015-02-03 01:16:02: If the node reboots during this firmware update/validation, re-run './dbnodeupdate.sh -c' after the node restarts..  
  (*) 2015-02-03 01:16:02: Collecting console history for diag purposes  
  (*) 2015-02-03 01:16:23: No rpms to remove  
  (*) 2015-02-03 01:16:45: EM Agent (in /u01/app/EMbase/core/12.1.0.3.0) stopped successfully  
  (*) 2015-02-03 01:16:45: Relinking all homes  
  (*) 2015-02-03 01:16:45: Unlocking /u01/app/11.2.0.4/grid  
  (*) 2015-02-03 01:16:51: Relinking /oracle/product/11.2.0.3 as orapnacod04 (WARNING: this home is not linked with rds - relink will also be done without rds option)  
  (*) 2015-02-03 01:17:02: Relinking /oracle/product/11.2.0.3 as orapnacop01 (with rds option)  
  (*) 2015-02-03 01:17:14: Relinking /oracle/product/11.2.0.3 as orapnacoq01 (WARNING: this home is not linked with rds - relink will also be done without rds option)  
  (*) 2015-02-03 01:17:25: Relinking /u01/app/11.2.0.4/grid as grid (with rds option)  
  (*) 2015-02-03 01:17:38: Executing /u01/app/11.2.0.4/grid/crs/install/rootcrs.pl -patch  
  (*) 2015-02-03 01:19:44: Sleeping another 60 seconds while stack is starting (1/5)  
  (*) 2015-02-03 01:19:44: Stack started  
  (*) 2015-02-03 01:19:44: Enabling stack to start at reboot. Disable this when the stack should not be starting on a next boot  
  (*) 2015-02-03 01:20:28: EM Agent (in /u01/app/EMbase/core/12.1.0.3.0) started successfully  
  (*) 2015-02-03 01:20:28: All post steps are finished.  

2 ExadataStorageServer and InfiniBandSwitch

2.1 Download Patch manager Plugin from 1487339.1 Metalink Note.

2.2 Preparing Exadata Cells for Patch Application

Execute below command as Root from Compute Node.
Generate rsa/dsa Key on Compute Node.

 ssh-keygen -t rsa  
 ssh-keygen -t dsa  
 Push key to all cell   
 dcli -l root -g cell_group –k  
 dcli -g cell_group -l root 'hostname -i'  

2.3 Set DISK REPAIR TIME on ASM disks.

 select dg.name,a.value from v$asm_diskgroup dg, v$asm_attribute a where dg.group_number=a.group_number and a.name='disk_repair_time';  
 alter diskgroup diskgroup_name set attribute 'disk_repair_time'='3.6h';  

2.4 Turn off DB services on compute Node for NON-ROLLING patch.

NOTE: THIS PART APPLY ONLY IF YOU DECIDE TO APPLY PATCH IN NON-ROLLING FASHION , DO NOT SHUT-DOWN IF PATCH IS ROLLING.

 Run below command as ROOT. From compute Node.   
 dcli -g dbs_group -l root "/u01/app/11.2.0/grid/bin/crsctl stop crs -f"  
 dcli -g dbs_group -l root "ps -ef | grep grid"   
 dcli -g cell_group -l root "cellcli -e alter cell shutdown services all"  

2.5 Pre-check on Patch on Cell storage using patch manager.

 cd <patchlocation>/18370227/Infrastructure/11.2.3.3.0/ExadataStorageServer_InfiniBandSwitch/patch_11.2.3.3.0.131014.1  
 ./patchmgr -cells ~/cell_group -reset_force  
 ./patchmgr -cells cell_group -patch_check_prereq [-rolling] [-ignore_alerts] [-smtp_from "addr" -smtp_to "addr1 addr2 addr3 ..."]  


2.6 Patch cell by Patch Utility.

If the prerequisite checks pass, then start patch application. Use -rolling option if you plan to use rolling updates. Use the -ignore_alerts option to ignore any open hardware alerts on the cells, and continue. Use the -smtp_from, -smtp_to options to set an e-mail address to receive patchmgr alert messages, and continue.

 ./patchmgr -cells ~/cell_group -reset_force  
 ./patchmgr -cells cell_group -patch [-rolling] [-ignore_alerts] [-smtp_from "from_email_address"] [-smtp_to "to_email_address1  to_email_address2 ..."]  

2.7 Check any Grid disk are inactive or offline.

 dcli -g ~/cell_group -l root                     \  
 "cat /root/attempted_deactivated_by_patch_griddisks.txt | grep -v   \  
 ACTIVATE | while read line; do str=\`cellcli -e list griddisk where  \  
 name = \$line attributes name, status, asmmodestatus\`; echo \$str | \  
 grep -v \"active ONLINE\"; done"  

2.8 Run ibswitch pre-requisite

 ./patchmgr –ibswitches -upgrade -ibswitch_precheck   

2.9 Apply patch on IBSWITCH.

 cd <patch location>/18370227/Infrastructure/11.2.3.3.0/ExadataStorageServer_InfiniBandSwitch/patch_11.2.3.3.0.131014.1  
 ./patchmgr –ibswitches -upgrade   

3 Database and Grid Home Upgrade.

3.1 Distribute GI and OH patch to NFS or /tmp
3.1 Install Latest OPatch and Oplan.
3.2 Generate steps to Patch GI using Oplan

 <$GRID_HOME>/OPatch/oplan generateApplySteps <patch location>/18370227/database/11.2.0.4.6_QDPE_Apr2014/18371656  
 <$GRID_HOME>/OPatch/oplan generateRollbackSteps <patch location>/18370227/database/11.2.0.4.6_QDPE_Apr2014/18371656  

3.3 Create OCM file

 dcli -g ~/dbs_group -l oracle $ORACLE_HOME/OPatch/ocm/bin/emocmrsp –output /home/oracle  

3.4 Follow Oplan generated File Instruction.

Thursday, October 30, 2014

Exalogic Patching - Virtual

Links to this post
This is procedure to apply Exalogic Virtual 2.0.6.1.1 April 2014 Patchset.

1 Set up the PSU
2 Run precheck for Patch
3 Create Rack History File.
4 Apply Patch on Exalogic Control servers.
5 Upgrade QDR InfiniBand (NM2-GW) Gateway Switches.
6 Update the Compute Node Base Image ROLLING.
7 Patch the OVMM, PC1, and PC2 Templates
8 Upgrade the Guest vServer  (VN02)
9 Observation
10 Reference

Important Notes About the Upgrade Procedure


Always run ExaPatch from the PSU bundle directory (where the  exapatch_descriptor.py file is located) using the full path to   exapatch from any compute node within the rack.
Do not run ExaPatch directly on the compute node being patched.
ExaPatch patches the NM2-GW and NM2-36P switches one switch at a time
(rolling upgrade). The switch running the master subnet manager is
patched after patching all the non-master switches.
The upgrade procedure supports upgrading the compute nodes one node at   a time (rolling upgrade). Upgrading one node at a time ensures that the hosted services and applications are not disrupted.
Oracle recommends that these patches be applied to a test or a   nonproduction system before it is applied to the production system. The total time taken for patching the test system can be used as a   baseline for scheduling the maintenance windows to patch the production system.
Perform the patching by following the steps exactly as documented in this readme.


--------------------------------------------
1 Set up the PSU
--------------------------------------------

~~~~~~~~

1. Log in to the compute node as root.
2. cd /exalogic-lcdata/patches
3. Add execute permissions for psuSetup.sh using the chmod command. In the following example,
      execute permissions for all users is added:
      # chmod a+x psuSetup.sh
4.  Run the script:
./psuSetup.sh ZFS_IP_Address [--mountonly] [--unmountonly] [--remount] [--verbose] [--force ] [--help]

[root@DummyCN01 patches]# ./psuSetup.sh 1.1.1.10

INFO: Pre-requiste Check...
INFO: Checking for Python version...
INFO: Python version check... succeeded
INFO: Checking for Root permissions...
INFO: /exalogic-lcdata is already mounted from 1.1.1.10
INFO: /exalogic-lctools is already mounted from 1.1.1.10
INFO: Extracting PSU Bundle data to /exalogic-lcdata. This will take a few minutes
ExaBR 1.1 (build 5951)
ExaBR 1.1 (build 5951)
INFO: /exalogic-lctools Version: 14.1 and expatch Version: 1.2.1 is already installed
INFO: Installation complete.
INFO: ExaPatch is installed in /exalogic-lctools/bin/exapatch
INFO: PSU is installed in /exalogic-lcdata/patches/Virtual/18178980/
#####

--------------------------------------------
2 Run precheck for Patch
--------------------------------------------

~~~~~~~~

1) Copy config file. 
cp /exalogic-lcdata/patches/Virtual/18178980/Infrastructure/2.0.6.1.1/exapatch_descriptor.py ./exalogic-lctools/bin
2) Cd to exapatch directory 
cd /exalogic-lcdata/patches/Virtual/18178980/Infrastructure/2.0.6.1.1/exalogic-lctools/bin
3) Run Exapatch pre-Patch check.
[root@DummyCN01 bin]# exapatch -a prePatchCheck
Logging to file /var/log/exapatch_20150730200807.log
log file: /var/log/exapatch_20200730200807.log
#####

Check authentication from Exapatch

The following prerequisites must be fulfilled on all Exalogic Control  vServers before they can be upgraded to version 12.1.4 b2500
 
When updating the Exalogic Control services, ExaPatch must be run on a compute node with TCP/IP access to all Exalogic Control vServers.
All the Exalogic Control vServers must be running. Verify that access to vServer-EC-OVMM and to the two vServer-EC-EMOC-PC is OK by running
   the following ExaPatch command:
         [root@compute-node]# /exalogic-lctools/bin/exapatch -a checkAuthentication
Log in to the Exalogic Control BUI and make sure that assets(switches storage) are managed by a single ProxyController(PC) at a time. If any of the assets appear to be managed by both the ProxyControllers, refer
           to the Troubleshooting section.
Back up the Exalogic Control Stack using the ExaBR tool, as described in Section 4.1 of Oracle Exalogic Elastic Cloud Backup and Recovery Guide Using ExaBR


-------------------------------------------------
3) Create Rack History File.
-------------------------------------------------

~~~~~~~~
[root@DummyCN01 bin]# ./exapatch -a getHistory
"/exalogic-lcdata/inventory/rack_history.xml"
INFO: Creating rack history file: /exalogic-lcdata/inventory/rack_history.xml
INFO: Updated rack history file: /exalogic-lcdata/inventory/rack_history.xml
#####

Make sure each component is in rack_history file.
WARNING: unable to update patch history: Unable to find unique identifier "xxyyzz123" for Compute-Node 1.1.557.66 in the rack history file.

-------------------------------------------------
4 ) Apply Patch on Exalogic Control servers.
-------------------------------------------------

1. Cd to Patch directory                                                                                       .
Cd /exalogic-lcdata/patches/1.78980/Infrastructure/2.0.6.1.1
2. Run the ExaPatch tool as follows:

~~~~~~~~
[root@DummyCN01 2.0.6.1.1]# /exalogic-lctools/bin/exapatch -a runExtension -p Exalogic_Control/emoc_patch_extension.py exapatch_descriptor.py
Logging to file /var/log/exapatch_20200804100747.log
Enter Compute-Node root password:
Enter ILOM-ComputeNode root password:
Enter ILOM-ZFS root password:
Enter vServer-EC-EMOC-PC 1.1.557.74 root password:
Enter vServer-EC-EMOC-PC 1.1.557.75 root password:
Enter EMOC-PC-service 1.1.557.74 root password:
Enter EMOC-PC-service 1.1.557.75 root password:
INFO: EMOC-PC-service 1.1.557.74 successfully completed all pre-patch checks
INFO: EMOC-PC-service 1.1.557.75 successfully completed all pre-patch checks
INFO: EMOC-EC-service 1.1.558.21 successfully completed all pre-patch checks
Upgrading PC software on EMOC-PC-service host 1.1.557.74 from version:
        12.1.4.2330
WARNING: unable to update patch history: RackHistory source file "/exalogic-lcdata/inventory/rack_history.xml" does not exist.
Upgrading PC software on EMOC-PC-service host 1.1.557.75 from version:
        12.1.4.2330
WARNING: unable to update patch history: RackHistory source file "/exalogic-lcdata/inventory/rack_history.xml" does not exist.
INFO: EMOC-PC-service 1.1.557.74 successfully completed all post-patch checks
Completed upgrade of PC software on EMOC-PC-service host 1.1.557.74 new version:
        12.1.4.2500
INFO: EMOC-PC-service 1.1.557.75 successfully completed all post-patch checks
Completed upgrade of PC software on EMOC-PC-service host 1.1.557.75 new version:
        12.1.4.2500
Upgrading EC software on EMOC-EC-service host 1.1.558.21 from version:
        12.1.4.2330
INFO: uploading patch bundle to vServer
INFO: running upgrade scripts
WARNING: unable to update patch history: RackHistory source file "/exalogic-lcdata/inventory/rack_history.xml" does not exist.
Completed upgrade of EC software on EMOC-EC-service host 1.1.558.21 new version:
        12.1.4.2500
INFO: EMOC-EC-service 1.1.558.21 successfully completed all post-patch checks
#####

-------------------------------------------------
5) Upgrade QDR InfiniBand (NM2-GW) Gateway Switches.
-------------------------------------------------

~~~~~~~~
----KILL user process.
Log in as root and run the following command:
[root@nm2gw-ib01 ~]# ps -ef | grep ssh


----Reduce the timeout for idle root sessions by editing sshd_config:
 
[root@nm2gw-ib01 ~]# vi /etc/ssh/sshd_config
-- 8< -- 
ClientAliveInterval 60
ClientAliveCountMax 3
-- 8< -- 

Restart the sshd service:
    
---[root@nm2gw-ib01 ~]# service sshd restart 
Stopping sshd:                                             [  OK  ]
Starting sshd:                                             [  OK  ]

---Verify that only one target is displayed by the sessions command, as shown in the following example:

[root@nm2gw-ib01 ~]# spsh
-> show /SP/sessions
/SP/sessions
Targets:
120350 (current)
      Properties:
      Commands:
      cd
      show
----Edit the timeout for the ILOM session:

 -> set /SP/cli timeout=1
 Set 'timeout' to '1'

---Get gwinstance value .
showgwconfig

----Updating the Firmware of the NM2-GW Switches

This step Approximately  41 Min.

Login into compute Node and Cd to Patch directory
Cd /exalogic-lcdata/patches/1.78980/Infrastructure/2.0.6.1.1
Apply Patch

    [root@compute-node]# /exalogic-lctools/bin/exapatch -a patch nm2-gw

[root@DummyCN01 2.0.6.1.1]# /exalogic-lctools/bin/exapatch -a patch nm2-gw
Logging to file /var/log/exapatch_20200804105003.log
INFO: NM2-GW-IB-Switch 1.1.557.72 pre-patch checks may run for approximately 10 minutes.
INFO: NM2-GW-IB-Switch 1.1.557.72 successfully completed all pre-patch checks
Upgrading InfiniBand software on NM2-GW-IB-Switch host 1.1.557.72 from version:
        SUN DCS gw version: 2.1.3-4
        Build time: Aug 28 2013 18.6:06
        FPGA version: 0x34
        SP board info:
        Hardware Revision: 0x0007
        Firmware Revision: 0x0000
        BIOS version: SUN0R100
        BIOS date: 06/22/2010
INFO: IB switch upgrade can take 10-15 minutes.
Completed upgrade of InfiniBand software on NM2-GW-IB-Switch host 1.1.557.72 new version:
        SUN DCS gw version: 2.1.4-1
        Build time: Jan.7 2020 11:18:57
        FPGA version: 0x34
        SP board info:
        Hardware Revision: 0x0007
        Firmware Revision: 0x0000
        BIOS version: SUN0R100
        BIOS date: 06/22/2010
INFO: NM2-GW-IB-Switch 1.1.557.72 post-patch checks may run for approximately 10 minutes.
INFO: NM2-GW-IB-Switch 1.1.557.72 successfully completed all post-patch checks
INFO: Changing master NM2-GW-IB-Switch 1.1.557.71 to standby.
INFO: NM2-GW-IB-Switch 1.1.557.71 pre-patch checks may run for approximately 10 minutes.
INFO: NM2-GW-IB-Switch 1.1.557.71 successfully completed all pre-patch checks
Upgrading InfiniBand software on NM2-GW-IB-Switch host 1.1.557.71 from version:
        SUN DCS gw version: 2.1.3-4
        Build time: Aug 28 2013 18.6:06
        FPGA version: 0x34
        SP board info:
        Hardware Revision: 0x0007
        Firmware Revision: 0x0000
        BIOS version: SUN0R100
        BIOS date: 06/22/2010
INFO: IB switch upgrade can take 10-15 minutes.
Completed upgrade of InfiniBand software on NM2-GW-IB-Switch host 1.1.557.71 new version:
        SUN DCS gw version: 2.1.4-1
        Build time: Jan.7 2020 11:18:57
        FPGA version: 0x34
        SP board info:
        Hardware Revision: 0x0007
        Firmware Revision: 0x0000
        BIOS version: SUN0R100
        BIOS date: 06/22/2010
INFO: NM2-GW-IB-Switch 1.1.557.71 post-patch checks may run for approximately 10 minutes.
INFO: NM2-GW-IB-Switch 1.1.557.71 successfully completed all post-patch checks
2020-08-04T10:50 2020-08-04T11:30:55
#####

-----If exalogic and exadata is connected via spine switch verify-topology will not work , Use ibnetdiscover instead

If the output does not contain all the NM2-GW switches in the rack, verify-topology will fail. To fix this do below.

~~~~~~~~
[root@DummyCN01 bin]# verify-topology
[ Exalogic Machine Infiniband Cabling Topology Verification Tool ]
[ERROR] switch xxyyxxzzz909090 does not configured to set ib node description.
[root@DummyCN01 bin]# ll /opt/exalogic.tools/tools/idgen.sh
-rwxr--r-- 1 root root 869 Oct 18  2013 /opt/exalogic.tools/tools/idgen.sh
[root@DummyCN01 bin]# /opt/exalogic.tools/tools/idgen.sh
spawn ssh-keygen -t dsa
Generating public/private dsa key pair.
Enter file in which to save the key (/root/.ssh/id_dsa):
/root/.ssh/id_dsa already exists.
Overwrite (y/n)? y
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /root/.ssh/id_dsa.
Your public key has been saved in /root/.ssh/id_dsa.pub.
The key fingerprint is:
root@DummyCN01.domain.com
[root@DummyCN01 bin]# /opt/exalogic.tools/tools/dcli -c 1.1.555.71,1.1.555.72 -k
root@1.1.555.72's password:
root@1.1.555.71's password:
1.1.555.71: ssh key added
1.1.555.72: ssh key added
[root@DummyCN01 bin]# /opt/exalogic.tools/tools/dcli -c 1.1.555.71,1.1.555.72 -f /opt/exalogic.tools/tools/network_tools/switch_node_desc_config.tgz -d /tmp
[root@DummyCN01 bin]# /opt/exalogic.tools/tools/dcli -c 1.1.555.71 -f /opt/exalogic.tools/tools/network_tools/remote_config_switch_node_desc.sh -d /tmp "/tmp/remote_config_switch_node_desc.sh 1"
1.1.555.71: exalogic/
1.1.555.71: exalogic/rc.local
1.1.555.71: exalogic/config_ib_desc.sh
1.1.555.71: exalogic/ib_set_node_desc.sh
1.1.555.71: Successfully configured switch node description
[root@DummyCN01 bin]# /opt/exalogic.tools/tools/dcli -c 1.1.555.72 -f /opt/exalogic.tools/tools/network_tools/remote_config_switch_node_desc.sh -d /tmp "/tmp/remote_config_switch_node_desc.sh 2"
1.1.555.72: exalogic/
1.1.555.72: exalogic/rc.local
1.1.555.72: exalogic/config_ib_desc.sh
1.1.555.72: exalogic/ib_set_node_desc.sh
1.1.555.72: Successfully configured switch node description
#####

-------------------------------------------------
6) Update the Compute Node Base Image ROLLING.
-------------------------------------------------

Note: Updating Base Image on Compute Node does not required reboot of Compute Node.

-----Compute nodes can be patched in the following ways:
Rolling: Patch one node at a time
o This method applies on one node at a time, patching it. With this method, only one node is being patched at any point in time and the other nodes can continue to provide services, but the patching process takes more time.
Parallel: Patch multiple nodes simultaneouly
o This method applies compute node patches/updates across multiple nodes in parallel. Using ExaPatch, you can patch a subset of nodes at a time or patch all the nodes in the Exalogic rack.
It is recommended that you patch one compute node, verify that everything works as expected, and then attempt to patch multiple compute nodes in parallel.

-----Pre-requiste

Before upgrading:
Ensure that the compute node base image is at v2.0.6.1.0.
There is no need to backup the compute nodes. If there is a need to restore a compute node, install the base image and use the Exalogic Configuration Utility (ECU) to configure it.
Ensure that at least 80 MB of free space exists in the root (/) partition. You can free up some disk space by running yum clean all or by deleting files that are not longer needed in the /tmp directory. Do not delete files in the directories: /var/log/xen, /var/tmp/exalogic or /var/tmp/ebi_conf.pre20611.bak. In the /var/log directory, do not delete ExaPatch log files or the file called ebi_20611.log.


------Setup Patch on all Nodes

Transfer psuSetup.sh from the downloaded location to this machine:
    [root@compute-node2]# scp root@compute-node1:~/psuSetup.sh .

Run the psuSetup.sh while specifying ZFS active head IPoIB address and with --mountonly option:
    [root@compute-node2]# ./psuSetup.sh 1.1.555.120 --mountonly

First upgrade compute Node one of the Node.

~~~~~~~~
[root@DummyCN01 2.0.6.1.1]# /exalogic-lctools/bin/exapatch -a patch cn -h 1.1.557.68
Logging to file /var/log/exapatch_202008041.750.log
Enter Compute-Node root password:
INFO: Compute-Node 1.1.557.68 successfully completed all pre-patch checks
Upgrading Base Image software on Compute-Node host 1.1.557.68 from version:
        Image version       : 2.0.6.1.0
        Image build version : 228455
        BIOS                : 25010600-07/08/2013
        OFED                : OFED-IOV-1.5.5-1.0.054
        InfiniBand card     : 2.11.1282
        Disk controller     : 12.12.0-.78
Completed upgrade of Base Image software on Compute-Node host 1.1.557.68 new version:
        Image version       : 2.0.6.1.1
        Image build version : 228455
        BIOS                : 25010600-07/08/2013
        OFED                : OFED-IOV-1.5.5-1.0.054
        InfiniBand card     : 2.11.1282
        Disk controller     : 12.12.0-.78
INFO: Compute-Node 1.1.557.68 successfully completed all post-patch checks
#####


Now Upgrade compute Node rest of The Node.

~~~~~~~~
[root@DummyCN02 2.0.6.1.1]#/exalogic-lctools/bin/exapatch -a patch cn -h 1.1.557.63 -h 1.1.557.64 -h 1.1.557.65 -h 1.1.557.66 -h 1.1.557.67
Logging to file /var/log/exapatch_20200804124409.log
Enter Compute-Node root password:
        Image build version : 228455
        BIOS                : 25010600-07/08/2013
        OFED                : OFED-IOV-1.5.5-1.0.054
        InfiniBand card     : 2.11.1282
        Disk controller     : 12.12.0-.78
Upgrading Base Image software on Compute-Node host 1.1.557.63 from version:
        Image version       : 2.0.6.1.0
        Image build version : 228455
        BIOS                : 25010600-07/08/2013
        OFED                : OFED-IOV-1.5.5-1.0.054
        InfiniBand card     : 2.11.1282
        Disk controller     : 12.12.0-.78
WARNING: unable to update patch history: Unable to find unique identifier "xyxyxyxy" for Compute-Node 1.1.557.66 in the rack history file.
Completed upgrade of Base Image software on Compute-Node host 1.1.557.64 new version:
        Image version       : 2.0.6.1.1
        Image build version : 228455
        BIOS                : 25010600-07/08/2013
        OFED                : OFED-IOV-1.5.5-1.0.054
        InfiniBand card     : 2.11.1282
        Disk controller     : 12.12.0-.78
INFO: Compute-Node 1.1.557.64 successfully completed all post-patch checks


#####

----Verify upgrade.
This process takes less than 5 minutes. The main update to compute node is ksplice and version update. This does not cause a reboot on the compute node being upgraded. You may connect to the serial console through ILOM and monitor the upgrade process.
After all the compute nodes, other than the one on which ExaPatch is running, have been upgraded, log in to the second compute node (cn02) and run ExaPatch to upgrade the first (cn01) compute node.
Verify the new compute node base image version in the ExaPatch output:

~~~~~~~~

[root@DummyCN01 ~]# dcli -g cn_group.txt 'imageinfo | grep "Image version"'
DummyCN01: Image version       : 2.0.6.1.1

Get Version of Image History. 

[root@DummyCN01 ~]# dcli -g cn_group.txt 'imagehistory | grep "Image version"'
DummyCN01: Image version       : 2.0.6.1.1
DummyCN01: Image version       : 2.0.6.1.0

#####



-------------------------------------------------
7 Patch the OVMM, PC1, and PC2 Templates
-------------------------------------------------
--- Pre-requisite
Reboot Control Vserver Stack
While patching Vserver or Guest Vserver , measure progress from another session via xm console <id>
All Exalogic Control vServers must be running when the patching procedure starts.


----Apply Patch .


[root@compute-node1]# /exalogic-lctools/bin/exapatch -a patch ectemplates
Patching each vServer template can take at least 25 minutes. The total duration for this step is at least 75 minutes. During the patching process, the progress can be tracked by using xm console to log in to the console of the vServer being patched. If you are tracking the progress, after each vServer reboot you should run xm console again to reconnect to the vServer console.

~~~~~~~~
[root@DummyCN01 2.0.6.1.1]# /exalogic-lctools/bin/exapatch -a patch ectemplates
Logging to file /var/log/exapatch_2020080.73044.log
Enter vServer-EC-EMOC-PC 1.1.557.74 root password:
Enter vServer-EC-EMOC-PC 1.1.557.75 root password:
Enter EMOC-PC-service 1.1.557.74 root password:
Enter EMOC-PC-service 1.1.557.75 root password:
INFO: vServer-EC-OVMM 1.1.558.21 successfully completed all pre-patch checks
Upgrading Base Template on vServer-EC-OVMM host 1.1.558.21 from version:
        2.0.6.1.1
Completed upgrade of Base Template on vServer-EC-OVMM host 1.1.558.21 new version:
        2.0.6.1.1
INFO: vServer-EC-OVMM 1.1.558.21 successfully completed all post-patch checks
Completed patching VM template vServer-EC-OVMM 1.1.558.21
INFO: vServer-EC-EMOC-PC 1.1.557.74 successfully completed all pre-patch checks
Starting to patch VM template vServer-EC-EMOC-PC 1.1.557.74
Upgrading Base Template on vServer-EC-EMOC-PC host 1.1.557.74 from version:
        2.0.6.1.0
Completed upgrade of Base Template on vServer-EC-EMOC-PC host 1.1.557.74 new version:
        2.0.6.1.1
INFO: vServer-EC-EMOC-PC 1.1.557.74 successfully completed all post-patch checks
Completed patching VM template vServer-EC-EMOC-PC 1.1.557.74
INFO: vServer-EC-EMOC-PC 1.1.557.75 successfully completed all pre-patch checks
Starting to patch VM template vServer-EC-EMOC-PC 1.1.557.75
Upgrading Base Template on vServer-EC-EMOC-PC host 1.1.557.75 from version:
        2.0.6.1.0
Completed upgrade of Base Template on vServer-EC-EMOC-PC host 1.1.557.75 new version:
        2.0.6.1.1
INFO: vServer-EC-EMOC-PC 1.1.557.75 successfully completed all post-patch checks
Completed patching VM template vServer-EC-EMOC-PC 1.1.557.75
#####

----Work around for Failed exapatch, while updating EC template.

Refer to April 2020 PSU Upgrade - Known Issues (Doc ID.673860.1)

~~~~~~~~
1. On the EC vServer, back up /etc/nsswitch.conf 
2. comment out nis, nisplus, ldap in /etc/nsswitch.conf 
3, go to /opt/egbt_upgrade/BaseTemplate/2.0.6.0.1/scripts/ 
Run./egbt_upgrade.sh --unattended 
4. reboot 
5. run force upgrade from the compute node 
/exalogic-lctools/bin/exapatch -force -a patch ectemplates 
6. check the imageinfo, imagehistory on the vServer to make sure its April PSU. 
7. restore /etc/nsswitch.conf 
#####

-------------------------------------------------
8 Upgrade the Guest vServer  (VN02)
-------------------------------------------------


Log in to the first compute node and run the following command:

~~~~~~~~
[root@compute-node1]# /exalogic-lctools/bin/exapatch -a patch vserver -h <vserverIP1> [-h <vServerIPn>]
To patch multiple guest vServers, you can specify multiple -h vserverIP options.
The following is a sample of the output displayed on the console while patching from 2.0.6.0.0.:
Logging to file /var/log/exapatch_20200302134.7.log
Enter root password for guest vServers: 
INFO: vServer-Generic xx.xx.xx.xx successfully completed all pre-patch checks
Upgrading Base Template on vServer-Generic host xx.xx.xx.xx from version:
      2.0.6.0.0
Completed patching VM templates.     
#####

-------------------------------------------------
9 Observation
-------------------------------------------------

After upgrading each component , exapatch update rack_hisotry.xml file.
If –l option is not provided , then exapatch will generate logfile  itself.
Non master IB switch gets upgraded first, then non master switch changes  to master and then patch applies to Master.
Would like to see if  it disconnection existing connection while “ INFO: Changing master NM2-GW-IB-Switch 1.1.557.71 to standby.”
May be Gwinstance value get increase
Base Image update does not reboot compute Node or Migrate VM on it.
WARNING: unable to update patch history: Unable to find unique identifier "1337FML021" for Compute-Node 1.1.557.66 in the rack history file.
While patching Vserver or Guest Vserver , measure progress from another session via xm console <id>
All exapatch command must run from /exalogic-lcdata/patches/Virtual/1.78980/Infrastructure/2.0.6.1.1  dir and must be fully qualified path till exapatch.


-------------------------------------------------
10 Reference
-------------------------------------------------

Exalogic April 2014 PSU - 2.0.6.1.1 (Linux - Physical - Exalogic X4-2) Infrastructure Upgrade Guide (Doc ID 1638838.1)
Refer to April 2014 PSU Upgrade - Known Issues (Doc ID 1673860.1)
Exalogic Infrastructure PSU Patching - Troubleshooting Guide (Doc ID 1590392.1)
Exalogic Infrastructure PSU Upgrade - ExaPatch Guide (Doc ID 1612143.1)

Thursday, August 7, 2014

Goldengate Filter data based on Before image and and after Value in Extract.

Links to this post
Below is example to filter data based on Existing value or before image in column and New updated value .

This is used in 11.2 OGG. In OGG 12C use @BEFORE to Get before image.

So X_COMMAND will appear as "REFRESH" If status change from Inactive to Active.

X_COMMAND=@IF (((@STRFIND(before.STATUS, "Inactive") > 0 ) and (@STRFIND(STATUS, "Active") >0 )) > 0 , "REFRESH", @CASE(@GETENV("GGHEADER", "OPTYPE"), "SQL COMPUPDATE", "UPDATE", "PK UPDATE", "UPDATE"))


You can also use this code for compare Exist and new value.
((@STRFIND(before.STATUS, "Inactive") > 0 ) and (@STRFIND(STATUS, "Active") >0 )) > 0 . 

Monday, June 30, 2014

Goldengate Filter data using Lookup Table and SQLEXEC .

Links to this post

This is true that You can use multiple SQLEXEC in Goldengate.

Below code shows How to filter data from SOURCE Table using Lookup table and get filter value dynamically .

This code check If Old value ORG_ID for source_table is exist in LOOKUP_PARAMETER_TAB.
to filter using lookup table ,I used FILTER and get lookup value from SQLEXEC.if value match it execute another SQLEXEC where it
execute stored procedure and Get dynamic parameters using @GETVAL function and pass to Procedure.
TABLE "INV"."SOURCE_TABLE",FILTER (ORG_ID = @GETVAL(lookup.XYZ)),SQLEXEC (ID lookup, QUERY "select distinct (MASTER_ORG_ID) XYZ from ABC.LOOKUP_PARAMETER_TAB mp  where mp.MASTER_ORG_ID= :p_organization_id_X", PARAMS (p_organization_id_X=ORG_ID),BEFOREFILTER , TRACE),SQLEXEC (ID ABC.PACKAGE_TO_EXEC.PROCEDURE_TO_EXEC,SPNAME ABC.PACKAGE_TO_EXEC.PROCEDURE_TO_EXEC,PARAMS (P_DBNAME=@GETENV("DBENVIRONMENT","DBNAME"),P_TRANSACTION_ID=@GETENV("TRANSACTION", "XID"),P_COMMAND=@CASE(@GETENV("GGHEADER", "OPTYPE"), "INSERT", "INSERT", "SQL COMPUPDATE", "UPDATE", "PK UPDATE", "UPDATE","DELETE","DELETE"),P_INVENT_ITEM_ID=INVENT_ITEM_ID,P_ORGANIZATION_ID=ORG_ID),AFTERFILTER, TRACE);

Tuesday, June 17, 2014

Goldengate Stored procedure parameter in SQLEXEC with GETENV

Links to this post

Here GG is extracting ABC.DUMMY table and while It does that, It also does following.

- check Where ORGANIZATION_ID =83,
- if true above execute XYZ.DUMMY_PKG (PACKAGE) to DUMMY_PROCEDURE_NAME (procedure),GG has bug for fully qualify Name of schema and Packagename as prefix.
-while executing above Procedure , It gets value run-time and pass value to procedure.

~database Name using GETENV("DBENVIRONMENT","DBNAME").
~Object Name
~Inventory_item_id (comes within procedure )
~Get Command type INSERT/UPDATE/DELETE ,again case function wrote to convert SQL COMPUPDATE / PKUPDATE to UPDATE string.
~Get Transaction XID of database.

TABLE ABC.DUMMY,WHERE (ORGANIZATION_ID = 83),SQLEXEC (ID XYZ.DUMMY_PKG.DUMMY_PROCEDURE_NAME,SPNAME XYZ.DUMMY_PKG.DUMMY_PROCEDURE_NAME,PARAMS (P_DBNAME=@
GETENV("DBENVIRONMENT","DBNAME"),P_OBJECT_NAME='DUMMY',P_INVENTORY_ITEM_ID=INVENTORY_ITEM_ID,P_ORGANIZATION_ID=ORGANIZATION_ID,P_COMMAND=@CASE(@GETENV("GGHEADER", "OPTYPE"), "INSERT", "INSERT", "SQL COMPUPDATE", "UPDATE", "PK UPDATE", "UPDATE","DELETE","DELETE"),P_TRANSACTION_ID=@GETENV("TRANSACTION", "XID")));

Friday, May 30, 2014

ExaLogic | List VM on Compute Node.

Links to this post

Part of ELLC Tool, Exapatch has variety of Commands that can be Handy for Admin also.

Exapatch Use for Exalogic Patching 

we can query Exapatch to see How many VMs are running on which Compute Nodes.

/exalogic-lctools/bin/exapatch -a listVMs
Logging to file /var/log/exapatch_20140529151107.log
Compute-Node: 10.10.100.101:
        000aaa00000600004995bfe5e81dcbdf (ExalogicControlOpsCenterPC1)
        000aaa0000060000d299f9c94afe41da (ExalogicControl)

Compute-Node: 10.10.100.102:
        000aaa00000600001f012b58d1fb85b6 (dummyvn02)
        000aaa0000060000cc95187b5e532439 (ExalogicControlOpsCenterPC2)

Compute-Node: 10.10.100.103:
        000aaa0000060000063eb61a02ae11c1 (dummyvn04)
        000aaa000006000021013dc081e56c98 (dummyvn18)

Compute-Node: 10.10.100.104:
        000aaa00000600006955f56fefc87683 (dummyvn01)

Compute-Node: 10.10.100.105:
        000aaa00000600007f2ee56ea19ed478 (dummyvn09)

Compute-Node: 10.10.100.106:
        000aaa0000060000268c78fb303c0e25 (dummyvn03)

Compute-Node: 10.10.100.107:
        000aaa00000600003056e4719030bef3 (dummyvn50)

Compute-Node: 10.10.100.108:
        000aaa000006000044490976c7f0c9a6 (dummyvn08)


The second Best use of Exapatch is get Version of Whole Exalogic Stack components.

/exalogic-lctools/bin/exapatch -a getVersion

Wednesday, May 21, 2014

Exadata Bug-list fixed from Image 11.2.3.3 and Later.

Links to this post
As the Exadata X4-2 with 11.2.3.3 and later has below bugs fixed.

Fixes:
-----

9964936     ENHANCE EXADATA ASR TO FILE AUTO SRS FOR (PREDICTIVE) BATTERY FAILURES
11065811     FOR CELLDISK IMPORT REQUIRED PUBLISH EVENT TO ASM + DEL ENDIANNES FRM OWNER FILE
11683510     BATTERY TEMP NORMAL CLEAR MSG RECEIVED WITHOUT ANY ALERT
11838804     ACCEPT MULTIPLE DNS SERVERS FOR ILOM WHERE APPLICABLE IN IPCONF
12357450     FIX TIMESTAMPS IN ALERTS MINED FROM SEL ON V1 SYSTEMS
12708278     CELLSRV SHOULD LOG GD NAME AND OFFSET FOR IO ERRORS
13361797     RS FAILED TO RESTART MS IN RARE SCENARIOS
13495012     ALTER CELL SNMPSUBSCRIBER COMMAND SETS TYPE=ASR ON WRONG ENTRY
13498201     REMOVE SPURIOUS [SERV CELLSRV HANG DETECTED] AFTER CELLSRV CRASH
13521330     NEED TO IMPLEMENT "LIST DATABASE" FOR THE EM IORM UI
13618724     REMOVE ERROR MESSAGE PREFIX FROM ALERT TEXT
13725681     DCLI -K CREATES DIRECTORY /~/.SSH ON SOLARIS 10 NODES.
13737794     VARIABLE-BINDINGS ORDER OF CELLCLI TESTMESSAGE IS NOT VALID.
13807139     IMPROVE RS AND CELLSRV STARTUP IP ERROR REPORTING AND DIAGNOSTICS
13822165     ADD SHOW BANNER AND HIDE STDERR ARGUMENTS TO DCLI
13838283     BETTER ERROR MESSAGE FOR CREATE CELLDISK ON PHYSICAL DISKS IN FAILURE STATUS
13923317     COMMAND PARSING ANOMALY IN CELLCLI RESULTS IN UNEXPECTED SETTINGS
13934957     CELL PATCHING PREREQ-CHECK SHOULD FAIL IF IPCONF -VERIFY IS NOT OK
13934966     CELL PATCHING PREREQ-CHECK SHOULD FAIL IF LIST ALERTHISTORY" SHOW ALERTS
13935080     PATCH SHOULD NOT STATE IT FAILED IF PREREQS ARE NOT OK.
13938302     ALERT IS NOT CREATED IF VALUE OF CL_TEMP METRIC TRESPASSED BUILD-IN THRESHOLD
13973225     MS SHOULD GENERATE ALERTS / TRAPS WHEN SAS LANES IN THE SAS EXPANDER FAIL
13977078     ASR - GRID CONTROL AND ASR TRAP DESTINATION ENTRY REMOVED
14008398     HANDLE INVALID MODEL FROM DMIDECODE MORE GRACEFULLY
14043671     CELLSRV SHUTDOWN WITHOUT FORCE FAILS IF A DISKGROUP IS DISMOUNTED
14045900     SUMMARY TEXT FOR CALIBRATE COMMAND DOES NOT REPORT ERROR 
14065914     FLASH LOGGING FEATURE SHOULD HAVE METRIC FOR BUFFER ALLOCATION FAILURES
14092566     TRACK FLASH CACHE STATE WHEN NOT CACHING
14103957     FLASH CACHE/LOG NOT CREATED ON RESTORED FLASH DISK
14107147     RESYNC TIME SHOULD NOT BE PART OF PATCH TIMEOUT
14148776     ALTER CELLDISK NAME - NEW NAME IS NOT REFLECTED BY LIST FLASHCACHE AND CACHEDBY
14165314     DUPLICATE FS ALERTS SHOULD BE SUPPRESSED
14177001     CUSTOMIZED BATTERY LEARN CYCLE IN EXADATA
14192222     CELL SERVER MODEL SHOULD REFLECT HC OR HP
14199144     MS SENDS ALERTS FROM SNMP TRAP FROM NON-LOCAL SPS
14222004     RENAMING A CELL REGENERATES NEW TEMPERATURE ALERTS AND DONOT CLEAR OLD ALERTS
14239811     MS PD STATUS SHOULD SAY 'FAILED' INSTEAD OF 'CRITICAL'
14244206     MS SCHEDULED BBU RELEARN FAILS TO CHECK BBU STATUS NOR RETRIES 
14263653     NEED PERMANENT FIX FOR 14263651
14305629     DISK INSERTION PROCESSING TOO LONG
14311898     WRONG IORM PLAN OBJECTIVE IS DISPLAYED AFTER DOWNGRADING FROM 11.2.3.2
14312177     SLOW FLASH WITH FLASHLOG ON IT CAUSES CELL SERVICES TO FAIL STARTUP
14313375     ERROR MSG ON /TMP/OC4JPATCH/7439847 : NO SUCH FILE OR DIRECTORY
14356436     ALTER FLASHCACHE FLUSH NEEDS BETTER ERROR MESSAGE FOR WTFC
14366869     CELLSRV DIES WITH SIGSEGV IN FLASHLOGSTORE::SCANACTIVETABLE ALONG LRGSAIOV TEST
14368098     MS: FD WITH CRITICAL PD STATUS STILL SHOWN AS NORMAL
14378866     INCREASE NUMBER OF GRIDDISKS ALLOWED PER BATCH TO AVOID A DOUBLE REBALANCE
14464028     CELLDISK INVALID ERROR WHEN FLASHCACHE CREATED WITH NOT PRESENT STATUS CD 
14480010     FLASH LOG NEEDS IMPROVED CONCURRENCY FOR RE-ENABLING DISKS
14502930     CRITICAL ALERT GENERATED WHEN USB IS REBUILT SUCCESSFULLY 
14505249     IORM OBJECTIVE BALANCED AND LOW_LATENCY DON'T KICK IN FOR SOLO WORKLOAD MODE
14555001     CREATE FLASHCACHE ALL CREATES FLASHCACHE OF SIZE 128M WITH NOT NORMAL CELLDISKS 
14569694     PATCHMGR -CLEANUP SHOULD CLEANING UP _PATCH_HCTAP_/ OR PROVIDE INSTRUCTIONS TO
14588372     ASR - EXADATA FAILING TO SEND SNMP PACKET DUE TO JAVA NULL POINTER ERROR
14610867     FLASH LOG REDO LOG WRITE HISTOGRAM NEEDS TO BE MORE EASILY EXPORTED
14612318     ADD CELLCLI AND SCRIPT SUPPORT TO REPLACE AND RE-ENABLE BBU
14621505     EXADATA GRIDDISK AUTO CREATION FAILED
14646784     JAVA.UTIL.NOSUCHELEMENTEXCEPTION IN MS LOG WHEN EMAIL DELIVERY RETRY FAILS
14674689     FLASH LOG ACTIVE TABLE SIZE SHOULD BE INCREASED
14692944     NEED DETAILED STATS WHY CELL IOS ARE NOT CACHED
14747900     IORM_DATABASE METRICS SHOW DBUA DATABASE METRICS
14758854     FLASHCACHE SIZE CHANGES WHEN CORRUPT CELLDISK IS MADE NORMAL
14769540     DESCRIBE GRIDDISK DOES NOT LIST THE CACHEDBY ATTRIBUTE
14770723     EXPOSE LIFE LEFT ON EACH AURA2 CARD AS A PERCENTAGE 
14803349     DOM CONFINEMENT SHOULD CONSIDER PARTNERSHIP
14808660     CELLSRV FAILED TO START, HIT ORA-600[FLASHLOGPIECELIST::ADDPIECE]
14841844     ASM DOES NOT HAVE ITS OWN DB_* DATABASE METRIC
15871310     CREATE GRIDDISK ALL COMMAND FAILS WHEN A CELLDISK IS NOT NORMAL
15897446     ASR-EXADATA CELL LOCATION OF MIB CAUSING FIFO ERRORS
15963552     POKE FROM CELLSRV IS MISSING WHEN THERE IS NO ASM METADATA
15974057     ALWAYS FLUSH FLASHCACHE FOR WRITEBACK MODE WHEN DOWNGRADING BELOW 11.2.3.2.0
15994904     CELL NEEDS EIGHTH RACK SUPPORT
16001442     FAILED CELLCLI COMMANDS HAVE ZERO EXIT STATUS
16006228     DBSERVER_BACKUP.SH TAR COMMAND SHOULD CORRECTLY HANDLE SPARSE FILES
16028248     LUN AND PHYSICALDISK INFOR NOT CORRECT FOR LIST CELLDISK AFTER RENAME
16064753     ALTER FLASHCACHE ALL WITH A FLASHCACHE ATTRIBUTE REPORTS SUCCESS
16065180     FLASHCACHEMODE SET TO WRITETHROUGH FOR WRB FLASHCACHE
16067726     MS HANG AFTER [OSSMISC:OSSMISC_TIMER_TICKS] WHEN TIME JUMPS BACKWARDS
16074182     ADD "LIST DATABASE" COMMAND TO CELLCLI AND MS
16074653     CPU IMPROVEMENTS FOR FLASHACACHE
16081052     FLASH LOG DISK SIZE SHOULD BE CORRECTLY VALIDATED
16081421     HARD DISK FAILED TO MOUNT AUTOMATICALLY AFTER FLASH DISK FAILURE AND REPLACEMENT
16092303     MS SERVER.XML FILE TRUNCATED AS PART OF A CELL POWER CYCLE
16105593     CELL RPM UPGRADE CHECKING ALL TRACE FILES, CAUSING EXCESSIVE DELAY
16174361     ALTER FLASHCACHE ALL FLUSH DOES NOT FLUSH PEER FAILURE OR POOR PERF CELLDISKS
16193439     ASR/ SNMP TRAP SYSTEM IDENTIFIER FAILURE TO BE SENT ON HALRT FAULTS
16213900     "LIST LUN LUN_NAME" COMMAND FAILS WHEN CELLSRV IS STOPPED
16231174     MS ILLEGALMONITORSTATEEXCEPTION DURING ERASE
16232311     FLASH DISK ALERT SHOULD INCLUDE THE SERIAL NUMBER OF THE WHOLE CARD
16246710     SUNDIAG SHOULD RETRIEVE LSIDIAG_FULL WHEN AURA2.X PROBLEM DETECTED
16278024     FORCE DROP OF WBFC CD GD SHOULD CHECK FOR ASM REDUNDANCY
16278105     REDUCE NUMBER OF CONCURRENT IOS IN A CELL
16371635     CLUSTER WIDE CRASH AFTER ALL CELLSRV CRASHED WITH ORA-600 [KGKPLOALLOC1]
16392070     USE PSID TO UNIQUELY IDENTIFY THE INFINIBAND HCA FOR FIRMWARE CHECK AND UPDATES
16411024     NEED SUPPRESS DISK POWER STATE CHANGE ALERTS
16411732     QUERY/CREATE TABLE ETC. FAILING WITH ORA-27614: SMART I/O FAILED DUE TO AN ERROR
16413066     LIST CELL DOES NOT DISPLAY ALL ATTRIBUTES ON RE-CREATING CELL
16417471     CELLCLI LIST METRICCURRENT FC_IO_ERRS - CELL-02016: METRIC DOES NOT EXIST: FC_IO
16463547     OSS SUPPORT FOR PERMANENT KEEP ACROSS CELLSRV BOUNCE
16472221     WBFC: ORA-600 [FCCGETGDCLS_1] DURING FDOM OFFLINE
16472355     WBFC: CELLSERV HANG DURING FDOM OFFLINE/ONLINE 
16481592     PATCHMGR CLEANUP COLLECTING IRRELEVANT CONTENT
16487249     KERNEL SYSTEM TIME DRIFTING TOO FAST FOR NTP
16495446     UNCORE FREQUENCY IS NOT DISABLED ON X3-2 DB NODES 
16501767     DURING IMAGE UPGRADE TO 11.2.3.2.1 CELL NODE DOES NOT COME UP AFTER LAST REBOOT
16508451     IMPROVE LOG FILE STITCHING BY PATCHMGR
16510225     RE-IMAGE DOESN'T ENABLE ALL CPU CORES
16537444     EXADATA: CELLSRV HANG HAPPENED IN CELLTRANSITION
16538569     CHECKDEVEACHBOOT -FIX MDALL FIX GRUB CAN FAIL WHEN MD4 NOT SYNC'ED
16585329     CONFIGURE LOGROTATE TO ROTATE AND BZIP2 ALL .LOG/.TRC FILES IN /VAR/LOG/CELLOS
16586268     CRASHCORE FILE IS OVERWRITTEN EVERY TIME IN CASE OF OS CRASHDUMP
16590105     FSCK CHECKS NOT DISABLED IN FRESH IMAGE
16591877     ASMDISKGROUPNAME, ASMDISKNAME IS NOT POPULATED CORRECTLY AFTER ASM DG RECREATION
16605828     TEST CASE FOR BUG15882436
16684067     CELLSRV 1M REMOTE RECEIVE PORT BUFFER DEPLETION
16688320     UPON SENDASRTRAP FAILURE, MS FAILED TO SEND REMAINING SNMP SUBSCRIBERS
16688982     WBFC: ASSIGNING FLASH CD IS 10 MINS DELAY AFTER CELLSRV RESTART
16694632     IORM PLAN RESET DOES NOT FREE SUB HEAP EXTENTS
16696321     LNX64-11204-OSS: CELLSRV HITS ORA-600 [DISKIOSCHED::SETPLAN:DB OTHER]
16696985     IMPROVE ALERT LOG MESSAGES WHEN NETWORK ISN'T AVAILABLE
16699385     LNX64-11204-CSS: 160 DB IN ONE CLUSTER, NODE REBOOT AFTER CSSD CRASH
16704019     A "DISK REMOVED" ALERT IS SENT WHEN A DISK ACTUALLY FAILS
16705313     DCLI HANDLES HEAD PIPE INCORRECTLY
16717229     CHANGE DEFAULT VALUE FOR VM/MIN_FREE_KBYTES TO 500MB
16745871     DISABLE THE BUILT-IN CELL AMBIENT THRESHOLD
16768684     FLASHLOG NEEDS PERFORMANCE IMPROVEMENTS BASED ON SLOB RESULTS
16769818     VERIFY-TOPOLOGY NEEDS TO ACCOMODATE NODE_DESC CHANGES & LACK OF SPINE SWITCH
16774368     DISABLE EOIB IN EXADATA IMAGE TO AVOID EXCESSIVE SM LOGGING / LOG WRAP
16775584     MS: "LIST FLASHCACHECONTENT" RETURNS DUPLICATE KEEP OBJECTS
16777412     EXPOSE FLASHCACHE BYPASS REASON METRICS (BUG 14692944 ) VIA MS
16777594     GCW:PATCHMGR CLEANUP DIDN'T HAPPEN WHEN ANY PID MATCH INSTALL.PID
16777751     DCLI DOES NOT CAPTURE REMOTE HOST IDENTIFICATION CHANGED ERROR
16782749     FIX WBFC WRITE METRICS AND ADD REDIRTY METRIC
16796626     TURN OFF OSWATCHER MAKING EXAWATCHER THE ONLY ONE IN USE
16807611     MS FILE DELETION LOGGING AND ROLLING RENAME WRAP ISSUES
16809426     EXADATA ABSOLUTE SERVICE TIME VIOLATION DETECTED ON ONE DISK AFFECTING OTHERS
16815398     REDUCE MTU SIZE ON DB NODES IB INTERFACE TO 7000
16836361     CD_IO_LOAD METRIC VALUES ARE INCORRECT AND TOO HIGH WHEN LOAD IS LOW
16845112     CELLCLI COMMANDS FAIL WITH CELL-2664: FAILED TO CREATE FLASHCACHE ERRORS
16849845     WRB: CELLSRV HANG DURING FDOM FAILURE USING SETPCI
16858835     OUTOFMEMORYERROR RS-7445 [SERV MS NOT RESPONDING] [IT WILL BE RESTARTED]
16864784     TEST NETWORK STATUS
16887059     REMOVE OFED_INFO (RPM OFED-SCRIPTS)
16903390     ORA-600 [PREDICATEDISK::WRITE_5]
16917575     HOTSPARE NOT RECLAIMED WHEN UPGRADING TO 11.2.3.2.1 
16921398     MISSING ARCFOUR CIPHER IN SSHD_CONFIG BREAKS SNAPSHOT
16932116     RESOURCE CONTROL NEEDS MULTIPLE RUNS TO GET CURRENT STATUS 
16949685     TEST DISKGROUP MOUNT WITH ONE INVALID IP IN CELLIP.ORA
16954519     UPDATE COPYRIGHT YEAR 2012 IN CELLCLI BANNER
16964406     DONT LOAD MLX4_EN DRIVER
16973508     WFC: FLASHCACHE SIZE WAS ROUNDED TO MULTIPLE OF 16MB
16977810     SYSTEM DISK IMAGE FAILED DUE TO MD4 NOT DEGRADED
16988043     DISCONTINUE CHECKSWPROFILE.SH
16992011     ALL MDS SHOWS "REMOVED" AFTER REBOOT
16998810     /OPT/ORACLE.EXAWATCHER/ARCHIVE DIRECTORY SHOULD BE ALLOWED TO BE A SYMLINK
17039567     NEED TO UPDATE /ETC/SYSCONFIG/KDUMP KDUMP_COMMANDLINE_APPEND LINE
17084429     RECLAIMDISKS.SH FAILED ABRUPTLY AND -RESTORE OPTION FAILS TO RESTORE.
17088220     ENABLE HWCHECKER IN SOLARIS EXADATA COMPUTE NODE
17157638     PARALLEL DML PRODUCES INCORRECT SQL RESULT
17214800     SUNDIAG SHOULD REFER TO EXAWATCHER INSTEAD OF OSWATCHER
17251471     SPEED UP SINGLE THREAD
17277236     ALTER LUN REENABLE ON FLASH LOG DISK CAN RESULT IN CELLSRV CRASH
17278319     SUNDIAG ENHANCEMENTS FOR ASR, ILOM, EXAWATCHER, CELL CONFINEMENT, NETWORK DATA
17285226     WBFC: CELLSRV HANG IN FLASHCACHECORE.H DUE TO DIRTY LRU QUEUE UPDATE
17295207     REPLACE ALL CURRENT USAGE OF DATE TO DATE FORMAT +'%F %T %Z' FOR LOGS/MESSAGES.
17307247     OSWATCHER NEEDS TO COLLECT MEGARAID FWTERMLOG PERIODICALLY
17313339     SYSTEM DISK REPLACEMENT FAILS ON HP V1 EQUIPMENT
17330822     LNX64-12.1-ASM,CELLSRV CRASH WITH ORA-600[~PREDICATEMAPELEMENT3]
17336036     MISLEADING ERROR ABOUT MISSING XML FILE WHEN UBIOSCONFIG FAILS
17346692     EIGHTH-RACK CONFIGURATION CHANGED TO ENABLED AFTER RESCUE
17349857     NEED TO SET BOOTWITHPINNEDCACHE TO 1 SO THAT EXADATA SYSTEM CAN BOOT WITH PINNED
17362109     ENABLE AUTOMATIC FW UPDATES ON LINUX DB NODES
17371176     DISABLE BUILDS OF NON-UEK OPTIONS FOR DB NODE UPDATE 
17383646     SERIALIZE THE INFINICHECK EXECUTION 
17390553     IORM LIMIT DIRECTIVE CAUSES OVER THROTTLING
17404812     CONCURRENTMODIFICATIONEXCEPTION IN CONFINETRANSITION
17417506     IO ERROR DETAILS FROM CACHE OBJECT
17444979     INFINICHECK FAILS IN EXPANSION STORAGE ONLY MODE
17451210     CELLCLI SERIALIZES CELLMONITOR COMMANDS TOO MUCH
17472203     ADD ALERT DESCRIPTION FOR CONFINED ALERTS; SO EMAILS DO NOT SHOW 'HARDWARE ALERT
17475687     LIST FLASHCACHE CAN DISPLAY VALUES NOT YET POPULATED BY SYNCDISKONCE
17484677     SOME NULL POINTER EXCEPTIONS IN MS
17489799     ROLLBACK FAILED ON V1 BECAUSE OF MISSING USB DEVICES
17504127     EXCESSIVE TRACE ENTRIES WHEN MS RECEIVES A SNMP TRAP
17510981     BLACKLIST EDAC MODULE FROM LOADING
17511671     CALIBRATE FAILED IN CREATING THE RAND_ALL.LUN FILE.
17511684     SET WATCHDOG_THRESH TO 30 AND SET PRINTK TO "4 4 1 7"
17512231     IMPROVE PATCHMGR ERROR MESSAGE WHEN DETECTING ACTIVE ALERTS IN ALERT HISTORY

Tuesday, March 11, 2014

GC buffer busy acquire in RAC

Links to this post

This is classic case when same Update runs from multiple Instance in RAC at same time and Execution plan is worst of all. this will create Heavy block contention.

These wait events used to be "buffer busy" in 10g but in 11g It has splited and granual at detail level.

we first analyzed AWR from all 3 Nodes for same Time frame and we found that "GC buffer busy acquire" and " Read by another session"


ON INSTANCE 1

ON INSTANCE 2

ON INSTANCE 3


So I ran few queries which can tell us more about what happen during this time-frame.
You may also find this query at Daily Performance Report.


SELECT sql_id,
text,
elapsed_time,
CPU_TIME,
EXECUTIONS,
round(elapsed_time/Executions,0) avg_elp_time,
PX_SERVERS,
DISK_READ_BYTES,
DISK_WRITE_BYTES,
IO_INTERCONNECT_BYTES,
OFFLOAD_ELIGIBLE_BYTES,
CELL_SMART_SCAN_ONLY_BYTES,
FLASH_CACHE_READS,
ROWS_PROCESSED
FROM (SELECT x.sql_id,
SUBSTR ( dhst.sql_text, 1, 30) text,
ROUND ( x.elapsed_time / 1000000,0)  elapsed_time,
ROUND ( x.cpu_time / 1000000,0)  CPU_TIME,
x.executions_delta       EXECUTIONS,
ROUND (X.DISK_READ_BYTES/1048576,0)        DISK_READ_BYTES,
ROUND (X.DISK_WRITE_BYTES/1048576,0)       DISK_WRITE_BYTES,
ROUND (X.IO_INTERCONNECT_BYTES/1048576,0)  IO_INTERCONNECT_BYTES,
ROUND (X.OFFLOAD_ELIGIBLE_BYTES/1048576,0) OFFLOAD_ELIGIBLE_BYTES,
X.FLASH_CACHE_READS                        FLASH_CACHE_READS,
ROUND (X.cell_smart_scan_only_BYTES/1048576,0)  CELL_SMART_SCAN_ONLY_BYTES,
(x.ROWS_PROCESSED) ROWS_PROCESSED,
(X.PX_SERVERS) PX_SERVERS,
row_number () OVER (PARTITION BY x.sql_id ORDER BY 0) rn
FROM dba_hist_sqltext dhst,
(SELECT dhss.sql_id                       sql_id,
SUM (dhss.cpu_time_delta)                 cpu_time,
SUM (dhss.elapsed_time_delta)             elapsed_time,
SUM (dhss.executions_delta)               executions_delta,
SUM (dhss.PHYSICAL_READ_BYTES_DELTA)      DISK_READ_BYTES,
SUM (dhss.PHYSICAL_WRITE_BYTES_DELTA)     DISK_WRITE_BYTES,
SUM (dhss.IO_INTERCONNECT_BYTES_DELTA)    IO_INTERCONNECT_BYTES,
SUM (dhss.IO_OFFLOAD_ELIG_BYTES_DELTA)    OFFLOAD_ELIGIBLE_BYTES,
SUM (dhss.OPTIMIZED_PHYSICAL_READS_DELTA) FLASH_CACHE_READS,
SUM (dhss.IO_OFFLOAD_RETURN_BYTES_DELTA)  cell_smart_scan_only_BYTES,
SUM (dhss.ROWS_PROCESSED_DELTA)      ROWS_PROCESSED,
SUM (dhss.PX_SERVERS_EXECS_DELTA) PX_SERVERS
FROM dba_hist_sqlstat dhss
WHERE dhss.snap_id IN
(SELECT snap_id
FROM dba_hist_snapshot
-----change snap_id here. 
WHERE SNAP_ID > 31822 AND SNAP_ID<= 31826)
GROUP BY dhss.sql_id) x
WHERE x.sql_id = dhst.sql_id
AND ROUND ( x.elapsed_time / 1000000, 3) > 3600)
WHERE rn = 1 and EXECUTIONS> 0
ORDER BY ELAPSED_TIME DESC;

---output

SQL_ID        TEXT                            ELAPSED_TIME  CPU_TIME  EXECUTIONS AVG_ELP_TIME   PX_SERVERS  DISK_READ_BYTES DISK_WRITE_BYTES IO_INTERCONNECT_BYTES OFFLOAD_ELIGIBLE_BYTES CELL_SMART_SCAN_ONLY_BYTES  FLASH_CACHE_READS ROWS_PROCESSED
------------- ------------------------------ ------------- --------- ----------- ------------ ------------ ---------------- ---------------- --------------------- ---------------------- -------------------------- -------------------- --------------
6vbxrnpxwc1mz BEGIN Sp_XYZ_DUMMY_Reversa             53686       830          98          548            0           344717                0                344824                    0                        107                    0             98
g25x6rr6x4yv7 UPDATE ABC_DUMMY_XYZACTION             44306       571          88          503            0            30440                0                 30426                    0                        -14                    0             88
74d8zqzh802xq SELECT MAX (BUSINESS_DATE || B         42335       186       14930            3            0             8293                0                  8293                    0                          0                    0          14921
g75678tr0ddmw BEGIN SP_PASSIVEPERIOD_CALC(:1         30266       150       10679            3            0             5567                0                  5567                    0                          0                    0          10678
cpz2fp6466vus SELECT CPT_TOTALHOLD_AMT FROM           9040       255         211           43            0           314231                0                314231                    0                          0                    0            209
7d4xjr17waxy7 BEGIN SP_CARD_ISSUENCE_SRT_TO_          4701       533         107           44            0           164865                0                164865                    0                          0                    0            103
c8hgnxkkr7jvz SELECT abc_ROW_ID, abc_SSNFAIL          4685       532         103           45            0           164861                0                164861                    0                          0                    0            103

For Top Hot object by Physical Read.   (Hot Object)

SELECT * FROM (
SELECT do.OWNER||'.'||do.OBJECT_NAME||'..['||do.OBJECT_TYPE||']' AS OBJECTS,
DHSS.INSTANCE_NUMBER AS INST,
SUM(DHSS.LOGICAL_READS_DELTA) LOGICAL_READ,
SUM(DHSS.PHYSICAL_READS_DELTA) PHY_READ,
SUM(DHSS.PHYSICAL_WRITES_DELTA) PHY_WRIT,
SUM(DHSS.ITL_WAITS_DELTA) ITL_WT,
SUM(DHSS.ROW_LOCK_WAITS_DELTA) ROW_LCK_WT
from dba_hist_seg_stat DHSS, DBA_OBJECTS DO
WHERE SNAP_ID > 31822 AND SNAP_ID<= 31826
--WHERE DHSS.SNAP_ID > 20135 AND DHSS.SNAP_ID<= 20183
AND DHSS.OBJ#=DO.OBJECT_ID
group by do.OWNER||'.'||do.OBJECT_NAME||'..['||do.OBJECT_TYPE||']',DHSS.INSTANCE_NUMBER
order BY PHY_READ DESC
) WHERE ROWNUM <=40;


OBJECTS                                             INST         LOGICAL_READ             PHY_READ    PHY_WRIT   ITL_WT ROW_LCK_WT
-------------------------------------------------- ----- -------------------- -------------------- ----------- -------- ----------
corpABC.ABC_DUMMY_XYZACTION..[TABLE]                2             22388144             16357126        1797        0          0
corpABC.ABC_DUMMY_XYZACTION..[TABLE]                1             19444032             14708504        2055        0          0
corpABC.ABC_DUMMY_XYZACTION..[TABLE]                3             16945392             12908672        1367        0          0
corpABC.ABC_CAF_INFO_ENTRY..[TABLE]                 1              7789840              7788316          98        0          0
corpABC.ABC_CAF_INFO_ENTRY..[TABLE]                 2              6969952              6968505          97        0          0
corpABC.ABC_CAF_INFO_ENTRY..[TABLE]                 3              6354976              6353643         104        0          0
corpABC.ABC_APPL_PAN..[TABLE]                       3              1149808              1069323         227        0          0
corpABC.ABC_APPL_PAN..[TABLE]                       2               903440               803870         256        0          0
corpABC.XYZACTIONLOG..[TABLE]                       3              1867424               629380         689        0          0
corpABC.XYZACTIONLOG..[TABLE]                       1           1197310176               436078         723        0          0
corpABC.XYZACTIONLOG..[TABLE]                       2              1909632               372565        1255        0          0
SQL_ID affected by "gc buffer busy acquire"
SQL> SELECT INSTANCE_NUMBER,SQL_ID,COUNT(EVENT) FROM DBA_HIST_ACTIVE_SESS_HISTORY WHERE SNAP_ID > 31822 AND SNAP_ID<= 31826 AND EVENT LIKE '%busy acquire%' group by sql_id,INSTANCE_NUMBER order by 2 desc;

INSTANCE_NUMBER SQL_ID                COUNT(EVENT)
-------------   ------------- --------------------
            3   g25x6rr6x4yv7                 619
            2   g25x6rr6x4yv7                 548
            1   g25x6rr6x4yv7                 192
SQL_ID affected by "read by other session"
SQL> SELECT SQL_ID,COUNT(EVENT) FROM DBA_HIST_ACTIVE_SESS_HISTORY WHERE SNAP_ID > 31822 AND SNAP_ID<= 31826 AND EVENT LIKE '%read by%' group by sql_id order by 2 desc;

INSTANCE_NUMBER SQL_ID                COUNT(EVENT)
-------------   ------------- --------------------
            3   g25x6rr6x4yv7                 764
            2   g25x6rr6x4yv7                 658
            1   g25x6rr6x4yv7                 169
Now lets see what is plan of Most affected SQL_ID '
SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_AWR('g25x6rr6x4yv7'));


--------------------
UPDATE ABC_DUMMY_XYZACTION SET ABC_TOTALHOLD_AMT = TRIM(TO_CHAR(:B4, '77777777777')), ABC_DUMMY_VALIDFLAG = 'N',
ABC_XYZACTION_FLAG = 'R' WHERE ABC_RRN = :B3 AND ABC_TXN_DATE = :B2 AND ABC_INST_CODE = :B1

Plan hash value: 4161612620
----------------------------------------------------------------------------------------------
| Id  | Operation          | Name                    | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | UPDATE STATEMENT   |                         |       |       | 12704 (100)|          |
|   1 |  UPDATE            |     ABC_DUMMY_XYZACTION |       |       |            |          |
|   2 |   TABLE ACCESS FULL|     ABC_DUMMY_XYZACTION |     1 |   106 | 12704   (1)| 00:02:33 |
----------------------------------------------------------------------------------------------


And Of-course it will generate GC wait event.  As you see "corp.ABC.ABC_DUMMY_XYZACTION"  is top Objects for Physical Reads. We created index on few columns which solved issue.