Tuesday, October 8, 2013

Identify and Diagnostic of Hardware Failure using ILOM on Exadata

Links to this post
To identify Faulty Hardware on Exadata , it use the Sun ILOM.

Below are some steps to Identify and Diagnostics using ILOMs on Exadata Machine using command Lines.

You can Take snapshot and Upload to SR , using ILOM snapshot ASR will help you to Diagnose and provide Further support.

To Take snapshot, Login into ILOM.

Oracle(R) Integrated Lights Out Manager

Version 3.1.2.10 r74387

Copyright (c) 2012, Oracle and/or its affiliates. All rights reserved.

-> help
The help command is used to view information about commands and targets

Usage: help [-format wrap|nowrap] [-o|-output terse|verbose]
[|legal|targets|| ]

Special characters used in the help command are
[]   encloses optional keywords or options
<>   encloses a description of the keyword
     (If <> is not present, an actual keyword is indicated)
|    indicates a choice of keywords or options

help               displays description if this target and its
properties
help     displays description of this property of this target
help targets               displays a list of targets
help legal                 displays the product legal notice

Commands are:
cd
create
delete
dump
exit
help
load
reset
set
show
start
stop
version
--- Choose which type of snapshot you want to Take. 

->set /SP/diag/snapshot dataset=data   [normal|full]
->set /SP/diag/snapshot dump_uri= or   ftp://username:pwd@host_ip_address/~


Identify Hardware Failure.

1) Method.
-> show /SP/faultmgmt

 /SP/faultmgmt
    Targets:
        shell
        0 (/SYS/MB/P0/D7)

2) Method. Which is Very Detailed.

-> show -o table -level all /SP/faultmgmt


Target              | Property               | Value
--------------------+------------------------+---------------------------------
/SP/faultmgmt/0     | fru                    | /SYS/MB/P0/D7
/SP/faultmgmt/0/    | class                  | fault.memory.intel.sb.dimm_ce
 faults/0           |                        |
/SP/faultmgmt/0/    | sunw-msg-id            | SPX86-8004-CE
 faults/0           |                        |
/SP/faultmgmt/0/    | component              | /SYS/MB/P0/D7
 faults/0           |                        |
/SP/faultmgmt/0/    | uuid                   | 34d4bfaa-dummy-ebc8-f95a-dummy-
 faults/0           |                        | d17a
/SP/faultmgmt/0/    | timestamp              | 2013-10-05/23:13:06
 faults/0           |                        |
/SP/faultmgmt/0/    | fru_part_number        | 001-0003
 faults/0           |                        |
/SP/faultmgmt/0/    | fru_dash_level         | 01
 faults/0           |                        |
/SP/faultmgmt/0/    | fru_rev_level          | 50
 faults/0           |                        |
/SP/faultmgmt/0/    | fru_serial_number      | 0000dummy00000dummy
 faults/0           |                        |
/SP/faultmgmt/0/    | fru_manufacturer       | Hynix Semiconductor Inc.
 faults/0           |                        |
/SP/faultmgmt/0/    | fru_name               | 8192MB DDR3 SDRAM DIMM
 faults/0           |                        |
/SP/faultmgmt/0/    | system_manufacturer    | Oracle Corporation
 faults/0           |                        |
/SP/faultmgmt/0/    | system_name            | Exadata X3-2
 faults/0           |                        |
/SP/faultmgmt/0/    | system_part_number     | Exadata X3-2
 faults/0           |                        |
/SP/faultmgmt/0/    | system_serial_number   | AK00122916
 faults/0           |                        |
/SP/faultmgmt/0/    | chassis_manufacturer   | Oracle Corporation
 faults/0           |                        |
/SP/faultmgmt/0/    | chassis_name           | SUN FIRE X4270 M3
 faults/0           |                        |
/SP/faultmgmt/0/    | chassis_part_number    | 700000000
 faults/0           |                        |
/SP/faultmgmt/0/    | chassis_serial_number  | 1323XXXXX03F
 faults/0           |                        |
/SP/faultmgmt/0/    | system_component_manuf | Oracle Corporation
 faults/0           | acturer                |
/SP/faultmgmt/0/    | system_component_name  | SUN FIRE X4270 M3
 faults/0           |                        |
/SP/faultmgmt/0/    | system_component_part_ | 70000000
 faults/0           | number                 |
/SP/faultmgmt/0/    | system_component_seria | 1323XXXXX03F
 faults/0           | l_number               |
/SP/faultmgmt/0/    | serd_count             | 0x7b
 faults/0           |                        |
/SP/faultmgmt/0/    | _list_idx              | 0
 faults/0           |                        |
/SP/faultmgmt/0/    | _list_sz               | 1
 faults/0           |                        |