[OpenIndiana-discuss] spontaneous reboot with record in fault management
jason matthews
jason at broken.net
Mon Aug 26 19:37:56 UTC 2013
on occasion i have systems spontaneously rebooting. i can often find entries like this in fault management but it is not particularly helpful. i suspect there is really nothing wrong and the software is generating a panic and rebooting. is there a way to mask this from any type of action or figure out what the source of the issue is?
in this particularly case, i watched the system dump 96gb of ram on to a dedicated dump device. however, i was unable to retrieve the data afterwards and received a message from savecore that read something like 'save core: bad magic number b'
any insights would be appreciated.
thanks,
j.
root at db017:~# fmadm faulty
--------------- ------------------------------------ -------------- ---------
TIME EVENT-ID MSG-ID SEVERITY
--------------- ------------------------------------ -------------- ---------
Aug 25 20:08:47 6c3020a1-e7bf-69e3-ab37-cb68d4324a0e SUNOS-8000-J0 Major
Host : db017
Platform : S5520UR Chassis_id : ............
Product_sn :
Fault class : defect.sunos.eft.unexpected_telemetry 50%
fault.sunos.eft.unexpected_telemetry 50%
Problem in : dev:////pci@0,0
faulted and taken out of service
Description : The diagnosis engine encountered telemetry from the listed
devices for which it was unable to perform a diagnosis -
Refer to http://sun.com/msg/SUNOS-8000-J0 for more information.
Refer to http://sun.com/msg/SUNOS-8000-J0 for more information.
Response : Error reports have been logged for examination by Sun.
Impact : Automated diagnosis and response for these events will not occur.
Action : Ensure that the latest Solaris Kernel and Predictive Self-Healing
(PSH) patches are installed.
More information about the OpenIndiana-discuss
mailing list