[OpenIndiana-discuss] system hang
gonczi at comcast.net
gonczi at comcast.net
Wed Mar 30 23:04:23 UTC 2011
Hi Ben,
The first thing I usually try when a hang happens, is loading the kernel
debugger (before the hang happens, or course)
First, make sure you shut off the graphic console ( svcadm disable gdm)
This is a critical step, otherwise the mdb window pops open in Hyperspace
and you will not be able to access it, leaving you with the unpleasant option
of pulling the plug to restart the machine.
Next, you have 2 choices: either edit the boot stanza , or just
run mdb -K from one of your login sessions.
The boot stanza can be edited (temporarily, the changes are not saved between
reboots) by pressing "e" in the boot menu
while the cursor is on the kernel that you want to boot.
Here, you would replace ", console=graphic" with " -k -d "
(and probably delete the "splash image" line).
If the system is able to come up, and you are just debugging some
predictable / reproducible hang, the mdb -K method is much easier.
Note, it is uppercase K, and do verify that your console is in text mode and.
You need to be near the console (ILOM is OK).
When you type mdb -K, the console pops into the debugger.
At this point, the machine is at a breakpoint, so you need to type ":c"
ie "colon c" on your console to continue, and let the machine run.
Given that you managed to load the debugger, you should be able to break
into mdb at will, by pressing a magic key combo on the console.
On Sparc, I recall it is ctrl ]
On intel, try
F1 A, or
ctrl-alt-D (as in the letter D) or
shift-break
Try all of the above, to see which one triggers the debugger for you.
shift-break usually works for me.
If you are desperate and can not find a key combo that works,
another possibility is set up the system for NMI triggered mdb.
Most motherboards have an NMI pin (see motherboard docs).
If you short this to ground, the mobo generates an NMI
(a non-maskable interrrupt).
It is common to have a GND (ground) pin right
next to this, so effectively you just momentarily connect the 2 pins.
You will need the following line in /etc/system to hook up the NMI to
trigger the debugger breakpoint:
set pcplusmp:apic_kmdb_on_nmi=1
It would be also useful to verify that the machine is configured
to save crash dumps ( see man dumpadm).
Once your system is set up, get it to hang, and then break into the debugger,
and poke around. You may want to intentionally crash the machine at this point,
just to generate a crash dump.
It can be done a number of ways, an easy one is writing
0 into the (r)ip register and typing :c
e.g:
<rip/w 0
:c
It is just easier to work on a crash dump, than on a live system.
E.g: generate a ::threadlist -v piped to a file, then pull that up in your favorite editor
to see what all the theads are doing.
The ::status command will, of course indicate a null pointer de-reference crash
do not be thrown by that, since you know you intentionally caused it.
best wishes
Steve
More information about the OpenIndiana-discuss
mailing list