[Xnv-team] [OpenIndiana Distribution - Bug #1625] Xorg hang (100% CPU), nvidia-related
illumos project
devnull at illumos.org
Thu Mar 1 21:25:51 UTC 2012
Issue #1625 has been updated by Marion Hakanson.
A couple of new pieces of information:
(1) I've been repeatedly failing to get a good crash dump. Turns out this was due to bug #1369, so see that bug report for the workaround.
(2) When the hang happens, if instead of rebooting you just kill the Xorg process, when Xorg tries to restart it fails, complaining that the nvidia card is not receiving interrupts. On my two machines, other devices sharing the same IRQ also fail to receive any interrupts when in this state.
(3) The problem still occurs in the stable release candidate oi_151a2, with the latest nVidia driver, version 295.20. This despite the new driver claiming to have fixed a bug which causes hangs on OpenGL systems, including X11/compiz machines like these.
(4) I do finally have a successful crash dump from the latest incident in (3) above. Here's an excerpt of an "mdb -k" session:
<pre>
> ::cpuinfo
ID ADDR FLG NRUN BSPL PRI RNRN KRNRN SWITCH THREAD PROC
0 fffffffffbc304a0 1f 0 0 10 no no t-58 ffffff02d686cba0 Xorg
1 ffffff02d4d61040 1f 0 0 -1 no no t-3 ffffff000f696c40
(idle)
2 ffffff02d50f0b00 1f 0 0 -1 no no t-1 ffffff000f505c40
(idle)
3 ffffff02d50ed500 1f 0 0 -1 no no t-1 ffffff000f59dc40
(idle)
4 ffffff02d50ec000 1f 0 0 -1 no no t-20 ffffff000f564c40
(idle)
5 fffffffffbc3aca0 1b 0 0 59 no no t-2 ffffff02debef7c0 reboot
6 ffffff02d50e2080 1f 1 0 -1 no no t-5 ffffff000fbfec40
(idle)
7 ffffff02d5269000 1f 0 0 -1 no no t-5 ffffff000fc7fc40
(idle)
> ffffff02d686cba0::findstack
stack pointer for thread ffffff02d686cba0: ffffff000fcd9a00
ffffff000fcd9a80 xc_serv+0x186()
ffffff000fcd9b10 0xffffff02d686cba0()
ffffff000fcd9b40 restorecontext+0x13b()
ffffff000fcd9f00 getsetcontext+0x227()
ffffff000fcd9f10 _interrupt+0xba()
>
</pre>
Perhaps this isn't actually an nvidia-related problem, but instead some kind of interrupt-handling problem. I would welcome suggestions on what to look for in the crash dump I've got, or I can make it available for download if someone wants to take a crack at it themselves.
----------------------------------------
Bug #1625: Xorg hang (100% CPU), nvidia-related
https://www.illumos.org/issues/1625
Author: Marion Hakanson
Status: New
Priority: Normal
Assignee: OI XNV
Category: XNV (X Window System)
Target version:
Difficulty: Hard
Tags: nvidia
This Xorg hang is happening on two of my systems. Both run OI-151a now, and both have the nVidia driver version 280.13. The problem also happened under OI-148 with the same nVidia driver, and with several previous driver revisions (I download them directly from the nVidia site). One machine is 64-bit Intel Core i7 w/8GB RAM, GeForce 7300GT; The other is 32-bit Intel P4 w/2GB RAM, GeForce 6200.
When the problem occurs, mouse and keyboard input are ignored, but one can log remotely. You see the Xorg process using 100% CPU. Actually it shows in "prstat -m" as 33% USR, 33% LCK, 33% SLP. Usually restarting gdm is not sufficient, I think the nVidia card or driver is in an unhappy state, with some blocks of black & white bars on the screen after Xorg exits, so I usually just reboot the system.
Also, the Xorg.0.log shows entries like the ones below at the end, prior to killing the Xorg process:
(WW) Oct 04 10:24:40 NVIDIA: WAIT (2, 6, 0x8000, 0x0000c290, 0x0000ca08)
(WW) Oct 04 10:24:47 NVIDIA: WAIT (1, 6, 0x8000, 0x0000c290, 0x0000ca08)
. . .
At the same time, this shows up in /var/adm/messages:
Oct 4 10:24:44 kyklops nvidia: [ID 702911 kern.notice] NVRM: Xid (0000:01:00): 16, Head 00000000 Count 00fdd77b
Oct 4 10:24:44 kyklops nvidia: [ID 702911 kern.notice] NVRM: Xid (0000:01:00): 16, Head 00000001 Count 00fdd74c
Oct 4 10:24:52 kyklops nvidia: [ID 702911 kern.notice] NVRM: Xid (0000:01:00): 16, Head 00000000 Count 00fdd77c
Oct 4 10:24:52 kyklops nvidia: [ID 702911 kern.notice] NVRM: Xid (0000:01:00): 16, Head 00000001 Count 00fdd74d
Oct 4 10:24:53 kyklops nvidia: [ID 702911 kern.notice] NVRM: Xid (0000:01:00): 8, Channel 0000001e
Oct 4 10:25:02 kyklops nvidia: [ID 702911 kern.notice] NVRM: Xid (0000:01:00): 8, Channel 00000020
Oct 4 10:25:02 kyklops nvidia: [ID 702911 kern.notice] NVRM: Xid (0000:01:00): 16, Head 00000000 Count 00fdd77e
Oct 4 10:25:02 kyklops nvidia: [ID 702911 kern.notice] NVRM: Xid (0000:01:00): 16, Head 00000001 Count 00fdd74f
Note that the problem is difficult to replicate on demand. It seems to happen at random times, and probably at or near the time of a window focus change. It does not appear to be necessary to be viewing flash video or a web page, or using any particular X application. I do have "compiz" window-manager effects enabled.
Let me know what kind of debug information I can collect, and how.
--
You have received this notification because you have either subscribed to it, or are involved in it.
To change your notification preferences, please click here: http://www.illumos.org/my/account
More information about the Xnv-team
mailing list