[Xnv-team] [OpenIndiana Distribution - Bug #1625] Xorg hang (100% CPU), nvidia-related

illumos project devnull at illumos.org
Thu Mar 22 19:05:34 UTC 2012


Issue #1625 has been updated by Marion Hakanson.


Ken, thanks for looking into this.  I understand your reasons for closing this item -- heck, I can't even reproduce it on demand, yet it does continue to happen on a regular basis on at least one of my two desktop machines.

As was hinted at above, I'm wondering if this is in fact not Xorg or nvidia specific, but rather a case of lost interrupts peculiar to my hardware.  Both of these systems have the nvidia driver sharing the same IRQ with some other onboard devices (keyboard, mouse, disk, or ehci drivers).  In fact, something changed on my older machine when I went to oi151a2 (prestable), which seems to have changed the IRQ assignments (nvidia driver no longer shares with ehci and pci-ide), and I haven't had this problem on that machine since.

If I could get some guidance on troubleshooting or diagnosing a system in the state where drivers are no longer receiving interrupts, I believe I could make some progress with this problem.

Could this issue be re-classified appropriately, and kept open until I give up on it?


Dell Optiplex 980:

<pre>
# echo "::interrupts -d" | mdb -k
IRQ  Vect IPL Bus    Trg Type   CPU Share APIC/INT# Driver Name(s) 
1    0x41 5   ISA    Edg Fixed  4   1     0x0/0x1   i8042#0
9    0x80 9   PCI    Lvl Fixed  1   1     0x0/0x9   acpi_wrapper_isr
11   0xd1 14  PCI    Lvl Fixed  2   1     0x0/0xb   hpet_isr
12   0x42 5   ISA    Edg Fixed  5   1     0x0/0xc   i8042#0
16   0x82 9   PCI    Lvl Fixed  7   2     0x0/0x10  ehci#0, nvidia#0
23   0x83 9   PCI    Lvl Fixed  0   1     0x0/0x17  ehci#1
24   0x40 5   PCI    Edg MSI    3   1     -         ahci#0
25   0x81 7   PCI    Edg MSI    6   1     -         pcieb#0
26   0x60 6   PCI    Edg MSI    1   1     -         e1000g#0
32   0x20 2          Edg IPI    all 1     -         cmi_cmci_trap
160  0xa0 0          Edg IPI    all 0     -         poke_cpu
208  0xd0 14         Edg IPI    all 1     -         kcpc_hw_overflow_intr
209  0xd3 14         Edg IPI    all 1     -         cbe_fire
210  0xd4 14         Edg IPI    all 1     -         cbe_fire
240  0xe0 15         Edg IPI    all 1     -         xc_serv
241  0xe1 15         Edg IPI    all 1     -         apic_error_intr
# 
</pre>


Old Pentium-4 machine:
<pre>
# echo "::interrupts -d" | mdb -k
IRQ  Vect IPL Bus    Trg Type   CPU Share APIC/INT# Driver Name(s) 
1    0x41 5   ISA    Edg Fixed  0   1     0x0/0x1   i8042#0
4    0xb0 12  ISA    Edg Fixed  0   1     0x0/0x4   asy#0
6    0x44 5   ISA    Edg Fixed  0   1     0x0/0x6   fdc#0
7    0x45 5   ISA    Edg Fixed  0   1     0x0/0x7   ecpp#0
9    0x80 9   PCI    Lvl Fixed  0   1     0x0/0x9   acpi_wrapper_isr
12   0x42 5   ISA    Edg Fixed  0   1     0x0/0xc   i8042#0
15   0x43 5   ISA    Edg Fixed  0   1     0x0/0xf   ata#1
16   0x81 9   PCI    Lvl Fixed  0   3     0x0/0x10  uhci#3, uhci#0, nvidia#0
18   0x84 9   PCI    Lvl Fixed  0   4     0x0/0x12  uhci#2, e1000g#0, pci-ide#1
, pci-ide#1
19   0x83 9   PCI    Lvl Fixed  0   1     0x0/0x13  uhci#1
22   0x40 5   PCI    Lvl Fixed  0   1     0x0/0x16  pci-ide#2
23   0x82 9   PCI    Lvl Fixed  0   1     0x0/0x17  ehci#0
160  0xa0 0          Edg IPI    all 0     -         poke_cpu
208  0xd0 14         Edg IPI    all 1     -         kcpc_hw_overflow_intr
209  0xd1 14         Edg IPI    all 1     -         cbe_fire
210  0xd3 14         Edg IPI    all 1     -         cbe_fire
240  0xe0 15         Edg IPI    all 1     -         xc_serv
241  0xe1 15         Edg IPI    all 1     -         apic_error_intr
# 
</pre>

----------------------------------------
Bug #1625: Xorg hang (100% CPU), nvidia-related
https://www.illumos.org/issues/1625

Author: Marion Hakanson
Status: Closed
Priority: Low
Assignee: OI XNV
Category: XNV (X Window System)
Target version: oi_151_stable
Difficulty: Hard
Tags: nvidia


This Xorg hang is happening on two of my systems. Both run OI-151a now, and both have the nVidia driver version 280.13. The problem also happened under OI-148 with the same nVidia driver, and with several previous driver revisions (I download them directly from the nVidia site). One machine is 64-bit Intel Core i7 w/8GB RAM, GeForce 7300GT; The other is 32-bit Intel P4 w/2GB RAM, GeForce 6200.

When the problem occurs, mouse and keyboard input are ignored, but one can log remotely. You see the Xorg process using 100% CPU. Actually it shows in "prstat -m" as 33% USR, 33% LCK, 33% SLP. Usually restarting gdm is not sufficient, I think the nVidia card or driver is in an unhappy state, with some blocks of black & white bars on the screen after Xorg exits, so I usually just reboot the system.

Also, the Xorg.0.log shows entries like the ones below at the end, prior to killing the Xorg process:

(WW) Oct 04 10:24:40 NVIDIA: WAIT (2, 6, 0x8000, 0x0000c290, 0x0000ca08)
(WW) Oct 04 10:24:47 NVIDIA: WAIT (1, 6, 0x8000, 0x0000c290, 0x0000ca08)
. . .

At the same time, this shows up in /var/adm/messages:

Oct 4 10:24:44 kyklops nvidia: [ID 702911 kern.notice] NVRM: Xid (0000:01:00): 16, Head 00000000 Count 00fdd77b
Oct 4 10:24:44 kyklops nvidia: [ID 702911 kern.notice] NVRM: Xid (0000:01:00): 16, Head 00000001 Count 00fdd74c
Oct 4 10:24:52 kyklops nvidia: [ID 702911 kern.notice] NVRM: Xid (0000:01:00): 16, Head 00000000 Count 00fdd77c
Oct 4 10:24:52 kyklops nvidia: [ID 702911 kern.notice] NVRM: Xid (0000:01:00): 16, Head 00000001 Count 00fdd74d
Oct 4 10:24:53 kyklops nvidia: [ID 702911 kern.notice] NVRM: Xid (0000:01:00): 8, Channel 0000001e
Oct 4 10:25:02 kyklops nvidia: [ID 702911 kern.notice] NVRM: Xid (0000:01:00): 8, Channel 00000020
Oct 4 10:25:02 kyklops nvidia: [ID 702911 kern.notice] NVRM: Xid (0000:01:00): 16, Head 00000000 Count 00fdd77e
Oct 4 10:25:02 kyklops nvidia: [ID 702911 kern.notice] NVRM: Xid (0000:01:00): 16, Head 00000001 Count 00fdd74f

Note that the problem is difficult to replicate on demand. It seems to happen at random times, and probably at or near the time of a window focus change. It does not appear to be necessary to be viewing flash video or a web page, or using any particular X application. I do have "compiz" window-manager effects enabled.

Let me know what kind of debug information I can collect, and how.



-- 
You have received this notification because you have either subscribed to it, or are involved in it.
To change your notification preferences, please click here: http://www.illumos.org/my/account



More information about the Xnv-team mailing list