[OpenIndiana-discuss] A KVM switch broke my server?! Lightdm is panic'ing the kernel

Carl Brewer carl at bl.echidna.id.au
Sat Sep 4 00:10:57 UTC 2021


This is the panic message :


root at skaro:/var/log# fmdump -Vp -u 017daaba-0d44-c582-d73d-b554bcf017a5
TIME UUID                                 SUNW-MSG-ID
Sep 04 2021 09:50:42.514056000 017daaba-0d44-c582-d73d-b554bcf017a5 
SUNOS-8000-KL

   TIME                 CLASS                                 ENA
   Sep 04 09:50:41.8497 ireport.os.sunos.panic.dump_pending_on_device 
0x0000000000000000

nvlist version: 0
         version = 0x0
         class = list.suspect
         uuid = 017daaba-0d44-c582-d73d-b554bcf017a5
         code = SUNOS-8000-KL
         diag-time = 1630713042 443560
         de = fmd:///module/software-diagnosis
         fault-list-sz = 0x1
         fault-list = (array of embedded nvlists)
         (start fault-list[0])
         nvlist version: 0
                 version = 0x0
                 class = defect.sunos.kernel.panic
                 certainty = 0x64
                 asru = 
sw:///:path=/var/crash/skaro/.017daaba-0d44-c582-d73d-b554bcf017a5
                 resource = 
sw:///:path=/var/crash/skaro/.017daaba-0d44-c582-d73d-b554bcf017a5
                 savecore-succcess = 0
                 os-instance-uuid = 017daaba-0d44-c582-d73d-b554bcf017a5
                 panicstr = hat_devload: loading a mapping to free page 
fffffe0001007830
                 panicstack = unix:hat_devload+1ba () | 
gfx_private:gfxp_map_kernel_space+b5 () | nvidia:_nv027867rm+58 () | 
9c40000000000 () | nvidia:_nv002650rm+0 () | nvidia:_nv002231rm+0 () |
                 crashtime = 1630713007
                 panic-time = September  4, 2021 at 09:50:07 AM AEST AEST
         (end fault-list[0])

         fault-status = 0x1
         severity = Major
         __ttl = 0x1
         __tod = 0x6132b4d2 0x1ea3df40


Suggests that it's something in the Nvidia driver? This stuff is way 
beyond my debugging skill level.

dmesg reports

Sep  4 09:50:41 skaro genunix: [ID 936769 kern.info] winlock0 is 
/pseudo/winlock at 0
Sep  4 09:50:41 skaro devfsadmd[684]: [ID 511948 daemon.error] di_init 
failed for /pci at 0,0/pci1458,5007 at 14/input: No such device or address
Sep  4 09:50:41 skaro svc.startd[9]: [ID 652011 daemon.warning] 
svc:/application/virtualbox/zoneaccess:default: Method 
"/opt/VirtualBox/VBoxZoneAccess" failed with exit status 127.
Sep  4 09:50:41 skaro savecore: [ID 570001 auth.error] reboot after 
panic: hat_devload: loading a mapping to free page fffffe0001007830
Sep  4 09:50:41 skaro savecore: [ID 620374 auth.error] Panic crashdump 
pending on dump device but dumpadm -n in effect; run savecore(1M) 
manually to extract. Image UUID 017daaba-0d44-c582-d73d-b554bcf017a5.
Sep  4 09:50:42 skaro svc.startd[9]: [ID 652011 daemon.warning] 
svc:/application/virtualbox/zoneaccess:default: Method 
"/opt/VirtualBox/VBoxZoneAccess" failed with exit status 127.
Sep  4 09:50:42 skaro last message repeated 1 time
Sep  4 09:50:42 skaro svc.startd[9]: [ID 748625 daemon.error] 
application/virtualbox/zoneaccess:default failed: transitioned to 
maintenance (see 'svcs -xv' for details)
Sep  4 09:50:42 skaro svc.startd[9]: [ID 748625 daemon.error] 
network/fail2ban:default failed repeatedly: transitioned to maintenance 
(see 'svcs -xv' for details)
Sep  4 09:50:42 skaro fmd: [ID 377184 daemon.error] SUNW-MSG-ID: 
SUNOS-8000-KL, TYPE: Defect, VER: 1, SEVERITY: Major
Sep  4 09:50:42 skaro EVENT-TIME: Sat Sep  4 09:50:42 AEST 2021
Sep  4 09:50:42 skaro PLATFORM: B460MAORUSPRO, CSN: Default-string, 
HOSTNAME: skaro
Sep  4 09:50:42 skaro SOURCE: software-diagnosis, REV: 0.1
Sep  4 09:50:42 skaro EVENT-ID: 017daaba-0d44-c582-d73d-b554bcf017a5
Sep  4 09:50:42 skaro DESC: The system has rebooted after a kernel 
panic.  Refer to http://illumos.org/msg/SUNOS-8000-KL for more information.
Sep  4 09:50:42 skaro AUTO-RESPONSE: The failed system image was dumped 
to the dump device.  If savecore is enabled (see dumpadm(1M)) a copy of 
the dump will be written to the savecore directory .
Sep  4 09:50:42 skaro IMPACT: There may be some performance impact while 
the panic is copied to the savecore directory.  Disk space usage by 
panics can be substantial.
Sep  4 09:50:42 skaro REC-ACTION: If savecore is not enabled then please 
take steps to preserve the crash image.
Sep  4 09:50:42 skaro Use 'fmdump -Vp -u 
017daaba-0d44-c582-d73d-b554bcf017a5' to view more panic detail. Please 
refer to the knowledge article for additional information.
Sep  4 09:50:43 skaro mac: [ID 435574 kern.info] NOTICE: e1000g0 link 
up, 1000 Mbps, full duplex
Sep  4 09:50:47 skaro nvidia_modeset: [ID 107833 kern.notice] Unloading

I don't think the Virtualbox stuff is relevant.

I have debugging info in /var/crash/skaro after running savecore :

root at skaro:/var/cron# mkdir -p /var/crash/skaro
root at skaro:/var/cron# savecore
savecore: System dump time: Sat Sep  4 09:50:07 2021

savecore: Saving compressed system crash dump in /var/crash/skaro/vmdump.0
savecore: Decompress the crash dump with
'savecore -vf /var/crash/skaro/vmdump.0'
root at skaro:/var/cron# cd /var/crash/skaro/
root at skaro:/var/crash/skaro# ls -la
total 950365
drwxr-xr-x   2 root     root           4 Sep  4 10:08 .
drwxr-xr-x   3 root     root           3 Sep  4 10:08 ..
-rw-r--r--   1 root     root           2 Sep  4 10:08 bounds
-rw-r--r--   1 root     root     486211584 Sep  4 10:08 vmdump.0
root at skaro:/var/crash/skaro# savecore -vf /var/crash/skaro/vmdump.0
savecore: System dump time: Sat Sep  4 09:50:07 2021

savecore: saving system crash dump in /var/crash/skaro/{unix,vmcore}.0
Constructing namelist /var/crash/skaro/unix.0
Constructing corefile /var/crash/skaro/vmcore.0
  0:05 100% done: 501992 of 501992 pages saved
3155 (0%) zero pages were not written
0:05 dump decompress is done
root at skaro:/var/crash/skaro# ls -la
total 4959227
drwxr-xr-x   2 root     root           6 Sep  4 10:08 .
drwxr-xr-x   3 root     root           3 Sep  4 10:08 ..
-rw-r--r--   1 root     root           2 Sep  4 10:08 bounds
-rw-r--r--   1 root     root     3858992 Sep  4 10:08 unix.0
-rw-r--r--   1 root     root     2085015552 Sep  4 10:08 vmcore.0
-rw-r--r--   1 root     root     486211584 Sep  4 10:08 vmdump.0
root at skaro:/var/crash/skaro# file *
bounds:         ascii text
unix.0:         ELF 64-bit LSB executable AMD64 Version 1, statically 
linked, not stripped, no debugging information available
vmcore.0:       SunOS 5.11 illumos-6703a0e87b 64-bit Intel crash dump 
from 'skaro'
vmdump.0:       SunOS 5.11 illumos-6703a0e87b 64-bit Intel compressed 
crash dump from 'skaro'

There's nothing at http://illumos.org/msg/SUNOS-8000-KL to help.

Um ....





-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_signature
Type: application/pgp-signature
Size: 665 bytes
Desc: OpenPGP digital signature
URL: <http://openindiana.org/pipermail/openindiana-discuss/attachments/20210904/85292fc8/attachment-0001.bin>


More information about the openindiana-discuss mailing list