[OpenIndiana-discuss] Plugging NVMe drive to USB-C triggers system panic

Stephan Althaus Stephan.Althaus at Duedinghausen.eu
Mon Dec 7 20:59:12 UTC 2020


Hello!

When i plug in a NVMe drive to the USB-C Port of my laptop, the system 
panics.

The fmadm says my device "JMICRON-JMS583" has issued an error state,
but the system should not panic, no?

(https://www.jmicron.com/products/list/13)

What does the panicstack say?
Why is there a panic as the system believes a reset to the XHCI bus is 
sufficient?


Any hints welcome!
Greetings, Stephan


*Details*

After reboot, the console lists many of these, until i plug out the 
USB-C device:

Dec  7 20:36:11 dell6510 xhci: [ID 291945 kern.warning] WARNING: xhci3: 
Encountered unsupported xHCI version 0.ffff
Dec  7 20:36:11 dell6510 xhci: [ID 884440 kern.warning] WARNING: xhci3: 
Root hub has 255 ports, but system only supports 31, limiting to 31

*$ dmesg*

<snip>
reboot after panic: XHCI runtime reset required
Dec  7 20:36:10 dell6510 savecore: [ID 780920 auth.error] Panic 
crashdump pending on dump device but dumpadm -n in effect; run 
savecore(1M) manually to extract. Image UUID 
0e9cf93e-0463-cf99-97d4-ef9612843fe8.
<snip>

*$ sudo fmadm faul**ty*
--------------- ------------------------------------ -------------- 
---------
TIME            EVENT-ID MSG-ID         SEVERITY
--------------- ------------------------------------ -------------- 
---------
Dec 07 20:34:41 3e63abab-31e9-479b-adc8-f7dcf8bdafe6 DISK-8000-3E   
Critical

Host        : dell6510
Platform    : Precision-7720    Chassis_id  : 49JT5M2
Product_sn  :

Fault class : fault.io.scsi.cmd.disk.dev.rqs.derr
Affects     : 
dev:///:devid=id1,sd@n3044564198835410//pci@0,0/pci8086,a114@1c,4/pci8086,15da@0/pci8086,15da@2/pci1028,7b1@0/storage@3/disk@0,0
                   faulted and taken out of service
FRU         : 
hc://:product-id=Precision-7720:server-id=dell6510:chassis-id=49JT5M2:serial=DD56419883E4A:part=JMICRON-JMS583:revision=0204/motherboard=0/hostbridge=4/pciexrc=4/pciexbus=5/pciexdev=0/pciexfn=0/pciexbus=6/pciexdev=2/pciexfn=0/pciexbus=61/pciexdev=0/pciexfn=0/port=0/usb-device=0/disk=0
                   faulty

Description : A non-recoverable hardware failure was detected by the device
               while performing a command.
               Refer to http://illumos.org/msg/DISK-8000-3E for more
               information.

Response    : The device may be offlined or degraded.

Impact      : The device has failed. The service may have been lost or
               degraded.

Action      : Ensure that the latest drivers and patches are installed.
               Schedule a repair procedure to replace the affected
               device. Use 'fmadm faulty' to find the affected disk.

--------------- ------------------------------------ -------------- 
---------
TIME            EVENT-ID MSG-ID         SEVERITY
--------------- ------------------------------------ -------------- 
---------
Dec 07 20:36:11 0e9cf93e-0463-cf99-97d4-ef9612843fe8 SUNOS-8000-KL  Major

Host        : dell6510
Platform    : Precision-7720    Chassis_id  : 49JT5M2
Product_sn  :

Fault class : defect.sunos.kernel.panic
Affects     : 
sw:///:path=/var/crash/dell/.0e9cf93e-0463-cf99-97d4-ef9612843fe8
                   faulted but still in service
Problem in  : 
sw:///:path=/var/crash/dell/.0e9cf93e-0463-cf99-97d4-ef9612843fe8
                   faulted but still in service

Description : The system has rebooted after a kernel panic.  Refer to
               http://illumos.org/msg/SUNOS-8000-KL for more information.

Response    : The failed system image was dumped to the dump device.  If
               savecore is enabled (see dumpadm(1M)) a copy of the dump 
will be
               written to the savecore directory .

Impact      : There may be some performance impact while the panic is 
copied to
               the savecore directory.  Disk space usage by panics can be
               substantial.

Action      : If savecore is not enabled then please take steps to 
preserve the
               crash image.
               Use 'fmdump -Vp -u 0e9cf93e-0463-cf99-97d4-ef9612843fe8' 
to view
               more panic detail.  Please refer to the knowledge article for
               additional information.

--------------- ------------------------------------ -------------- 
---------
TIME            EVENT-ID MSG-ID         SEVERITY
--------------- ------------------------------------ -------------- 
---------
Dec 07 20:34:41 7492a400-17a2-eb87-ec75-ada813c9a447 PCIEX-8000-0A  
Critical

Host        : dell6510
Platform    : Precision-7720    Chassis_id  : 49JT5M2
Product_sn  :

Fault class : fault.io.pciex.device-interr
Affects     : dev:////pci@0,0/pci8086,a114@1c,4/pci8086,15da@0
dev:////pci@0,0/pci8086,a114@1c,4/pci8086,15da@0/pci8086,15da@2/pci1028,7b1@0
               dev:////pci@0,0/pci8086,a114@1c,4
                   faulted and taken out of service
FRU         : "MB" 
(hc://:product-id=Precision-7720:server-id=dell6510:chassis-id=49JT5M2/motherboard=0)
                   faulty

Description : A problem was detected for a PCIEX device.
               Refer to http://illumos.org/msg/PCIEX-8000-0A for more
               information.

Response    : One or more device instances may be disabled

Impact      : Loss of services provided by the device instances 
associated with
               this fault

Action      : Schedule a repair procedure to replace the affected 
device.  Use
               fmadm faulty to identify the device or contact your illumos
               distribution team for support.

--------------- ------------------------------------ -------------- 
---------
TIME            EVENT-ID MSG-ID         SEVERITY
--------------- ------------------------------------ -------------- 
---------
Dec 07 20:34:44 df189892-4b85-ea85-84ab-d8a535350bcf SUNOS-8000-J0  Major

Host        : dell6510
Platform    : Precision-7720    Chassis_id  : 49JT5M2
Product_sn  :

Fault class : fault.sunos.eft.unexpected_telemetry 50%
               defect.sunos.eft.unexpected_telemetry 50%
Problem in  : dev:////pci@0,0
                   faulted and taken out of service

Description : The diagnosis engine encountered telemetry from the listed
               devices for which it was unable to perform a diagnosis -
               Refer to http://illumos.org/msg/SUNOS-8000-J0 for more
               information.  Refer to 
http://illumos.org/msg/SUNOS-8000-J0 for
               more information.

Response    : Error reports have been logged for examination by your illumos
               distribution team.

Impact      : Automated diagnosis and response for these events will not 
occur.

Action      : Ensure that the latest illumos Kernel and Predictive 
Self-Healing
               (PSH) updates are installed.

--------------- ------------------------------------ -------------- 
---------
TIME            EVENT-ID MSG-ID         SEVERITY
--------------- ------------------------------------ -------------- 
---------
Dec 07 20:34:41 eab08cd5-4eeb-cd65-e14c-e11932c36288 SUNOS-8000-KL  Major

Host        : dell6510
Platform    : Precision-7720    Chassis_id  : 49JT5M2
Product_sn  :

Fault class : defect.sunos.kernel.panic
Affects     : 
sw:///:path=/var/crash/dell/.eab08cd5-4eeb-cd65-e14c-e11932c36288
                   faulted but still in service
Problem in  : 
sw:///:path=/var/crash/dell/.eab08cd5-4eeb-cd65-e14c-e11932c36288
                   faulted but still in service

Description : The system has rebooted after a kernel panic.  Refer to
               http://illumos.org/msg/SUNOS-8000-KL for more information.

Response    : The failed system image was dumped to the dump device.  If
               savecore is enabled (see dumpadm(1M)) a copy of the dump 
will be
               written to the savecore directory .

Impact      : There may be some performance impact while the panic is 
copied to
               the savecore directory.  Disk space usage by panics can be
               substantial.

Action      : If savecore is not enabled then please take steps to 
preserve the
               crash image.
               Use 'fmdump -Vp -u eab08cd5-4eeb-cd65-e14c-e11932c36288' 
to view
               more panic detail.  Please refer to the knowledge article for
               additional information.


*fmdump*

TIME UUID                                 SUNW-MSG-ID
Dec 07 2020 20:34:41.581780000 3e63abab-31e9-479b-adc8-f7dcf8bdafe6 
DISK-8000-3E

   TIME CLASS                                 ENA
   Dec 07 20:34:33.1042 ereport.io.scsi.cmd.disk.dev.rqs.derr 
0x00d5ec4d51d00c01

nvlist version: 0
     version = 0x0
     class = list.suspect
     uuid = 3e63abab-31e9-479b-adc8-f7dcf8bdafe6
     code = DISK-8000-3E
     diag-time = 1607369681 472777
     de = fmd:///module/eft
     fault-list-sz = 0x1
     fault-list = (array of embedded nvlists)
     (start fault-list[0])
     nvlist version: 0
         version = 0x0
         class = fault.io.scsi.cmd.disk.dev.rqs.derr
         certainty = 0x64
         resource = 
hc://:product-id=Precision-7720:server-id=dell6510:chassis-id=49JT5M2:serial=DD56419883E4A:part=JMICRON-JMS583:revision=0204/motherboard=0/hostbridge=4/pciexrc=4/pciexbus=5/pciexdev=0/pciexfn=0/pciexbus=6/pciexdev=2/pciexfn=0/pciexbus=61/pciexdev=0/pciexfn=0/port=0/usb-device=0/disk=0
         asru = 
dev:///:devid=id1,sd@n3044564198835410//pci@0,0/pci8086,a114@1c,4/pci8086,15da@0/pci8086,15da@2/pci1028,7b1@0/storage@3/disk@0,0
         fru = 
hc://:product-id=Precision-7720:server-id=dell6510:chassis-id=49JT5M2:serial=DD56419883E4A:part=JMICRON-JMS583:revision=0204/motherboard=0/hostbridge=4/pciexrc=4/pciexbus=5/pciexdev=0/pciexfn=0/pciexbus=6/pciexdev=2/pciexfn=0/pciexbus=61/pciexdev=0/pciexfn=0/port=0/usb-device=0/disk=0
     (end fault-list[0])

     fault-status = 0x1
     severity = Critical
     __ttl = 0x1
     __tod = 0x5fce83d1 0x22ad4220

TIME UUID                                 SUNW-MSG-ID
Dec 07 2020 20:36:11.733251000 0e9cf93e-0463-cf99-97d4-ef9612843fe8 
SUNOS-8000-KL

   TIME CLASS                                 ENA
   Dec 07 20:36:10.5791 ireport.os.sunos.panic.dump_pending_on_device 
0x0000000000000000

nvlist version: 0
     version = 0x0
     class = list.suspect
     uuid = 0e9cf93e-0463-cf99-97d4-ef9612843fe8
     code = SUNOS-8000-KL
     diag-time = 1607369771 580716
     de = fmd:///module/software-diagnosis
     fault-list-sz = 0x1
     fault-list = (array of embedded nvlists)
     (start fault-list[0])
     nvlist version: 0
         version = 0x0
         class = defect.sunos.kernel.panic
         certainty = 0x64
         asru = 
sw:///:path=/var/crash/dell/.0e9cf93e-0463-cf99-97d4-ef9612843fe8
         resource = 
sw:///:path=/var/crash/dell/.0e9cf93e-0463-cf99-97d4-ef9612843fe8
         savecore-succcess = 0
         os-instance-uuid = 0e9cf93e-0463-cf99-97d4-ef9612843fe8
         panicstr = XHCI runtime reset required
         panicstack = xhci:xhci_taskq+393cdf07 () | 
genunix:taskq_thread+2cd () | unix:thread_start+b () |
         crashtime = 1607369692
         panic-time =  7 December 2020 at 20:34:52 CET CET
     (end fault-list[0])

     fault-status = 0x1
     severity = Major
     __ttl = 0x1
     __tod = 0x5fce842b 0x2bb485b8

TIME UUID                                 SUNW-MSG-ID
Dec 07 2020 20:34:41.290518000 7492a400-17a2-eb87-ec75-ada813c9a447 
PCIEX-8000-0A

   TIME CLASS                                 ENA
   Dec 07 20:33:15.5142 ereport.io.service.lost               
0x8481ef5245200801

nvlist version: 0
     version = 0x0
     class = list.suspect
     uuid = 7492a400-17a2-eb87-ec75-ada813c9a447
     code = PCIEX-8000-0A
     diag-time = 1607369681 131074
     de = fmd:///module/eft
     fault-list-sz = 0x3
     fault-list = (array of embedded nvlists)
     (start fault-list[0])
     nvlist version: 0
         version = 0x0
         class = fault.io.pciex.device-interr
         certainty = 0x28
         resource = 
hc://:product-id=Precision-7720:server-id=dell6510:chassis-id=49JT5M2/motherboard=0/hostbridge=4/pciexrc=4/pciexbus=5/pciexdev=0/pciexfn=0/pciexbus=6/pciexdev=2/pciexfn=0/pciexbus=61/pciexdev=0/pciexfn=0
         asru = 
dev:////pci@0,0/pci8086,a114@1c,4/pci8086,15da@0/pci8086,15da@2/pci1028,7b1@0
         fru = 
hc://:product-id=Precision-7720:server-id=dell6510:chassis-id=49JT5M2/motherboard=0
         location = MB
     (end fault-list[0])
     (start fault-list[1])
     nvlist version: 0
         version = 0x0
         class = fault.io.pciex.device-interr
         certainty = 0x28
         resource = 
hc://:product-id=Precision-7720:server-id=dell6510:chassis-id=49JT5M2/motherboard=0/hostbridge=4/pciexrc=4/pciexbus=5/pciexdev=0/pciexfn=0
         asru = dev:////pci@0,0/pci8086,a114@1c,4/pci8086,15da@0
         fru = 
hc://:product-id=Precision-7720:server-id=dell6510:chassis-id=49JT5M2/motherboard=0
         location = MB
     (end fault-list[1])
     (start fault-list[2])
     nvlist version: 0
         version = 0x0
         class = fault.io.pciex.device-interr
         certainty = 0x14
         resource = 
hc://:product-id=Precision-7720:server-id=dell6510:chassis-id=49JT5M2/motherboard=0/hostbridge=4/pciexrc=4
         asru = dev:////pci@0,0/pci8086,a114@1c,4
         fru = 
hc://:product-id=Precision-7720:server-id=dell6510:chassis-id=49JT5M2/motherboard=0
         location = MB
     (end fault-list[2])

     fault-status = 0x1 0x1 0x1
     severity = Critical
     __ttl = 0x1
     __tod = 0x5fce83d1 0x1150f3f0

TIME UUID                                 SUNW-MSG-ID
Dec 07 2020 20:34:44.491026000 df189892-4b85-ea85-84ab-d8a535350bcf 
SUNOS-8000-J0

   TIME CLASS                                 ENA
   Dec 07 20:34:41.7708 ereport.io.pciex.rc.ce-msg            
0x00f6357b6cb01c01

nvlist version: 0
     version = 0x0
     class = list.suspect
     uuid = df189892-4b85-ea85-84ab-d8a535350bcf
     code = SUNOS-8000-J0
     diag-time = 1607369684 471770
     de = fmd:///module/eft
     fault-list-sz = 0x2
     fault-list = (array of embedded nvlists)
     (start fault-list[0])
     nvlist version: 0
         version = 0x0
         class = defect.sunos.eft.unexpected_telemetry
         certainty = 0x32
         resource = dev:////pci@0,0
         reason = no valid path to component was found in 
ereport.io.pciex.rc.ce-msg
         retire = 0
         response = 0
     (end fault-list[0])
     (start fault-list[1])
     nvlist version: 0
         version = 0x0
         class = fault.sunos.eft.unexpected_telemetry
         certainty = 0x32
         resource = dev:////pci@0,0
         reason = no valid path to component was found in 
ereport.io.pciex.rc.ce-msg
         retire = 0
         response = 0
     (end fault-list[1])

     fault-status = 0x1 0x1
     severity = Major
     __ttl = 0x1
     __tod = 0x5fce83d4 0x1d447650

TIME UUID                                 SUNW-MSG-ID
Dec 07 2020 20:34:41.055285000 eab08cd5-4eeb-cd65-e14c-e11932c36288 
SUNOS-8000-KL

   TIME CLASS                                 ENA
   Dec 07 20:34:39.1380 ireport.os.sunos.panic.dump_pending_on_device 
0x0000000000000000

nvlist version: 0
     version = 0x0
     class = list.suspect
     uuid = eab08cd5-4eeb-cd65-e14c-e11932c36288
     code = SUNOS-8000-KL
     diag-time = 1607369681 35677
     de = fmd:///module/software-diagnosis
     fault-list-sz = 0x1
     fault-list = (array of embedded nvlists)
     (start fault-list[0])
     nvlist version: 0
         version = 0x0
         class = defect.sunos.kernel.panic
         certainty = 0x64
         asru = 
sw:///:path=/var/crash/dell/.eab08cd5-4eeb-cd65-e14c-e11932c36288
         resource = 
sw:///:path=/var/crash/dell/.eab08cd5-4eeb-cd65-e14c-e11932c36288
         savecore-succcess = 0
         os-instance-uuid = eab08cd5-4eeb-cd65-e14c-e11932c36288
         panicstr = XHCI runtime reset required
         panicstack = xhci:xhci_taskq+393cdf07 () | 
genunix:taskq_thread+2cd () | unix:thread_start+b () |
         crashtime = 1607369595
         panic-time =  7 December 2020 at 20:33:15 CET CET
     (end fault-list[0])

     fault-status = 0x1
     severity = Major
     __ttl = 0x1
     __tod = 0x5fce83d1 0x34b9508



More information about the openindiana-discuss mailing list