[OpenIndiana-discuss] ZFS hangs - causes host to panic

Wed Feb 7 10:36:24 UTC 2018

----- Ursprüngliche Mail -----
> Von: "Stephan Budach" <stephan.budach at jvm.de>
> An: "Discussion list for OpenIndiana" <openindiana-discuss at openindiana.org>
> Gesendet: Dienstag, 16. Januar 2018 14:15:37
> Betreff: [OpenIndiana-discuss] ZFS hangs - causes host to panic
> 
> 
> Hi,
> 
> 
> I am currently putting my new NVME servers through their paces and I
> already experinced two panics on one of those hosts.
> After taking "forever" writing the crash dump, I found this in the
> syslog after reboot:
> 
> 
> 
> Jan 16 13:25:29 nfsvmpool09 savecore: [ID 570001 auth.error] reboot
> after panic: I/O to pool 'nvmeTank02' appears to be hung.
> Jan 16 13:25:29 nfsvmpool09 savecore: [ID 771660 auth.error] Panic
> crashdump pending on dump device but dumpadm -n in effect; run
> savecore(1M) manually to extract. Image UUID
> 995846d5-8c94-4f68-bada-e05ae5e4cb25(fault-management initiated).
> 
> 
> I ran mdb against the crash dump, but I am still a dummy, reading
> those information:
> 
> 
> 
> root at nfsvmpool09:/var/crash/nfsvmpool09# mdb unix.0 vmcore.0
> Loading modules: [ unix genunix specfs dtrace mac cpu.generic uppc
> apix scsi_vhci zfs sata sd ip hook neti sockfs arp usba fctl stmf
> stmf_sbd mm lofs i40e idm cpc crypto fcip fcp random ufs logindmux
> nsmb ptm smbsrv nfs sppp ipc ]
> > $C
> ffffd000f5dd79d0 vpanic()
> ffffd000f5dd7a20 vdev_deadman+0x10b(ffffd0320fb69980)
> ffffd000f5dd7a70 vdev_deadman+0x4a(ffffd0333b018940)
> ffffd000f5dd7ac0 vdev_deadman+0x4a(ffffd03228f796c0)
> ffffd000f5dd7af0 spa_deadman+0xad(ffffd03229543000)
> ffffd000f5dd7b90 cyclic_softint+0xfd(ffffd031eac4db00, 0)
> ffffd000f5dd7ba0 cbe_low_level+0x14()
> ffffd000f5dd7bf0 av_dispatch_softvect+0x78(2)
> ffffd000f5dd7c20 apix_dispatch_softint+0x35(0, 0)
> ffffd000f5da1990 switch_sp_and_call+0x13()
> ffffd000f5da19e0 apix_do_softint+0x6c(ffffd000f5da1a50)
> ffffd000f5da1a40 apix_do_interrupt+0x362(ffffd000f5da1a50, 2)
> ffffd000f5da1a50 _interrupt+0xba()
> ffffd000f5da1bc0 acpi_cpu_cstate+0x11b(ffffd031e98a43e0)
> ffffd000f5da1bf0 cpu_acpi_idle+0x8d()
> ffffd000f5da1c00 cpu_idle_adaptive+0x13()
> ffffd000f5da1c20 idle+0xa7()
> ffffd000f5da1c30 thread_start+8()
> > 
> 
> 
> Can anybody make something useful of that?
> 
> 
> Thanks,
> Stephan

I have been trying to hunt that down further, as it only seems to affect some NVMe SSDs and consequently the error moves along with where I am putting thise NVMe SSDs in. What seems to happen is, that at some random point, writes to the NVMe SSDs are not coming back and finally the ZFS deadman timer kicks in, panicing the host.

What I was able to gather is that at that point the SSD becomes 100% busy with no actual transfer between the device and the host. iostst -xenM will show something like this:

                            extended device statistics       ---- errors ---
    r/s    w/s   Mr/s   Mw/s wait actv wsvc_t asvc_t  %w  %b s/w h/w trn tot device
    0,0    0,0    0,0    0,0  0,0  1,0    0,0    0,0   0 100   0  27   0  27 c21t1d0
    0,0    0,0    0,0    0,0  0,0  0,0    0,0    0,0   0   0   0   3   0   3 c14t1d0
    0,0    0,0    0,0    0,0  0,0  0,0    0,0    0,0   0   0   0   2   0   2 c29t1d0
    0,0    0,0    0,0    0,0  0,0  0,0    0,0    0,0   0   0   0   3   0   3 c6t1d0
    0,0    0,0    0,0    0,0  0,0  0,0    0,0    0,0   0   0   0   3   0   3 c15t1d0
    0,0    0,0    0,0    0,0  0,0  1,0    0,0    0,0   0 100   0   2   0   2 c13t1d0
    0,0    0,0    0,0    0,0  0,0  0,0    0,0    0,0   0   0   0  28   0  28 c23t1d0
    0,0    0,0    0,0    0,0  0,0  0,0    0,0    0,0   0   0   0   2   0   2 c16t1d0
    0,0    0,0    0,0    0,0  0,0  0,0    0,0    0,0   0   0   0   2   0   2 c24t1d0
    0,0    0,0    0,0    0,0  0,0  0,0    0,0    0,0   0   0   0  27   0  27 c19t1d0
    0,0    0,0    0,0    0,0  0,0  0,0    0,0    0,0   0   0   0  27   0  27 c22t1d0
    0,0    0,0    0,0    0,0  0,0  0,0    0,0    0,0   0   0   0   3   0   3 c12t1d0
    0,0    0,0    0,0    0,0  0,0  0,0    0,0    0,0   0   0   0   2   0   2 c17t1d0
    0,0    0,0    0,0    0,0  0,0  0,0    0,0    0,0   0   0   0   3   0   3 c7t1d0
    0,0    0,0    0,0    0,0  0,0  0,0    0,0    0,0   0   0   0  27   0  27 c20t1d0
    0,0    0,0    0,0    0,0  0,0  0,0    0,0    0,0   0   0   0   3   0   3 c10t1d0
    0,0    0,0    0,0    0,0  0,0  0,0    0,0    0,0   0   0   0   3   0   3 c26t1d0
    0,0    0,0    0,0    0,0  0,0  0,0    0,0    0,0   0   0   0   2   0   2 c8t1d0
    0,0    0,0    0,0    0,0  0,0  0,0    0,0    0,0   0   0   0   3   0   3 c25t1d0
    0,0    0,0    0,0    0,0  0,0  0,0    0,0    0,0   0   0   0   2   0   2 c27t1d0
 1844,2    0,0   14,4    0,0  0,0  0,4    0,0    0,2   0  39   0  27   0  27 c18t1d0
    0,0    0,0    0,0    0,0  0,0  0,0    0,0    0,0   0   0   0   3   0   3 c11t1d0
    0,0    0,0    0,0    0,0  0,0  0,0    0,0    0,0   0   0   0   3   0   3 c9t1d0
    0,0    0,0    0,0    0,0  0,0  0,0    0,0    0,0   0   0   0   2   0   2 c28t1d0
    0,0    0,0    0,0    0,0 98,0  1,0    0,0    0,0 100 100   0   0   0   0 nvmeTank01
    0,0    0,0    0,0    0,0 104,0  1,0    0,0    0,0 100 100   0   0   0   0 nvmeTank02
 1844,2    0,0   14,4    0,0  0,0  0,5    0,0    0,3   0  50   0   0   0   0 poolc18d1t0
    0,0    0,0    0,0    0,0  0,0  0,0    0,0    0,0   0   0   0   0   0   0 rpool
    0,0    0,0    0,0    0,0  0,0  0,0    0,0    0,0   0   0   0   0   0   0 c4t0d0
    0,0    0,0    0,0    0,0  0,0  0,0    0,0    0,0   0   0   0   0   0   0 c4t1d0

c21t1d0 and c13t1d0 are blocking their respective zpools, but I had also some other SSD behave in this way, so I am wondering how likely it is, that I got a really bad batch of NVMes, since atm, I'd suggest, that at least 3 devices exhibit this odd behaviour.