[OpenIndiana-discuss] Kernel panic on hung zpool accessed via lofi

Watson, Dan Dan.Watson at bcferries.com
Fri Sep 11 18:57:46 UTC 2015


Hi all,

I've been enjoying OI for quite a while butI'm running into a problem with accessing zpool on disk image files sitting on zfs accessed via lofi that I hope someone can give me a hint on.

To recover data from a zpool I've copied slice 0 off of all the disks to a different host under /alt (zfs file system)
root at represent:/alt# ls
c1t50014EE0037B0FF3d0s0.dd  c1t50014EE0AE25CF55d0s0.dd  c1t50014EE2081874CAd0s0.dd  c1t50014EE25D6CDE92d0s0.dd  c1t50014EE25D6DDBC7d0s0.dd  c1t50014EE2B2C380C3d0s0.dd
c1t50014EE0037B105Fd0s0.dd  c1t50014EE0AE25EFD1d0s0.dd  c1t50014EE20818C0ECd0s0.dd  c1t50014EE25D6DCF0Ed0s0.dd  c1t50014EE2B2C27AE2d0s0.dd  c1t50014EE6033DD776d0s0.dd

I use lofiadm to access the disk images as devices because for some reason zfs can't access a "device" formatted vdev as a file
root at represent:/alt# lofiadm
Block Device             File                           Options
/dev/lofi/1              /alt/c1t50014EE0037B0FF3d0s0.dd        -
/dev/lofi/2              /alt/c1t50014EE0037B105Fd0s0.dd        -
/dev/lofi/3              /alt/c1t50014EE0AE25CF55d0s0.dd        -
/dev/lofi/4              /alt/c1t50014EE0AE25EFD1d0s0.dd        -
/dev/lofi/5              /alt/c1t50014EE2081874CAd0s0.dd        -
/dev/lofi/6              /alt/c1t50014EE20818C0ECd0s0.dd        -
/dev/lofi/7              /alt/c1t50014EE25D6CDE92d0s0.dd        -
/dev/lofi/8              /alt/c1t50014EE25D6DCF0Ed0s0.dd        -
/dev/lofi/9              /alt/c1t50014EE25D6DDBC7d0s0.dd        -
/dev/lofi/10             /alt/c1t50014EE2B2C27AE2d0s0.dd        -
/dev/lofi/11             /alt/c1t50014EE2B2C380C3d0s0.dd        -
/dev/lofi/12             /alt/c1t50014EE6033DD776d0s0.dd        -

The zpool is identifiable
root at represent:/alt# zpool import -d /dev/lofi
   pool: oldtank
     id: 13463599998639852818
  state: ONLINE
 status: One or more devices are missing from the system.
 action: The pool can be imported using its name or numeric identifier.
   see: http://illumos.org/msg/ZFS-8000-2Q
 config:
        oldtank                  ONLINE
          raidz2-0               ONLINE
            /dev/lofi/4          ONLINE
            /dev/lofi/2          ONLINE
            /dev/lofi/1          ONLINE
            /dev/lofi/3          ONLINE
            /dev/lofi/8          ONLINE
            /dev/lofi/10         ONLINE
            /dev/lofi/11         ONLINE
            /dev/lofi/7          ONLINE
            /dev/lofi/6          ONLINE
            /dev/lofi/9          ONLINE
            /dev/lofi/5          ONLINE
            /dev/lofi/12         ONLINE
        cache
          c1t50015178F36728A3d0
          c1t50015178F3672944d0

And I import the zpool (this command never exits)
root at represent:/alt# zpool import -d /dev/lofi oldtank

In another window it is evident that the system has managed to add the zpool
                            extended device statistics       ---- errors ---
    r/s    w/s   Mr/s   Mw/s wait actv wsvc_t asvc_t  %w  %b s/w h/w trn tot device
  101.1    0.0    1.7    0.0  0.3  2.8    2.9   27.5  28 100   0   0   0   0 lofi1
  118.6    0.0    1.3    0.0  0.3  2.9    2.4   24.3  28 100   0   0   0   0 lofi2
  123.8    0.0    1.0    0.0  0.3  2.9    2.7   23.3  31  94   0   0   0   0 lofi3
  133.1    0.0    1.1    0.0  0.4  2.8    2.7   20.7  34  92   0   0   0   0 lofi4
  144.8    0.0    1.6    0.0  0.2  2.7    1.3   18.7  17  97   0   0   0   0 lofi5
  132.3    0.0    1.2    0.0  0.2  2.5    1.4   18.7  17  95   0   0   0   0 lofi6
  100.3    0.0    1.0    0.0  0.2  2.7    1.9   26.6  18 100   0   0   0   0 lofi7
  117.3    0.0    1.2    0.0  0.2  2.7    1.9   23.3  21  99   0   0   0   0 lofi8
  142.1    0.0    1.0    0.0  0.3  2.5    1.9   17.3  26  85   0   0   0   0 lofi9
  142.8    0.0    1.0    0.0  0.2  2.5    1.5   17.4  20  83   0   0   0   0 lofi10
  144.1    0.0    0.9    0.0  0.3  2.7    2.0   19.0  28  96   0   0   0   0 lofi11
  101.8    0.0    0.8    0.0  0.2  2.7    2.2   26.1  21  96   0   0   0   0 lofi12
 1502.1    0.0   13.7    0.0 3229.1 35.3 2149.7   23.5 100 100   0   0   0   0 oldtank
...
  195.6    0.0    5.8    0.0  0.0  6.1    0.0   31.4   0  95   0   0   0   0 c0t50014EE25F8307D2d0
  200.9    0.0    5.8    0.0  0.0  7.5    0.0   37.2   0  97   0   0   0   0 c0t50014EE2B4CAA6D3d0
  200.1    0.0    5.8    0.0  0.0  7.0    0.0   35.1   0  97   0   0   0   0 c0t50014EE25F74EC15d0
  197.9    0.0    5.9    0.0  0.0  7.2    0.0   36.2   0  96   0   0   0   0 c0t50014EE25F74DD46d0
  198.1    0.0    5.5    0.0  0.0  6.7    0.0   34.0   0  95   0   0   0   0 c0t50014EE2B4D7C1C9d0
  202.4    0.0    5.9    0.0  0.0  6.9    0.0   34.1   0  97   0   0   0   0 c0t50014EE2B4CA8F9Bd0
  223.9    0.0    6.9    0.0  0.0  8.8    0.0   39.1   0 100   0   0   0   0 c0t50014EE20A2DAE1Ed0
  201.6    0.0    5.9    0.0  0.0  6.6    0.0   32.9   0  96   0   0   0   0 c0t50014EE25F74F90Fd0
  210.9    0.0    6.0    0.0  0.0  8.7    0.0   41.5   0 100   0   0   0   0 c0t50014EE20C083E31d0
  222.9    0.0    6.5    0.0  0.0  9.1    0.0   40.7   0  99   0   0   0   0 c0t50014EE2B6B2FA22d0
  214.4    0.0    6.1    0.0  0.0  8.9    0.0   41.6   0 100   0   0   0   0 c0t50014EE20C07F3F3d0
  222.1    0.0    6.5    0.0  0.0  9.7    0.0   43.6   0 100   0   0   0   0 c0t50014EE2615D7B2Ed0
  219.1    0.0    6.2    0.0  0.0  9.5    0.0   43.3   0  99   0   2   8  10 c0t50014EE2B6B3FB99d0
  217.6    0.0    6.2    0.0  0.0  8.6    0.0   39.3   0  98   0   0   0   0 c0t50014EE20C07E598d0
  216.4    0.0    6.1    0.0  0.0  7.7    0.0   35.5   0 100   0   0   0   0 c0t50014EE2615D7ADAd0
  216.4    0.0    6.2    0.0  0.0  9.1    0.0   42.1   0 100   0   0   0   0 c0t50014EE2B6B3F65Ed0
...
3360.1    0.0   97.3    0.0 447.0 129.1  133.0   38.4 100 100   0   0   0   0 tank

But eventually the system panics, core dumps and reboots.

Looking at the core dump I get the following
> ::status
debugging crash dump vmcore.0 (64-bit) from represent
operating system: 5.11 oi_151a9 (i86pc)
image uuid: 19b88adb-6510-e6e9-a723-95f098c85108
panic message: I/O to pool 'oldtank' appears to be hung.
dump content: kernel pages only
> $c
vpanic()
vdev_deadman+0xda(ffffff01ceb3e800)
vdev_deadman+0x37(ffffff01cff9f000)
vdev_deadman+0x37(ffffff01d53c4800)
spa_deadman+0x69(ffffff01ce186580)
cyclic_softint+0xdc(fffffffffbc30640, 0)
cbe_low_level+0x17()
av_dispatch_softvect+0x5f(2)
dispatch_softint+0x34(0, 0)
switch_sp_and_call+0x13()
dosoftint+0x59(ffffff0007a05ad0)
do_interrupt+0x114(ffffff0007a05ad0, 1)
_interrupt+0xba()
mach_cpu_idle+6()
cpu_idle+0xaf()
cpu_idle_adaptive+0x19()
idle+0x114()
thread_start+8()
>

I have been able to reproduce this problem several times, although it has managed to complete enough to rename the original zpool.

Has anyone else encountered this issue with lofi mounted zpools?
I'm using mpt_sas with SATA drives, and I _DO_ have error counters climbing for some of those drives, is it probably that?
Any other ideas?

I'd greatly appreciate any suggestions.

Thanks!
Dan






More information about the openindiana-discuss mailing list