[OpenIndiana-discuss] Kernel panic on hung zpool accessed via lofi
Watson, Dan
Dan.Watson at bcferries.com
Fri Sep 11 18:57:46 UTC 2015
Hi all,
I've been enjoying OI for quite a while butI'm running into a problem with accessing zpool on disk image files sitting on zfs accessed via lofi that I hope someone can give me a hint on.
To recover data from a zpool I've copied slice 0 off of all the disks to a different host under /alt (zfs file system)
root at represent:/alt# ls
c1t50014EE0037B0FF3d0s0.dd c1t50014EE0AE25CF55d0s0.dd c1t50014EE2081874CAd0s0.dd c1t50014EE25D6CDE92d0s0.dd c1t50014EE25D6DDBC7d0s0.dd c1t50014EE2B2C380C3d0s0.dd
c1t50014EE0037B105Fd0s0.dd c1t50014EE0AE25EFD1d0s0.dd c1t50014EE20818C0ECd0s0.dd c1t50014EE25D6DCF0Ed0s0.dd c1t50014EE2B2C27AE2d0s0.dd c1t50014EE6033DD776d0s0.dd
I use lofiadm to access the disk images as devices because for some reason zfs can't access a "device" formatted vdev as a file
root at represent:/alt# lofiadm
Block Device File Options
/dev/lofi/1 /alt/c1t50014EE0037B0FF3d0s0.dd -
/dev/lofi/2 /alt/c1t50014EE0037B105Fd0s0.dd -
/dev/lofi/3 /alt/c1t50014EE0AE25CF55d0s0.dd -
/dev/lofi/4 /alt/c1t50014EE0AE25EFD1d0s0.dd -
/dev/lofi/5 /alt/c1t50014EE2081874CAd0s0.dd -
/dev/lofi/6 /alt/c1t50014EE20818C0ECd0s0.dd -
/dev/lofi/7 /alt/c1t50014EE25D6CDE92d0s0.dd -
/dev/lofi/8 /alt/c1t50014EE25D6DCF0Ed0s0.dd -
/dev/lofi/9 /alt/c1t50014EE25D6DDBC7d0s0.dd -
/dev/lofi/10 /alt/c1t50014EE2B2C27AE2d0s0.dd -
/dev/lofi/11 /alt/c1t50014EE2B2C380C3d0s0.dd -
/dev/lofi/12 /alt/c1t50014EE6033DD776d0s0.dd -
The zpool is identifiable
root at represent:/alt# zpool import -d /dev/lofi
pool: oldtank
id: 13463599998639852818
state: ONLINE
status: One or more devices are missing from the system.
action: The pool can be imported using its name or numeric identifier.
see: http://illumos.org/msg/ZFS-8000-2Q
config:
oldtank ONLINE
raidz2-0 ONLINE
/dev/lofi/4 ONLINE
/dev/lofi/2 ONLINE
/dev/lofi/1 ONLINE
/dev/lofi/3 ONLINE
/dev/lofi/8 ONLINE
/dev/lofi/10 ONLINE
/dev/lofi/11 ONLINE
/dev/lofi/7 ONLINE
/dev/lofi/6 ONLINE
/dev/lofi/9 ONLINE
/dev/lofi/5 ONLINE
/dev/lofi/12 ONLINE
cache
c1t50015178F36728A3d0
c1t50015178F3672944d0
And I import the zpool (this command never exits)
root at represent:/alt# zpool import -d /dev/lofi oldtank
In another window it is evident that the system has managed to add the zpool
extended device statistics ---- errors ---
r/s w/s Mr/s Mw/s wait actv wsvc_t asvc_t %w %b s/w h/w trn tot device
101.1 0.0 1.7 0.0 0.3 2.8 2.9 27.5 28 100 0 0 0 0 lofi1
118.6 0.0 1.3 0.0 0.3 2.9 2.4 24.3 28 100 0 0 0 0 lofi2
123.8 0.0 1.0 0.0 0.3 2.9 2.7 23.3 31 94 0 0 0 0 lofi3
133.1 0.0 1.1 0.0 0.4 2.8 2.7 20.7 34 92 0 0 0 0 lofi4
144.8 0.0 1.6 0.0 0.2 2.7 1.3 18.7 17 97 0 0 0 0 lofi5
132.3 0.0 1.2 0.0 0.2 2.5 1.4 18.7 17 95 0 0 0 0 lofi6
100.3 0.0 1.0 0.0 0.2 2.7 1.9 26.6 18 100 0 0 0 0 lofi7
117.3 0.0 1.2 0.0 0.2 2.7 1.9 23.3 21 99 0 0 0 0 lofi8
142.1 0.0 1.0 0.0 0.3 2.5 1.9 17.3 26 85 0 0 0 0 lofi9
142.8 0.0 1.0 0.0 0.2 2.5 1.5 17.4 20 83 0 0 0 0 lofi10
144.1 0.0 0.9 0.0 0.3 2.7 2.0 19.0 28 96 0 0 0 0 lofi11
101.8 0.0 0.8 0.0 0.2 2.7 2.2 26.1 21 96 0 0 0 0 lofi12
1502.1 0.0 13.7 0.0 3229.1 35.3 2149.7 23.5 100 100 0 0 0 0 oldtank
...
195.6 0.0 5.8 0.0 0.0 6.1 0.0 31.4 0 95 0 0 0 0 c0t50014EE25F8307D2d0
200.9 0.0 5.8 0.0 0.0 7.5 0.0 37.2 0 97 0 0 0 0 c0t50014EE2B4CAA6D3d0
200.1 0.0 5.8 0.0 0.0 7.0 0.0 35.1 0 97 0 0 0 0 c0t50014EE25F74EC15d0
197.9 0.0 5.9 0.0 0.0 7.2 0.0 36.2 0 96 0 0 0 0 c0t50014EE25F74DD46d0
198.1 0.0 5.5 0.0 0.0 6.7 0.0 34.0 0 95 0 0 0 0 c0t50014EE2B4D7C1C9d0
202.4 0.0 5.9 0.0 0.0 6.9 0.0 34.1 0 97 0 0 0 0 c0t50014EE2B4CA8F9Bd0
223.9 0.0 6.9 0.0 0.0 8.8 0.0 39.1 0 100 0 0 0 0 c0t50014EE20A2DAE1Ed0
201.6 0.0 5.9 0.0 0.0 6.6 0.0 32.9 0 96 0 0 0 0 c0t50014EE25F74F90Fd0
210.9 0.0 6.0 0.0 0.0 8.7 0.0 41.5 0 100 0 0 0 0 c0t50014EE20C083E31d0
222.9 0.0 6.5 0.0 0.0 9.1 0.0 40.7 0 99 0 0 0 0 c0t50014EE2B6B2FA22d0
214.4 0.0 6.1 0.0 0.0 8.9 0.0 41.6 0 100 0 0 0 0 c0t50014EE20C07F3F3d0
222.1 0.0 6.5 0.0 0.0 9.7 0.0 43.6 0 100 0 0 0 0 c0t50014EE2615D7B2Ed0
219.1 0.0 6.2 0.0 0.0 9.5 0.0 43.3 0 99 0 2 8 10 c0t50014EE2B6B3FB99d0
217.6 0.0 6.2 0.0 0.0 8.6 0.0 39.3 0 98 0 0 0 0 c0t50014EE20C07E598d0
216.4 0.0 6.1 0.0 0.0 7.7 0.0 35.5 0 100 0 0 0 0 c0t50014EE2615D7ADAd0
216.4 0.0 6.2 0.0 0.0 9.1 0.0 42.1 0 100 0 0 0 0 c0t50014EE2B6B3F65Ed0
...
3360.1 0.0 97.3 0.0 447.0 129.1 133.0 38.4 100 100 0 0 0 0 tank
But eventually the system panics, core dumps and reboots.
Looking at the core dump I get the following
> ::status
debugging crash dump vmcore.0 (64-bit) from represent
operating system: 5.11 oi_151a9 (i86pc)
image uuid: 19b88adb-6510-e6e9-a723-95f098c85108
panic message: I/O to pool 'oldtank' appears to be hung.
dump content: kernel pages only
> $c
vpanic()
vdev_deadman+0xda(ffffff01ceb3e800)
vdev_deadman+0x37(ffffff01cff9f000)
vdev_deadman+0x37(ffffff01d53c4800)
spa_deadman+0x69(ffffff01ce186580)
cyclic_softint+0xdc(fffffffffbc30640, 0)
cbe_low_level+0x17()
av_dispatch_softvect+0x5f(2)
dispatch_softint+0x34(0, 0)
switch_sp_and_call+0x13()
dosoftint+0x59(ffffff0007a05ad0)
do_interrupt+0x114(ffffff0007a05ad0, 1)
_interrupt+0xba()
mach_cpu_idle+6()
cpu_idle+0xaf()
cpu_idle_adaptive+0x19()
idle+0x114()
thread_start+8()
>
I have been able to reproduce this problem several times, although it has managed to complete enough to rename the original zpool.
Has anyone else encountered this issue with lofi mounted zpools?
I'm using mpt_sas with SATA drives, and I _DO_ have error counters climbing for some of those drives, is it probably that?
Any other ideas?
I'd greatly appreciate any suggestions.
Thanks!
Dan
More information about the openindiana-discuss
mailing list