[OpenIndiana-discuss] Kernel panic on hung zpool accessed via lofi

Jim Klimov jimklimov at cos.ru
Mon Sep 14 19:56:17 UTC 2015


14 сентября 2015 г. 20:23:18 CEST, "Watson, Dan" <Dan.Watson at bcferries.com> пишет:
>>-----Original Message-----
>>From: Jim Klimov [mailto:jimklimov at cos.ru] 
>>Sent: September 12, 2015 10:31 AM
>>To: Discussion list for OpenIndiana; Watson, Dan;
>openindiana-discuss at openindiana.org
>>Subject: Re: [OpenIndiana-discuss] Kernel panic on hung zpool accessed
>via lofi
>>
>>11 сентября 2015 г. 20:57:46 CEST, "Watson, Dan"
><Dan.Watson at bcferries.com> пишет:
>>>Hi all,
>>>
>>>I've been enjoying OI for quite a while butI'm running into a problem
>>>with accessing zpool on disk image files sitting on zfs accessed via
>>>lofi that I hope someone can give me a hint on.
><snip>
>>>I have been able to reproduce this problem several times, although it
>>>has managed to complete enough to rename the original zpool.
>>>
>>>Has anyone else encountered this issue with lofi mounted zpools?
>>>I'm using mpt_sas with SATA drives, and I _DO_ have error counters
>>>climbing for some of those drives, is it probably that?
>>>Any other ideas?
>>>
>>>I'd greatly appreciate any suggestions.
>>>
>>>Thanks!
>>>Dan
>>>
>>>_______________________________________________
>>>openindiana-discuss mailing list
>>>openindiana-discuss at openindiana.org
>>>http://openindiana.org/mailman/listinfo/openindiana-discuss
>>
>>From the zpool status I see it also refers to cache disks. Are those
>device names actually available (present and not used by another pool)?
>Can you remove them from the pool after you've imported it?
>>
>>Consider importing with '-N' to not automount (and autoshare)
>filesystems from this pool, and '-R /a' or some other empty/absent
>altroot path to ensure lack of conflicts when you do mount (and also
>does not add the poll into >zfs.cache file for later autoimports). At
>least, mounting and sharing as a (partially) kernel-side operation is
>something that might time out...
>>
>>Also, you might want to tune or disable the deadman timer and increase
>other acceptable latencies (see OI wiki or other resources).
>>
>>How much RAM does the box have (you pay twice the ARC cache for
>oldtank and for pool which hosts the dd files), maybe tune down
>primary/secondary caching for the files store.
>>
>>How did you get into this recovery situation? Maybe oldtank is
>corrupted and so is trying to recover during import? E.g. I had a
>history with a deduped pool where I deleted lots of data and the kernel
>wanted more RAM to process >the delete-queue of blocks than I had, and
>it took dozens of panic-reboots to complete (progress can be tracked
>with zdb).
>>
>>Alternately you can import the pool read-only to maybe avoid these
>recoveries altogether if you only want to retrieve the data.
>>
>>Jim
>>
>>--
>>Typos courtesy of K-9 Mail on my Samsung Android
>
>I can't remove the cache drives from the zpool as all zpool commands
>seem to hang waiting for something but they are not available on the
>host (anymore). I'm hoping they show up as absent/degraded.
>
>I'll try -N and/or -R /a
>
>I'll read up on how to tune the deadman timer, I've been looking at
>https://smartos.org/bugview/OS-2415 and that has lots of useful things
>to tune.
>
>I ended up doing this because the original host of the zpool stopped
>being able to make it to multi-user while attached to the disk. With
>SATA disk in a SAS tray that usually means (to me) that one of the
>disks is faulty and sending resets to the controller causing the whole
>disk tray to reset. I tried identifying the faulty disk but tested
>individually all the disks worked fine. I decided to try copying the
>disk images to the alternate host to try to recover the data. Further
>oddities have cropped up on the original host so I'm going to try
>connecting the original disk tray to an alternate host.
>
>I'll try read-only first. I was unaware there was a way to do this. I
>obviously need a ZFS refresher.
>
>Thanks Jim!
>
>Dan
>_______________________________________________
>openindiana-discuss mailing list
>openindiana-discuss at openindiana.org
>http://openindiana.org/mailman/listinfo/openindiana-discuss

BTW regarding '-N' to not-mount filesystems from the imported pool - you can follow up with 'zfs mount -a' if/when the pool gets imported. It may be that some one specific dataset fails to mount and/or share due to some fatal errors, while others work ok. If that does bite, on the next reboot you can script a oneliner to mount fs'es one by one to determine the troublemaker. ;)
--
Typos courtesy of K-9 Mail on my Samsung Android



More information about the openindiana-discuss mailing list