[OpenIndiana-discuss] ZFS help

Udo Grabowski (IMK) udo.grabowski at kit.edu
Mon Mar 13 08:00:44 UTC 2017


Hi,

(see inline hints)


On 13/03/2017 01:27, Jeff Woolsey wrote:
> TL;DR: I'd already done most of that, and wouldn't do most of what you
> counsel against.
>
> It's clear I left out details of the saga that preceded this.  This is a
> generic x86 box (PC) with 4 SATA ports, 3.5GB memory, and one
> hyperthreaded 3.2GHz CPU. One of my mirrored 2TB disks flaked out, and
> pending getting another, I replaced it with an apparently-working spare
> 1.5TB.  The way the pools are laid out is historical; this system
> started out on a pair of 500s.  Anyway, the 1.5TB was also flakey.  Most
> of the time I could pacify things by power-cycling the individual
> recalcitrant disk, which would then recover the pool with just a scrub,
> usually; a resilver otherwise.
>
> As for this disk, the reason it looks like I have mirrored two slices on
> the same disk is that there were a number of reboots and devfsadm -C in
> between there.   ZFS managed to deal with that most of the time.  Note
> also that it _was_ /dev/dsk/c5d0s3.  It isn't any more.  I just want the
> pool to forget about that disk entirely (since it's not there (UNAVAIL),
> just what is the pool resilvering _from_???).
>
> 24 hours later the resilvering has not advanced _at all_.

This pool is in wait state for the missing device and will not
recover without a reboot (we had this situation a couple of times).
The only way to get out of this is a reboot, and the second disk should
either be not in the slot, or it should be a fresh disk with nothing
on it, so that the system declares that as corrupt on reboot.
Unfortunately, the other disk seems to have serious data errors, so
the pool will show severe dataloss (see zpool status -v after repair).
The weird number you see seems to be the GUID of the disk, the label
is corrupt in some way.

Look at iostat -exn, it should tell you more about the errors,
as well as a 'smartctl -a -dsat,12 /dev/rdsk/c5d0s2' that shows
the specific errors recorded by the disk controller.

After reboot, you also very probably will have to send a 'repair' to
the fmadm event ids before it will start to resilver the pool. If that
all fails, you may evacuate the /etc/zfs/zpool.cache file and reboot
and try to import that pool by hand, but I would try that as a last
resort, you may loose the complete pool. Also note that the device name
is recorded on the pool itself, see zdb -l /dev/dsk/c5d0s6, and that
name will be adapted only correctly after a reboot or reimport if it is
wrong. 'touch /reconfigure' before reboot may also be advisable.
If it fails to import, there are more tricky options to rollback the
pool on import, but I hope that this is not necessary.

> ECC is not available in PC architecture; the systems I do have with ECC
> are ten times slower, SCSI-only, and SPARC (for which OI-current is not
> available).

Not having ECC will imply that you have to be extra careful of
the integrity of your memory, since problems there will have severe
impact on the integrity of your pool, and this is definitely
non-detectable by scrub. Run some extensive memory scrubbers regularily
to find flaky memory. Also problematic are certain mainboard
SATA adapters, like those old JMB36x from JMicron (I have such a beast
at home, and I had to entirely disable it to get rid of scrub errors
on my pools), especially with fast SSDs. And finally, dying capacitors
in your AC-adapter or on the mainboard can drive the whole machine
crazy; look out for bulged capacitors, they are definitely dead, if
you can get access to an oscilloscope, check the voltages for cleanness
(I just had one case of hefty ripples on the 5V line, all other
voltages were clean, and the machine just did weird stuff before it
finally refused to start at all). With a little experience and a good
temperature-controlled soldertip, dead capacitors can be exchanged for
new onesi even on a multilayered board, but that is definitely not a
beginners task...

>
> I'll try the dd thing, and try to import the image that results.  I
> suspect it may have the same problem.
>
>
> On 3/11/17 10:54 PM, Nikola M wrote:
>> On 03/11/17 11:25 PM, Jeff Woolsey wrote:
>>> # uname -a
>>> SunOS bombast 5.11 illumos-2816291 i86pc i386 i86pc
>>> # cat /etc/release
>>>                OpenIndiana Hipster 2016.10 (powered by illumos)
>>>           OpenIndiana Project, part of The Illumos Foundation (C)
>>> 2010-2016
>>>                           Use is subject to license terms.
>>>                              Assembled 30 October 2016
>>> # # zpool status cloaking
>>>     pool: cloaking
>>>    state: ONLINE
>>> status: One or more devices is currently being resilvered.  The pool
>>> will
>>>           continue to function, possibly in a degraded state.
>>> action: Wait for the resilver to complete.
>>>     scan: resilver in progress since Sat Mar 11 11:42:37 2017
>>>       8.25M scanned out of 358G at 962/s, (scan is slow, no estimated
>>> time)
>>>       5.31M resilvered, 0.00% done
>>> config:
>>>
>>>           NAME                     STATE     READ WRITE CKSUM
>>>           cloaking                 ONLINE      97     0     0
>>>             mirror-0               ONLINE     582     0     0
>>>               c5d0s6               ONLINE       0     0   582
>>> (resilvering)
>>>               8647373200783277078  UNAVAIL      0     0     0  was
>>> /dev/dsk/c5d0s3
>>>
>>> errors: 120 data errors, use '-v' for a list
>>
>> Not an expert (should really ask OpenZFS people what to do next),
>> yet seems like you do have disk died/unavailable and even after
>> reboot, it would continue doing the operation that started.
>> "120 data errors" looks like you DO have some data errors form disk
>> itself and also you do have 582 checksum errors in transfer from/to disk.
>>
>> I hope you have Backups elsewhere, I hope you are not using SATA disks
>> on SAS to SATA expanders (unreliable), I hope you are not using SATA
>> disks on SAS controller (not recommended), I hope you are using ECC
>> RAM (must have if valuing data).
>> Also it seems you have done some weird thing.., adding 2 disk slices
>> on the SAME disk to a mirror..
>> What is the point of that, when you can always set 'zfs set copies=2'
>> for any dataset to get duplicated data copies on same pool. anyway?
>>
>>> 8.25M scanned out of 358G at 962/s, (scan is slow, no estimated time)
>>
>> When I start zpool scrub, it starts slowly but later it does speed up.
>> I would recommend turning machine off, booting from some live USB/DVD
>> media and dump with dd (disk dump) _Everything_ on that disk/working
>> partition/slice elsewhere (on image, device) for safekeeping, in case
>> other disk dies too.
>>
>>> # zpool reopen cloaking
>>> cannot reopen 'cloaking': pool I/O is currently suspended
>>> # zpool detach cloaking /dev/dsk/c5d0s3
>>> cannot detach /dev/dsk/c5d0s3: pool I/O is currently suspended
>>> # zpool detach cloaking 8647373200783277078
>>> cannot detach 8647373200783277078: pool I/O is currently suspended
>>> # zpool detach cloaking randomtrash
>>> cannot detach randomtrash: no such device in pool
>>> #
>>>
>>> How can I get rid of the UNAVAIL disk slice so that this pool doesn't
>>> try to resilver (From what, pray tell?) all the time.  I don't know
>>> where that ugly number came from--this system only has SATA disks.  I
>>> have a new mirror slice just waiting for it as soon as it stops doing
>>> this.  zpool clear  just hangs. Meanwhile, despite its assertions of
>>> ONLINE,
>>>
>>> # zfs list -r cloaking
>>> cannot open 'cloaking': pool I/O is currently suspended
>>> # zpool remove cloaking 8647373200783277078
>>> cannot remove 8647373200783277078: only inactive hot spares, cache,
>>> top-level, or log devices can be removed
>>> # zpool offline cloaking 8647373200783277078
>>> cannot offline 8647373200783277078: pool I/O is currently suspended
>>> #
>>>
>>> I'm of the opinion that the data is mostly intact (unless zpool has been
>>> tricked into resilvering a data disk from a blank one (horrors)).
>>>
>>> # zpool export cloaking
>>>
>>> hangs.

-- 
Dr.Udo Grabowski    Inst.f.Meteorology a.Climate Research IMK-ASF-SAT
http://www.imk-asf.kit.edu/english/sat.php
KIT - Karlsruhe Institute of Technology            http://www.kit.edu
Postfach 3640,76021 Karlsruhe,Germany  T:(+49)721 608-26026 F:-926026



More information about the openindiana-discuss mailing list