[OpenIndiana-discuss] ZFS help

Fri Mar 17 04:40:26 UTC 2017

You mean you can slice a lofi disk? My whole career is a lie!

;)

" 'With the first link, the chain is forged. The first speech censured, the
first thought forbidden, the first freedom denied, chains us all
irrevocably.' Those words were uttered by Judge Aaron Satie as wisdom and
warning... The first time any man's freedom is trodden on, we’re all
damaged." - Jean-Luc Picard, quoting Judge Aaron Satie, Star Trek: TNG
episode "The Drumhead"
- Alex Smith
- Kent, Washington (metropolitan Seattle area)

On Thu, Mar 16, 2017 at 7:48 PM, Jeff Woolsey <jlw at jlw.com> wrote:

> On 3/13/17 1:00 AM, Udo Grabowski (IMK) wrote:
> > Hi,
> >
> > (see inline hints)
> >
> >
> > This pool is in wait state for the missing device and will not
> > recover without a reboot (we had this situation a couple of times).
> > The only way to get out of this is a reboot, and the second disk should
> > either be not in the slot, or it should be a fresh disk with nothing
> > on it, so that the system declares that as corrupt on reboot.
>
> That may be difficult as zfs thinks that each submirror is a different
> slice on the same disk.  About all I can do there is fill the "missing"
> slice with garbage (or zeros).  (As it happens, the pool that
> temporarily is not living there is called "missing"...).  Well, that
> didn't work.  zdb shows that device with a different label than the one
> in the pool.
>
> > Unfortunately, the other disk seems to have serious data errors, so
> > the pool will show severe dataloss (see zpool status -v after repair).
> > The weird number you see seems to be the GUID of the disk, the label
> > is corrupt in some way.
>
> That other disk has been replaced, and is now working fine for the other
> pools.
>
> >
> > Look at iostat -exn, it should tell you more about the errors,
> > as well as a 'smartctl -a -dsat,12 /dev/rdsk/c5d0s2' that shows
> > the specific errors recorded by the disk controller.
>
> The BIOS was complaining of SMART reporting imminent failure on the
> temporary replacement disk (1500GB).  That disk is also no longer in the
> system.
> >
> > After reboot, you also very probably will have to send a 'repair' to
> > the fmadm event ids before it will start to resilver the pool. If that
> > all fails, you may evacuate the /etc/zfs/zpool.cache file and reboot
> > and try to import that pool by hand, but I would try that as a last
> > resort, you may loose the complete pool. Also note that the device name
> > is recorded on the pool itself, see zdb -l /dev/dsk/c5d0s6, and that
> > name will be adapted only correctly after a reboot or reimport if it is
> > wrong. 'touch /reconfigure' before reboot may also be advisable.
> > If it fails to import, there are more tricky options to rollback the
> > pool on import, but I hope that this is not necessary.
> >
> Looks like it's time for heavier artillery.
>
> >  And finally, dying capacitors
> > in your AC-adapter or on the mainboard can drive the whole machine
> > crazy;
>
> I've replaced those before on an earlier incantation of this system
> (micro-ATX socket 754 Athlon64 3000+).  Unlikely to be the cause here,
> as the symptom of this in the past was catatonia (i.e. dead, no response
> at all).  But I'll keep it in mind.
>
> >>
> >> I'll try the dd thing, and try to import the image that results.  I
> >> suspect it may have the same problem.
> >>
>
> So far I've been unable to convince  zpool to import from /dev/lofi/1 .
> I'm guessing it's because there is no fdisk label there, or I haven't
> figured out how to slice a lofi "disk".
>
> Meanwhile, the list of things I can't do remains:
>
> >>>
> >>>> # zpool reopen cloaking
> >>>> cannot reopen 'cloaking': pool I/O is currently suspended
> >>>> # zpool detach cloaking /dev/dsk/c5d0s3
> >>>> cannot detach /dev/dsk/c5d0s3: pool I/O is currently suspended
> >>>> # zpool detach cloaking 8647373200783277078
> >>>> cannot detach 8647373200783277078: pool I/O is currently suspended
> >>>> # zpool clear cloaking       just hangs. Meanwhile, despite its
> >>>> assertions of
> >>>> ONLINE,
> >>>>
> >>>> # zfs list -r cloaking
> >>>> cannot open 'cloaking': pool I/O is currently suspended
> >>>> # zpool remove cloaking 8647373200783277078
> >>>> cannot remove 8647373200783277078: only inactive hot spares, cache,
> >>>> top-level, or log devices can be removed
> >>>> # zpool offline cloaking 8647373200783277078
> >>>> cannot offline 8647373200783277078: pool I/O is currently suspended
> >>>> #
> >>>>
> >>>> I'm of the opinion that the data is mostly intact (unless zpool has
> >>>> been
> >>>> tricked into resilvering a data disk from a blank one (horrors)).
> >>>>
> >>>> # zpool export cloaking
> >>>>
> >>>> hangs.
> >
>
> --
> Jeff Woolsey {{woolsey,jlw}@jlw,first.last@{gmail,jlw}}.com
> Nature abhors straight antennas, clean lenses, and empty storage.
> "Delete! Delete! OK!" -Dr. Bronner on disk space management
> Card-sorting, Joel.  -Crow on solitaire
>
>
> _______________________________________________
> openindiana-discuss mailing list
> openindiana-discuss at openindiana.org
> https://openindiana.org/mailman/listinfo/openindiana-discuss
>