[OpenIndiana-discuss] in desperate need of fsck.zfs

Tue Jul 24 14:54:23 UTC 2012

On 07/24/2012 07:46 AM, James Carlson wrote:
> Ray Arachelian wrote:
>> I think it's high time we get an fsck.zfs tool.  While attempting to
> I think there might be a misunderstanding here.  Please read through the
> original PSARC materials for ZFS, particularly the 1-pager:
>
> http://arc.opensolaris.org/caselog/PSARC/2002/240/onepager.opensolaris
>
> The reason that ZFS doesn't have fsck isn't because of an oversight or
> mistake.  It's by design.  ZFS is designed such that the on-disk data
> structures are always self-consistent, if they are readable at all.
>

I agree with almost everything you say, however, I have a zpool that is
unmountable, unreadable, and unclearable.  I can access the individual
drives via dd or format-analyze-read without issue.  This is even after
upgrading to 151a5.

I agree and accept that bad blocks, or bad disks, or bad controllers, or
drivers, RAM, or bugs can cause failures.  However, this zpool
experienced temporary I/O failures.  These should be recoverable, or at
least data should be accessible from it.

There should at minimum be a mechanism whereby I should be able to mount
the volumes on this pool as read only with failmode as set to continue
so that I can recover data.

How do I do this?  zpool clear fails at the zpool level and at each
individual drive.  I realize it's gone read only to prevent further
damage.  But how do I read it, if I can't mount it?

I'm able to run format-analyze-disk on the drives (I haven't ran it on
all of them, but on some).  I'm able to run zdb data against it and it
spits out a list of files/directory metaslabs like this:
     28033    1    16K     1K  1.50K     1K  100.00  ZFS plain file
     28034    1    16K     1K  1.50K     1K  100.00  ZFS plain file
     28035    1    16K     1K  1.50K     1K  100.00  ZFS plain file
     28036    1    16K     1K  1.50K     1K  100.00  ZFS plain file
     28037    1    16K     1K  1.50K     1K  100.00  ZFS plain file
     28038    1    16K    512  1.50K    512  100.00  ZFS directory
     28039    1    16K     1K  1.50K     1K  100.00  ZFS plain file
     28040    1    16K    512  1.50K    512  100.00  ZFS directory
     28041    1    16K    512  1.50K    512  100.00  ZFS directory
     28042    1    16K     1K  1.50K     1K  100.00  ZFS plain file
     28043    1    16K     1K  1.50K     1K  100.00  ZFS plain file
     28044    1    16K     1K  1.50K     1K  100.00  ZFS plain file
     28045    1    16K     1K  1.50K     1K  100.00  ZFS plain file
     28046    1    16K     1K  1.50K     1K  100.00  ZFS plain file
     28047    1    16K     1K  1.50K     1K  100.00  ZFS plain file
     28048    1    16K     1K  1.50K     1K  100.00  ZFS plain file
     28049    1    16K     1K  1.50K     1K  100.00  ZFS plain file
     28050    1    16K     1K  1.50K     1K  100.00  ZFS plain file
     28051    1    16K     1K  1.50K     1K  100.00  ZFS plain file
     28052    1    16K     1K  1.50K     1K  100.00  ZFS plain file
     28053    1    16K     1K  1.50K     1K  100.00  ZFS plain file
     28054    1    16K     1K  1.50K     1K  100.00  ZFS plain file

But how do I actually recover the data from there?

If this was say, ufs, I could always mount it read only, and run rsync
against whatever data was left to copy it elsewhere.  How do I do that?

So here's where I start to disagree, if there were I/O errors, how come
clearing them doesn't reset the count and allow me to force the pool to
mount its volume?  If there's bad metadata, how do I correct it, if
there's no fsck?

Suppose enough blocks were overwritten, or went bad on the volume that
some files/directories are no longer accessible (say two blocks in the
same chunk on two disks of a raidz1). 

With UFS, or other file systems, and fsck would recover from these, at
the sacrifice of that data, but not at the loss of all the data on the
volume.  Even if there was no fsck, there would at least be the ability
to mount it read only.  So, I don't quite buy the idea that zfs doesn't
need an fsck.

The theory of zfs is nice and all that, but it's completely useless in
helping me recover the data off this drive -- is there anything that can
be done to at least allow me to mount the data read only?  Or mark the
metadata about the zpool such that it thinks it hasn't experienced fatal
flaws for long enough to read the data off it?