[OpenIndiana-discuss] in desperate need of fsck.zfs

Tue Jul 24 15:13:55 UTC 2012

Not to be argumentative, but UFS would be toast if you lost the
superblock and did not know the location of any backup superblocks.
    The official answer would be to recover from backup. Sounds to me
that what you need is a data recovery program.

Maybe Solaris 11 could mount your pool?

Mike

On Tue, 2012-07-24 at 10:54 -0400, Ray Arachelian wrote:

> On 07/24/2012 07:46 AM, James Carlson wrote:
> > Ray Arachelian wrote:
> >> I think it's high time we get an fsck.zfs tool.  While attempting to
> > I think there might be a misunderstanding here.  Please read through the
> > original PSARC materials for ZFS, particularly the 1-pager:
> >
> > http://arc.opensolaris.org/caselog/PSARC/2002/240/onepager.opensolaris
> >
> > The reason that ZFS doesn't have fsck isn't because of an oversight or
> > mistake.  It's by design.  ZFS is designed such that the on-disk data
> > structures are always self-consistent, if they are readable at all.
> >
> 
> I agree with almost everything you say, however, I have a zpool that is
> unmountable, unreadable, and unclearable.  I can access the individual
> drives via dd or format-analyze-read without issue.  This is even after
> upgrading to 151a5.
> 
> I agree and accept that bad blocks, or bad disks, or bad controllers, or
> drivers, RAM, or bugs can cause failures.  However, this zpool
> experienced temporary I/O failures.  These should be recoverable, or at
> least data should be accessible from it.
> 
> There should at minimum be a mechanism whereby I should be able to mount
> the volumes on this pool as read only with failmode as set to continue
> so that I can recover data.
> 
> How do I do this?  zpool clear fails at the zpool level and at each
> individual drive.  I realize it's gone read only to prevent further
> damage.  But how do I read it, if I can't mount it?
> 
> I'm able to run format-analyze-disk on the drives (I haven't ran it on
> all of them, but on some).  I'm able to run zdb data against it and it
> spits out a list of files/directory metaslabs like this:
>      28033    1    16K     1K  1.50K     1K  100.00  ZFS plain file
>      28034    1    16K     1K  1.50K     1K  100.00  ZFS plain file
>      28035    1    16K     1K  1.50K     1K  100.00  ZFS plain file
>      28036    1    16K     1K  1.50K     1K  100.00  ZFS plain file
>      28037    1    16K     1K  1.50K     1K  100.00  ZFS plain file
>      28038    1    16K    512  1.50K    512  100.00  ZFS directory
>      28039    1    16K     1K  1.50K     1K  100.00  ZFS plain file
>      28040    1    16K    512  1.50K    512  100.00  ZFS directory
>      28041    1    16K    512  1.50K    512  100.00  ZFS directory
>      28042    1    16K     1K  1.50K     1K  100.00  ZFS plain file
>      28043    1    16K     1K  1.50K     1K  100.00  ZFS plain file
>      28044    1    16K     1K  1.50K     1K  100.00  ZFS plain file
>      28045    1    16K     1K  1.50K     1K  100.00  ZFS plain file
>      28046    1    16K     1K  1.50K     1K  100.00  ZFS plain file
>      28047    1    16K     1K  1.50K     1K  100.00  ZFS plain file
>      28048    1    16K     1K  1.50K     1K  100.00  ZFS plain file
>      28049    1    16K     1K  1.50K     1K  100.00  ZFS plain file
>      28050    1    16K     1K  1.50K     1K  100.00  ZFS plain file
>      28051    1    16K     1K  1.50K     1K  100.00  ZFS plain file
>      28052    1    16K     1K  1.50K     1K  100.00  ZFS plain file
>      28053    1    16K     1K  1.50K     1K  100.00  ZFS plain file
>      28054    1    16K     1K  1.50K     1K  100.00  ZFS plain file
> 
> But how do I actually recover the data from there?
> 
> If this was say, ufs, I could always mount it read only, and run rsync
> against whatever data was left to copy it elsewhere.  How do I do that?
> 
> So here's where I start to disagree, if there were I/O errors, how come
> clearing them doesn't reset the count and allow me to force the pool to
> mount its volume?  If there's bad metadata, how do I correct it, if
> there's no fsck?
> 
> Suppose enough blocks were overwritten, or went bad on the volume that
> some files/directories are no longer accessible (say two blocks in the
> same chunk on two disks of a raidz1). 
> 
> With UFS, or other file systems, and fsck would recover from these, at
> the sacrifice of that data, but not at the loss of all the data on the
> volume.  Even if there was no fsck, there would at least be the ability
> to mount it read only.  So, I don't quite buy the idea that zfs doesn't
> need an fsck.
> 
> The theory of zfs is nice and all that, but it's completely useless in
> helping me recover the data off this drive -- is there anything that can
> be done to at least allow me to mount the data read only?  Or mark the
> metadata about the zpool such that it thinks it hasn't experienced fatal
> flaws for long enough to read the data off it?
> 
> 
> _______________________________________________
> OpenIndiana-discuss mailing list
> OpenIndiana-discuss at openindiana.org
> http://openindiana.org/mailman/listinfo/openindiana-discuss