[OpenIndiana-discuss] in desperate need of fsck.zfs
Ray Arachelian
ray at arachelian.com
Tue Jul 24 14:54:23 UTC 2012
On 07/24/2012 07:46 AM, James Carlson wrote:
> Ray Arachelian wrote:
>> I think it's high time we get an fsck.zfs tool. While attempting to
> I think there might be a misunderstanding here. Please read through the
> original PSARC materials for ZFS, particularly the 1-pager:
>
> http://arc.opensolaris.org/caselog/PSARC/2002/240/onepager.opensolaris
>
> The reason that ZFS doesn't have fsck isn't because of an oversight or
> mistake. It's by design. ZFS is designed such that the on-disk data
> structures are always self-consistent, if they are readable at all.
>
I agree with almost everything you say, however, I have a zpool that is
unmountable, unreadable, and unclearable. I can access the individual
drives via dd or format-analyze-read without issue. This is even after
upgrading to 151a5.
I agree and accept that bad blocks, or bad disks, or bad controllers, or
drivers, RAM, or bugs can cause failures. However, this zpool
experienced temporary I/O failures. These should be recoverable, or at
least data should be accessible from it.
There should at minimum be a mechanism whereby I should be able to mount
the volumes on this pool as read only with failmode as set to continue
so that I can recover data.
How do I do this? zpool clear fails at the zpool level and at each
individual drive. I realize it's gone read only to prevent further
damage. But how do I read it, if I can't mount it?
I'm able to run format-analyze-disk on the drives (I haven't ran it on
all of them, but on some). I'm able to run zdb data against it and it
spits out a list of files/directory metaslabs like this:
28033 1 16K 1K 1.50K 1K 100.00 ZFS plain file
28034 1 16K 1K 1.50K 1K 100.00 ZFS plain file
28035 1 16K 1K 1.50K 1K 100.00 ZFS plain file
28036 1 16K 1K 1.50K 1K 100.00 ZFS plain file
28037 1 16K 1K 1.50K 1K 100.00 ZFS plain file
28038 1 16K 512 1.50K 512 100.00 ZFS directory
28039 1 16K 1K 1.50K 1K 100.00 ZFS plain file
28040 1 16K 512 1.50K 512 100.00 ZFS directory
28041 1 16K 512 1.50K 512 100.00 ZFS directory
28042 1 16K 1K 1.50K 1K 100.00 ZFS plain file
28043 1 16K 1K 1.50K 1K 100.00 ZFS plain file
28044 1 16K 1K 1.50K 1K 100.00 ZFS plain file
28045 1 16K 1K 1.50K 1K 100.00 ZFS plain file
28046 1 16K 1K 1.50K 1K 100.00 ZFS plain file
28047 1 16K 1K 1.50K 1K 100.00 ZFS plain file
28048 1 16K 1K 1.50K 1K 100.00 ZFS plain file
28049 1 16K 1K 1.50K 1K 100.00 ZFS plain file
28050 1 16K 1K 1.50K 1K 100.00 ZFS plain file
28051 1 16K 1K 1.50K 1K 100.00 ZFS plain file
28052 1 16K 1K 1.50K 1K 100.00 ZFS plain file
28053 1 16K 1K 1.50K 1K 100.00 ZFS plain file
28054 1 16K 1K 1.50K 1K 100.00 ZFS plain file
But how do I actually recover the data from there?
If this was say, ufs, I could always mount it read only, and run rsync
against whatever data was left to copy it elsewhere. How do I do that?
So here's where I start to disagree, if there were I/O errors, how come
clearing them doesn't reset the count and allow me to force the pool to
mount its volume? If there's bad metadata, how do I correct it, if
there's no fsck?
Suppose enough blocks were overwritten, or went bad on the volume that
some files/directories are no longer accessible (say two blocks in the
same chunk on two disks of a raidz1).
With UFS, or other file systems, and fsck would recover from these, at
the sacrifice of that data, but not at the loss of all the data on the
volume. Even if there was no fsck, there would at least be the ability
to mount it read only. So, I don't quite buy the idea that zfs doesn't
need an fsck.
The theory of zfs is nice and all that, but it's completely useless in
helping me recover the data off this drive -- is there anything that can
be done to at least allow me to mount the data read only? Or mark the
metadata about the zpool such that it thinks it hasn't experienced fatal
flaws for long enough to read the data off it?
More information about the OpenIndiana-discuss
mailing list