[OpenIndiana-discuss] in desperate need of fsck.zfs

James Carlson carlsonj at workingcode.com
Tue Jul 24 11:46:32 UTC 2012


Ray Arachelian wrote:
> I think it's high time we get an fsck.zfs tool.  While attempting to

I think there might be a misunderstanding here.  Please read through the
original PSARC materials for ZFS, particularly the 1-pager:

http://arc.opensolaris.org/caselog/PSARC/2002/240/onepager.opensolaris

The reason that ZFS doesn't have fsck isn't because of an oversight or
mistake.  It's by design.  ZFS is designed such that the on-disk data
structures are always self-consistent, if they are readable at all.

Of course, no file system can guarantee that the on-disk data are always
readable.  Hardware device failures do happen, and if you have enough
failure, you will in fact lose data.  That's what backups are all about.

What ZFS does guarantee, though, is that if you can read the data, then
the structures read are always self-consistent and correct.  It
guarantees that by a design that ensures that the pointed-to data are
always written and committed before the pointers that reference that data.

What does fsck do?  It reads through the data structures and verifies
that the UFS i-nodes and free lists and directory entries are all
self-consistent, and applies "reasonable" heuristics to modify them if
they are not consistent.  But with ZFS, there are no inconsistent
on-disk states, so there's nothing that a "fsck.zfs tool" could possibly do.

ZFS's zpool "scrub" facility allows you to check for unreadable blocks
and, to the extent that the chosen topology allows it, correct for them.

So what does that leave?  If you have crummy disks, replace them.  If
you have poor drivers, upgrade or rewrite them.  If you're using a
technology with consistently badly-written code or poorly-performing
interfaces (USB, sadly, seems to fall in this camp), then use something
else.  The design of the file system is supposed to protect you against
random failures due to bad sectors or entire drives or spindles going
down; not against all possible data loss or malicious equipment.

And if you have found a bug in ZFS, then it needs to be fixed.  fsck in
UFS is not there as a means to work around bugs in UFS -- if anything,
it can be expected to have more bugs than UFS because it's much more
rarely used -- but as a means to cope with transient inconsistent
on-disk states that may persist between system boots.  The same is true
for ZFS; if there's a bug, it needs to be fixed, not papered over with
another "tool."

-- 
James Carlson         42.703N 71.076W         <carlsonj at workingcode.com>



More information about the OpenIndiana-discuss mailing list