[OpenIndiana-discuss] summary: in desperate need of fsck.zfs

Ray Arachelian ray at arachelian.com
Wed Jul 25 18:46:44 UTC 2012


On 07/25/2012 02:26 PM, Jim Klimov wrote:
>
> I think it is a bug for the system to hang upon requests to a
> faulty pool - unless this behavior was requested with "failmode".
> From what I gather below, the box itself no longer hangs upon
> hitting problems (but some zfs/zpool commands still do?)
Yeah, I wish it was more of a timeout value instead.  Say wait X seconds
rather than wait forever.
I suspect the zfs cache file may actually store any unwritten data in
it, so deleting it might not be a great idea in normal situations. 
(I've seen the size change quite a bit.)


> There was a bug I reported, a fix for which I hope was included
> in oi_151a5 (I'm not sure though), regarding deduped datasets
> used without verification.
>
> If you had written data marked as deduped (even if there was a
> single copy of the file), and then a block of this file went
> corrupted on-disk, and you rewrote this file with a copy from
> backup, there could previously be some bad side-effects causing
> kernel panics.

I'm not using dedupe, I had dedupe on a different zpool a long time ago,
and it turned out it was eating up too much memory - this box only has
4G of RAM and it seems it's not enough for what I want out of it.

> For remote reboots you could use an UPS (good idea on a storage
> box anyway) with management via LAN or COM/USB ports. You can
> then use NUT or other toolsets to reboot the UPS on request ;)

Good idea.  I do have an RPS somewhere, but have to hook it up. (Remote
power supply - works over serial port, not over ethernet, but good enough.)

> Didn't you say the JBOD has an eSATA port, and you're getting
> hold of an HBA with such a port? You might re-hang the JBOD
> to a different link technology to see if that helps.

It does, but this box, a Gigabyte G31M-ES2L doesn't seem to support
multiplexing SATA.  I've tried it, no love there on the motherboard SATA
ports  - I do have a SATA to eSATA bracket I tried it with.  There's an
internal Sil RAID controller, but that too didn't support it.  I've got
an LSI logic SAS controller that I'm waiting on a SAS to SATA cable for
that I'm going to try next.  Hopefully that driver will support SATA
muxing.  But I'll probably upgrade the motherboard soon as I get more
cash in the toy budget since I want to put in 16G of RAM that I've got
left over.


> Technically, the pool should be excluded from the cache whenever
> you (successfully) do an explicit "zpool export".

Except that in this state, once the I/O gets broken, it stays broken and
all zpool commands regarding that pool get frozen.

> There might be some processing, like deferred-delete, which takes
> place upon pool import. If there's much data marked as deleted
> (especially on deduped datasets), the processing on smaller
> systems can take days and several reboots, due to kernel running
> out of RAM and not going to swap. (Fixes for this or similar
> cases were discussed, and maybe integrated as of current distro -
> I am not sure again). Mounting read-only, obviously, skips these
> deferred-delete processings, and maybe prohibits scrub as well.
Only problem is, it seems that once zpool is marked as having enough I/O
errors in the zfs.cache file, it seems to disable all access to it.  Not
sure if this is a bug, or my misinterpretation of it, but it feels like
a bug to me.


>
> Hanging should not happen (unless requested by failmode), and
> an error code should be returned for the read-request instead.

Yup, except that the target zpool is in the same JBOD as the source, so
when the USB connection fails, both are marked as inaccessible.  In the
case for this morning's lock up, it's the target that's now reporting
"I/O error" whenever I do any zpool command.  So I'm sure it's in the
same kind of mess.  It's not locking, but I've seen this kind of thing
prevent the box from rebooting, so I don't want to try until I get
home.  And this target zpool has failmode=wait, so I'm pretty sure it
will hang.  I've other stuff on this machine, so want it accessible.




More information about the OpenIndiana-discuss mailing list