[OpenIndiana-discuss] ZFS keeps finding errors

Gregory Youngblood gregory at youngblood.me
Sat Dec 21 16:22:56 UTC 2013


I have seen something like this before long ago on an OpenSolaris box. It was desktop grade equipment, machine, disks, nonECC memory, and every scrub found issues. Never did find the source. Did not have the issue on server grade setups, and the individual parts worked and tested fine, no errors when pulled or tested with linux.

At the time I speculated my sata controller was not fully supported even though it seemed to work. Not sure.

Greg




Sent from my HTC One on the Verizon Wireless 4G LTE network

----- Reply message -----
From: "Robin Axelsson" <gu99roax at student.chalmers.se>
To: <openindiana-discuss at openindiana.org>
Subject: [OpenIndiana-discuss] ZFS keeps finding errors
Date: Sat, Dec 21, 2013 8:42 AM

I would investigate the possibility to hook the hard drives up to
another system with say, an LSI 1068 based controller and see how they
behave there...


On 2013-12-21 16:13, Jim Klimov wrote:
> Hello all,
>
> I got access to my old Home-NAS again, which got me looking under deep
> the hood of ZFS in the first place due to strange errors with my pool.
> This is a Pentium-4 based PC including an Asus P5B-Deluxe motherboard
> with 7 SATA connectors (an Intel and a JMicron set), with 8GB of
> (non-ECC) memory. Used to be quite a machine back in its day as a PC
> and gaming station! ;)
>
> Currently it serves as an OpenIndiana-based storage unit, but serves
> poorly: despite using raidz2 over 6*2Tb drives, it keeps finding errors
> (and DD'ing the offsets from disks shows that indeed there is trash on
> disk instead of proper data). Also the system disk finds and fixes some
> checksum errors on every scrub - luckily, it uses copies=2. While there
> was only the old 80Gb SATA (which I thought had died - but did not) it
> usually found 2-6 errors per scrub. Now I have mirrored it with a newer
> 250Gb SATA (picked from an HP Microserver barebone) to migrate the OS
> from an old disk, and it finds and fixes up to 30 errors per scrub.
>
> My guess is that these problems may be due to randomness from overheat
> in the CPU or chipset - but no real complaints here, insufficient power
> somehow or lack of ECC. Or just plain age is showing... All I can say
> instrumentally is that long SMART tests did not find any errors.
>
> So far I am trying to evacuate the remaining data from this box to
> some other storage. But I am not sure what to do with it or its parts
> such as disks - can they be trusted to rebuild a new pool, for example?
> Or should they be safer put away or repurposed back into a PC?..
>
> Any ideas or comments?
> Thanks,
> //Jim
>
> _______________________________________________
> OpenIndiana-discuss mailing list
> OpenIndiana-discuss at openindiana.org
> http://openindiana.org/mailman/listinfo/openindiana-discuss
> .
>


_______________________________________________
OpenIndiana-discuss mailing list
OpenIndiana-discuss at openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss


More information about the OpenIndiana-discuss mailing list