[OpenIndiana-discuss] vdev reliability was: Recommendations for fast storage

Richard Elling richard.elling at richardelling.com
Mon Apr 22 01:02:17 UTC 2013


On Apr 21, 2013, at 3:47 AM, Jim Klimov <jimklimov at cos.ru> wrote:

> On 2013-04-21 06:13, Richard Elling wrote:
>> Terminology warning below…
> 
> 
>> BER is the term most often used in networks, where the corruption is transient. For permanent
>> data faults, the equivalent is unrecoverable read error rate (UER), also expressed as a failure rate
>> per bit. ...
> 
> Well, with computers being networks of smaller components, beside the
> UER "contained" only in the storage device as repeatably returning the
> error (or rather a response different from stored and expected value),
> there is a place for BER concept as you say it is - there are cables
> and soldered signal lines which can catch noise, there are protocols
> and firmwares which might mistreat some corner cases, etc. - providing
> intermittent errors which are not there the second time you look.

The problem is finding a spec that you can design to. We have seen many
bad cables cause all sorts of latency (due to retries on bad transfers). This
information is not measured or spec'ed by disk vendors.

> Even UERs might not be persistent, if the HDD decides to relocate a
> detected-failing sector into spare areas, and returns some consistent
> replies to queries afterwards (I did have cases with old HDDs that
> did creak and rattle for a while and returned some bytes when querying
> bad sectors, and replies were different every time or IO errors were
> returned at the protocol layer instead of random garbage as data).
> 
>> The trend seems to be that BER data is not shown for laptop drives, which is a large part of
>> the HDD market. Presumably, this is because the load/unload failure mode dominates in
>> this use case as the drives are not continuously spinning. It is a good idea to use components
>> in the environment for which they are designed, so I'm pretty sure you'd never consider using
>> a laptop drive for a storage array.
> 
> This brings up an interesting question for home-NAS users: it does not
> seem unreasonable to use a laptop drive or two as an rpool in an array
> like the popular ZFS workhorse HP N40L. I agree that it seems improper
> to build an array for *intensive* IO with an horde of such disks, but
> do you have statistics to really discourage these two cases (rpool and
> intensive IO)? What about home-NASes which just occasionally see some
> IO, maybe in intensive bursts, but idle for hours otherwise?
> 
> Indeed, many portable-disk boxes contain a laptop drive. Arguably, they
> might also be more reliable mechanically, because intended for use in
> shaky environments.



You get what you pay for.
 -- richard

--

Richard.Elling at RichardElling.com
+1-760-896-4422





More information about the OpenIndiana-discuss mailing list