[OpenIndiana-discuss] Recommendations for fast storage

Jim Klimov jimklimov at cos.ru
Tue Apr 16 23:15:06 UTC 2013


On 2013-04-16 23:37, Timothy Coalson wrote:
> On Tue, Apr 16, 2013 at 4:29 PM, Sašo Kiselkov <skiselkov.ml at gmail.com>wrote:
>
>> If you are IOPS constrained, then yes, raid-zn will be slower, simply
>> because any read needs to hit all data drives in the stripe. This is
>> even worse on writes if the raidz has bad geometry (number of data
>> drives isn't a power of 2).
>>
>
> Off topic slightly, but I have always wondered at this - what exactly
> causes non-power of 2 plus number of parities geometries to be slower, and
> by how much?  I tested for this effect with some consumer drives, comparing
> 8+2 and 10+2, and didn't see much of a penalty (though the only random test
> I did was read, our workload is highly sequential so it wasn't important).


My take on this is not that these geometries are slower, but that
they may be less efficient in terms of overheads at data storage.

Say, you write a 16-sector block of userdata to your arrays.
In case of 8+2 that would be two full stripes of parity and data.
In case of 9+2 that would be a 9+2 and a 7+2 stripe. Access to
this data is less balanced, placing more load on some disks which
have 2 sectors of this block, and less load on others which have
only one sector. It seems more "sad" when (i.e. due to compression)
you have 1 or 2 userdata sectors remaining on a second stripe, but
must still provide the 2 or 3 sectors of redundancy for this mini
stripe.

Also, as I found, ZFS raidzN makes precautions to not leave some
potentially unusable holes (i.e. 1 or 2 free sectors, where you
can't fit parity and data), so it would allocate full stripes
when you have sufficiently unlucky stripe lengths just a few
sectors shorter than full (i.e. 7+2 above would likely be allocated
as 9+2 with zeroed-out extra sectors)... These things do add up to
gigabytes, though they can happen on power-of-two sized arrays with
compression just as easily (I found this with a 6-disk raidz2, with
4 data disks).

Jim




More information about the OpenIndiana-discuss mailing list