[OpenIndiana-discuss] Recommendations for fast storage

Sašo Kiselkov skiselkov.ml at gmail.com
Wed Apr 17 09:16:11 UTC 2013


On 04/17/2013 02:08 AM, Edward Ned Harvey (openindiana) wrote:
>> From: Sašo Kiselkov [mailto:skiselkov.ml at gmail.com]
>>
>> If you are IOPS constrained, then yes, raid-zn will be slower, simply
>> because any read needs to hit all data drives in the stripe. 
> 
> Saso, I would expect you to know the answer to this question, probably:
> I have heard that raidz is more similar to raid-1e than raid-5.
> Meaning, when you write data to raidz, it doesn't get striped across
> all devices in the raidz vdev...  Rather, two copies of the data get
> written to any of the available devices in the raidz. Can you confirm?

No, this is not what happens. Raid-Z indeed does stripe data across all
leaf vdevs (minus parity) and does so by splitting the logical block up
into equally sized portions. A block *can* take up less than the full
stripe width for very small blocks or for very wide stripes, both of
which should be a rare occurrence. 4k sectored devices change this
calculation quite dramatically, which is why I wouldn't recommend using
them in pools unless you understand how your workload and raidz geometry
will interact and take note of it. See vdev_raidz_map_alloc here:

https://github.com/illumos/illumos-gate/blob/master/usr/src/uts/common/fs/zfs/vdev_raidz.c#L434-L554

for all the fine details (the above description is grossly oversimplified).

> If the behavior is to stripe across all the devices in the raidz,
> then the raidz iops really can't exceed that of a single device,
> because you have to wait for every device to respond before you
> have a complete block of data.  But if it's more like raid-1e and
> individual devices can read independently of each other, then at
> least theoretically, the raidz with n-devices in it could return
> iops performance on-par with n-times a single disk. 

As a general rule of thumb, raidz has the IOPS of a single drive. This
is not exactly news:
https://blogs.oracle.com/roch/entry/when_to_and_not_to

Cheers,
--
Saso



More information about the OpenIndiana-discuss mailing list