[OpenIndiana-discuss] Recommendations for fast storage

Wed Apr 17 00:49:45 UTC 2013

On 2013-04-17 02:10, Jay Heyl wrote:
> Not to get into bickering about semantics, but I asked, "Or am I wrong
> about reads being issued in parallel to all the mirrors in the array?", to
> which you replied, "Yes, in normal case... this assumption is wrong... but
> reads should be in parallel." (Ellipses intended for clarity, not argument
> munging.) If reads are in parallel, then it seems as though my assumption
> is correct. I realize the system will discard data from all but the first
> reads and that using only the first response can improve performance, but
> in terms of number of IOPs, which is where I intended to go with this, it
> seems to me the mirrored system will have at least as many if not more than
> the raid-zn system.
>
> Or have I completely misunderstood what you intended to say?

Um, right... I got torn between several letters and forgot the details
of one. So, here's what I replied to with poor wording - *I thought you
meant* "A single read request from a program would be redirected as a
series of parallel requests to mirror components asking for the same
data, whichever one answers first" - this is no, the "wrong" in my
reply. Unless the first device to answer returns garbage (something
that doesn't match the expected checksum), other copies are not read
as part of this request.

Now, if there are many requests on the system issued simultaneously,
which is most often the case, then reads from different requests are
directed to different disks, but again - one read goes to one disk
except pathological cases. It is likely that the system selects a
disk to read from based, in part, on its expectation of where the
disk head is (i.e. last requested LBA is nearest to the LBA we want
now) in order to minimize latency and unproductive time losses.
Thus "sequential reads" where requests for nearby sectors come in
a succession are likely to be satisfied by a single disk in the
mirror, leaving other disks available to satisfy other reads.

Copies of a write request however are sent to all disks and committed
(flushed) before the synchronous request is accepted as completed
(for example, a write-and-commit of a TXG transaction group).

Hope this makes my point clearer, it is late here ;)
//Jim