[OpenIndiana-discuss] ZFS read speed(iSCSI)

Fri Jun 7 14:43:16 UTC 2013

On 2013-06-07 14:09, Edward Ned Harvey (openindiana) wrote:
>> From: Heinrich van Riel [mailto:heinrich.vanriel at gmail.com]
>>
>> I will post my findings, but might take some time to fix the network in
>> time and they will have to deal with 1Gbps for the storage. The request is
>> to run ~90 VMs on 8 servers connected.
>
> With 90 VM's on 8 servers, being served ZFS iscsi storage by 4x 1Gb ethernet in LACP, you're really not going to care about any one VM being able to go above 1Gbit.  Because it's going to be so busy all the time, that the 4 LACP bonded ports will actually be saturated.  I think your machines are going to be slow.  I normally plan for 1Gbit per VM, in order to be comparable with a simple laptop.
>
> You're going to have a lot of random IO.  I'll strongly suggest you switch to mirrors instead of raidz.

I'll leave your practical knowledge in higher regard than my theoretical
hunches, but I believe typical PCs (including VDI desktops) don't do
much disk IO after they've loaded the OS or a requested application.
Unless they swap due to low RAM, and swapping over iSCSI sounds like
a bad idea best avoided somehow else, anyway.

That is, the IO may be saturated on single links, but that would be in
bursts, just like any other sporadic not-full-bandwidth IO.

And from what I read, if his 8 VM servers would contact the ZFS storage
box with requests to many more targets, then on average all NICs will
likely get their share of work, for one connection or another, even as
part of LACP trunks (which may be easier to manage than VLAN-based
MPxIO, with its separate benefits however). Right?..

The OP's particular scenario may be different from generic theory in
that it is an educational lab environment, i.e. there would be lots
of VMs cloned for the class work and doing their IO bursts more or
less at the same time with the same data (like learning to install
some same program). If so, L2ARC caching can do good by keeping copies
of the origin dataset ("golden image" of the VMs common before cloning)
as well as installation images, etc. It might seem like a good idea
to use dedup as well, but if the storage is only for a short duration
(class, semester) then overheads and possible slowdowns during writes
due to dedup might in fact make matters worse. Maybe not, experiments
would show and should be easy to stage - run for one class with some
settings, for another - with others, measure the differences in response
times, disk usage, etc. Pick your better choices :)

So here's my 2c, but they may be wrong ;)
HTH,
//Jim Klimov