[OpenIndiana-discuss] [zfs] ZFS High-Availability and Sync Replication

Roel_D openindiana at out-side.nl
Mon Nov 19 22:19:12 UTC 2012


http://openindiana.org/pipermail/openindiana-discuss/2011-February/002451.html
Are we reinventing the wheel? 



Kind regards, 

The out-side

Op 19 nov. 2012 om 11:35 heeft Sašo Kiselkov <skiselkov.ml at gmail.com> het volgende geschreven:

> On 11/18/2012 08:32 PM, Richard Elling wrote:
>> more below...
>> 
>> On Nov 18, 2012, at 3:13 AM, Sašo Kiselkov <skiselkov.ml at gmail.com> wrote:
>> 
>>> On 11/17/2012 03:03 AM, Richard Elling wrote:
>>>> On Nov 15, 2012, at 5:39 AM, Sašo Kiselkov <skiselkov.ml at gmail.com> wrote:
>>>> 
>>>>> I've been lately looking around the net for high-availability and sync
>>>>> replication solutions for ZFS and came up pretty dry - seems like all
>>>>> the jazz is going around on Linux with corosync/pacemaker and DRBD. I
>>>>> found a couple of tools, such as AVS and OHAC, but these seem rather
>>>>> unmaintained, so it got me wondering what others use for ZFS clustering,
>>>>> HA and sync replication. Can somebody please point me in the right
>>>>> direction?
>>>> 
>>>> Architecturally, replicating in this way is a bad idea. Past efforts to do 
>>>> block-level replication suffer from one or more of:
>>>>    1. coherency != consistency
>>>>    2. performance sux without nonvolatile write caches
>>>>    3. network bandwidth is not infinite
>>>>    4. the speed of light is too slow
>>>>    5. replicating big chunks of data exacerbates #3 and #4
>>>> 
>>>> AVS and friends worked ok for the time they were originally developed,
>>>> when disks were 9, 18, or 36 GB. For a JBOD full of 4TB disks, it just isn't
>>>> feasible.
>>>> 
>>>> In the market, where you do see successes for block-level replication, the
>>>> systems are constrained to avoid #2 and #5 (eg TrueCopy or SRDF).
>>>> 
>>>> For most practical applications today, the biggest hurdle is #1, by far. 
>>>> Fortunately, there are many solutions there: NoSQL, distributed databases,
>>>> (HA-)IMDBs, etc.
>>>> 
>>>> Finally, many use cases for block-level replication are solved in the metro
>>>> area choosing the right hardware and using mirrors thus solving #1 and KISS
>>>> at the same time.
>>> 
>>> I understand that replication at the storage level is the Wrong Way(tm)
>>> to do it, but I need to cover this scenario for the rare cases where the
>>> application layer can't/won't do it themselves. Most specifically, I
>>> need to replicate VM backing storage for VMs that can't do software
>>> RAID-1 themselves (which is of course the best way).
>>> 
>>> In any case, I'm just looking at what's available in the market now.
>>> Ultimate I might go for shared storage + two heads + corosync/pacemaker
>>> (I got 'em to compile on Illumos).
>> 
>> 
>> If you are just building a HA cluster pair with shared storage, then there is
>> significant work already done with RSF-1, OHAC, VCS, etc. I've looked at
>> corosync in detail and it is a bit more DIY than the others. The easy part is
>> getting the thing to work when a node totally fails... the hard part is that
>> nodes rarely totally fail...
> 
> Naturally, it takes some testing and trial and error to get a cluster
> suite designed, which is why I'm not trying to do it myself and am
> looking for what others have done. Of the solutions you mention, after a
> bit of research I have gotten the following impressions:
> 
> RSF-1:
>  Pros:
>    *) seems OI/Illumos aware and friendly
>    *) commercial support available
>  Cons:
>    *) closed-source
>    *) no downloadable trial version
>    *) no price on website, which complicates market research
> 
> OHAC:
>  Pros:
>    *) open-source & free
>    *) Sun project, so probably well integrated with Solaris OSes
>  Cons:
>    *) dead, or at least the public part of it
>    *) documentation links dead or lead to Oracle walled gardens
> 
> VCS:
>  Pros:
>    *) commercial support available
>  Cons:
>    *) deeply proprietary (down to the L2 interconnect protocol)
>    *) no price on website
> 
> Of these, OHAC seems like the best bet, because we can try and apply it
> freely and back in the day we could get additional peace of mind by it
> being backed by Sun (and thus we could get commercial support from a
> reputable vendor, if need be) - sadly that is no longer the case.
> 
> Cheers,
> --
> Saso
> 
> _______________________________________________
> OpenIndiana-discuss mailing list
> OpenIndiana-discuss at openindiana.org
> http://openindiana.org/mailman/listinfo/openindiana-discuss


More information about the OpenIndiana-discuss mailing list