[OpenIndiana-discuss] [zfs] ZFS High-Availability and Sync Replication
Roel_D
openindiana at out-side.nl
Mon Nov 19 22:19:12 UTC 2012
http://openindiana.org/pipermail/openindiana-discuss/2011-February/002451.html
Are we reinventing the wheel?
Kind regards,
The out-side
Op 19 nov. 2012 om 11:35 heeft Sašo Kiselkov <skiselkov.ml at gmail.com> het volgende geschreven:
> On 11/18/2012 08:32 PM, Richard Elling wrote:
>> more below...
>>
>> On Nov 18, 2012, at 3:13 AM, Sašo Kiselkov <skiselkov.ml at gmail.com> wrote:
>>
>>> On 11/17/2012 03:03 AM, Richard Elling wrote:
>>>> On Nov 15, 2012, at 5:39 AM, Sašo Kiselkov <skiselkov.ml at gmail.com> wrote:
>>>>
>>>>> I've been lately looking around the net for high-availability and sync
>>>>> replication solutions for ZFS and came up pretty dry - seems like all
>>>>> the jazz is going around on Linux with corosync/pacemaker and DRBD. I
>>>>> found a couple of tools, such as AVS and OHAC, but these seem rather
>>>>> unmaintained, so it got me wondering what others use for ZFS clustering,
>>>>> HA and sync replication. Can somebody please point me in the right
>>>>> direction?
>>>>
>>>> Architecturally, replicating in this way is a bad idea. Past efforts to do
>>>> block-level replication suffer from one or more of:
>>>> 1. coherency != consistency
>>>> 2. performance sux without nonvolatile write caches
>>>> 3. network bandwidth is not infinite
>>>> 4. the speed of light is too slow
>>>> 5. replicating big chunks of data exacerbates #3 and #4
>>>>
>>>> AVS and friends worked ok for the time they were originally developed,
>>>> when disks were 9, 18, or 36 GB. For a JBOD full of 4TB disks, it just isn't
>>>> feasible.
>>>>
>>>> In the market, where you do see successes for block-level replication, the
>>>> systems are constrained to avoid #2 and #5 (eg TrueCopy or SRDF).
>>>>
>>>> For most practical applications today, the biggest hurdle is #1, by far.
>>>> Fortunately, there are many solutions there: NoSQL, distributed databases,
>>>> (HA-)IMDBs, etc.
>>>>
>>>> Finally, many use cases for block-level replication are solved in the metro
>>>> area choosing the right hardware and using mirrors thus solving #1 and KISS
>>>> at the same time.
>>>
>>> I understand that replication at the storage level is the Wrong Way(tm)
>>> to do it, but I need to cover this scenario for the rare cases where the
>>> application layer can't/won't do it themselves. Most specifically, I
>>> need to replicate VM backing storage for VMs that can't do software
>>> RAID-1 themselves (which is of course the best way).
>>>
>>> In any case, I'm just looking at what's available in the market now.
>>> Ultimate I might go for shared storage + two heads + corosync/pacemaker
>>> (I got 'em to compile on Illumos).
>>
>>
>> If you are just building a HA cluster pair with shared storage, then there is
>> significant work already done with RSF-1, OHAC, VCS, etc. I've looked at
>> corosync in detail and it is a bit more DIY than the others. The easy part is
>> getting the thing to work when a node totally fails... the hard part is that
>> nodes rarely totally fail...
>
> Naturally, it takes some testing and trial and error to get a cluster
> suite designed, which is why I'm not trying to do it myself and am
> looking for what others have done. Of the solutions you mention, after a
> bit of research I have gotten the following impressions:
>
> RSF-1:
> Pros:
> *) seems OI/Illumos aware and friendly
> *) commercial support available
> Cons:
> *) closed-source
> *) no downloadable trial version
> *) no price on website, which complicates market research
>
> OHAC:
> Pros:
> *) open-source & free
> *) Sun project, so probably well integrated with Solaris OSes
> Cons:
> *) dead, or at least the public part of it
> *) documentation links dead or lead to Oracle walled gardens
>
> VCS:
> Pros:
> *) commercial support available
> Cons:
> *) deeply proprietary (down to the L2 interconnect protocol)
> *) no price on website
>
> Of these, OHAC seems like the best bet, because we can try and apply it
> freely and back in the day we could get additional peace of mind by it
> being backed by Sun (and thus we could get commercial support from a
> reputable vendor, if need be) - sadly that is no longer the case.
>
> Cheers,
> --
> Saso
>
> _______________________________________________
> OpenIndiana-discuss mailing list
> OpenIndiana-discuss at openindiana.org
> http://openindiana.org/mailman/listinfo/openindiana-discuss
More information about the OpenIndiana-discuss
mailing list