[OpenIndiana-discuss] Current ZFS Backup projects (OpenIndiana-discuss Digest, Vol 26, Issue 34)

Jim Klimov jimklimov at cos.ru
Thu Sep 13 09:23:12 UTC 2012


2012-09-12 6:05, Ong Yu-Phing пишет:
> Jim, I assume you are referring to this:
> http://wiki.openindiana.org/oi/rsync+daemon+service+on+OpenIndiana, thanks!

Yes, I think that's it ;)

> My concern is that typically rsync will take quite a while to traverse a
> large set of files before sending only changed files; a classic example
> is backing up say 1TB of maildir emails, it may take 4+ hours, and you
> now have to deal with a situation where your midnight backup is really a
> "somewhere between midnight and 4am" backup.  And of course, if you want
> to take snapshots/backups of comstar volumes, rsync isn't quite the
> right fit.
>
> On the other hand, a zfs snapshot gives an almost-at-the-time backup
> (give or take a few seconds), versus the aforementioned rsync.  The zfs
> snapshot can then be sent off-site, independent of the backup activity.

I got an impression that you needed to backup some "client" machines
with varied OSes, such as Windows or Linux desktops, onto a ZFS server.
In that case rsync should help, although you're right that it would
take long to scan the directory trees for changes.

With client FSes that support snapshots (ZFS, NTFS shadow copy) you
might have some luck making scripts that take the client's snapshot,
hold it (in case of ZFS, to avoid it being destroyed while you're
working), rsync the changes from the snapshot (so the 0am backup is
really the state at 0am) and release/destroy the snapshot on client.
In case of ZFS at least, you might have some optimization by using
"zfs diff" to determine changed files between two snapshots on the
client - but then you should not destroy rsynced snapshots right
away, but keep a backlog of one or two at least. And you should have
some locking to prevent several instances of the backup job crawling
the same client space and bringing IOPS to a halt.

Now, before you ask "why not zfs-send client snapshots directly?" -
there may be reasons, such as incompatible ZFS versions on client
and server, differing dataset layouts, flaky network preventing
transfer of large zfs-send streams (though that should have been
addressed with resumable zfs-send feature, if that was integrated).

HTH,
//Jim Klimov




More information about the OpenIndiana-discuss mailing list