[OpenIndiana-discuss] Is this kind of nfs speed just tiny bit outrageous?
James Carlson
carlsonj at workingcode.com
Mon Jan 5 11:43:39 UTC 2015
On 01/04/15 18:40, Harry Putnam wrote:
> James Carlson <carlsonj at workingcode.com> writes:
>> What does your invocation line look like? Is it like this:
> But if you mean the rsnapshot invocation then:
>
> rsnapshot (using rsync) is running on linux client so
>
> rsync line (one of serveral) would be:
>
> rsync /var/ >>> (nfs Mounted share on lchost) /nfs/bk/
> The later being the nfs mount point.
Yes, that's what I meant. And, yes, that's an unnecessarily problematic
usage model for rsync.
> I'd guess the most small files would be in the /var hierarchy but that
> is only one of the local source pths.
>
> Even there, it is a newish install so not really that many small
> files, way less than the millions you mentioned.
>
> More like:
> # find /var/ -type f|wc -l
> 66269
It's doing a synchronous write on every changed file transferred. It
might be possible to use truss to monitor the number of written files,
or to use snoop or wireshark to see the number of NFS COMMIT operations
that go by.
How long does it take to do 66K file writes and fsyncs?
> Really pretty small potatoes for such a hefty time consumption.
>
>>From OP:
> real 314m10.421s
> user 0m0.454s
> sys 3m52.071s
>
> 318 minutes for 1.3 GB
Agreed; that does seem pretty darned slow. Another possible issue would
be network problems. It'd be worth looking for errors there.
>> I assume it's the former, and you're trying to write zillions of tiny
>> files via NFS. If so, I suspect you're seeing the action of NFS COMMIT:
>
>> https://blogs.oracle.com/roch/entry/nfs_and_zfs_a_fine
>
> I got lost after a few paragrapths there. Is it really like what I'm
> doing?
Precisely.
>> The simple answer is "don't do that." You can serialize the stream,
>> transfer the serialized stream over the network (via ssh or rsync's
>> own protocol), and then write locally. This is what I do with my
>> rsync jobs, regardless of whether the target server is Solaris, Linux,
>> AIX, or something else.
>
> Not sure what you mean by `serialize' (in rotation perhaps?) but if
> what you are saying is that ssh is faster... yes .. in my case it is
> orders of magnitude faster.
"Serialize" in this context means to convert multiple files into a
datastream for transfer. A simple example of serialization would be the
"tar" utility. Another example -- much more pertinent here -- is
rsync's own network protocol, which you can use by setting up an rsync
daemon and using "host:/" rather than "/net/host/". It transfers all of
the file deltas as a single data stream to the remote system and then
unpacks and applies them there.
The higher level point is that if you convert to a serialized form and
then do the writes locally on the server, you'll run much faster than if
you do each of the writes individually on each file from the client.
This is true regardless of the client and server involved.
So, if you care about the resulting performance, just don't do it with
/net/host/ as an access mechanism. If you're trying to use this as a
means to stress test NFS or as a way to generate load in order to tune
the server or (perhaps) redesign the protocol, then it might be useful,
but if the goal is actually transferring files, it's much less useful.
> Running the same rsnapshot/ rsync over an sshfs mounted share located
> on zfs server is dozens of times faster than what I reported for nfs.
>
> Can't find the `time' report now but I think single digits, maybe 8
> minutes.
I'm not talking about sshfs. Read the rsync man page to see how to run
rsync over ssh.
I don't know if sshfs provides the same integrity guarantees as NFS. I
suspect that one of your problems (it's possible you have more than one
here) is the NFS COMMIT mechanism.
>> https://blogs.oracle.com/perrin/entry/slog_blog_or_blogging_on
>
> Looking into your last URL now, but do you think your original
> thoughts on this still hold?
Yes, at least on the advisability of using /net/host/... as an rsync
target. That's just a bad idea.
--
James Carlson 42.703N 71.076W <carlsonj at workingcode.com>
More information about the openindiana-discuss
mailing list