[OpenIndiana-discuss] Missing the boat on using dedupe
hput
hputn3 at zohomail.com
Thu Jun 9 12:11:04 UTC 2022
First I made a mess of trying to merge three slightly different
versions of a trove of images. A zfs fs directories full of images and
file structures on 2 different zfs-linux hosts; plus 1 Image trove on
a windows 10 host.
I'm sorry what follows is a bit hard to follow.
when rsyncing one of zfs-linux hosts' Images directory into A very similar Images
directory on a different zfs-linux host.
Then rsyncing a similar Images directory on windows 10 into the result
of the first merging.
When I add the windows directory into the mess, it seems to have
doubled or nearly so the size of the 3 way merged Images directory.
I didn't use any trick flags with rsync in any of the cases; Just:
rsync -vvrptgoD --stats $linux_zfs_HOST1_Images/ \
$linux_zfs_HOST2_Images/
In that merge there wasn't much difference in final size.
By the way; both linux-zfs hosts use compression=lz4
Result of first merge:
USED AVAIL REFER
284G 2.35T 284G
And finally:
rsync -vvrptgoD --stats $Win_10_HOST_Images/ \
$linux_zfs_HOST2_Images/
In that final merge from windows 10 HOST to zfs-linuxHOST2; the size
has doubled .
USED AVAIL REFER
569G 2.57T 569G
Ok, so I thought I'd experiment with dedup on an OI vbox-vm. And ship
the enlarged pool onto a deduppified zpool and zfs fs.
On OI;
I created a pool of one virtual disk and set:
zfs set dedup=on tdd (tdd name from: test dedup)
Then created: zfs create -omountpoint=/Images tdd/Images
zfs set dedup=on tdd/Images
By creating a `dedup-afied' pool and zfs fs I was trying to use zfs
dedup to fix my mess and remove duplicates.
Final move was to; (on 2nd linux-zfs-HOST2)
I ran
zfs send -v p2/Images@$SNAP |pv|ssh oi zfs recv -v -F tdd/Images
(again a bit simplified)
I expected to see something like a halving of the size when the
zfs send completed.
Starting size on zfs-linux-HOST2
USED AVAIL REFER
p2/Images 569G 2.57T 569G /Images
(simplified above)
Result on depupped pool and zfs fs on OI (newly updated)
NAME USED AVAIL REFER MOUNTPOINT
tdd/Images 568G 382G 568G /Images
So, the result is very disappointing.
I guess I'm really missing the boat about how dedup is supposed to
work.
First, thanks for slogging thru my poor writing this far.
Can anyone explain where I'm going wrong?
More information about the openindiana-discuss
mailing list