[OpenIndiana-discuss] Missing the boat on using dedupe

hput hputn3 at zohomail.com
Thu Jun 9 12:11:04 UTC 2022


First I made a mess of trying to merge three slightly different
versions of a trove of images.  A zfs fs directories full of images and
file structures on 2 different zfs-linux hosts; plus 1 Image trove on
a windows 10 host.

I'm sorry what follows is a bit hard to follow.  

when rsyncing one of zfs-linux hosts' Images directory into A very similar Images
directory on a different zfs-linux host.

Then rsyncing a similar Images directory on windows 10 into the result
of the first merging.

When I add the windows directory into the mess, it seems to have
doubled or nearly so the size of the 3 way merged Images directory.

I didn't use any trick flags with rsync in any of the cases; Just:

  rsync -vvrptgoD --stats $linux_zfs_HOST1_Images/ \
                          $linux_zfs_HOST2_Images/

In that merge there wasn't much difference in final size.

By the way; both linux-zfs hosts use compression=lz4

  Result of first merge:
  USED  AVAIL     REFER  
  284G  2.35T      284G

  And finally:
 
  rsync -vvrptgoD --stats $Win_10_HOST_Images/ \
                          $linux_zfs_HOST2_Images/

  In that final merge from windows 10 HOST to zfs-linuxHOST2; the size
  has doubled .
  
    USED  AVAIL     REFER
    569G  2.57T      569G

Ok, so I thought I'd experiment with dedup on an OI vbox-vm.  And ship
the enlarged pool onto a deduppified zpool and zfs fs.

On OI;
I created a pool of one virtual disk and set:

 zfs set dedup=on tdd  (tdd name from: test dedup)

Then created: zfs create -omountpoint=/Images tdd/Images

   zfs set dedup=on tdd/Images

By creating a `dedup-afied' pool and zfs fs I was trying to use zfs
dedup to fix my mess and remove duplicates.

Final move was to; (on 2nd linux-zfs-HOST2)

  I ran

  zfs send -v p2/Images@$SNAP |pv|ssh oi zfs recv -v -F tdd/Images

(again a bit simplified)

I expected to see something like a halving of the size when the
zfs send completed.

Starting size on zfs-linux-HOST2

            USED  AVAIL     REFER  
p2/Images   569G  2.57T      569G  /Images

(simplified above)

Result on depupped pool and zfs fs on OI (newly updated)

NAME                                USED  AVAIL  REFER  MOUNTPOINT
tdd/Images                          568G   382G   568G  /Images


So, the result is very disappointing.


I guess I'm really missing the boat about how dedup is supposed to
work.

First, thanks for slogging thru my poor writing this far.


Can anyone explain where I'm going wrong?




More information about the openindiana-discuss mailing list