[OpenIndiana-discuss] Missing the boat on using dedupe

Judah Richardson judahrichardson at gmail.com
Thu Jun 9 14:35:09 UTC 2022


FWIW, using deduplication is recommended only when the pool's dedup ratio
<https://docs.oracle.com/cd/E37838_01/html/E61017/gazss.html#SVZFSgjhav>
(more info at that link) is >= 2. The documentation implies violating that
principle creates more problems than it solves.

On Thu, Jun 9, 2022 at 7:11 AM hput via openindiana-discuss <
openindiana-discuss at openindiana.org> wrote:

> First I made a mess of trying to merge three slightly different
> versions of a trove of images.  A zfs fs directories full of images and
> file structures on 2 different zfs-linux hosts; plus 1 Image trove on
> a windows 10 host.
>
> I'm sorry what follows is a bit hard to follow.
>
> when rsyncing one of zfs-linux hosts' Images directory into A very similar
> Images
> directory on a different zfs-linux host.
>
> Then rsyncing a similar Images directory on windows 10 into the result
> of the first merging.
>
> When I add the windows directory into the mess, it seems to have
> doubled or nearly so the size of the 3 way merged Images directory.
>
> I didn't use any trick flags with rsync in any of the cases; Just:
>
>   rsync -vvrptgoD --stats $linux_zfs_HOST1_Images/ \
>                           $linux_zfs_HOST2_Images/
>
> In that merge there wasn't much difference in final size.
>
> By the way; both linux-zfs hosts use compression=lz4
>
>   Result of first merge:
>   USED  AVAIL     REFER
>   284G  2.35T      284G
>
>   And finally:
>
>   rsync -vvrptgoD --stats $Win_10_HOST_Images/ \
>                           $linux_zfs_HOST2_Images/
>
>   In that final merge from windows 10 HOST to zfs-linuxHOST2; the size
>   has doubled .
>
>     USED  AVAIL     REFER
>     569G  2.57T      569G
>
> Ok, so I thought I'd experiment with dedup on an OI vbox-vm.  And ship
> the enlarged pool onto a deduppified zpool and zfs fs.
>
> On OI;
> I created a pool of one virtual disk and set:
>
>  zfs set dedup=on tdd  (tdd name from: test dedup)
>
> Then created: zfs create -omountpoint=/Images tdd/Images
>
>    zfs set dedup=on tdd/Images
>
> By creating a `dedup-afied' pool and zfs fs I was trying to use zfs
> dedup to fix my mess and remove duplicates.
>
> Final move was to; (on 2nd linux-zfs-HOST2)
>
>   I ran
>
>   zfs send -v p2/Images@$SNAP |pv|ssh oi zfs recv -v -F tdd/Images
>
> (again a bit simplified)
>
> I expected to see something like a halving of the size when the
> zfs send completed.
>
> Starting size on zfs-linux-HOST2
>
>             USED  AVAIL     REFER
> p2/Images   569G  2.57T      569G  /Images
>
> (simplified above)
>
> Result on depupped pool and zfs fs on OI (newly updated)
>
> NAME                                USED  AVAIL  REFER  MOUNTPOINT
> tdd/Images                          568G   382G   568G  /Images
>
>
> So, the result is very disappointing.
>
>
> I guess I'm really missing the boat about how dedup is supposed to
> work.
>
> First, thanks for slogging thru my poor writing this far.
>
>
> Can anyone explain where I'm going wrong?
>
>
> _______________________________________________
> openindiana-discuss mailing list
> openindiana-discuss at openindiana.org
> https://openindiana.org/mailman/listinfo/openindiana-discuss
>


More information about the openindiana-discuss mailing list