[OpenIndiana-discuss] Inefficient zvol space usage on 4k drives
Steve Gonczi
gonczi at comcast.net
Thu Aug 8 01:21:32 UTC 2013
Hi Jim,
This looks to me more like a rounding-up problem, esp. looking at the
bug report quoted. The waste factor increases as the block size goes
down. Kind of looks like it fits the ratio of the blocks nominal size, vs
its minimal on-disk foot print.
For example, compressed blocks are variable size.
If a block compresses to some small but non-zero size, it would take
up the size of the smallest on-disk allocation unit. For an 8K block
block, the smallest non-zero allocation could be 4K ( vs. 512 bytes)
A similar thing would happen to small files, taking up less than a single block's
worth of bytes. Zfs alters the block size for these, to closely match the
actual bytes stored.
A one byte file would take up merely a single sector. For small files,
512 byes vs. 4K minimum size can make a big difference.
If most of the blocks are compressed, or there are a lot of small files,
the 8k vs 512 or the 8k vs 4k ratio pretty much predicts doubling of
the on-disk footprint at 8k block size.
I do not see how the sector size could cause a similarly significant
increase in the on-disk footprint by making metadata storage inefficient.
I presume when you are talking about metadata, you mean
the interior nodes (level > 0) of files.
If a file is <= 3 blocks in size, it will not have any interior nodes.
Otherwise, the nodes are allocated one page at a time, as many as needed.
Metadata pages currently contain 128 block pointer structs (128*128 bytes == 16K)
This interior node page size is independent of the file system's user-changeable
block size.
I do not believe that these pages are variable size.
So a rough guesstimate would be : one 16K metadata page for every 128 blocks
in the file.
(Technically, there could be multiple levels of interior node pages, but the 128x fanout
is so aggressive that you can neglect those for an order-of-magnitude
rough guess)
On the average, metadata takes up less than 2% of the space needed by user payload.
I am planning on playing with 4K sectors to try and repeat the experiment mentioned,
I am curious what are the performance and space usage implications when
the file size and compression are taken into consideration.
Steve
----- Original Message -----
Yes, I've had similar results on my rig and complained some time ago...
yet the ZFS world moves forward with desiring ashift=12 as the default
(and it may be inevitable ultimately). I think the main problem is that
small userdata blocks involve a larger portion of metadata, which may
come in small blocks which don't fully cover a sector (supposedly they
should be aggregated into up-to-16k clusters, but evidently are not
always so.
More information about the OpenIndiana-discuss
mailing list