[OpenIndiana-discuss] ZFS improvements.

taemun taemun at gmail.com
Tue Nov 23 15:55:11 UTC 2010


Maurilio,

For the use-case where one is using the standard (much larger) block size of
128KB, the metadata blocks will be comparatively smaller.

My problem with the ashift=9 on 4KB sectored drives is that a write of 512B
incurs an enormous hit to latency because the drive is physically reading
4KB, modifying 512B, and writing 4KB. Which also means that there is the
potential for an error to be made during this operation, silently corrupting
your data.

I was experiencing around 20 IOPS with random 512B write with the WD15EARS I
had on hand. I don't believe it was defective (or rather, that WD would call
it defective, beyond their pandering to Windows XP).
50 ms random write latency is atrocious.

Thanks for the link. I'm aware of it (and have just used that binary to
create a new pool with 9x Seagate 2TB LP + 9x Samsung 2TB HD204UI). Living
dangerously :)

On 24 November 2010 02:39, Maurilio Longo <maurilio.longo at libero.it> wrote:

> taemun,
>
> I've made some tests with WD10EARS disks (1Tb disk, 4Kb sectors reported as
> 512b), I've used four of them in a raidz1 zpool to keep a copy of around 20
> sparse zvols I'm exporting through comstar.
>
> To make the story short, for sparse zvols (I don't know what happens with
> non
> sparse zvols) with sectors of 8Kb in size, moving them from a real 512b
> sectors disk (zpool) to a 4kb one, there is around 30% of space wasted.
>
> I suspect that for every 8Kb block of data in a zvol, zfs adds a sector of
> 'administrative' data, which on 4Kb sectors disk is mostly wasted
> (compression
> does not help a lot here, since you'd have to have 8kb of data which can be
> compressed down to (less than) 4kb to have any space saving).
>
> So, I'm going to recreate the pool with the standard zpool command even if
> it
> is going to be a little slower and keep using real 512b disks for the time
> being.
>
> Best regards.
>
> Maurilio.
>
> PS. here you can find a zpoolm command compiled with ashift=12 which works
> on
> OpenIndiana:
>
>
> http://digitaldj.net/2010/11/03/zfs-zpool-v28-openindiana-b147-4k-drives-and-you/
>
> PPS. Even a compressible zfs filesystem (source tree of snv release 143)
> when
> moved to 4Kb zpool adds around 30% of wasted space.
>
>
> taemun wrote:
> > If we're making a wishlist, something more immediate (and surely easy to
> > implement) would be a command line option for zpool create to change the
> > ashift value. This allows use of "Advanced Format" drives which have 4KB
> > internal sector size, and export them (incorrectly, for Windows XP) as
> 512B
> > blocks.
> >
> > eg
> > zpool create -o ashift=12 tank c1t0d0 c1t1d0
> >
> > This would provide immediate benefit to anyone wanting to use modern
> cheap
> > hard drives, where the user is compus enough to know what the system
> can't.
> >
> > A more advanced option would be timing of read/write performance of
> > increasing block sizes (and offsets, if needed) in order to identify what
> > the physical sector size is (and its alignment). But purely manual is a
> good
> > start.
> > _______________________________________________
> > OpenIndiana-discuss mailing list
> > OpenIndiana-discuss at openindiana.org
> > http://openindiana.org/mailman/listinfo/openindiana-discuss
> >
>
> --
>  __________
> |  |  | |__| Maurilio Longo
> |_|_|_|____| farmaconsult s.r.l.
>
>
>
> _______________________________________________
> OpenIndiana-discuss mailing list
> OpenIndiana-discuss at openindiana.org
> http://openindiana.org/mailman/listinfo/openindiana-discuss
>


More information about the OpenIndiana-discuss mailing list