[OpenIndiana-discuss] trying to get 4K aligned root pool on ESX

Jim Klimov jimklimov at cos.ru
Mon Aug 19 12:18:26 UTC 2013


On 2013-08-19 13:02, Edward Ned Harvey (openindiana) wrote:
>> From: Steve Goldthorpe [mailto:OpenIndiana at waistcoat.org.uk]
>> Sent: Sunday, August 18, 2013 12:23 PM
>>
>> No matter what I try I can't seem to get a 4K aligned root pool using the
>> OpenIndiana installer (oi151-a7 live image).
>> I'm using ESXi using 4K aligned disks.
>
> You shouldn't put ESX volumes in rpool.  Create a separate pool - if necessary first partition your disk - and put the volumes into the second pool.
>
> If you partition your disk, you should be aware that zfs only enables the disk hardware write-back cache for systems where zfs has control of the whole disk.  So you might have to script something with "format" to enable the disk cache on those devices.

Ned, while your point is valid, I read the OP's problem as opposite:
that he is trying to create a VMWare ESX VM with OI inside, and that
this VM's rpool is misaligned even if created as ashift=12.

The default layout for rpool involves a standard MBR partition table,
with a "p1" partition starting at sector #63 in 512b-sized sectors.
This partition is inside sliced with Solaris SMI label - a table of
"slices", of which the zero'th "cylinder" (as 16065 512b-sectors) is
reserved on x86 for "boot", and the slice0 used for rpool starts at
cylinder #1 - thus its offset from start of disk is 16065+63 sectors.
This is in fact 16128 - a value divisible by 8 so the slice0 should
be aligned with 4k sectors.

Just a few weeks ago I contemplated such misalignment an the mailing
lists, but this was a theoretical construct with no real ways for me
to verify without reinstallation. I can point Steve to that discussion
and kindly ask to write back here if he tests any of those ideas and
tells us how it went. I think some DTrace scripts that float on that
list (from Richard Elling mostly, as well as some published on his
site) can help estimate latencies of different IO offset modulo's
and thus detect if there is an alignment problem.

That other discussion for SSDs involved larger alignment blocks like
512kb, which possibly required some magic with gparted to create the
MBR partition at an exact offset so that the slice0 would be megabyte-
aligned or so. Also there could be a need for "magic" on some drives
which lie in their LBA addressing and hide an extra 512b sector - so
that for legacy OSes their first partition starting at logical offset
sector #63, would be physically mapped to an offset of 64*512b which
is a whole number of hardware 4kb sectors - good for Win/Lin, bad for
Solaris SMI slice0 at cylinder 1.

That discussion's subthread started around here:
http://www.mail-archive.com/omnios-discuss@lists.omniti.com/msg00639.html

There's also a pretty good writeup from Oracle for their ZFS SA:
http://www.oracle.com/technetwork/server-storage/sun-unified-storage/documentation/partitionalign-111512-1875560.pdf

HTH,
//Jim Klimov



More information about the OpenIndiana-discuss mailing list