[OpenIndiana-discuss] Split-root installations
Jim Klimov
jimklimov at cos.ru
Mon Dec 2 01:06:40 UTC 2013
I've pursued the idea of a separate SMF service which would take
care of local ZFS-based split-root systems instead of hacking into
existing scripts (network or filesystem). All the logic that I've
earlier added into fs-root and fs-minimal has now moved into the
new script and service fs-root-zfs. This service declares as its
dependants the networking services as well as filesystem/root, to
ensure that it runs before consumers of /usr and other filesystems
that make up a root hierarchy. In fact, it mounts all of the ZFS
filesystems which are children of the current bootfs or of the
$rpool/SHARED dataset, and /var/run in particular, fulfilling
much of filesystem/minimal in one early blow. This should be
appreciated by NWAM in particular :)
It does check for filesystems listed in /etc/vfstab and provided
by non-ZFS technologies - this script should skip mounting those
mountpoints and any under them. Also, it only runs for a ZFS root.
The large changes I proposed before to the older scripts are no
longer needed for the local-ZFS split-root setups; though smaller
changes (provided in new patches) are still added - to skip
zfs-mounting in case that the filesystem in question has already
been mounted.
My local tests were quite successful, so I posted the update at
the Wiki:
http://wiki.openindiana.org/oi/Advanced+-+Split-root+installation
http://wiki.openindiana.org/download/attachments/27230229/fs-root-zfs
http://wiki.openindiana.org/download/attachments/27230229/fs-root-zfs.xml
Slight but important fixes to the older fs-root and fs-minimal
scripts (as well as the whole of the new fs-root-zfs script) can
be reviewed in this patch:
http://wiki.openindiana.org/download/attachments/27230229/fs-root-zfs.patch
An example of console output for booting a BE with bad mountpoint
due to untimely reset while this BE was mounted for administrative
tasks from another running BE (with console debugging enabled by
"touch /$BEMOUNT/.debug_mnt"):
OpenIndiana Build oi_151a8 64-bit (illumos 7256a34efe)
SunOS Release 5.11 - Copyright 1983-2010 Oracle and/or its affiliates.
All rights reserved. Use is subject to license terms.
Rootfs mountpoint not '/' but '/tmp/tmp.a5aGOb', trying to fix.
Fixing 'rpool/ROOT/test/opt' to use '/opt' mountpoint instead of
'/tmp/tmp.a5aGOb/opt': shifted in same root hierarchy
Fixing 'rpool/ROOT/test/usr' to use '/usr' mountpoint instead of
'/tmp/tmp.a5aGOb/usr': shifted in same root hierarchy
Fixing 'rpool/ROOT/test/usr/local' to use '/usr/local' mountpoint
instead of '/tmp/tmp.a5aGOb/usr/local': shifted in same root hierarchy
Fixing 'rpool/ROOT/test/var' to use '/var' mountpoint instead of
'/tmp/tmp.a5aGOb/var': shifted in same root hierarchy
Mounting '/usr': use 'rpool/ROOT/test/usr': in same root hierarchy
Mounting '/var': use 'rpool/ROOT/test/var': in same root hierarchy
Mounting '/var/adm': use 'rpool/SHARED/var/adm': the only option
Not ZFS-mounting '/tmp': equal or under a non-ZFS mountpoint '/tmp'
Mounting '/opt': use 'rpool/ROOT/test/opt': in same root hierarchy
Not mounting: '/usr' from 'rpool/ROOT/test/usr': something already mounted
Mounting '/usr/local': use 'rpool/ROOT/test/usr/local': in same root
hierarchy
Not mounting: '/var' from 'rpool/ROOT/test/var': something already mounted
Not mounting: '/var' from 'rpool/SHARED/var': canmount!=on
Not mounting: '/var/adm' from 'rpool/SHARED/var/adm': something already
mounted
Mounting '/var/cores': use 'rpool/SHARED/var/cores': in shared root
hierarchy
Mounting '/var/crash': use 'rpool/SHARED/var/crash': in shared root
hierarchy
Mounting '/var/log': use 'rpool/SHARED/var/log': in shared root hierarchy
Mounting '/var/mail': use 'rpool/SHARED/var/mail': in shared root hierarchy
Not mounting: '/var/spool' from 'rpool/SHARED/var/spool': canmount!=on
Mounting '/var/spool/clientmqueue': use
'rpool/SHARED/var/spool/clientmqueue': in shared root hierarchy
Mounting '/var/spool/mqueue': use 'rpool/SHARED/var/spool/mqueue': in
shared root hierarchy
Mounting '/var/tmp': use 'rpool/SHARED/var/tmp': in shared root hierarchy
fs-root-zfs: completed without fatal errors
Hostname: openindiana
...
The filesystem tree on this box is:
# df -k
Filesystem 1K-blocks Used Available Use% Mounted on
rpool/ROOT/test 2065321 314150 1751171 16% /
swap 1347972 1112 1346860 1% /etc/svc/volatile
rpool/ROOT/test/usr 2576515 825344 1751171 33% /usr
rpool/ROOT/test/var 1809675 58504 1751171 4% /var
rpool/SHARED/var/adm 1751283 112 1751171 1% /var/adm
swap 1346912 52 1346860 1% /var/run
rpool/ROOT/test/opt 1752589 1418 1751171 1% /opt
rpool/ROOT/test/usr/local
1751202 31 1751171 1% /usr/local
rpool/SHARED/var/cores
1762117 10946 1751171 1% /var/cores
rpool/SHARED/var/crash
1751202 31 1751171 1% /var/crash
rpool/SHARED/var/log 1751229 58 1751171 1% /var/log
rpool/SHARED/var/mail
1751203 32 1751171 1% /var/mail
rpool/SHARED/var/spool/clientmqueue
1751203 32 1751171 1%
/var/spool/clientmqueue
rpool/SHARED/var/spool/mqueue
1751202 31 1751171 1% /var/spool/mqueue
rpool/SHARED/var/tmp 1751223 52 1751171 1% /var/tmp
/usr/lib/libc/libc_hwcap2.so.1
2576515 825344 1751171 33% /lib/libc.so.1
swap 1346868 8 1346860 1% /tmp
rpool/export 1751203 32 1751171 1% /export
rpool/export/home 1751203 32 1751171 1% /export/home
rpool/export/home/admin
1751846 675 1751171 1% /export/home/admin
rpool 1751217 46 1751171 1% /rpool
/export/home/admin 1751846 675 1751171 1% /home/admin
Hope for comments,
//Jim Klimov
On 2013-11-30 10:25, Jim Klimov wrote:
> 2) I think one more valid approach to unroll these dependencies
> via SMF in a packageable manner has emerged, and a rather apparent
> one: to move (or duplicate, or invoke) the code from fs-root which
> mounts a zfs-based /usr filesystem into a service of its own, on
> which consumers of the /usr namespace would depend (optional_all).
>
> At start this service would check if current root is zfs, and
> if a child dataset or legacy-mounted ZFS /usr are known and
> available - it would mount the dataset if yes. Otherwise it
> would exit without an error. As a result, the networking
> scripts in my split-zfs-based-root cause would be guaranteed
> to have a /usr before they run.
>
> It would (should) have no impact on systems that use monoroots
> on ZFS, or that use other roots (networked, metadevice, etc.) -
> these would work or fail the same as they do today.
>
> 3) Similarly, such a service can mount ZFS-based datasets of
> the rest of the root hierarchy if available (/var, children
> of bootfs, SHARED/*) and as a result of this, even the NWAM
> method on systems with local storage would have a complete
> environment to work in (for its LDAP/NIS interaction), all
> without major rehaul of SMF dependencies and method code.
>
> But in this extended case there is a possible though improbable
> loophole: if some parts of the operating environment including
> the rootfs are mounted from ZFS, but some major components
> like /var work from nfs/cachefs/ufs/... and then some datasets
> like /var/adm would be mounted on top of that. A script that
> only mounts a ZFS hierarchy in order to avoid dependencies
> on networking and metadevices would apparently ignore these
> other options; at most it can detect them in /etc/vfstab and
> stop mounting stuff under the involved mountpoint (this would
> come in later via filesystem service chains that exist today).
>
> And the current filesystem service methods should need to check
> that they don't mount the same (zfs) filesystem twice, so as
> to not bail out on "zfs mount" errors due to this.
More information about the OpenIndiana-discuss
mailing list