[OpenIndiana-discuss] ZFS; what the manuals don't say ...

Thu Oct 25 08:51:34 UTC 2012

2012-10-24 23:58, Timothy Coalson wrote:
 >  I doubt I would like the outcome of having
 > some software make arbitrary decisions of what real filesystem each
 > to put file on, and then having one filesystem fail, so if you really
 > expect this, you may be happier keeping the two pools separate and
 > deciding where to put stuff yourself (since if you are expecting a
 > set of disks to fail, I expect you would have some idea as to which
 > ones it would be, for instance an external enclosure).

This to an extent sounds similar (doable with) hierarchical storage
management, such as Sun's SAMFS/QFS solution. Essentially, this is
a (virtual) filesystem where you set up storage rules based on last
access times and frequencies, data types, etc. and where you have
many tiers of storage (ranging from fast, small, expensive to slow,
bulky, cheap), such as SSD Arrays - 15K SAS arrays - 7.2 SATA - Tape.

New incoming data ends up on the fast tier. Old stale data lives on
tapes. Data used sometimes migrates between tiers. The rules you
define for the HSM system regulate how many copies on which tier
you'd store, so loss of some devices should not be fatal - as well
as cleaning up space on the faster tier to receive new data or to
cache the old data requested by users and fetched from slower tiers.

I did propose to add some HSM-type capabilities to ZFS, mostly with
the goals of power-saving on home-NAS machines, so that the box could
live with a couple of active disks (i.e. rpool and the "active-data"
part of the data pool) while most of the data pool's disks can remain
spun-down. Whenever a user reads some data from the pool (watching
a movie or listening to music or processing his photos) the system
would prefetch the data (perhaps a folder with MP3's) onto the cache
disks and let the big ones spin down - with a home NAS and few users
it is likely that if you're watching a movie, you system is otherwise
unused for a couple of hours.

Likewise, and this happens to be the trickier part, new writes to the
data pool should go to the active disks and occasionally sync to and
spread over the main pool disk.

I hoped this can be all done transparently to users within ZFS, but
overall discussions led to conclusion that this can better be done
not within ZFS, but with some daemons (perhaps a dtrace-abusing script)
doing the data migration and abstraction (the transparency to users).
Besides, with introduction and advances in generic L2ARC, and with
the possibility of file-level prefetch, much of that discussion became
moot ;)

Hope this small historical insight helps you :)
//Jim Klimov