[OpenIndiana-discuss] ZFS; what the manuals don't say ...

Tue Oct 23 13:41:31 UTC 2012

On 10/23/2012 8:29 AM, Robin Axelsson wrote:
> Hi,
> I've been using zfs for a while but still there are some questions 
> that have remained unanswered even after reading the documentation so 
> I thought I would ask them here.
>
> I have learned that zfs datasets can be expanded by adding vdevs. Say 
> that you have created say a raidz3 pool named "mypool" with the command
> # zpool create mypool raidz3 disk1 disk2 disk3 ... disk8
>
> you can expand the capacity by adding vdevs to it through the command
>
> # zpool add mypool raidz3 disk9 disk10 ... disk16
>
> The vdev that is added doesn't need to have the same raid/mirror 
> configuration or disk geometry, if I understand correctly. It will 
> merely be dynamically concatenated with the old storage pool. The 
> documentations says that it will be "striped" but it is not so clear 
> what that means if data is already stored in the old vdevs of the pool.
>
> Unanswered questions:
>
> * What determines _where_ the data will be stored on a such a pool? 
> Will it fill up the old vdev(s) before moving on to the new one or 
> will the data be distributed evenly?
> * If the old pool is almost full, an even distribution will be 
> impossible, unless zpool rearranges/relocates data upon adding the 
> vdev. Is that what will happen upon adding a vdev?
> * Can the individual vdevs be read independently/separately? If say 
> the newly added vdev faults, will the entire pool be unreadable or 
> will I still be able to access the old data? What if I took a snapshot 
> before adding the new vdev?
>
> * Can several datasets be mounted to the same mount point, i.e. can 
> multiple "file system"-datasets be mounted so that they (the root of 
> them) are all accessed from exactly the same (POSIX) path and 
> subdirectories with coinciding names will be merged? The purpose of 
> this would be to seamlessly expand storage capacity this way just like 
> when adding vdevs to a pool.
> * If that's the case how will the data be distributed/allocated over 
> the datasets if I copy a data file to that path?
>
> Kind regards
> Robin.

*) yes, you can dynamically add more disks and zfs will just start using 
them.
*) zfs stripes across all vdevs evenly, as it can.
*) as your old vdev gets full, zfs will only allocate blocks to the 
newer, less full vdev
*) since it's a stripe across vdevs (and they should all be raidz2 or 
better!) if one vdev fails, your filesystem will be unavailable. They 
are not independent unless you put them in a separate pool.
*) you cannot have overlapping /mixed filesystems at exactly the same 
place, however it is perfectly possible to have e.g. /export be on 
rootpool, /export/mystuff on zpool1 and /export/mystuff/morestuff be on 
zpool2.

The unasked question is "If I wanted the vdevs to be equally balanced, 
could I?". The answers is a qualified yes. What you would need to do is 
reopen every single file, buffer it to memory, then write every block 
out again. We did this operation once. It means that all vdevs will 
roughly have the same block allocation when you are done.