[OpenIndiana-discuss] Replacing both disks in a mirror set

Martin Bochnig martin at martux.org
Mon Oct 8 23:07:44 UTC 2012


at first a reminder: never ever detach a disk before you have a third
disk that already completed resilvering.
The term "detach" is misleading, because it detaches the disk from the
pool. Afterwards you cannot access the disk's previous contents
anymore. Your "detached" half of a mirror can neither be imported, nor
mounted and also not even rescued (unlike a disk with a "zpool
destroy"ed disk). If I ever mentally recover from a zfs encryption
caused 2TB (or 3 years!) data loss, then I may offer an implementation
with less ambigous naming to Illumos.

"zpool detach" suggests, that you could still use this disk as a
reserve backup copy of the pool you were detaching it from. And that
you could simply "zpool attach" it again, in case the other disk would

Unfortunately, this is not the case.
Well, you can of course attach it again. Like any new or empty disk.
But only if and only if you have enough replicas, and that's not what
one wanted if one fell in this misunderstanding trap.
And there are no warnings in the zpool/zfs man pages.

What you want:

zpool replace <poolname> <vdev to be replaced> <new vdev>
But last weekend I lost 7 years of trust that I had in ZFS.
 Because Oracle Solaris 11/11 x86 with an encrypted and gzip-9
compressed mirror cannot be accessed anymore after VirtualBox forced
me to remove prower from the host machine.
Since then a 1:1 mirror of 2TB disks cannot be mounted anymore. It
always ends in a kernel panic due to a pf in

The problem is, that scrub doesn't find an error, and so has nothing
to auto-repair.
Even zpool attach sucessfully completes resilver, but the newly
resilvered disk contains the same error. Be aware that ZFS is not free
of bugs.
If it stays like that (I contacted some folks for help), then my trust
in ZFS has destroyed, VAPORIZED 3 years of my work and life.

So, back to your question: To be as cautious as possible, what I would
do in your case:

0.)  zpool offline <poolname> <vdev you want to replace>

1.) Physically remove this disc (important, because I have seen cases,
where zfs forgets that you offlined a vdev after a reboot)

2.) AFTER (!IMPORTANT!) you physically disconnected the disc to be
replaced, "zpool detach it" or alternatively take "zpool replace
<poolname> <oldvdev_that_you_disconnected_BEFOREinordertokeepitasbafailsafebackup!>

3.) Depending on if you did detach or replace in step 2.), "zpool
attach <poolname> <Firstvdevofthispool> <newvdev>  or ommit this step,
if you took "zpool replace" in step 2.)

What I do from now on: For each 1:1 mirror that I have I will take a
third disk, resilver it, offline and physically disconnect it, and
store it at a secure place.

Because if you have this much bad luck as I had last weekend, ZFS
replicates the data corruption, too.
And then you could have 1000 discs mirrored, they would all contain
the corruption.
For this reason, you are only on the safe side, if you physically
disconnect a third copy!

Good luck!

On 10/8/12, Maurilio Longo <maurilio.longo at libero.it> wrote:
> Dan Swartzendruber wrote:
>> I'm not understanding your problem.  If you add a 3rd temporary disk,
>> wait
>> for it to resilver, then replace c1t5d0, let the new disk resilver, then
>> detach the temporary disk, you will never have less than 2 up to date
>> disks
>> in the mirror. What am I missing?
> Dan,
> you're right, I was trying to find a way to "move" the new disk in the
> failing
> disk bay instead of simply replacing the failing one :)
> Thanks for the advice!
> Maurilio.
