[OpenIndiana-discuss] Resilver restarting on second dead drive?
jsowoc at gmail.com
Thu Feb 9 14:31:16 UTC 2012
On Thu, Feb 9, 2012 at 12:39 AM, Roy Sigurd Karlsbakk <roy at karlsbakk.net> wrote:
> Hi all
> On this server, I had a drive dying on me, and the spare kicked in. I replaced the drive, detached the spare and "zpool replace"d the original dev, waiting for it to resilver. When this was almost finished (say, 2 hours left), a second drive died, the spare kicking in and resilver restarting. zpool status now showed resilver having been running for a few minutes, and more time to wait for salvation.
> Would it be possible/easy/good to change this behaviour to make zpool carry on resilvering the initial devices instead of restarting the whole process? Restarting it renders my system more vulnerable for a longer time (with two drives out of a RAIDz2), and also renders my nerves a bit more shaky, neither of which is beneficial.
> All the best
I can see your point, where if a resilver is close to finishing, you'd
want it to finish before working on yet another drive. However, if you
were to lose two drives in short succession, the current behaviour
would reconstruct both of them in parallel, i.e. reading from the
remaining good drives and simultaneously writing out both sets of
I don't quite understand what happened in your specific case. Let's
say you had a setup:
raidz2 c1d0 c1d1 c1d2 c1d3 spare c1d4 c1d5
Let's say c1d3 failed. Resilver started and d4 replaced d3's place -
you now have a non-degraded raidz2. You then physically swapped out d3
for a new drive and did "zpool replace". Until the replace command
completes, you still have the fully-functioning zpool of c1d0 c1d1
c1d2 c1d4. When another drive, eg. c1d2, fails, I would hope the
replace command is cancelled (it's cosmetic - d4 is doing fine instead
of d3) and instead the array is resilvered with c2d5 in place of c1d2.
Is this what happened (other than the specific disk numbers)?
More information about the OpenIndiana-discuss