[OpenIndiana-discuss] Zpool replacing forever

Andrew Gabriel illumos at cucumber.demon.co.uk
Mon Jan 21 14:05:53 UTC 2019


On 21/01/2019 02:37, Gary Mills wrote:
> On Sun, Jan 20, 2019 at 08:16:44PM +0000, Andrew Gabriel wrote:
>> zfs is keeping the old disk around as a ghost, in case you can put it
>> back in and zfs can find good copies of the corrupt data on it during
>> the resilver. It will stay in the zpool until there's a clean scrub,
>> after which zfs will collapse out the replacing-0 vdev. (In your case,
>> you know there is no copy of the bad data so this won't help, but in
>> general it can.)
> I see.  That's a good explanation, one that I didn't see anywhere
> else.  I suppose that the man page for `zpool replace' should advise
> you to correct all errors before using the command.  That way, the
> confused status I saw would not arise.


Actually, you want to get that disk replacing ASAP to reduce risk of 
catastrophic data loss, so I would not suggest holding off the replace 
until you sorted out the errors. You could start on sorting out the 
errors if you are waiting for a replacement disk to arrive on site, but 
getting it replacing ASAP is the top priority.

I've had a few cases of a second drive failing during a resilver. If we 
hadn't been some way into the first resilver, we would have lost about 4 
billion files, but in the worst case of the double disk fails on RAIDZ1 
I saw, only 11,000 files out of over a billion were lost IIRC (and in 
most of the RAIDZ1 double disk fails, just a couple of hundred file were 
lost). 11,000 files is about half an hour to restore, verses 2 months to 
restore the whole pool.


-- 

Andrew Gabriel



More information about the openindiana-discuss mailing list