[OpenIndiana-discuss] [zfs] problem on my zpool
Richard Elling
richard.elling at richardelling.com
Wed Oct 23 16:16:43 UTC 2013
On Oct 22, 2013, at 11:46 PM, Clement BRIZARD <clement at brizou.fr> wrote:
> I cleared the "degraded" disk. we will see what happens in 131hours
Yes, clearing is the proper procedure.
The predicted time to complete is usually wildly inaccurate until you get near the end
of resilvering or scrubbing. The estimated time remaining is based on bandwidth, but
the workload is limited by IOPS and throttling. If you read a file, it will be checked and
repaired, if necessary, so you can continue to use the pool as it scans the older data.
As to the root cause, more likely a common, transient fault. Think along the lines of
power supplies, cables, flaky motherboard, etc. The disks themselves are likely to be
fine. The original fault might or might not recur.
-- richard
>
> pool: nas
> state: ONLINE
> status: One or more devices is currently being resilvered. The pool will
> continue to function, possibly in a degraded state.
> action: Wait for the resilver to complete.
> scan: resilver in progress since Wed Oct 23 08:25:56 2013
> 2.23G scanned out of 22.2T at 48.6M/s, 133h22m to go
> 6.10M resilvered, 0.01% done
> config:
>
> NAME STATE READ WRITE CKSUM CAP Product
> nas ONLINE 0 0 0
> raidz1-0 ONLINE 0 0 0
> c8t50024E9004993E6Ed0p0 ONLINE 0 0 0 2 TB SAMSUNG HD204UI
> c8t50024E92062E7524d0 ONLINE 0 0 0 2 TB SAMSUNG HD204UI
> c8t50024E900495BE84d0p0 ONLINE 0 0 0 2 TB SAMSUNG HD204UI
> c8t50014EE25A5EEC23d0p0 ONLINE 0 0 0 2 TB WDC WD20EARS-00M
> c8t50024E9003F03980d0p0 ONLINE 0 0 0 2 TB SAMSUNG HD204UI
> c8t50014EE2B0D3EFC8d0 ONLINE 0 0 0 2 TB WDC WD20EARX-00P
> c8t50014EE6561DDB4Cd0p0 ONLINE 0 0 0 2 TB WDC WD20EARS-00M
> c8t50024E9003F03A09d0p0 ONLINE 0 0 0 2 TB SAMSUNG HD204UI
> raidz1-1 ONLINE 0 0 0
> c50t8d0 ONLINE 0 0 0 (resilvering) 2 TB ST2000DL004 HD20
> c2d0 ONLINE 0 0 0 (resilvering) 2 TB
> c1d0 ONLINE 0 0 0 (resilvering) 2 TB
> c50t11d0 ONLINE 0 0 0 2 TB SAMSUNG HD204UI
> c50t10d0 ONLINE 0 0 0 (resilvering) 2 TB SAMSUNG HD204UI
>
>
>
>
> Le 23/10/2013 08:43, Clement BRIZARD a écrit :
>> I woke up this morning and so you're messages, unfortunately I had to reboot, the server completely froze.
>> Now I have that :
>>
>> pool: nas
>> state: DEGRADED
>> status: One or more devices is currently being resilvered. The pool will
>> continue to function, possibly in a degraded state.
>> action: Wait for the resilver to complete.
>> scan: resilver in progress since Wed Oct 23 08:19:42 2013
>> 5.81G scanned out of 22.2T at 49.2M/s, 131h43m to go
>> 15.6M resilvered, 0.03% done
>> config:
>>
>> NAME STATE READ WRITE CKSUM
>> nas DEGRADED 0 0 0
>> raidz1-0 DEGRADED 0 0 0
>> c8t50024E9004993E6Ed0p0 ONLINE 0 0 0
>> c8t50024E92062E7524d0 ONLINE 0 0 0
>> c8t50024E900495BE84d0p0 ONLINE 0 0 0
>> c8t50014EE25A5EEC23d0p0 ONLINE 0 0 0
>> c8t50024E9003F03980d0p0 ONLINE 0 0 0
>> c8t50014EE2B0D3EFC8d0 ONLINE 0 0 0
>> c8t50014EE6561DDB4Cd0p0 DEGRADED 0 0 0 too many errors
>> c8t50024E9003F03A09d0p0 ONLINE 0 0 0
>> raidz1-1 ONLINE 0 0 0
>> c50t8d0 ONLINE 0 0 0 (resilvering)
>> c2d0 ONLINE 0 0 0 (resilvering)
>> c1d0 ONLINE 0 0 0 (resilvering)
>> c50t11d0 ONLINE 0 0 0
>> c50t10d0 ONLINE 0 0 0 (resilvering)
>>
>>
>>
>>
>>
>> Le 23/10/2013 08:00, Jason Matthews a écrit :
>>>
>>> first, dont reboot. if you do you might not be able remount the pool. the data you see is from the disks that are functioning. listing the files and copying complete files are two different things. if you dont have a backup you may need to copy whatever partial data you can from the broken pool.
>>>
>>> now let's start by getting the disks back in good shape.
>>>
>>> clear the degraded disk
>>> zpool clear c8t50014EE6561DDB4Cd0p0
>>>
>>> reseat the missing disks in the hopes they come back then clear them
>>>
>>> check cfgadm -al and make sure they are connected and configured
>>>
>>> when you reseat them check the messages (or dmesg) to see if the system notices the re-insertion. if it does see the disk installed clear the disks in the pool in effort to bring the pool back to an operational state.
>>>
>>> Sent from Jasons' hand held
>>>
>>> On Oct 22, 2013, at 5:04 PM, Clement BRIZARD <clement at brizou.fr> wrote:
>>>
>>>> Hello everybody,
>>>> I have a problem with my pool, I had some slowdowns lately on my nfs share of my zfs pool. A weekly scrub began and is still running but it worries me, it currently returne that
>>>>
>>>> pool: nas
>>>> state: UNAVAIL
>>>> status: One or more devices are faulted in response to IO failures.
>>>> action: Make sure the affected devices are connected, then run 'zpool clear'.
>>>> see: http://illumos.org/msg/ZFS-8000-HC
>>>> scan: scrub in progress since Sun Oct 20 19:29:23 2013
>>>> 15.2T scanned out of 22.2T at 84.0M/s, 24h5m to go
>>>> 1.29G repaired, 68.67% done
>>>> config:
>>>>
>>>> NAME STATE READ WRITE CKSUM
>>>> nas UNAVAIL 63 2 0 insufficient replicas
>>>> raidz1-0 DEGRADED 0 0 0
>>>> c8t50024E9004993E6Ed0p0 ONLINE 0 0 0
>>>> c8t50024E92062E7524d0 ONLINE 0 0 0
>>>> c8t50024E900495BE84d0p0 ONLINE 0 0 0
>>>> c8t50014EE25A5EEC23d0p0 ONLINE 0 0 0
>>>> c8t50024E9003F03980d0p0 ONLINE 0 0 1 (repairing)
>>>> c8t50014EE2B0D3EFC8d0 ONLINE 0 0 0
>>>> c8t50014EE6561DDB4Cd0p0 DEGRADED 0 0 211 too many errors (repairing)
>>>> c8t50024E9003F03A09d0p0 ONLINE 0 0 18 (repairing)
>>>> raidz1-1 UNAVAIL 131 9 0 insufficient replicas
>>>> c50t8d0 REMOVED 0 0 0 (repairing)
>>>> c2d0 ONLINE 0 0 0 (repairing)
>>>> c1d0 ONLINE 0 0 0 (repairing)
>>>> c50t11d0 ONLINE 0 0 0 (repairing)
>>>> c50t10d0 REMOVED 0 0 0
>>>>
>>>> errors: 10972861 data errors, use '-v' for a list
>>>>
>>>>
>>>> really weird, I haven't disconnected any disk. For several hours even if it said that the pool was unavailable I was browsing on it via nfs. I can't anymore.
>>>>
>>>>
>>>> What do you think I should do ?
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> OpenIndiana-discuss mailing list
>>>> OpenIndiana-discuss at openindiana.org
>>>> http://openindiana.org/mailman/listinfo/openindiana-discuss
>>> _______________________________________________
>>> OpenIndiana-discuss mailing list
>>> OpenIndiana-discuss at openindiana.org
>>> http://openindiana.org/mailman/listinfo/openindiana-discuss
>>
>>
>> _______________________________________________
>> OpenIndiana-discuss mailing list
>> OpenIndiana-discuss at openindiana.org
>> http://openindiana.org/mailman/listinfo/openindiana-discuss
>
>
> _______________________________________________
> OpenIndiana-discuss mailing list
> OpenIndiana-discuss at openindiana.org
> http://openindiana.org/mailman/listinfo/openindiana-discuss
--
Richard.Elling at RichardElling.com
+1-760-896-4422
More information about the OpenIndiana-discuss
mailing list