[OpenIndiana-discuss] Help debugging and replacing failed (maybe) hard drive

Jan Owoc jsowoc at gmail.com
Tue Jul 3 19:23:48 UTC 2012


On Tue, Jul 3, 2012 at 12:54 PM, Wood Peter <peterwood.sd at gmail.com> wrote:
> root at tzstor14:~# zpool status -v pool01
>   pool: pool01
>  state: DEGRADED
> status: One or more devices has been removed by the administrator.
>         Sufficient replicas exist for the pool to continue functioning in a
>         degraded state.

Am I correct in assuming that (to your knowledge) neither you nor any
other administrator has detached or removed any drives on or about
June 24th?


> action: Online the device using 'zpool online' or replace the device with
>         'zpool replace'.
>   scan: resilvered 71.9G in 0h36m with 0 errors on Sun Jun 24 05:37:43 2012
> config:
>
>         NAME           STATE     READ WRITE CKSUM
>         pool01         DEGRADED     0     0     0
>           raidz1-0     DEGRADED     0     0     0
>             spare-0    REMOVED      0     0     0
>               c3t0d0   REMOVED      0     0     0
>               c3t14d0  ONLINE       0     0     0
>             c3t1d0     ONLINE       0     0     0
>             c3t2d0     ONLINE       0     0     0
>             c3t3d0     ONLINE       0     0     0
>             c3t4d0     ONLINE       0     0     0
>             c3t5d0     ONLINE       0     0     0
>             c3t6d0     ONLINE       0     0     0
>           raidz1-1     ONLINE       0     0     0
>             c3t7d0     ONLINE       0     0     0
>             c3t8d0     ONLINE       0     0     0
>             c3t9d0     ONLINE       0     0     0
>             c3t10d0    ONLINE       0     0     0
>             c3t11d0    ONLINE       0     0     0
>             c3t12d0    ONLINE       0     0     0
>             c3t13d0    ONLINE       0     0     0
>         logs
>           mirror-2     ONLINE       0     0     0
>             c2t4d0     ONLINE       0     0     0
>             c2t5d0     ONLINE       0     0     0
>         cache
>           c2t2d0       ONLINE       0     0     0
>           c2t3d0       ONLINE       0     0     0
>         spares
>           c3t14d0      INUSE     currently in use
>
> errors: No known data errors


> There is ton of information on the Internet about zpool failures but it's
> mostly outdated and in some cases contradicting. I couldn't find anything
> that applies to my case.

I personally use the ZFS Administration Guide (I think the version for
OpenSolaris is most similar to what OI has). The document should
relatively comprehensive and (hopefully) not internally contradictory:
http://docs.oracle.com/cd/E19120-01/open.solaris/index.html


> - What state "REMOVED" means and why the drive was put in this state?
>   "iostat -En" shows no errors for this drive. I can't find any indication
> that the drive is bad.

Page 106 of the PDF of the ZFS Administration Guide:
"The device was physically removed while the system was running.
Device removal detection is hardware-dependent and might not be
supported on all platforms."


> - If the drive has to be replaced could somebody please confirm that the
> following steps are sufficient:
>
> * zpool offline pool01 c3t0d0
> * zpool detach pool01 c3t0d0
> * Physically replace the drive with a new one
> * zpool add pool01 spare c3t0d0
> * zpool scrub pool01

I would do this a bit differently:
* zpool offline pool01 c3t0d0
** IF you believe the hard drive is faulty (vs. random single error),
physically replace
* zpool replace pool01 c3t0d0
** wait for automagic resilver (~36 minutes)
* zpool detach pool01 c3t14d0
* zpool add pool01 spare c3t14d0

I like "my" way better because the drives/spares stay in the same
locations (you might have them labelled a certain way) but it does
require an additional copying of the data back (if you left checksums
on, this shouldn't be a problem). No additional scrub should be
necessary, but if you are doubting the drive, you can re-scrub just in
case (you should be scrubbing weekly or monthly anyway, right?).


Jan



More information about the OpenIndiana-discuss mailing list