[OpenIndiana-discuss] zpool in sorry state

Fri Jul 8 11:05:49 UTC 2011

I'm posting this in hopes someone can help me out.  Yesterday, it appears we
lost 2-3 drives in our pool.  The pool is 22 drives mirrored with 2 hot
spares, both of which activated:

Here's the current state of the pool:

pool: vmstorage
 state: DEGRADED
status: One or more devices has experienced an error resulting in data
        corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
        entire pool from backup.
   see: http://www.sun.com/msg/ZFS-8000-8A
 scan: resilvered 517G in 6h8m with 1 errors on Fri Jul  8 02:56:21 2011
config:

        NAME                         STATE     READ WRITE CKSUM
        vmstorage                    DEGRADED    18     0     0
          mirror-0                   ONLINE       0     0     0
            c1t5000C5002DAEAB3Cd0    ONLINE       0     0     0
            c1t5000C5002DAEDC16d0    ONLINE       0     0     0
          mirror-1                   ONLINE       0     0     0
            c1t5000C5002DAEDEA4d0    ONLINE       0     0     0
            c1t5000C5002DAEED3Ed0    ONLINE       0     0     0
          mirror-2                   DEGRADED     1     0     0
            spare-0                  DEGRADED     1     0     0
              c1t5000C5002DAEF015d0  DEGRADED     1     0     0  too many
errors
              c1t5000C5002DAFEACCd0  ONLINE       0     0     2
            spare-1                  DEGRADED     2     0     0
              c1t5000C5002DAF0B60d0  FAULTED      0     0     0  too many
errors
              c1t5000C5002DAFE81Bd0  ONLINE       0     0     2
          mirror-3                   DEGRADED     6     0     0
            c1t5000C5002DAF0E57d0    ONLINE       0     0     0
            c1t5000C5002DAF5AF7d0    FAULTED      0     0     0  too many
errors
          mirror-4                   ONLINE       0     0     0
            c1t5000C5002DAF5D4Bd0    ONLINE       0     0     0
            c1t5000C5002DAF5EE4d0    ONLINE       0     0     0
          mirror-5                   ONLINE       0     0     0
            c1t5000C5002DAF8A56d0    ONLINE       0     0     0
            c1t5000C5002DAF08C0d0    ONLINE       0     0     0
          mirror-6                   ONLINE       0     0     0
            c1t5000C5002DAF9B8Bd0    ONLINE       0     0     0
            c1t5000C5002DAF76E9d0    ONLINE       0     0     0
          mirror-7                   ONLINE       0     0     0
            c1t5000C5002DAF241Ad0    ONLINE       0     0     0
            c1t5000C5002DAF556Ed0    ONLINE       0     0     0
          mirror-8                   ONLINE      11     0     0
            c1t5000C5002DAF1140d0    ONLINE      11     0     0
            c1t5000C5002DAF6428d0    ONLINE      11     0     0
          mirror-9                   ONLINE       0     0     0
            c1t5000C5002DAFC8EEd0    ONLINE       0     0     0
            c1t5000C5002DAFC42Fd0    ONLINE       0     0     0
          mirror-10                  ONLINE       0     0     0
            c1t5000C5002DAFCB37d0    ONLINE       0     0     0
            c1t5000C5002DAFD4A7d0    ONLINE       0     0     0
        logs
          mirror-11                  ONLINE       0     0     0
            c1t50015179594B0EF0d0    ONLINE       0     0     0
            c1t50015179594DC9E4d0    ONLINE       0     0     0
        cache
          c1t50015179594DD9AAd0      ONLINE       0     0     0
          c1t50015179594DD9C1d0      ONLINE       0     0     0
        spares
          c1t5000C5002DAFE81Bd0      INUSE     currently in use
          c1t5000C5002DAFEACCd0      INUSE     currently in use

errors: 1 data errors, use '-v' for a list

I know I've lost some data, which I'm currently restoring from backup (yes,
thankfully I have everything backed up).

I have two questions though:
1) I haven't found a definitive answer on what I need to do to replace the
two failed drives.  zpool detach [failed_drive]? and the spare takes over,
or is it more complicated than that?
2) Some data (LUNS) on this pool are still available and functioning fine.
 Can I trust this pool once I replace the failed drives, delete the trashed
LUNs and scrub?  I can't remove the whole pool and start over again, I've
got customers irritated enough as it is.

Your advice/help is greatly appreciated.