[OpenIndiana-discuss] zpool and HDD problems

Jim Klimov jimklimov at cos.ru
Mon Oct 5 04:17:26 UTC 2015


5 октября 2015 г. 0:27:38 CEST, Rainer Heilke <rheilke at dragonhearth.com> пишет:
>Greetings.
>I've recently had three hard drives fail in my server. One was the OS 
>disk, so I just reinstalled. The other two, however, were each one-half
>
>of zpool mirrors. They are the problem disks.
>
>Both have been replaced, but now I cannot seem to work with them. In 
>format -e, they are giving errors, specifically:
>        1. c3d1 <drive type unknown>
>           /pci at 0,0/pci-ide at 11/ide at 0/cmdk at 1,0
>  and,
>        7. c7d1 <drive type unknown>
>           /pci at 0,0/pci-ide at 14,1/ide at 0/cmdk at 1,0
>
>There is also a third disk erroring out:
>      3. c5t9d1 <SS330055-         99JJXXK-0001 cyl 60797 alt 2 hd 255 
>sec 63>
>           /pci at 0,0/pci1002,5a17 at 3/pci1000,9240 at 0/sd at 9,1
>
>I am suspecting c3d1 to be an old OS mirror, due to the low controller 
>number.
>
>When I select 1 or 7, I get a Segmentation fault, and get booted out of
>
>the format utility. (If I select 3, the format utility never comes
>back, 
>freezing.) A zpool status shows:
>
>   pool: Pool1
>  state: ONLINE
>status: Some supported features are not enabled on the pool. The pool
>can
>         still be used, but some features are unavailable.
>action: Enable all features using 'zpool upgrade'. Once this is done,
>        the pool may no longer be accessible by software that does not 
>support
>         the features. See zpool-features(5) for details.
>scan: resilvered 2.78M in 0h0m with 0 errors on Tue Sep 16 14:11:00
>2014
>config:
>
>         NAME        STATE     READ WRITE CKSUM
>         Pool1       ONLINE       0     0     0
>           c5t8d1    ONLINE       0     0     0
>
>errors: No known data errors
>
>   pool: data
>  state: DEGRADED
>status: One or more devices has experienced an error resulting in data
>         corruption.  Applications may be affected.
>action: Restore the file in question if possible.  Otherwise restore
>the
>         entire pool from backup.
>    see: http://illumos.org/msg/ZFS-8000-8A
>   scan: resilvered 36.1M in 0h17m with 738 errors on Thu Oct  1 
>18:17:43 2015
>config:
>
>         NAME                     STATE     READ WRITE CKSUM
>         data                     DEGRADED 20.6K     0     0
>           mirror-0               DEGRADED 81.8K     0     0
>             7152018192933189428  FAULTED      0     0     0  was 
>/dev/dsk/c11t8d1s0
>             c6d0                 ONLINE       0     0 81.8K
>
>errors: 737 data errors, use '-v' for a list
>
>(Doing a zpool status -v freezes the terminal.)
>
>The system has three disks connected to an LSI MegaRAID SAS 9240-8i 
>controller.
>
>I am suspecting that disk 3 (c5t9d1) might be the detached mirror of 
>Pool1 ( c5t8d1), but being unable to work with it, I cannot verify
>this. 
>I have no idea on how to deal with the data mirror. Should I just
>detach 
>/dev/dsk/c11t8d1s0 ( 7152018192933189428) and hope that c6d0 will be 
>clean enough for a decent scrub? Or is /dev/dsk/c11t8d1s0 ( 
>7152018192933189428) the disk with the less corrupted data? Not being 
>able to even get a listing (ls) of the data pool leaves me very
>hesitant.
>
>Does anyone have any ideas on how to clean this up?
>
>Thanks in advance,
>Rainer

As already noted - part of the problem may be IDE access mode: e.g. are your disks modern and large (over 2TB IIRC)?

Did you rescan OS devices (devfsadm -Cv)?

Did you try other partitioning programs (parted, fdisk) to see if you can access the new disks at all, and in particular to verify that zfs managed to make its partitioning? In the worse case you might have to define an mbr/efi 'solaris' partition yourself and use it (as cXtYdZpN) directly as a pool vdev, or use format afterwards to define slices inside that partition and use cXtYdZs0 on the disk. I wrote some howto's about 'advanced setup' on OI wiki that can help you get started.

But first i'd verify all components work, including hardware. Maybe cabling needs to be all re-plugged, the box may need a vacuum cleaner (or rather a blow-out), or the power-source has nearly died (aged capacitors, etc.)

Jim
--
Typos courtesy of K-9 Mail on my Samsung Android



More information about the openindiana-discuss mailing list