[OpenIndiana-discuss] Zpool crashes system on reboot and import
jimklimov at cos.ru
jimklimov at cos.ru
Wed Dec 11 13:21:44 UTC 2013
It might help if you could run mdb over the kernel crashdump files so developers would at least have a stack trace of what went bad. Maybe they would have more specific questions on variable values etc. and would post those - but the general debugging steps (see Wiki) come first anyway.
So far you can also try to trace your pool with zdb -bscvL or similar to check for inconsistensies - i.e. if the box crashed/rebooted with io's written out of order like labels or uberblocks updated before the data they point to was committed, if the disks/caches lied.
Then you might have luck rolling back a few txg's on import, and you can model if this helps with zdb as well (so it would start with an older txg number and skip the possibly corrupted last few sync cycles).
Hth, Jim
Typos courtesy of my Samsung Mobile
-------- Исходное сообщение --------
От: CJ Keist <cj.keist at colostate.edu>
Дата: 2013.12.11 5:31 (GMT+01:00)
Кому: Discussion list for OpenIndiana <openindiana-discuss at openindiana.org>
Тема: [OpenIndiana-discuss] Zpool crashes system on reboot and import
All,
Some time back we had issue where I last entire zpool file system
due to possible bad raid controller card. At that time I was strongly
encouraged to get a raid card that supported JBOD and allow ZFS to
control all disks. Well I did that and unfortunately today I lost an
entire zpool that was configured with mutiple raidz2 volumes. See below:
root at projects2:~# zpool status data
pool: data
state: ONLINE
scan: scrub in progress since Tue Dec 10 18:11:19 2013
211G scanned out of 30.2T at 1/s, (scan is slow, no estimated time)
0 repaired, 0.68% done
config:
NAME STATE READ WRITE CKSUM
data ONLINE 0 0 0
raidz2-0 ONLINE 0 0 0
c3t50014EE25D929FBCd0 ONLINE 0 0 0
c3t50014EE2B2E8E02Ed0 ONLINE 0 0 0
c3t50014EE25C346397d0 ONLINE 0 0 0
c3t50014EE206EB0DDDd0 ONLINE 0 0 0
c3t50014EE25D932FC7d0 ONLINE 0 0 0
c3t50014EE25C341621d0 ONLINE 0 0 0
c3t50014EE206DE835Ed0 ONLINE 0 0 0
c3t50014EE2083D20DAd0 ONLINE 0 0 0
c3t50014EE2083D842Ed0 ONLINE 0 0 0
raidz2-1 ONLINE 0 0 0
c3t50014EE2B2E8D8CCd0 ONLINE 0 0 0
c3t50014EE2B18BE3A4d0 ONLINE 0 0 0
c3t50014EE25C339C05d0 ONLINE 0 0 0
c3t50014EE25D9307DAd0 ONLINE 0 0 0
c3t50014EE2B2E7E5E8d0 ONLINE 0 0 0
c3t50014EE206EB20ABd0 ONLINE 0 0 0
c3t50014EE2B2E56CFAd0 ONLINE 0 0 0
c3t50014EE25D92FC0Ad0 ONLINE 0 0 0
c3t50014EE25C42CFDBd0 ONLINE 0 0 0
raidz2-2 ONLINE 0 0 0
c3t50014EE25D933003d0 ONLINE 0 0 0
c3t50014EE2B2E89EF3d0 ONLINE 0 0 0
c3t50014EE2B2E8DC9Cd0 ONLINE 0 0 0
c3t50014EE25C35933Ed0 ONLINE 0 0 0
c3t50014EE2B1968F65d0 ONLINE 0 0 0
c3t50014EE2083D6987d0 ONLINE 0 0 0
c3t50014EE2083DDCACd0 ONLINE 0 0 0
c3t50014EE25C42C384d0 ONLINE 0 0 0
c3t50014EE206F2A389d0 ONLINE 0 0 0
raidz2-3 ONLINE 0 0 0
c3t50014EE2B1967C56d0 ONLINE 0 0 0
c3t50014EE2083E1931d0 ONLINE 0 0 0
c3t50014EE2B1895807d0 ONLINE 0 0 0
c3t50014EE25D9333E7d0 ONLINE 0 0 0
c3t50014EE2B196397Ad0 ONLINE 0 0 0
c3t50014EE25D930567d0 ONLINE 0 0 0
c3t50014EE2B19D4F5Ad0 ONLINE 0 0 0
c3t50014EE25D930525d0 ONLINE 0 0 0
c3t50014EE2083DDCFAd0 ONLINE 0 0 0
raidz2-4 ONLINE 0 0 0
c3t50014EE20721B2BBd0 ONLINE 0 0 0
c3t50014EE2B2E8DC6Ad0 ONLINE 0 0 0
c3t50014EE25C40CF9Fd0 ONLINE 0 0 0
c3t50014EE25D24BC9Fd0 ONLINE 0 0 0
c3t50014EE2B2E8DFDAd0 ONLINE 0 0 0
c3t50014EE25C33BF64d0 ONLINE 0 0 0
c3t50014EE25D9328C4d0 ONLINE 0 0 0
c3t50014EE25C401FBFd0 ONLINE 0 0 0
c3t50014EE2B1899AC5d0 ONLINE 0 0 0
errors: No known data errors
The system crashed and when rebooted it would just core dump and
reboot. After booting in single user mode I found the zpool that was
crashing the system. Exported that out and was able to bring the system
back up. When I try to import that pool it would again crash my system.
I finally found that I could import the pool without crashing my system
if I imported it read only:
zpool import -o readonly=on data
That is the output I have now above from the pool imported as readonly.
Looking for any advice on way to save this pool??? As you can see zpool
reports no errors with the pool.
Running OI 151a8 i86pc i386 i86pc Solaris
--
C. J. Keist Email: cj.keist at colostate.edu
Systems Group Manager Solaris 10 OS (SAI)
Engineering Network Services Phone: 970-491-0630
College of Engineering, CSU Fax: 970-491-5569
Ft. Collins, CO 80523-1301
All I want is a chance to prove 'Money can't buy happiness'
_______________________________________________
OpenIndiana-discuss mailing list
OpenIndiana-discuss at openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss
More information about the OpenIndiana-discuss
mailing list