[OpenIndiana-discuss] server hangs

Roman Naumenko roman at naumenko.ca
Thu Sep 1 18:42:54 UTC 2011


I need to dig into MB manual, but its basically all commodity hw based (although mb is some server-type Asus). 

--Roman N 

----- Original Message -----

> what about hw event logs? if you have power flucuations it might show
> ip there.

> you can probably pull those out from your service processor or boot
> to bios and read them there.

> Sent from Jasons' hand held

> On Sep 1, 2011, at 8:37 AM, Roman Naumenko <roman at naumenko.ca> wrote:

> > Costly troubleshooting you had.
> > All right then, I will wait for the next failure to look through it
> > once again and maybe swap psu if nothing again found.
> >
> > --Roman N
> >
> > ----- Original Message -----
> >
> >> I burned through about 3 disks before I figured it out. Nothing in
> >> the
> >> logs made me think this but the eventual failure of the disks
> >> alerted
> >> me
> >> that something hardwarish was happening.
> >
> >> On 08/31/11 11:01 PM, Roman Naumenko wrote:
> >>> Well, might be the reason. 8 drivers is certainly limit too much
> >>> for a
> >>> stock psu. But there should be some traces, no?
> >>> How did you figure out the reason for errors on your system?
> >>>
> >>> --Roman
> >>>
> >>> Daniel Kjar said the following, on 31-08-11 9:43 PM:
> >>>> Careful... are you overtaxing your power supply? My 148 system
> >>>> was
> >>>> behaving like that when I put too many drives in an ultra 20.
> >>>>
> >>>> On 8/31/2011 7:48 PM, Roman Naumenko wrote:
> >>>>> Hi,
> >>>>>
> >>>>> I have SunOS 5.11 oi_148 installed on my storage server with 8
> >>>>> disks
> >>>>> in raidz2 pool.
> >>>>> It hangs about once in a week and I had to restart it.
> >>>>> Can you help me troubleshoot it?
> >>>>>
> >>>>> It has some zfs volumes shared over nfs and afpd. (afpd is
> >>>>> unfortunately a development version to satisfy OSX Lion).
> >>>>>
> >>>>> roks at data:~$ afpd -V
> >>>>> afpd 2.2.0 - Apple Filing Protocol (AFP) daemon of Netatalk
> >>>>>
> >>>>> afpd has been compiled with support for these features:
> >>>>>
> >>>>> AFP3.x support: Yes
> >>>>> TCP/IP Support: Yes
> >>>>> DDP(AppleTalk) Support: No
> >>>>> CNID backends: dbd last tdb
> >>>>> SLP support: No
> >>>>> Zeroconf support: Yes
> >>>>> TCP wrappers support: Yes
> >>>>> Quota support: Yes
> >>>>> Admin group support: Yes
> >>>>> Valid shell checks: Yes
> >>>>> cracklib support: No
> >>>>> Dropbox kludge: No
> >>>>> Force volume uid/gid: No
> >>>>> ACL support: Yes
> >>>>> EA support: ad | sys
> >>>>> LDAP support: Yes
> >>>>>
> >>>>> It also has time-slider enabled, which is pretty buggy peace of
> >>>>> hmmm
> >>>>> software, but it shouldn't cause server to crash or hang.
> >>>>>
> >>>>> So the problems start with nfs and/or afpd timeouts on clients,
> >>>>> but
> >>>>> I still can ssh to the server. Can't read any files or logs
> >>>>> though.
> >>>>> Then network service disappears in a minute or few minutes,
> >>>>> console
> >>>>> becomes frozen and I have to do hard restart at that point.
> >>>>>
> >>>>> Where should I look to understand what causing this?
> >>>>> Since I can't reproduce the problem, I'd like to get prepared
> >>>>> when
> >>>>> it happens next time.
> >>>>> I couldn't find anything unusual in the logs after restart.
> >>>>>
> >>>>> time-slider complains for some reason about space on rpool
> >>>>> Aug 31 19:41:36 data time-sliderd: [ID 702911 daemon.notice] No
> >>>>> more
> >>>>> hourly snapshots left
> >>>>> Aug 31 19:41:36 data time-sliderd: [ID 702911 daemon.warning]
> >>>>> rpool
> >>>>> exceeded 80% capacity. Hourly and daily automatic snapshots
> >>>>> were
> >>>>> destroyed
> >>>>>
> >>>>> Where does it see 80%?
> >>>>>
> >>>>> $ df -h
> >>>>>
> >>>>> Filesystem Size Used Avail Use% Mounted on
> >>>>> rpool/ROOT/solaris 5.5G 3.0G 2.6G 54% /
> >>>>> swap 1.4G 396K 1.4G 1% /etc/svc/volatile
> >>>>> /usr/lib/libc/libc_hwcap1.so.1 5.5G 3.0G 2.6G 54%
> >>>>> /lib/libc.so.1
> >>>>> swap 1.4G 8.0K 1.4G 1% /tmp
> >>>>> swap 1.4G 52K 1.4G 1% /var/run
> >>>>> rpool/export 2.6G 32K 2.6G 1% /export
> >>>>> rpool/export/home 2.6G 33K 2.6G 1% /export/home
> >>>>> rpool/export/home/usr1 2.6G 38K 2.6G 1% /export/home/usr1
> >>>>> rpool/export/home/usr2 3.0G 385M 2.6G 13% /export/home/usr2
> >>>>> rpool 2.6G 48K 2.6G 1% /rpool
> >>>>>
> >>>>>
> >>>>> --Roman
> >>>>>
> >>>>> _______________________________________________
> >>>>> OpenIndiana-discuss mailing list
> >>>>> OpenIndiana-discuss at openindiana.org
> >>>>> http://openindiana.org/mailman/listinfo/openindiana-discuss
> >>>>
> >>>
> >>> _______________________________________________
> >>> OpenIndiana-discuss mailing list
> >>> OpenIndiana-discuss at openindiana.org
> >>> http://openindiana.org/mailman/listinfo/openindiana-discuss
> >
> >> --
> >> Dr. Daniel Kjar
> >> Assistant Professor of Biology
> >> Division of Mathematics and Natural Sciences
> >> Elmira College
> >> 1 Park Place
> >> Elmira, NY 14901
> >> 607-735-1826
> >> http://faculty.elmira.edu/dkjar
> >
> >> "...humans send their young men to war; ants send their old
> >> ladies"
> >> -E. O. Wilson
> >> _______________________________________________
> >> OpenIndiana-discuss mailing list
> >> OpenIndiana-discuss at openindiana.org
> >> http://openindiana.org/mailman/listinfo/openindiana-discuss
> > _______________________________________________
> > OpenIndiana-discuss mailing list
> > OpenIndiana-discuss at openindiana.org
> > http://openindiana.org/mailman/listinfo/openindiana-discuss

> _______________________________________________
> OpenIndiana-discuss mailing list
> OpenIndiana-discuss at openindiana.org
> http://openindiana.org/mailman/listinfo/openindiana-discuss


More information about the OpenIndiana-discuss mailing list