[OpenIndiana-discuss] NFS exported dataset crashes the system

Ram Chander ramquick at gmail.com
Tue Apr 9 05:59:45 UTC 2013


There could be corruption in that dir. Can you run a scrub on the pool

zpool scrub <pool>


On Tue, Apr 9, 2013 at 6:43 AM, Peter Wood <peterwood.sd at gmail.com> wrote:

> I've asked the ZFS discussion list for help on this but now I have more
> information and it looks like a bug in the drivers or something.
>
> I have number of Dell PE R710 and PE 2950 servers running OpenSolaris, OI
> 151a and OI 151a.7. All these systems are used as storage servers, clean OS
> install, no extra services running. The systems are NFS exporting a lot of
> ZFS datasets that are mounted on about ten CentOS-5.9 systems.
>
> The above setup has been working for 2+ years with no problem.
>
> Recently we bought two Supermicro systems:
>   Supermicro X9DRH-iF
>   Xeon E5-2620 @ 2.0 GHz 6-Core
>   128GB RAM
>   LSI SAS9211-8i HBA
>   32x 3TB Hitachi HUS723030ALS640, SAS, 7.2K
>
> I installed OI151.a.7 on them and started migrating data from the old Dell
> servers (zfs send/receive).
>
> Things have been working great for about two months until I migrated one
> particular directory to one of the new Supermicro systems and after about
> two days the system crashed. No network connectivity, black console, no
> response to keyboard keys, no activity lights (no error lights either) on
> the chassis. The only way out is to hit the reset button. Nothing in the
> logs as far as I can tell. Log entries just stop when the system crashes.
>
> In the following two months I did a lot of testing and a lot of trips to
> the colo in the middle of the night and the observation is that regardless
> of the OS everything works on the Dell servers. As soon as I move that
> directory to any of the Supermicro servers with OI151.a.7 it will crash
> them within 2 hours up to 5 days.
>
> The Supermicro servers can be idle, exporting nothing, or can be exporting
> 15+ other directories with high IOPS and working for months with no
> problems but as soon as I have them export that directory they'll crash in
> 5 days the most.
>
> There is only one difference between that directory an all others exported
> directories. One of the client systems that mounts it and writes to it is
> an old Debian 5.0 system. No idea why that would crash a Supermicro system
> but not a Dell system.
>
> We worked directly with LSI developers and upgraded the firmware to some
> unpublished, prerelease development version to no avail. We disabled all
> power saving features and CPU C states in the BIOS and nothing changed.
>
> Any idea?
>
> Thanks a lot.
>
> -- Peter
> _______________________________________________
> OpenIndiana-discuss mailing list
> OpenIndiana-discuss at openindiana.org
> http://openindiana.org/mailman/listinfo/openindiana-discuss
>


More information about the OpenIndiana-discuss mailing list