[OpenIndiana-discuss] NFS exported dataset crashes the system

Peter Wood peterwood.sd at gmail.com
Wed Apr 10 22:29:36 UTC 2013


On Wed, Apr 10, 2013 at 7:35 AM, Paul van der Zwan <paulz at vanderzwan.org>wrote:

>
> On 9 Apr 2013, at 3:13 , Peter Wood <peterwood.sd at gmail.com> wrote:
>
> > I've asked the ZFS discussion list for help on this but now I have more
> > information and it looks like a bug in the drivers or something.
> >
> > I have number of Dell PE R710 and PE 2950 servers running OpenSolaris, OI
> > 151a and OI 151a.7. All these systems are used as storage servers, clean
> OS
> > install, no extra services running. The systems are NFS exporting a lot
> of
> > ZFS datasets that are mounted on about ten CentOS-5.9 systems.
> >
> > The above setup has been working for 2+ years with no problem.
> >
> > Recently we bought two Supermicro systems:
> >  Supermicro X9DRH-iF
> >  Xeon E5-2620 @ 2.0 GHz 6-Core
> >  128GB RAM
> >  LSI SAS9211-8i HBA
> >  32x 3TB Hitachi HUS723030ALS640, SAS, 7.2K
> >
> > I installed OI151.a.7 on them and started migrating data from the old
> Dell
> > servers (zfs send/receive).
> >
> > Things have been working great for about two months until I migrated one
> > particular directory to one of the new Supermicro systems and after about
> > two days the system crashed. No network connectivity, black console, no
> > response to keyboard keys, no activity lights (no error lights either) on
> > the chassis. The only way out is to hit the reset button. Nothing in the
> > logs as far as I can tell. Log entries just stop when the system crashes.
> >
> > In the following two months I did a lot of testing and a lot of trips to
> > the colo in the middle of the night and the observation is that
> regardless
> > of the OS everything works on the Dell servers. As soon as I move that
> > directory to any of the Supermicro servers with OI151.a.7 it will crash
> > them within 2 hours up to 5 days.
> >
> > The Supermicro servers can be idle, exporting nothing, or can be
> exporting
> > 15+ other directories with high IOPS and working for months with no
> > problems but as soon as I have them export that directory they'll crash
> in
> > 5 days the most.
> >
> > There is only one difference between that directory an all others
> exported
> > directories. One of the client systems that mounts it and writes to it is
> > an old Debian 5.0 system. No idea why that would crash a Supermicro
> system
> > but not a Dell system.
> >
> > We worked directly with LSI developers and upgraded the firmware to some
> > unpublished, prerelease development version to no avail. We disabled all
> > power saving features and CPU C states in the BIOS and nothing changed.
> >
> > Any idea?
>
> I had a similar kind of problem where a VirtualBox Freebsd 9.1 VM could
> hang the server.
> It had /usr/src and /usr/obj NFS mounted from the OI a7 box it was running
> on.
> The are separate NFS shared datasets in on of my 3 pools.
>
> When I ran a make buildworld in that VM it consistently locked up the OI
> host, no console access,
> no network access ( not even ping ).
> As a test I switched to NFSv4 instead of NFSv3 and I have not seen a hang
> since.
> So it looked like a heavy NFSv3 load was the issue.
>
>         Paul
>
>
Make sense. I haven't tried that.

If I'm correct ZFS on OI supports NFSv2,3 and 4.

By switching to NFSv4 you mean that on your client machine (the FreeBSD VM)
you setup the NFS client to use NFSv4 protocol. Do I understand this
correctly? Or, did you do something on the OI server to accept only NFSv4
connections?

Could you please give more information.

Thanks,

-- Peter


More information about the OpenIndiana-discuss mailing list