[OpenIndiana-discuss] Need help decoding fmd fault/error

Udo Grabowski (IMK) udo.grabowski at kit.edu
Thu Jul 31 22:48:14 UTC 2014


On 31/07/2014 23:00, Scott LeFevre wrote:
> *** Up front, my apologizes for this long post but I found that most
> forums will ask for more detail so I thought I try to provide it up
> front. ***
>
> I have a server at home running oi 151a9 and arrived home to find the
> system locked up.  Keyboard and network unresponsive. So I rebooted.
>
> As a quick aside, a little background on this box.  Its running a
> SuperMicro X8SAX mother board with two SuperMicro MV88SX6081 8-port SATA
> II PCI-X Controllers for several years.  Between the two controllers, I
> have (9) 1TB drives.  In the past month or so, I'll have one drive quit
> responding to the daily smartd short test and go off line.  The simplest
> way to fix this is to cold boot.  After a little resilvering and the
> raidz2 pool is back and running.  This gave me the impression that I had
> one of the drives slowly going down hill until today.  This morning was
> one of those days that I woke up and had to cold boot the server to get
> a sleepy drive going again.
>
> So I started digging for the error.
> .....(snip)....
> .....(snap)....
> I'm left at this point not knowing where to point the finger.  Is it the
> mother board and/or bus? The HBAs? and/or the disks?  Am I looking in
> the right spots?
>
> Any assistance in understanding this is appreciated.
>

Very often we see that a drive failing in a specific way
also brings down the controller bus after a while, up to
the point where all other disks on that line start to
exhibit bus errors, although they are not defect.
So the lesson to learn is not to keep a broken drive
in the box for too long, it will not heal itself...




More information about the openindiana-discuss mailing list