[OpenIndiana-discuss] Need help decoding fmd fault/error
Udo Grabowski (IMK)
udo.grabowski at kit.edu
Thu Jul 31 22:48:14 UTC 2014
On 31/07/2014 23:00, Scott LeFevre wrote:
> *** Up front, my apologizes for this long post but I found that most
> forums will ask for more detail so I thought I try to provide it up
> front. ***
>
> I have a server at home running oi 151a9 and arrived home to find the
> system locked up. Keyboard and network unresponsive. So I rebooted.
>
> As a quick aside, a little background on this box. Its running a
> SuperMicro X8SAX mother board with two SuperMicro MV88SX6081 8-port SATA
> II PCI-X Controllers for several years. Between the two controllers, I
> have (9) 1TB drives. In the past month or so, I'll have one drive quit
> responding to the daily smartd short test and go off line. The simplest
> way to fix this is to cold boot. After a little resilvering and the
> raidz2 pool is back and running. This gave me the impression that I had
> one of the drives slowly going down hill until today. This morning was
> one of those days that I woke up and had to cold boot the server to get
> a sleepy drive going again.
>
> So I started digging for the error.
> .....(snip)....
> .....(snap)....
> I'm left at this point not knowing where to point the finger. Is it the
> mother board and/or bus? The HBAs? and/or the disks? Am I looking in
> the right spots?
>
> Any assistance in understanding this is appreciated.
>
Very often we see that a drive failing in a specific way
also brings down the controller bus after a while, up to
the point where all other disks on that line start to
exhibit bus errors, although they are not defect.
So the lesson to learn is not to keep a broken drive
in the box for too long, it will not heal itself...
More information about the openindiana-discuss
mailing list