[OpenIndiana-discuss] SMART status changing?

Mon Apr 30 20:46:44 UTC 2012

On "complete drive failure", you are probably correct but SMART has
layers.  There is a overall PASS/FAIL status which generally reads:

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
See vendor-specific Attribute list for marginal Attributes.

If you look at the attribute list, you will often see sectors pending
failure and failed yet the drive still says that it status is PASSED.  

I was responding with the assumption that smartctl was kicking out
emails or to system log that 1 or 2 sectors failed but later they would
disappear in later reports/updates because they had been reallocated. 

So I think we need to better understand what smartctl is reporting: a
FAILED drive or a few sectors pending failure or failed.  

If the overall status states FAILED, its probably about to die and
should be replaced ASAP.  I've seen one or two drives report FAILED and
then later PASSED but they ultimately died in less then 48 hours.

I hope that clarifies my response.
Cheers,
Scott LeFevre

On Mon, 2012-04-30 at 22:29 +0200, Roy Sigurd Karlsbakk wrote:

> Are you sure? I thought SMART usually reports failure only after a few hundred/thousands sectors have failed and spare sectors are running short…
> 
> roy
> 
> ----- Opprinnelig melding -----
> > I've run across this before. What generally happens is a sector will
> > become unreadable/failed and get reported by smartctl. If the file
> > system is active and the file using that sector is updated/over
> > written,
> > the drive will reallocate the sector and fall off as a problem within
> > SMART. The thing to watch are the sector reporting attributes using
> > 'smartctl -a'. Specifically, keep tabs on the following:
> > 
> > ID# ATTRIBUTE_NAME
> > 5 Reallocated_Sector_Ct
> > 197 Current_Pending_Sector
> > 198 Offline_Uncorrectable
> > 
> > Its been my experience that occasionally a few sectors will fail and
> > become reallocated and it causes little harm. If you see 10-20+
> > sectors
> > fail at a time then the drive's failure is close at hand and you
> > should
> > replace the drive ASAP.
> > 
> > Hope this helps.
> > 
> > Scott LeFevre
> > 
> > 
> > On Mon, 2012-04-30 at 21:16 +0200, Roy Sigurd Karlsbakk wrote:
> > 
> > > Hi all
> > >
> > > I know this question isn't strictly an openindiana question, but I
> > > want to give it a shot…
> > >
> > > I have a few servers with some 300 drives in total with Icinga (a
> > > Nagios fork) monitoring the zpool and smartctl health. It sometimes
> > > happens that smartctl reports a bad drive, and it's replaced. A few
> > > times I've seen smartctl report a bad drive, and then, after a day
> > > or two, it suddenly changes its mind, reporting a perfectly healthy
> > > drive.
> > >
> > > Has anyone see this happen?
> > >
> > > Vennlige hilsener / Best regards
> > >
> > > roy
> > > --
> > > Roy Sigurd Karlsbakk
> > > (+47) 98013356
> > > roy at karlsbakk.net
> > > http://blogg.karlsbakk.net/
> > > --
> > > I all pedagogikk er det essensielt at pensum presenteres
> > > intelligibelt. Det er et elementært imperativ for alle pedagoger å
> > > unngå eksessiv anvendelse av idiomer med fremmed opprinnelse. I de
> > > fleste tilfeller eksisterer adekvate og relevante synonymer på
> > > norsk.
> > >
> > > _______________________________________________
> > > OpenIndiana-discuss mailing list
> > > OpenIndiana-discuss at openindiana.org
> > > http://openindiana.org/mailman/listinfo/openindiana-discuss
> > _______________________________________________
> > OpenIndiana-discuss mailing list
> > OpenIndiana-discuss at openindiana.org
> > http://openindiana.org/mailman/listinfo/openindiana-discuss
>