[OpenIndiana-discuss] Errors without errors

Thu Aug 5 08:03:07 UTC 2021

> On 5. Aug 2021, at 10:52, Michelle <michelle at msknight.com> wrote:
> 
> Thanks for this. So I'm possibly better off rolling back the OS
> snapshot after my backup has finished?

maybe, maybe not. first of all, I have no idea to what point the rollback would be.

secondly; the system has seen some errors, at this time, the fault is, it does not tell us if those were checksum errors or something else, and it seems to me, it is something else.

and this is why: if you look on your zpool output, you see report about c6t3d0, but iostat -En below, it does not include c6t3d0. It seems to be missing.

what do you get from: 'iostat -En c6t3d0’ ?

Also, it would be good idea to check /var/adm/messages, are there any SATA or IO related messages around august 05. 02:00? 

FMA definitely has recorded an issue about pool, so there must be something going on.

rgds,
toomas

> 
> I have removed the drive for the moment, and am running a backup. Just
> in case :-)
> 
> mich at jaguar:~$ iostat -En
> c5d1             Soft Errors: 0 Hard Errors: 0 Transport Errors: 0 
> Model: INTEL SSDSA2M04 Revision:  Serial No: CVGB949301PC040 
> Size: 40.02GB <40019116032 bytes>
> Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0 
> Illegal Request: 0 
> c6t1d0           Soft Errors: 0 Hard Errors: 0 Transport Errors: 0 
> Vendor: ATA      Product: WDC WD40EZRZ-00G Revision: 0A80 Serial No:
> WD-WCC7K5UK24LJ 
> Size: 4000.79GB <4000787030016 bytes>
> Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0 
> Illegal Request: 0 Predictive Failure Analysis: 0 
> c6t0d0           Soft Errors: 0 Hard Errors: 0 Transport Errors: 0 
> Vendor: ATA      Product: WDC WD60EFRX-68L Revision: 0A82 Serial No:
> WD-WX21DA84EH0F 
> Size: 6001.18GB <6001175126016 bytes>
> Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0 
> Illegal Request: 0 Predictive Failure Analysis: 0 
> c6t2d0           Soft Errors: 0 Hard Errors: 0 Transport Errors: 0 
> Vendor: ATA      Product: WDC WD60EFRX-68L Revision: 0A82 Serial No:
> WD-WX51DB880RJ4 
> Size: 6001.18GB <6001175126016 bytes>
> Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0 
> Illegal Request: 0 Predictive Failure Analysis: 0 
> 
> 
> --------------- ------------------------------------  -------------- --
> -------
> TIME            EVENT-ID                              MSG-
> ID         SEVERITY
> --------------- ------------------------------------  -------------- --
> -------
> Aug 05 02:00:53 c5934fd6-5f4b-409e-b0f8-8f44ea8f99c4  ZFS-8000-
> FD    Major     
> 
> Host        : jaguar
> Platform    : ProLiant-MicroServer      Chassis_id  : 5C7351P4L9
> Product_sn  : 
> 
> Fault class : fault.fs.zfs.vdev.io
> Affects     : zfs://pool=jaguar/vdev=740c01ae0d3c3109
>                  faulted and taken out of service
> Problem in  : zfs://pool=jaguar/vdev=740c01ae0d3c3109
>                  faulted and taken out of service
> 
> Description : The number of I/O errors associated with a ZFS device
> exceeded
>                     acceptable levels.  Refer to
>              http://illumos.org/msg/ZFS-8000-FD for more information.
> 
> Response    : The device has been offlined and marked as faulted.  An
> attempt
>                     will be made to activate a hot spare if
> available. 
> 
> Impact      : Fault tolerance of the pool may be compromised.
> 
> Action      : Run 'zpool status -x' and replace the bad device.
> 
> 
> 
> On Thu, 2021-08-05 at 10:22 +0300, Toomas Soome via openindiana-discuss 
> wrote:
>>> On 5. Aug 2021, at 09:35, Michelle <michelle at msknight.com> wrote:
>>> 
>>> Hi Folks,
>>> 
>>> About a month ago I updated my Hipster...
>>> SunOS jaguar 5.11 illumos-ca706442e6 i86pc i386 i86pc
>>> 
>>> This morning it was absolutely crawling. Couldn't even connect via
>>> SSH
>>> and had to bounce the box.
>>> 
>>> It was reporting a drive as faulted, but didn't give any numbers...
>>> everything was 0. I'm now not sure what happened and whether the
>>> drive
>>> is good, or whether I should roll back the OS.
>>> 
>>> (and the drive WD Red 6TB (not shingle) went out of warrantee a
>>> week
>>> ago. How about that, eh?)
>>> 
>>> Grateful for any opinions please.
>>> 
>>> Thu  5 Aug 04:00:01 UTC 2021
>>> NAME   SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP  
>>>  HE
>>> ALTH  ALTROOT
>>> lion  5.45T  5.28T   176G        -         -     4%    96%  1.00x  
>>> DEGR
>>> ADED  -
>>> pool: jaguar
>>> state: DEGRADED
>>> status: One or more devices are faulted in response to persistent
>>> errors.
>>> 	Sufficient replicas exist for the pool to continue functioning
>>> in a
>>> 	degraded state.
>>> action: Replace the faulted device, or use 'zpool clear' to mark
>>> the
>>> device
>>> 	repaired.
>>> scan: scrub in progress since Thu Aug  5 00:00:00 2021
>>> 	6.00T scanned at 428M/s, 5.02T issued at 358M/s, 7.90T total
>>> 	1M repaired, 63.59% done, 0 days 02:20:17 to go
>>> config:
>>> 	NAME        STATE     READ WRITE CKSUM
>>> 	jaguar      DEGRADED     0     0     0
>>> 	  raidz1-0  DEGRADED     0     0     0
>>> 	    c6t0d0  ONLINE       0     0     0
>>> 	    c6t2d0  ONLINE       0     0     0
>>> 	    c6t3d0  FAULTED      0     0     0  too many
>>> errors  (repairing)
>>> 
>> 
>> Can you postoutput from: 
>> iostat -En
>> fmadm faulty
>> 
>> in any case, there definitely is bug about error reporting - counters
>> are zero while “too many errors” is reported.
>> 
>> rgds,
>> toomas
>> _______________________________________________
>> openindiana-discuss mailing list
>> openindiana-discuss at openindiana.org
>> https://openindiana.org/mailman/listinfo/openindiana-discuss
> 
> 
> _______________________________________________
> openindiana-discuss mailing list
> openindiana-discuss at openindiana.org
> https://openindiana.org/mailman/listinfo/openindiana-discuss