[OpenIndiana-discuss] General ZFS questions (Michelle Knight)

Robin Axelsson gu99roax at student.chalmers.se
Thu Jan 13 15:49:16 UTC 2011


There is no longer any documentation readily available that explain all 
those error types. I don't understand them properly either but if I had 
Soft, Hard, Media, Recoverable errors I would immediately suspect that 
something is wrong with the affected drive (that could be caused by a 
bad PSU or cable). I believe transport errors and illegal requests means 
that something is wrong with the communication with the drive but not 
necessarily the drive itself.

I have 18 illegal requests on each drive in the storage pool which I 
suspect occurs during boot. It's always the same number every time I 
boot and since these errors don't increase in numbers during continuous 
operation I'm currently not too concerned about them.

Since the entire system freezes there might be something else than the 
drive that cause the failures. It could be the power supply, bad cables 
or a poor SAS/SATA controller. Maybe some controllers have very poor 
drivers that freezes the entire system when a drive is failing. I chose 
an LSI 1068 controller specifically to avoid such problems, and LSI 
drivers are native in SOL/OSOL.  If I could I would test the drives on a 
system that is proven to work.


On 2011-01-13 15:56, Michelle Knight wrote:
>> I would be grateful if someone clarified whether this is what Michelle
>> Knight is referring to or the numbers in the CKSUM column.
> Yes, the numbers in the CKSUM column, which trigger repairs during the scrub
> process.
>
>> By last fall I was affected by a number of freezes when accessing the
>> pool over the network. More thorough tests revealed that it was just the
>> storage pool that froze and not the entire system.
> I'm having the whole machine freeze out. Won't even listen to the power
> switch. All terminal sessions freeze and the mouse and keyboard on the console
> are frozen.
>
> The previous freezes happened sometime between half an hour and one and a half
> hours of up time. If she stays up for two hours now with both the backup
> drives disconnected, I'll try a scrub on the main data set. If that works,
> then I'll try copying the data off over a network connection.
>
> If it works after that, then I'm looking at having a problem when the backup
> ZFS pool is connected. This will obviously take more than a few hours to go
> through.
>
> After spome SFTP file transfers, my iostat give this, but I don't know what an
> illegal request number means...
>
> mich at jaguar:~# iostat -En
> c2t0d0           Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
> Vendor: ATA      Product: INTEL SSDSA2M040 Revision: 02HB Serial No:
> Size: 40.02GB<40020664320 bytes>
> Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
> Illegal Request: 8 Predictive Failure Analysis: 0
> c1t0d0           Soft Errors: 0 Hard Errors: 6 Transport Errors: 0
> Vendor: LITE-ON  Product: DVDRW LH-18A1P   Revision: GL03 Serial No:
> Size: 0.00GB<0 bytes>
> Media Error: 0 Device Not Ready: 6 No Device: 0 Recoverable: 0
> Illegal Request: 0 Predictive Failure Analysis: 0
> c2t1d0           Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
> Vendor: ATA      Product: INTEL SSDSA2M040 Revision: 02HB Serial No:
> Size: 40.02GB<40019582464 bytes>
> Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
> Illegal Request: 7 Predictive Failure Analysis: 0
> c2t2d0           Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
> Vendor: ATA      Product: ST32000542AS     Revision: CC34 Serial No:
> Size: 2000.40GB<2000398934016 bytes>
> Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
> Illegal Request: 6 Predictive Failure Analysis: 0
> c2t3d0           Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
> Vendor: ATA      Product: ST31500541AS     Revision: CC34 Serial No:
> Size: 1500.30GB<1500301910016 bytes>
> Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
> Illegal Request: 6 Predictive Failure Analysis: 0
> c2t4d0           Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
> Vendor: ATA      Product: SAMSUNG HD154UI  Revision: 1118 Serial No:
> Size: 1500.30GB<1500301910016 bytes>
> Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
> Illegal Request: 6 Predictive Failure Analysis: 0
>
> _______________________________________________
> OpenIndiana-discuss mailing list
> OpenIndiana-discuss at openindiana.org
> http://openindiana.org/mailman/listinfo/openindiana-discuss
> .
>




More information about the OpenIndiana-discuss mailing list