[OpenIndiana-discuss] Kernel panic on hung zpool accessed via lofi

Wed Sep 16 20:42:14 UTC 2015

On 16/09/2015 19:24, Nikola M wrote:
> On 09/11/15 08:57 PM, Watson, Dan wrote:
>> I'm using mpt_sas with SATA drives, and I_DO_  have error counters 
>> climbing for some of those drives, is it probably that?
>> Any other ideas?
>
> It is generally strongly advised to use SATA disks on SATA controllers 
> and SAS disks on SAS controllers. And to use controller that can do JBOD.
>
> Also, using SAS to SATA multipliers or using port multipliers at all 
> is strongly disadvised too,
> because it is usually cheap logic in it, that can go crazy and disk is 
> not under direct control of the controller..

A disk interface specialist was telling me earlier today what goes wrong 
here. The problem is that many SATA drives drop the phy interface when 
they have some internal problem, even just retrying transfers. Normally 
that doesn't matter a scrap when they are connected 1-to-1 to a SATA 
controller. However, if they are connected to SAS fabric, it will cause 
the SAS fabric to re-enumerate all the drives at least at that port 
multiplier level, likely losing outstanding IOs on other drives, most 
particularly other SATA drives as implementations of STP (SATA Tunneling 
Protocol) in SAS HBAs/expanders just aren't very good. This often causes 
OS drivers to report errors against the wrong drive - i.e. not 
necessarily the one which is the root cause but others were IOs are 
lost, and you can't necessarily tell which was to blame (and probably 
don't even realise you might be being mislead). It happens again if/when 
the SATA drive recovers and brings its phy back up. This could cause FMA 
to fault out wrong drives in situations were you do genuinely have a 
misbehaving drive, leaving the bad drive online when there's no pool 
redundancy left to fault out any more drives.

Why is this not a problem with SAS drives? Well apparently they don't 
drop their phy interfaces anywhere near as easily when such things 
happen, because they are designed for use with SAS fabric where doing so 
is known to be a problem. Even if they do drop their phy, it doesn't 
result in confusing error reports from other drives on the SAS fabric. 
Some SAS drives can actually reset and reboot their firmware if it 
crashes without the phy interface being dropped.

> Also what OI/illumos is that, because I was reading long ago there 
> were some bugs solved in illumos for mpt_sas.

Somewhere around 18 months ago IIRC, Nexenta pushed a load of fixes for 
this into their git repo. I don't think I've seen these picked up yet by 
Illumos, although maybe I missed it? The fixes were in mpt_sas and FMA, 
to more accurately determine when disks are going bad by pushing the 
timing of the SCSI commands right down to the bottom of the stack (so 
delays in the software stack are not mistaken for bad drives), and to 
have FMA better analyse and handle errors when they do happen.

-- 
Andrew Gabriel