[OpenIndiana-discuss] Kernel panic on hung zpool accessed via lofi
Andrew Gabriel
illumos at cucumber.demon.co.uk
Wed Sep 16 20:42:14 UTC 2015
On 16/09/2015 19:24, Nikola M wrote:
> On 09/11/15 08:57 PM, Watson, Dan wrote:
>> I'm using mpt_sas with SATA drives, and I_DO_ have error counters
>> climbing for some of those drives, is it probably that?
>> Any other ideas?
>
> It is generally strongly advised to use SATA disks on SATA controllers
> and SAS disks on SAS controllers. And to use controller that can do JBOD.
>
> Also, using SAS to SATA multipliers or using port multipliers at all
> is strongly disadvised too,
> because it is usually cheap logic in it, that can go crazy and disk is
> not under direct control of the controller..
A disk interface specialist was telling me earlier today what goes wrong
here. The problem is that many SATA drives drop the phy interface when
they have some internal problem, even just retrying transfers. Normally
that doesn't matter a scrap when they are connected 1-to-1 to a SATA
controller. However, if they are connected to SAS fabric, it will cause
the SAS fabric to re-enumerate all the drives at least at that port
multiplier level, likely losing outstanding IOs on other drives, most
particularly other SATA drives as implementations of STP (SATA Tunneling
Protocol) in SAS HBAs/expanders just aren't very good. This often causes
OS drivers to report errors against the wrong drive - i.e. not
necessarily the one which is the root cause but others were IOs are
lost, and you can't necessarily tell which was to blame (and probably
don't even realise you might be being mislead). It happens again if/when
the SATA drive recovers and brings its phy back up. This could cause FMA
to fault out wrong drives in situations were you do genuinely have a
misbehaving drive, leaving the bad drive online when there's no pool
redundancy left to fault out any more drives.
Why is this not a problem with SAS drives? Well apparently they don't
drop their phy interfaces anywhere near as easily when such things
happen, because they are designed for use with SAS fabric where doing so
is known to be a problem. Even if they do drop their phy, it doesn't
result in confusing error reports from other drives on the SAS fabric.
Some SAS drives can actually reset and reboot their firmware if it
crashes without the phy interface being dropped.
> Also what OI/illumos is that, because I was reading long ago there
> were some bugs solved in illumos for mpt_sas.
Somewhere around 18 months ago IIRC, Nexenta pushed a load of fixes for
this into their git repo. I don't think I've seen these picked up yet by
Illumos, although maybe I missed it? The fixes were in mpt_sas and FMA,
to more accurately determine when disks are going bad by pushing the
timing of the SCSI commands right down to the bottom of the stack (so
delays in the software stack are not mistaken for bad drives), and to
have FMA better analyse and handle errors when they do happen.
--
Andrew Gabriel
More information about the openindiana-discuss
mailing list