[OpenIndiana-discuss] Kernel panic on hung zpool accessed via lofi

Rich Murphey rich at whiteoaklabs.com
Wed Sep 16 19:44:59 UTC 2015


I'm also seeing panics in 'deadman' caused by failing drives, for SATA
drives, but also for SAS and SATA NAS drives as well.

In my limited experience, I've found Smart stats very effective in
resolving drive issues, using five specific metrics recommended by
Backblaze (below).
By eliminating drives that have non-zero values for any of these specific
metrics, the panics (for me) were eliminated.

I mention this also because drives that are gradually failing can cause
intermittent hangs, and cause one to suspect SAS cables, expanders, etc.
I don't want to discourage you from swapping other parts to try to resolve
issues, but rather look at Smart metrics as well.

Best regards,
Rich


   - SMART 5 – Reallocated_Sector_Count.
   - SMART 187 – Reported_Uncorrectable_Errors.
   - SMART 188 – Command_Timeout.
   - SMART 197 – Current_Pending_Sector_Count.
   - SMART 198 – Offline_Uncorrectable.

https://www.backblaze.com/blog/hard-drive-smart-stats/




On Wed, Sep 16, 2015 at 1:50 PM Watson, Dan <Dan.Watson at bcferries.com>
wrote:

> I know it's not the best route to go but for personal use and budget SATA
> drives on SAS expanders is much easier to achieve.  Used 3Gbit SAS trays
> with expander can be had for $12/drive bay, while for retail there is still
> a significant price jump going from SATA interface to SAS. And even drives
> like the new Seagate 8TB "Cloud backup" drive don't have a SAS option,
> although now that I think about it they are probably marketed more towards
> "personal cloud" devices than actual datacenter based cloud services.
>
> Also the newer SATA drives are much less disruptive in a SAS tray than the
> early capacity drives. I've noticed that drives with a labeled WWN tend to
> be less error prone, and only when a drive completely dies do you get the
> cascading bus reset that kills all IO. Just don't daisy chain the SAS
> expanders/trays because that seems to introduce significant errors.
>
> This is an updated fresh install of OI. I'm not using any special
> publisher so I imagine it's somewhat out of date.
>
> I've managed to get the zpool working using the read-only  import option
> mentioned previously and it seems to be working fine. I'm betting I just
> did not have enough RAM available to do dedupe.
>
> Thanks!
> Dan
>
> -----Original Message-----
> From: Nikola M [mailto:minikola at gmail.com]
> Sent: September 16, 2015 11:25 AM
> To: Discussion list for OpenIndiana
> Subject: Re: [OpenIndiana-discuss] Kernel panic on hung zpool accessed via
> lofi
>
> On 09/11/15 08:57 PM, Watson, Dan wrote:
> > I'm using mpt_sas with SATA drives, and I_DO_  have error counters
> climbing for some of those drives, is it probably that?
> > Any other ideas?
>
> It is generally strongly advised to use SATA disks on SATA controllers
> and SAS disks on SAS controllers. And to use controller that can do JBOD.
>
> Also, using SAS to SATA multipliers or using port multipliers at all is
> strongly disadvised too,
> because it is usually cheap logic in it, that can go crazy and disk is
> not under direct control of the controller..
>
> Also what OI/illumos is that, because I was reading long ago there were
> some bugs solved in illumos for mpt_sas.
>
> First two issues could be hardware problems, and such config is usually
> unsupportable (I know it is not on Smartos), third issue could be seen
> further.
>
>
> _______________________________________________
> openindiana-discuss mailing list
> openindiana-discuss at openindiana.org
> http://openindiana.org/mailman/listinfo/openindiana-discuss
>
> _______________________________________________
> openindiana-discuss mailing list
> openindiana-discuss at openindiana.org
> http://openindiana.org/mailman/listinfo/openindiana-discuss
>


More information about the openindiana-discuss mailing list