[OpenIndiana-discuss] disconnected drives, how to avoid in the future?
Rich
rercola at acm.jhu.edu
Thu Apr 12 21:42:38 UTC 2012
Those patches aren't yet in OI/IL mainline, as of when I looked today.
Regarding when they'll be usable, either in mainline or by fetching
them yourself...
17:33 < PMT> ping Triskelios - I don't suppose you have your pending
patches to mpt_sas (per
http://blogs.everycity.co.uk/alasdair/2011/05/adjusting-drive-timeouts-with-mdb-on-solaris-or-openindiana/)
laying around somewhere easily grabbable?
17:34 <@Triskelios> not at the moment, should land on our public repo
on bitbucket sometime soon
- Rich
On Thu, Apr 12, 2012 at 1:30 PM, Karl Rossing
<karl.rossing at barobinson.com> wrote:
> I'm running into this issue with disconnected drives on snv_134.
>
> Would upgrading to oi_151a2 have the updated mpt_sas drive as noted on
>
> http://blogs.everycity.co.uk/alasdair/2011/05/adjusting-drive-timeouts-with-mdb-on-solaris-or-openindiana/
> "Update (New): These timeouts don’t do squat because mpt_sas doesn’t honour
> the timeouts. This was recently uncovered by Nexenta and a patch to fix it
> is about to hit Illumos shortly. I’ll post when it does. Another patch is in
> progress which will further improve how mpt_sas handles failed drives.
> Thanks to Albert Lee for his work on them - you, sir, rock!"
>
> Karl
>
>
> On 01/10/2012 10:48 AM, Martin Frost wrote:
>>
>> > From: Jason Matthews<jason at broken.net>
>> > Date: Tue, 10 Jan 2012 08:26:08 -0800
>> >
>> >
>> > you can adjust the disk timeouts in solaris.
>>
>> Here's an article on how to do that, although it ends with the author
>> adding this comment "However in testing with failing harddrives (on
>> mpt_sas anyway), we see that the sd timeouts are completely ignored so
>> my entire post above is moot!"
>>
>>
>> http://blogs.everycity.co.uk/alasdair/2011/05/adjusting-drive-timeouts-with-mdb-on-solaris-or-openindiana/
>>
>> I haven't tested this, so does it work or not (in OpenIndiana)?
>>
>> Martin
>>
>> > there are two schools of thought here:
>> >
>> > 1) accomodate the extremely long timeouts of cinsumer drives and
>> > let the drive decide whether to report an error back (fail itself
>> > out)
>> >
>> > 2) set the time outs very narrowly and be aggressive in letting zfs
>> > fail out disks.
>> >
>> > i generally go with option 2.
>> >
>> > Sent from Jasons' hand held
>> >
>> > On Jan 10, 2012, at 7:13 AM, Maurilio Longo<maurilio.longo at libero.it>
>> wrote:
>> >
>> > > Geoff,
>> > >
>> > > I've hit this problem several times in the past, with OpenSolaris
>> > > and then with OpenIndiana.
>> > >
>> > > There are, to my knowledge, no available solutions, it is so by
>> > > design!
>> > >
>> > > If a disk stops responding the pool waits until after it responds
>> > > again (sometimes pulling it out of its slot and then reinserting
>> > > the disk causes a reset of the link and it starts working again).
>> > >
>> > > I was not able to assess what happens if I set failmode to
>> continue.
>> > >
>> > > I think it could be no better since you still cannot write to the
>> pool.
>> > >
>> > > This is IMHO the biggest problem of ZFS, in that I cannot
>> > > instruct it to stop using a failed device if it has some level of
>> > > redundancy still available.
>> > >
>> > > Wait is OK only if an entire vdev stops responding, not if a disk
>> > > in a vdev with redundancy has problems either fatal or
>> > > transitory.
>> > >
>> > > Best regards.
>> > >
>> > > Maurilio.
>> > >
>> > >
>> > > PS. Using server grade disks (those with TLER) makes it possibile
>> > > to overcome this problem for transitory errors.
>> > >
>> > >
>> > > Geoff Nordli wrote:
>> > >
>> > >> Part of my concern is why one disk would have completely brought
>> > >> down the system. I have seen this come up on the list before,
>> > >> but I don't remember any resolutions to fixing it.
>> > >>
>> > >> Anyone have any clues to try to prevent this from happening in
>> > >> the future?
>> > >>
>> > >> thanks,
>> > >>
>> > >> Geoff
>>
>> _______________________________________________
>> OpenIndiana-discuss mailing list
>> OpenIndiana-discuss at openindiana.org
>> http://openindiana.org/mailman/listinfo/openindiana-discuss
>
>
>
>
> CONFIDENTIALITY NOTICE: This communication (including all attachments) is
> confidential and is intended for the use of the named addressee(s) only and
> may contain information that is private, confidential, privileged, and
> exempt from disclosure under law. All rights to privilege are expressly
> claimed and reserved and are not waived. Any use, dissemination,
> distribution, copying or disclosure of this message and any attachments, in
> whole or in part, by anyone other than the intended recipient(s) is strictly
> prohibited. If you have received this communication in error, please notify
> the sender immediately, delete this communication from all data storage
> devices and destroy all hard copies.
>
>
> _______________________________________________
> OpenIndiana-discuss mailing list
> OpenIndiana-discuss at openindiana.org
> http://openindiana.org/mailman/listinfo/openindiana-discuss
More information about the OpenIndiana-discuss
mailing list