[OpenIndiana-discuss] ZFS read speed(iSCSI)

Heinrich van Riel heinrich.vanriel at gmail.com
Tue Jun 11 19:19:29 UTC 2013


I  dont think they just throw a vanilla copy of the OS on vanilla hardware
for that. Like all storage providers they will have a set of specifics
around the drivers/os and firmware down the disk level/model in most cases.
We dont have access to that tested interoperability matrix and I am sure
there are tons of other custom bits. Last I checked emc vnx still runs
windows, but what does that really mean?

Looking at the Comstar documentation for FC, it says to put the adapter in
target mode and allocated LUNs. We are running emc vnx/cx/datadomain/cisco
ucs (combo modules), tons of VMware and windows systems connected to the
same fabric with no problems using the same adapters. I cant believe it is
the fabric.

When the slowdown happens and I stop all IO from the initiator side the
Solaris system is unable to reboot. It will say it is stopping system
services, but will be stuck in this state with no mention of any problems
during the shutdown or before from the reboot.

I took one last stab as one person mentioned that they are using FC with
OmniOS and it does work for me, but only with the Qlogic card. With Emulex
it also drops the link.
I will just have to accept that I can only connect to the a single switch
currently since we are an emulex shop and I have only this one qlt.



On Tue, Jun 11, 2013 at 8:51 AM, Michael Stapleton <
michael.stapleton at techsologic.com> wrote:

> I have no idea what the problem is, but it is worth noting that last
> time I checked, Oracles storage arrays were running Solaris and Comstar.
>
> Mike
>
> On Mon, 2013-06-10 at 20:36 -0400, Heinrich van Riel wrote:
>
> > spoke to soon died again.
> > Give up. Just posting the result in case someone else run into issues
> with
> > fc target and find this. Solaris is not even the answer. When it slows
> down
> > I kill the copies and wait until there is no more IO and can see that
> from
> > VMware side and pool io. When I try to reboot it is not able to the same
> as
> > OI. clearly a problem with comstar's ability to deal with fc. after a
> hard
> > reset it will work for again a short bit
> > Last post
> > Cheers
> >
> >
> >
> > On Mon, Jun 10, 2013 at 7:36 PM, Heinrich van Riel <
> > heinrich.vanriel at gmail.com> wrote:
> >
> > > switch to the qlogic adpater using solaris 11.1. Problem resolved....
> well
> > > for now. Not as fast as OI with the emulex adapter, perhaps it is the
> older
> > > pool/fs version since I want to keep my options open for now. I am
> getting
> > > around 200MB/s when cloning. At least backups can run for now. Getting
> a
> > > license for 11.1 for one year. I will worry about it again after that.
> > > Never had problems with any device connected fc like this, that is
> usually
> > > the beauty of it but expensive. Downside right now is  qlt card I have
> only
> > > has a single port.
> > > thanks,
> > >
> > >
> > >
> > > On Mon, Jun 10, 2013 at 2:46 PM, Heinrich van Riel <
> > > heinrich.vanriel at gmail.com> wrote:
> > >
> > >> Just want to provide an update here.
> > >>
> > >> Installed Solaris 11.1 reconfigured everything. Went back to Emulex
> card
> > >> since it is a dual port for connect to both switches. Same problem,
> well
> > >> the link does not fail, but it is writing at 20k/s.
> > >>
> > >>
> > >> I am really not sure what to do anymore other that to accept fc
> target is
> > >> no longer an option, but I will post in the ora solaris forum. Either
> this
> > >> has been an issue for some time or it is a hardware combination or
> perhaps
> > >> I am doing something seriously wrong.
> > >>
> > >>
> > >>
> > >>
> > >>
> > >> On Sat, Jun 8, 2013 at 6:57 PM, Heinrich van Riel <
> > >> heinrich.vanriel at gmail.com> wrote:
> > >>
> > >>> I took a look at every server that I knew I could power down or that
> is
> > >>> slated for removal in the future and I found a qlogic adapter not in
> use.
> > >>>
> > >>> HBA Port WWN: 2100001b3280b
> > >>>         Port Mode: Target
> > >>>         Port ID: 12000
> > >>>         OS Device Name: Not Applicable
> > >>>         Manufacturer: QLogic Corp.
> > >>>         Model: QLE2460
> > >>>         Firmware Version: 5.2.1
> > >>>         FCode/BIOS Version: N/A
> > >>>         Serial Number: not available
> > >>>         Driver Name: COMSTAR QLT
> > >>>         Driver Version: 20100505-1.05
> > >>>         Type: F-port
> > >>>         State: online
> > >>>         Supported Speeds: 1Gb 2Gb 4Gb
> > >>>         Current Speed: 4Gb
> > >>>         Node WWN: 2000001b3280b
> > >>>
> > >>>
> > >>> Link does not go down but useless, right from the start it is as
> slow as
> > >>> the emulex after I made the xfer change.
> > >>> So it is not a driver issue.
> > >>>
> > >>> alloc free read write read write
> > >>> ----- ----- ----- ----- ----- -----
> > >>> 681G 53.8T 5 12 29.9K 51.3K
> > >>> 681G 53.8T 0 0 0 0
> > >>> 681G 53.8T 0 0 0 0
> > >>> 681G 53.8T 0 0 0 0
> > >>> 681G 53.8T 0 0 0 0
> > >>> 681G 53.8T 0 88 0 221K
> > >>> 681G 53.8T 0 0 0 0
> > >>> 681G 53.8T 0 0 0 0
> > >>> 681G 53.8T 0 0 0 0
> > >>> 681G 53.8T 0 0 0 0
> > >>> 681G 53.8T 0 163 0 812K
> > >>> 681G 53.8T 0 0 0 0
> > >>> 681G 53.8T 0 0 0 0
> > >>> 681G 53.8T 0 0 0 0
> > >>> 681G 53.8T 0 0 0 0
> > >>> 681G 53.8T 0 198 0 1.13M
> > >>> 681G 53.8T 0 0 0 0
> > >>> 681G 53.8T 0 0 0 0
> > >>> 681G 53.8T 0 0 0 0
> > >>> 681G 53.8T 0 0 0 0
> > >>> 681G 53.8T 0 88 0 221K
> > >>> 681G 53.8T 0 0 0 0
> > >>> 681G 53.8T 0 0 0 0
> > >>> 681G 53.8T 0 0 0 0
> > >>> 681G 53.8T 0 0 0 0
> > >>> 681G 53.8T 0 187 0 1.02M
> > >>> 681G 53.8T 0 0 0 0
> > >>> 681G 53.8T 0 0 0 0
> > >>> 681G 53.8T 0 0 0 0
> > >>> 681G 53.8T 0 0 0 0
> > >>>
> > >>> This is a clean install of a7 with nothing done other than nic
> config in
> > >>> lacp. I did not attempt a reinstall of a5 yet and prob won't either.
> > >>> I dont know what to do anymore I was going to try OmniOS but there
> is no
> > >>> way of knowing if it would work.
> > >>>
> > >>>
> > >>> I will see if I can get approved for a solaris license for one year,
> if
> > >>> not I am switching back to windows storage spaces. Cant backup the
> current
> > >>> lab on the EMC array to this node in any event since there is no ip
> > >>> connectivity and fc is a dream.
> > >>>
> > >>> Guess I am the only one trying to use it as an fc target and these
> > >>> problems are not noticed.
> > >>>
> > >>>
> > >>>
> > >>> On Sat, Jun 8, 2013 at 4:55 PM, Heinrich van Riel <
> > >>> heinrich.vanriel at gmail.com> wrote:
> > >>>
> > >>>> changing max-xfer-size causes the link to stay up and no problem are
> > >>>> reported from stmf.
> > >>>>
> > >>>> #       Memory_model       max-xfer-size
> > >>>> #     ----------------------------------------
> > >>>> #       Small              131072 - 339968
> > >>>> #       Medium             339969 - 688128
> > >>>> #       Large              688129 - 1388544
> > >>>> #
> > >>>> # Range:  Min:131072   Max:1388544   Default:339968
> > >>>> #
> > >>>> max-xfer-size=339968;
> > >>>>
> > >>>> as soon as I changed it to 339969 the there is no link loss, but I
> > >>>> would be so lucky that is solves my problem. after a few min it
> would grind
> > >>>> to a crawl, so much so that in vmware it will take well over a min
> to just
> > >>>> browse a folder, we talking are a few k/s.
> > >>>>
> > >>>> Setting it to the max causes the the link to go down again and smtf
> > >>>> reports the following again:
> > >>>> FROM STMF:0062568: abort_task_offline called for LPORT: lport abort
> > >>>> timed out
> > >>>>
> > >>>> I also played around with the buffer settings.
> > >>>>
> > >>>> Any ideas?
> > >>>> Thanks,
> > >>>>
> > >>>>
> > >>>>
> > >>>>  On Fri, Jun 7, 2013 at 8:38 PM, Heinrich van Riel <
> > >>>> heinrich.vanriel at gmail.com> wrote:
> > >>>>
> > >>>>> New card, different PCI-E slot (removed the other one) different FC
> > >>>>> switch (same model with same code) older hba firmware (2.72a2)  =
> same
> > >>>>> result.
> > >>>>>
> > >>>>> On the setting changes when it boots it complains about this
> option,
> > >>>>> does not exist: szfs_txg_synctime
> > >>>>> The changes still allowed for a constant write, but at a max of
> > >>>>> 100Mb/s so not much better than iscsi over 1Gbe. I guess I would
> need to
> > >>>>> increase write_limit_override. if i disable the settings again it
> > >>>>> shows 240MB/s with bursts up to 300, both stats are from VMware's
> disk perf
> > >>>>> monitoring while cloning the same VM.
> > >>>>>
> > >>>>> All iSCSI luns remain active with no impact.
> > >>>>> So I will conclude, I guess, it seems to be the problem that was
> there
> > >>>>> in 2009 from build ~100 to 128. When I search the error messages
> all posts
> > >>>>> date back to 2009.
> > >>>>>
> > >>>>> I will try one more thing to reinstall with 151a5 since a server
> that
> > >>>>> was removed from the env was running this with no issues, but was
> using an
> > >>>>> older emulex HBA, LP10000 PCIX
> > >>>>> Looking at the notable changes in the release notes past a5 I do
> see
> > >>>>> anything that changed that I would think would cause the behavior.
> Would
> > >>>>> this just be a waste of time?
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>> On Fri, Jun 7, 2013 at 6:36 PM, Heinrich van Riel <
> > >>>>> heinrich.vanriel at gmail.com> wrote:
> > >>>>>
> > >>>>>> In the debug info I see 1000's of the following events:
> > >>>>>>
> > >>>>>> FROM STMF:0149225: abort_task_offline called for LPORT: lport
> abort
> > >>>>>> timed out
> > >>>>>> FROM STMF:0149225: abort_task_offline called for LPORT: lport
> abort
> > >>>>>> timed out
> > >>>>>> FROM STMF:0149225: abort_task_offline called for LPORT: lport
> abort
> > >>>>>> timed out
> > >>>>>> FROM STMF:0149226: abort_task_offline called for LPORT: lport
> abort
> > >>>>>> timed out
> > >>>>>> FROM STMF:0149226: abort_task_offline called for LPORT: lport
> abort
> > >>>>>> timed out
> > >>>>>> FROM STMF:0149226: abort_task_offline called for LPORT: lport
> abort
> > >>>>>> timed out
> > >>>>>> FROM STMF:0149227: abort_task_offline called for LPORT: lport
> abort
> > >>>>>> timed out
> > >>>>>> FROM STMF:0149227: abort_task_offline called for LPORT: lport
> abort
> > >>>>>> timed out
> > >>>>>> FROM STMF:0149227: abort_task_offline called for LPORT: lport
> abort
> > >>>>>> timed out
> > >>>>>> emlxs1:0149228: port state change from 11 to 11
> > >>>>>> FROM STMF:0149228: abort_task_offline called for LPORT: lport
> abort
> > >>>>>> timed out
> > >>>>>> FROM STMF:0149228: abort_task_offline called for LPORT: lport
> abort
> > >>>>>> timed out
> > >>>>>> FROM STMF:0149228: abort_task_offline called for LPORT: lport
> abort
> > >>>>>> timed out
> > >>>>>> :0149228: fct_port_shutdown: port-ffffff1157ff1278,
> fct_process_logo:
> > >>>>>> unable to
> > >>>>>> clean up I/O. iport-ffffff1157ff1378, icmd-ffffff1195463110
> > >>>>>> FROM STMF:0149229: abort_task_offline called for LPORT: lport
> abort
> > >>>>>> timed out
> > >>>>>> FROM STMF:0149229: abort_task_offline called for LPORT: lport
> abort
> > >>>>>> timed out
> > >>>>>> FROM STMF:0149229: abort_task_offline called for LPORT: lport
> abort
> > >>>>>> timed out
> > >>>>>>
> > >>>>>>
> > >>>>>> And then the following as the port recovers.
> > >>>>>>
> > >>>>>> emlxs1:0150128: port state change from 11 to 11
> > >>>>>> emlxs1:0150128: port state change from 11 to 0
> > >>>>>> emlxs1:0150128: port state change from 0 to 11
> > >>>>>> emlxs1:0150128: port state change from 11 to 0
> > >>>>>> :0150850: fct_port_initialize: port-ffffff1157ff1278, emlxs
> initialize
> > >>>>>> emlxs1:0150950: port state change from 0 to e
> > >>>>>> emlxs1:0150953: Posting sol ELS 3 (PLOGI) rp_id=fffffd lp_id=22000
> > >>>>>> emlxs1:0150953: Processing sol ELS 3 (PLOGI) rp_id=fffffd
> > >>>>>> emlxs1:0150953: Sol ELS 3 (PLOGI) completed with status 0,
> did/fffffd
> > >>>>>> emlxs1:0150953: Posting sol ELS 62 (SCR) rp_id=fffffd lp_id=22000
> > >>>>>> emlxs1:0150953: Processing sol ELS 62 (SCR) rp_id=fffffd
> > >>>>>> emlxs1:0150953: Sol ELS 62 (SCR) completed with status 0,
> did/fffffd
> > >>>>>> emlxs1:0151053: Posting sol ELS 3 (PLOGI) rp_id=fffffc lp_id=22000
> > >>>>>> emlxs1:0151053: Processing sol ELS 3 (PLOGI) rp_id=fffffc
> > >>>>>> emlxs1:0151053: Sol ELS 3 (PLOGI) completed with status 0,
> did/fffffc
> > >>>>>> emlxs1:0151054: Posting unsol ELS 3 (PLOGI) rp_id=fffc02
> lp_id=22000
> > >>>>>> emlxs1:0151054: Processing unsol ELS 3 (PLOGI) rp_id=fffc02
> > >>>>>> emlxs1:0151054: Posting unsol ELS 20 (PRLI) rp_id=fffc02
> lp_id=22000
> > >>>>>> emlxs1:0151054: Processing unsol ELS 20 (PRLI) rp_id=fffc02
> > >>>>>> emlxs1:0151055: Posting unsol ELS 5 (LOGO) rp_id=fffc02
> lp_id=22000
> > >>>>>> emlxs1:0151055: Processing unsol ELS 5 (LOGO) rp_id=fffc02
> > >>>>>> emlxs1:0151146: Posting unsol ELS 3 (PLOGI) rp_id=21500
> lp_id=22000
> > >>>>>> emlxs1:0151146: Processing unsol ELS 3 (PLOGI) rp_id=21500
> > >>>>>> emlxs1:0151146: Posting unsol ELS 20 (PRLI) rp_id=21500
> lp_id=22000
> > >>>>>>  emlxs1:0151146: Processing unsol ELS 20 (PRLI) rp_id=21500
> > >>>>>> emlxs1:0151146: Posting unsol ELS 3 (PLOGI) rp_id=21600
> lp_id=22000
> > >>>>>> emlxs1:0151146: Processing unsol ELS 3 (PLOGI) rp_id=21600
> > >>>>>> emlxs1:0151146: Posting unsol ELS 20 (PRLI) rp_id=21600
> lp_id=22000
> > >>>>>> emlxs1:0151146: Processing unsol ELS 20 (PRLI) rp_id=21600
> > >>>>>> emlxs1:0151338: Posting unsol ELS 3 (PLOGI) rp_id=21500
> lp_id=22000
> > >>>>>> emlxs1:0151338: Processing unsol ELS 3 (PLOGI) rp_id=21500
> > >>>>>> emlxs1:0151338: Posting unsol ELS 20 (PRLI) rp_id=21500
> lp_id=22000
> > >>>>>> emlxs1:0151338: Processing unsol ELS 20 (PRLI) rp_id=21500
> > >>>>>> emlxs1:0151338: Posting unsol ELS 3 (PLOGI) rp_id=21600
> lp_id=22000
> > >>>>>>  emlxs1:0151338: Processing unsol ELS 3 (PLOGI) rp_id=21600
> > >>>>>> emlxs1:0151338: Posting unsol ELS 20 (PRLI) rp_id=21600
> lp_id=22000
> > >>>>>> emlxs1:0151338: Processing unsol ELS 20 (PRLI) rp_id=21600
> > >>>>>> emlxs1:0151428: Posting unsol ELS 3 (PLOGI) rp_id=21500
> lp_id=22000
> > >>>>>> emlxs1:0151428: Processing unsol ELS 3 (PLOGI) rp_id=21500
> > >>>>>> emlxs1:0151428: port state change from e to 4
> > >>>>>> emlxs1:0151428: Posting unsol ELS 20 (PRLI) rp_id=21500
> lp_id=22000
> > >>>>>> emlxs1:0151428: Processing unsol ELS 20 (PRLI) rp_id=21500
> > >>>>>> emlxs1:0151428: Posting unsol ELS 3 (PLOGI) rp_id=21600
> lp_id=22000
> > >>>>>> emlxs1:0151428: Processing unsol ELS 3 (PLOGI) rp_id=21600
> > >>>>>> emlxs1:0151428: Posting unsol ELS 20 (PRLI) rp_id=21600
> lp_id=22000
> > >>>>>> emlxs1:0151428: Processing unsol ELS 20 (PRLI) rp_id=21600
> > >>>>>>
> > >>>>>> To be honest it does not really tell me much since I do not
> > >>>>>> understand comstar to these depths. It would appear that the link
> fails so
> > >>>>>> either driver problem or hardware issue? I will replace the
> LPe11002 with a
> > >>>>>> brand new unopened one and then  give up on FC on OI.
> > >>>>>>
> > >>>>>>
> > >>>>>>
> > >>>>>>
> > >>>>>> On Fri, Jun 7, 2013 at 4:54 PM, Heinrich van Riel <
> > >>>>>> heinrich.vanriel at gmail.com> wrote:
> > >>>>>>
> > >>>>>>> I did find this in my inbox from 2009, I have been using FC with
> ZFS
> > >>>>>>> for quite sometime and only recently retired an install with OI
> a5 that was
> > >>>>>>> upgraded from opensolaris. It did not do real heavy duty stuff,
> but I had a
> > >>>>>>> similar problem where we were stuck on build 99 for quite some
> time.
> > >>>>>>>
> > >>>>>>> To  Jean-Yves Chevallier at emulex
> > >>>>>>> Any comments on the future of Emulex with regards to the COMSTAR
> > >>>>>>> project?
> > >>>>>>> It seems I am not the only one that have problems using Emulex in
> > >>>>>>> later builds. For now I am stuck with build 99.
> > >>>>>>> As always any feedback would be greatly appreciated since we
> have to
> > >>>>>>> make a decision of sticking with Opensolaris & COMSTAR or start
> migrating
> > >>>>>>> to another solution since we cannot stay on build 99 forever.
> > >>>>>>> What I am really trying to find out is if there is a
> > >>>>>>> roadmap/decision to ultimately only support Qlogic HBA’s in
> target mode.
> > >>>>>>>
> > >>>>>>> Response:
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> Sorry for the delay in answering you. I do have news for you.
> > >>>>>>> First off, the interface used by COMSTAR has changed in recent
> > >>>>>>> Nevada releases (NV120 and up I believe). Since it is not a
> public
> > >>>>>>> interface we had no prior indication on this.
> > >>>>>>> We know of a number of issues, some on our driver, some on the
> > >>>>>>> COMSTAR stack. Based on the information we have from you and
> other
> > >>>>>>> community members, we have addressed all these issues in our
> next driver
> > >>>>>>> version – we will know for sure after we run our DVT (driver
> verification
> > >>>>>>> testing) next week. Depending on progress, this driver will be
> part of NV
> > >>>>>>> 128 or else NV 130.
> > >>>>>>> I believe it is worth taking another look based on these upcoming
> > >>>>>>> builds, which I imagine might also include fixes to the rest of
> the COMSTAR
> > >>>>>>> stack.
> > >>>>>>>
> > >>>>>>> Best regards.
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> I can confirm that this was fixed in 128 and all I did was update
> > >>>>>>> from 99 to 128 and there were no problems.
> > >>>>>>> Seem like the same problem has now returned and emulex does not
> > >>>>>>> appear to be a good fit since sun mostly used qlogic.
> > >>>>>>>
> > >>>>>>> guess it is back to iscsi only for now.
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> On Fri, Jun 7, 2013 at 4:40 PM, Heinrich van Riel <
> > >>>>>>> heinrich.vanriel at gmail.com> wrote:
> > >>>>>>>
> > >>>>>>>> I changed the settings. I do see it writing all the time now,
> but
> > >>>>>>>> the link still dies after a a few min
> > >>>>>>>>
> > >>>>>>>> Jun  7 16:30:57  emlxs: [ID 349649 kern.info] [ 5.0608]emlxs1:
> > >>>>>>>> NOTICE: 730: Link reset. (Disabling link...)
> > >>>>>>>> Jun  7 16:30:57 emlxs: [ID 349649 kern.info] [ 5.0333]emlxs1:
> > >>>>>>>> NOTICE: 710: Link down.
> > >>>>>>>> Jun  7 16:33:16 emlxs: [ID 349649 kern.info] [ 5.055D]emlxs1:
> > >>>>>>>> NOTICE: 720: Link up. (4Gb, fabric, target)
> > >>>>>>>> Jun  7 16:33:16 fct: [ID 132490 kern.notice] NOTICE: emlxs1 LINK
> > >>>>>>>> UP, portid 22000, topology Fabric Pt-to-Pt,speed 4G
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> On Fri, Jun 7, 2013 at 3:06 PM, Jim Klimov <jimklimov at cos.ru
> >wrote:
> > >>>>>>>>
> > >>>>>>>>> Comment below
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>> On 2013-06-07 20:42, Heinrich van Riel wrote:
> > >>>>>>>>>
> > >>>>>>>>>> One sec apart cloning 150GB vm from a datastore on EMC to OI.
> > >>>>>>>>>>
> > >>>>>>>>>> alloc free read write read write
> > >>>>>>>>>> ----- ----- ----- ----- ----- -----
> > >>>>>>>>>> 309G 54.2T 81 48 452K 1.34M
> > >>>>>>>>>> 309G 54.2T 0 8.17K 0 258M
> > >>>>>>>>>> 310G 54.2T 0 16.3K 0 510M
> > >>>>>>>>>> 310G 54.2T 0 0 0 0
> > >>>>>>>>>> 310G 54.2T 0 0 0 0
> > >>>>>>>>>> 310G 54.2T 0 0 0 0
> > >>>>>>>>>> 310G 54.2T 0 10.1K 0 320M
> > >>>>>>>>>> 311G 54.2T 0 26.1K 0 820M
> > >>>>>>>>>> 311G 54.2T 0 0 0 0
> > >>>>>>>>>> 311G 54.2T 0 0 0 0
> > >>>>>>>>>> 311G 54.2T 0 0 0 0
> > >>>>>>>>>> 311G 54.2T 0 10.6K 0 333M
> > >>>>>>>>>> 313G 54.2T 0 27.4K 0 860M
> > >>>>>>>>>> 313G 54.2T 0 0 0 0
> > >>>>>>>>>> 313G 54.2T 0 0 0 0
> > >>>>>>>>>> 313G 54.2T 0 0 0 0
> > >>>>>>>>>> 313G 54.2T 0 9.69K 0 305M
> > >>>>>>>>>> 314G 54.2T 0 10.8K 0 337M
> > >>>>>>>>>>
> > >>>>>>>>> ...
> > >>>>>>>>> Were it not for your complaints about link resets and
> "unusable"
> > >>>>>>>>> connections, I'd say this looks like a normal behavior for
> async
> > >>>>>>>>> writes: they get cached up, and every 5 sec you have a
> transaction
> > >>>>>>>>> group (TXG) sync which flushes the writes from cache to disks.
> > >>>>>>>>>
> > >>>>>>>>> In fact, the picture still looks like that, and possibly is the
> > >>>>>>>>> reason for hiccups.
> > >>>>>>>>>
> > >>>>>>>>> The TXG sync may be an IO intensive process, which may block or
> > >>>>>>>>> delay many other system tasks; previously when the interval
> > >>>>>>>>> defaulted to 30 sec we got unusable SSH connections and
> temporarily
> > >>>>>>>>> "hung" disk requests on the storage server every half a minute
> when
> > >>>>>>>>> it was really busy (i.e. initial filling up with data from
> older
> > >>>>>>>>> boxes). It cached up about 10 seconds worth of writes, then
> spewed
> > >>>>>>>>> them out and could do nothing else. I don't think I ever saw
> > >>>>>>>>> network
> > >>>>>>>>> connections timing out or NICs reporting resets due to this,
> but I
> > >>>>>>>>> wouldn't be surprised if this were the cause for your case,
> though
> > >>>>>>>>> (i.e. disk IO threads preempting HBA/NIC threads for too long
> > >>>>>>>>> somehow, making the driver very puzzled about staleness state
> of its card).
> > >>>>>>>>>
> > >>>>>>>>> At the very least, TXG syncs can be tuned by two knobs: the
> time
> > >>>>>>>>> limit (5 sec default) and the size limit (when the cache is
> "this"
> > >>>>>>>>> full, begin the sync to disk). The latter is a realistic figure
> > >>>>>>>>> that
> > >>>>>>>>> can allow you to sync in shorter bursts - with less
> interruptions
> > >>>>>>>>> to smooth IO and process work.
> > >>>>>>>>>
> > >>>>>>>>> A somewhat related tunable is the number of requests that ZFS
> would
> > >>>>>>>>> queue up for a disk. Depending on its NCQ/TCQ abilities and
> random
> > >>>>>>>>> IO abilities (HDD vs. SSD), long or short queues may be
> preferable.
> > >>>>>>>>> See also: http://www.solarisinternals.**
> > >>>>>>>>> com/wiki/index.php/ZFS_Evil_**Tuning_Guide#Device_I.2FO_**
> > >>>>>>>>> Queue_Size_.28I.2FO_**Concurrency.29<
> http://www.solarisinternals.com/wiki/index.php/ZFS_Evil_Tuning_Guide#Device_I.2FO_Queue_Size_.28I.2FO_Concurrency.29
> >
> > >>>>>>>>>
> > >>>>>>>>> These tunables can be set at runtime with "mdb -K", as well as
> in
> > >>>>>>>>> the /etc/system file to survive reboots. One of our storage
> boxes
> > >>>>>>>>> has these example values in /etc/system:
> > >>>>>>>>>
> > >>>>>>>>> *# default: flush txg every 5sec (may be max 30sec, optimize
> > >>>>>>>>> *# for 5 sec writing)
> > >>>>>>>>> set zfs:zfs_txg_synctime = 5
> > >>>>>>>>>
> > >>>>>>>>> *# Spool to disk when the ZFS cache is 0x18000000 (384Mb) full
> > >>>>>>>>> set zfs:zfs_write_limit_override = 0x18000000
> > >>>>>>>>> *# ...for realtime changes use mdb.
> > >>>>>>>>> *# Example sets 0x18000000 (384Mb, 402653184 b):
> > >>>>>>>>> *# echo zfs_write_limit_override/**W0t402653184 | mdb -kw
> > >>>>>>>>>
> > >>>>>>>>> *# ZFS queue depth per disk
> > >>>>>>>>> set zfs:zfs_vdev_max_pending = 3
> > >>>>>>>>>
> > >>>>>>>>> HTH,
> > >>>>>>>>> //Jim Klimov
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>> ______________________________**_________________
> > >>>>>>>>> OpenIndiana-discuss mailing list
> > >>>>>>>>> OpenIndiana-discuss@**openindiana.org<
> OpenIndiana-discuss at openindiana.org>
> > >>>>>>>>>
> http://openindiana.org/**mailman/listinfo/openindiana-**discuss<
> http://openindiana.org/mailman/listinfo/openindiana-discuss>
> > >>>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>
> > >>>>>>
> > >>>>>
> > >>>>
> > >>>
> > >>
> > >
> > _______________________________________________
> > OpenIndiana-discuss mailing list
> > OpenIndiana-discuss at openindiana.org
> > http://openindiana.org/mailman/listinfo/openindiana-discuss
>
>
> _______________________________________________
> OpenIndiana-discuss mailing list
> OpenIndiana-discuss at openindiana.org
> http://openindiana.org/mailman/listinfo/openindiana-discuss
>


More information about the OpenIndiana-discuss mailing list