[OpenIndiana-discuss] Pool I/O

Fri May 8 20:16:31 UTC 2015

Tracked it down to about 3 gvfsd-metadata processes, maybe...can't decide if they were victims or root causes.

Shooting those in the head brought things back; I didn't see how our DCS3700's were buried though, what it appeared to me was that pool i/o was effectively blocked; so I don't know whether the DDRdrives would have had any effect.

I would still like to be edumacated on a way to acquire a bit more insight into what the pool was busy waiting for when the spindles were so idle.  I have no doubt NFS was suffering, but, my number of threads was not at max, and the system was relatively idle; I just couldn't get anything written to disk in a timely fashion.

J

On 08 May 13:10, jason matthews wrote:
> 
> 
> 
> sounds like it is blocking on NFS :-)
> 
> Ask Chris for a try/buy  DDRdrive X1 or whatever the latest
> concoction is... it could be life change for you.
> 
> j.
> 
> On 5/8/15 11:32 AM, Joe Hetrick wrote:
> >Today I played a bit with set sync=disabled after watching a few f/s write IOP's.  I can't decide if I've found a particular group of users with a new (more abusive) set of jobs;
> >
> >I'm looking more and more, and I've turned sync off on a handful of filesystems that are showing a high number of write I/O, sustained; when those systems are bypassing the ZIL, everything is happy.  The ZIL devices are never in %w, and the pool %b coincides with spindle %b, which is almost never higher than 50 or so; and things are streaming nicely.
> >
> >Does anyone have any dtrace that I could use to poke into just what the pool is blocking on when these others are in play?  Looking at nfsv3 operations, I see a very large number of
> >create
> >setattr
> >write
> >modify
> >rename
> >
> >and sometimes remove
> >and I'm suspecting these users are doing something silly at HPC scale..
> >
> >
> >Thanks!
> >
> >Joe
> >
> >
> >>Hi all,
> >>
> >>
> >>	We've recently run into a situation where I'm seeing pool at 90-100 %b, and our ZIL's at 90-100 %w, yet all of the spindles are relatively idle.  Furthermore, local I/O is normal, and testing is able to quickly and easily put both pool and spindles in the VDEV into high activity.
> >>
> >>     The system is primarily accessed via NFS (home server for an HPC environment).  We've had users to evil things before to cause pain, but, this is most odd, as I would only expect this behavior if we had a faulty device in the pool with high %b (we don't) or if we had some sort of COW related issue; such as being <15% free space or so.  In this case, we are less than half full of a 108TB raidz3 pool.
> >>
> >>	latencytop shows a lot of ZFS ZIL Writer latency, but thats to be expected given what I see above.  Pool I/O with zpool iostat is normal-ish, and as I said, simple raw writes to the pool show expected performance when done locally.
> >>
> >>	Does anyone have any ideas?
> >>
> >>Thanks,
> >>
> >>Joe
> >>
> >>-- 
> >>Joe Hetrick
> >>perl -e 'print pack(h*,a6865647279636b604269647a616e69647f627e2e65647a0)'
> >>BOFH Excuse: doppler effect
> >>
> >_______________________________________________
> >openindiana-discuss mailing list
> >openindiana-discuss at openindiana.org
> >http://openindiana.org/mailman/listinfo/openindiana-discuss
> >
> 
> 
> _______________________________________________
> openindiana-discuss mailing list
> openindiana-discuss at openindiana.org
> http://openindiana.org/mailman/listinfo/openindiana-discuss

-- 
Joe Hetrick
perl -e 'print pack(h*,a6865647279636b604269647a616e69647f627e2e65647a0)'
BOFH Excuse: old inkjet cartridges emanate barium-based fumes