[OpenIndiana-discuss] Sudden ZFS performance issue

Fri Jul 5 18:09:45 UTC 2013

On Fri, Jul 5, 2013 at 8:00 PM, Saso Kiselkov <skiselkov.ml at gmail.com> wrote:
> On 05/07/2013 17:08, wim at vandenberge.us wrote:
>> Good morning,
>>
>> I have a weird problem with two of the 15+ OpenSolaris storage servers in our
>> environment. All the Nearline servers are essentially the same. Supermicro
>> X9DR3-F based server, Dual E5-2609's, 64GB memory, Dual 10Gb SFP+ NICs, LSI
>> 9200-8e HBA, Supermicro CSE-826E26-R1200LPB storage arrays and Seagate
>> enterprise 2TB SATA or SAS drives (not mixed within a server). Root, l2ARC and
>> ZIL are all on Intel SSD (SLC series 313 for ZIL, MLC 520 for L2ARC and MLC 330
>> for boot)
>>
>> The volumes are built out of 9 drive Z1 groups, ashift is set to 9 (which is
>> supposed to appropiate for the enterprise seagates). The pools are large
>> (120-130TB) but are only between 27 and 32% full. Each server serves an iSCSI
>> (Comstar) and an CIFS (in kernel server) volume of the same pool. I realize this
>> is not optimal from a recovery/resilver/rebuild standpoint but the servers are
>> replicated and the data is easily rebuildable.
>>
>> Initially these servers did great for several months, while certainly no speed
>> demons, 300+ MB/sec for sequential read/writes was not a problem. Several weeks
>> ago, literally overnight, replication times went through the roof for one
>> server. Simple testing showed that reading from the pool would no longer go over
>> 25MB/s. Even a scrub that used to run at 400+ MB/sec is now crawling along at
>> below 40MB/s.
>>
>> Sometime yesterday the second server started to exhibit the exact same
>> behaviour. This one is used even less (it's our D2D2T server) and data is
>> written to it at night and read during the day to be written to tape.
>>
>> I've exhausted all I know and I'm at a loss. Does anyone have any ideas of what
>> to look at, or do any obvious reasons for this behaviour jump out from the
>> configuration above?
>
> Isn't iostat -Exn reporting some transport errors? Smells like a drive
> gone bad and forcing retries, which would cause about a 10x decrease in
> performance. Just a guess, though.

Why should a retry require a 10x decrease in performance? A proper
design would surely do retries in parallel to other operations
(Reiser4 and btrfs do it) up to a certain amount of
failures-in-flight.

Irek