[OpenIndiana-discuss] MPT SGL mem alloc failed

Timothy Coalson tsc5yc at mst.edu
Mon Jul 9 20:44:59 UTC 2012


The pool is 24 3TB disks, 23 Hitachi Deskstar, and 1 Seagate, arranged
as 2 groups of raidz2, the group that dropped included the seagate.
It is a backup for our other NFS server (conducted nightly via rsync),
and has a single gigabit connection, so it doesn't get used heavily,
and it doesn't need much ARC.  Our workload is not very random at all
(no database or anything), even on the server it backs up.  The main
point is, it was working fine for months on oi_151a4, and months
before that on a3, a2, and a1, and now it is giving a driver error
(which mentions memory, which is why I posted memstat) and dropping
disks on scrub (which it has been running weekly since installation,
without problem until now).  It does happen to be running two (low,
intermittent usage) VMs via virtualbox, but that was also the case on
151a4 (though I did update virtualbox when I updated to 151a5).

Tim

On Mon, Jul 9, 2012 at 3:26 PM, Rich <rercola at acm.jhu.edu> wrote:
> I've got a number of mpt_sas-using Supermicro-hardware-running OI
> machines, and have never seen that error, so I'm impressed.
>
> That said, I don't think I'd call that "plenty of memory", depending
> on your dataset size. How many disks and how large are the pools? It's
> quite possible to eat up 24 GB very quickly with enough disk IO
> (assuming e.g. 100 MB/s IO, perfect transfer efficiency, and no other
> users of your RAM, it'd take 240 disks writing in parallel - in
> practice, I would not be surprised if half that were sufficient).
>
> - Rich
>
> On Mon, Jul 9, 2012 at 4:17 PM, Timothy Coalson <tsc5yc at mst.edu> wrote:
>> I upgraded a machine to oi_151a5 from oi_151a4 last week, and when its
>> weekly scrub rolled around, /var/adm/messages gathered a lot of these,
>> in groups of dozens at a time:
>>
>> Jul  7 01:15:21 myelin2 scsi: [ID 107833 kern.warning] WARNING: /pci at 0,0/pci8086
>> ,340a at 3/pci1000,30c0 at 0 (mpt_sas0):
>> Jul  7 01:15:21 myelin2         Unable to allocate dma memory for extra SGL.
>> Jul  7 01:15:21 myelin2 scsi: [ID 107833 kern.warning] WARNING: /pci at 0,0/pci8086
>> ,340a at 3/pci1000,30c0 at 0 (mpt_sas0):
>> Jul  7 01:15:21 myelin2         MPT SGL mem alloc failed
>>
>> And zpool status showed a lot of failed reads, and decided to drop all
>> the disks on one of the two HBAs.  Under oi_151a4, I am fairly certain
>> these messages did not show up (there are none in /var/adm/messages.*,
>> which has entries from June 11, the upgrade was on July 3).  A zpool
>> clear later, and it accepted the disks again, and the resilver didn't
>> need to correct much, but another scrub caused the same problem again.
>>  It is running a pair of LSI 9201-16i HBAs connected via fanout cables
>> to SATA disks, and appears to have plenty of free memory:
>>
>> tim at myelin2:~$ echo ::memstat | sudo mdb -k
>> Page Summary                Pages                MB  %Tot
>> ------------     ----------------  ----------------  ----
>> Kernel                    4553408             17786   72%
>> ZFS File Data              193505               755    3%
>> Anon                       108432               423    2%
>> Exec and libs                1351                 5    0%
>> Page cache                   5734                22    0%
>> Free (cachelist)            22007                85    0%
>> Free (freelist)           1402689              5479   22%
>>
>> Total                     6287126             24559
>> Physical                  6287125             24559
>>
>> The errors in /var/adm/messages continue to show up, even while not
>> scrubbing, though less often, but zfs only seems to see problems when
>> scrubbing (or possibly any heavy IO load, but this machine doesn't get
>> much of that otherwise).  Any ideas on chasing this down?  Otherwise,
>> I plan to boot it back into 151a4 and try and reproduce the problem,
>> to check if the update is to blame.
>>
>> Tim
>>
>> _______________________________________________
>> OpenIndiana-discuss mailing list
>> OpenIndiana-discuss at openindiana.org
>> http://openindiana.org/mailman/listinfo/openindiana-discuss
>
> _______________________________________________
> OpenIndiana-discuss mailing list
> OpenIndiana-discuss at openindiana.org
> http://openindiana.org/mailman/listinfo/openindiana-discuss



More information about the OpenIndiana-discuss mailing list