[OpenIndiana-discuss] oi_151a3 svc.startd chewing up memory

Jim Klimov jimklimov at cos.ru
Tue May 29 20:08:52 UTC 2012


2012-05-30 0:00, Milan Jurik wrote:
> Jim,
>
> Jim Klimov píše v út 29. 05. 2012 v 04:16 +0400:
>> 2012-05-29 3:39, Jim Klimov wrote:
>>> Hello all,
>>>
>>> On a test box running OI oi_151a3 I found that it can not restart
>>> services due to svc.startd being 2500M in VM size. As a consequence,
>>> fork() fails due to insufficient free VM (swap space) and services
>>> can not start.
>>>
>>> I guess there is a leak somewhere, was anything like that fixed in
>>> the past month or so?
>>
>> A bit more research points to an old problem, marked as fixed:
>> http://defect.opensolaris.org/bz/show_bug.cgi?id=15761
>>
>> This box did have a pkg/server (startd/duration=child) instance
>> which did not start well and was left astray. From what I see
>> now, a daemon is started and in particular grabs the TCP port,
>> but SMF thinks the service has failed and restarts it. Further
>> invokations fail due to busy port, but the service does not go
>> into maintenance. I see svc.startd occupying more RAM at a rate
>> of about 1Mb/1-2mins (glancing at top).
>>
>> The symptomatic part can be fixed by simply changing the base
>> pkg/server startd/duration attribute from "child" to "contract",
>> but the core problem - svc.startd leaking memory in case of such
>> unlimited restarts - is still in place. Also, for the past ten
>> minutes or so since I fixed the pkg/server, the restarter hasn't
>> released a byte ;)
>>
>
> Is https://www.illumos.org/issues/2801 your problem? :-)

Well, I did not trace that yet, but roughly agree that the
description might fit ;)

I believe the core of the problem is not pkg/server itself,
it just causes it to show. I think the problem is with some
likely infinite loop used to restart failed "child" services,
since those restarts do not cause maintenance (by def?) and
instead loop restarting. Maybe some recursion happens instead
of a cycle? That could eat up RAM at least... i.e.:
* initial start
** detected an error condition
*** start again
**** detected an error condition
***** start again
****** detected an error condition
******* start again
...

instead of
* initial start
** detected an error condition
* start again
** detected an error condition
* start again
** detected an error condition
* start again
...

//Jim



More information about the OpenIndiana-discuss mailing list