[OpenIndiana-discuss] Proposed Samba updates

Jim Klimov jimklimov at cos.ru
Mon Jun 10 13:32:39 UTC 2013


On 2013-06-10 10:57, Christopher Chan wrote:
> Joking aside, the smbd processes dropping out of the control of the main
> smbd process can take up to half a day before it happens and a week to
> build up an appreciable number to be alarmed and they have also started
> within an hour and taken less than a day to set off the alarms and build

I wonder if it makes sense to create and package (and install into cron
or something else) automated test suites for know bug-cases like this?
On our deployments we have different sorts of health-monitoring scripts,
mostly for known problematic cases with web-servers and web application
servers, for hanging virtualbox VMs, and so on. Such systems can quickly
detect their inefficiency or plain breakdown and restart the appropriate
services (and we admins can work with our developers or other software
authors to track and fix the problem), and overall this leads to higher
portion of uptime even with buggy software that we are not in position
to fix or replace.

If the breakage scenario is known, you can make a script to test for
the symptoms (i.e. amount of smbd processes over a threshold, maybe
response times and disk IO loads, amount of context-switching between
thousands of processes on the overall server bringing it down to a
crawl) and actively send off alarms to local admin, so he can forward
them or otherwise work with upstream to report and fix problems. This
would help automate testing for known elusive problems at least - more
deployments which have something in common can help track down the
causes, if the problem does not happen everywhere.

My 2c,
//Jim




More information about the OpenIndiana-discuss mailing list