[OpenIndiana-discuss] bmc-watchdog SMF dependencies
Gordon Ross
gordon.w.ross at gmail.com
Wed Jun 27 05:34:25 UTC 2012
On Tue, Jun 26, 2012 at 4:26 PM, Jim Klimov <jimklimov at cos.ru> wrote:
> Hello all, I've got a new small matter for generic discussion:
> Now, in OpenSolaris (and portable to OI) there is a bmc-watchdog
> package for some proprietary hardware implementations, as well as
> a newly ported open-source driver is brewing. There is also an SMF
> service to wrap the watchdog. From what I see, upon service startup
> the HW timer is started, and for the duration of the service uptime
> the timer-resets are regularly issued. Upon service shutdown there
> are two possible approaches (as I tweaked the method script a lot,
> I am not sure what was there originally): either the daemon for
> regularly-restarting the timer is just killed (and the timer keeps
> ticking), or the timer is also stopped. On my system it happened
> to be the former, and during a shutdown which took longer than usual
> to proceed (for valid reasons), the box got reset by the timer.
>
> Now I looked at the SMF manifest, and see that the service only
> depends on filesystem/usr. In my practice this meant that upon OS
> shutdown, the bmc-watchdog daemon was quickly killed (as nothing
> depends on this service) and the timer ticked down to zero - boom!
>
> Question is: what is a valid way to avoid the watchdog killing
> the system upon lengthy shutdowns?
>
> I came up with a few ideas:
> 1) Redefine the stop method to not kill the daemon - not good, for
> pedantic reasons at least ;)
> 2) Redefine the stop method to kill the daemon and stop the timer -
> not good because the box can potentially also freeze during
> shutdown, and in that case we would want it automagically reset;
> 3) Make milestone/single-user a dependency of bmc-watchdog (I also
> tried to redefine bmc-watchdog instance to have a dependent -
> but this did not get picked up properly) - in this case the
> daemon works until all heavy services in miletone/multi-user
> get shut down properly, and only gets killed then (and the HW
> timer ticks for a few more seconds, until the system is rebooted).
4) Run your watchdog reset program outside of the SMF contract,
for the service i.e. "ctrun -l none bmc-watchdog"
so it will not be killed when the service is stopped.
--
Gordon Ross <gwr at nexenta.com>
Nexenta Systems, Inc. www.nexenta.com
Enterprise class storage for everyone
More information about the OpenIndiana-discuss
mailing list