[OpenIndiana-discuss] bmc-watchdog SMF dependencies

Gordon Ross gordon.w.ross at gmail.com
Wed Jun 27 05:34:25 UTC 2012


On Tue, Jun 26, 2012 at 4:26 PM, Jim Klimov <jimklimov at cos.ru> wrote:
> Hello all, I've got a new small matter for generic discussion:

>  Now, in OpenSolaris (and portable to OI) there is a bmc-watchdog
> package for some proprietary hardware implementations, as well as
> a newly ported open-source driver is brewing. There is also an SMF
> service to wrap the watchdog. From what I see, upon service startup
> the HW timer is started, and for the duration of the service uptime
> the timer-resets are regularly issued. Upon service shutdown there
> are two possible approaches (as I tweaked the method script a lot,
> I am not sure what was there originally): either the daemon for
> regularly-restarting the timer is just killed (and the timer keeps
> ticking), or the timer is also stopped. On my system it happened
> to be the former, and during a shutdown which took longer than usual
> to proceed (for valid reasons), the box got reset by the timer.
>
>  Now I looked at the SMF manifest, and see that the service only
> depends on filesystem/usr. In my practice this meant that upon OS
> shutdown, the bmc-watchdog daemon was quickly killed (as nothing
> depends on this service) and the timer ticked down to zero - boom!
>
>  Question is: what is a valid way to avoid the watchdog killing
> the system upon lengthy shutdowns?
>
>  I came up with a few ideas:
> 1) Redefine the stop method to not kill the daemon - not good, for
>   pedantic reasons at least ;)
> 2) Redefine the stop method to kill the daemon and stop the timer -
>   not good because the box can potentially also freeze during
>   shutdown, and in that case we would want it automagically reset;
> 3) Make milestone/single-user a dependency of bmc-watchdog (I also
>   tried to redefine bmc-watchdog instance to have a dependent -
>   but this did not get picked up properly) - in this case the
>   daemon works until all heavy services in miletone/multi-user
>   get shut down properly, and only gets killed then (and the HW
>   timer ticks for a few more seconds, until the system is rebooted).

4)  Run your watchdog reset program outside of the SMF contract,
     for the service i.e. "ctrun -l none bmc-watchdog"
     so it will not be killed when the service is stopped.


-- 
Gordon Ross <gwr at nexenta.com>
Nexenta Systems, Inc.  www.nexenta.com
Enterprise class storage for everyone



More information about the OpenIndiana-discuss mailing list