[oi-dev] [developer] BMC driver on Illumos
Jim Klimov
jimklimov at cos.ru
Thu Mar 28 20:18:30 UTC 2013
On 2013-03-28 20:12, Garrett D'Amore wrote:
>
> On Mar 28, 2013, at 12:05 PM, Jim Klimov <jimklimov at cos.ru> wrote:
>
>> So, for the case of dedicated-hardware watchdogs, this is the part of
>> your post which I can't find as relevant: "The usual thing is to hook
>> this up to a system timer, which will catch hard hangs."
>
> What I mean is that what most systems do is not express an API out to userland, but just have something that runs out of the timer that tickles the hardware watchdog register. This guards against the hard hang of the entire system/scheduler, but it does nothing to ensure that some upper layer services are still being handled.
I see... well, whatever way a daemon is implemented (as is relevant for
Linux watchdogs, and at least legacy OpenSolaris and maybe illumos BMC
port), it does rely on a timer interrupt and software not having hung
in order to tickle the hardware watchdog's separate timer.
In Linux, I might guess, the API is expressed as /dev/watchdog node,
into where you can echo a character. The daemon is usually trivial,
reference code is part of some README and it is a dozen lines long.
In OpenSolaris, I believe, there was a single (closed-source) driver
for all supported watchdog models and the bmc-watchdog program could
talk to it. In a way it was the user-accessible API to set and query
the watchdog. In case it helps, here are a few output examples:
# bmc-watchdog -g
Timer Use: SMS/OS
Timer: Running
Logging: Enabled
Timeout Action: Hard Reset
Pre-Timeout Interrupt: None
Pre-Timeout Interval: 0 seconds
Timer Use BIOS FRB2 Flag: Clear
Timer Use BIOS POST Flag: Clear
Timer Use BIOS OS Load Flag: Clear
Timer Use BIOS SMS/OS Flag: Clear
Timer Use BIOS OEM Flag: Clear
Initial Countdown: 900 seconds
Current Countdown: 842 seconds
# bmc-watchdog -h
Usage: bmc-watchdog <COMMAND> [OPTIONS]... [COMMAND_OPTIONS]...
COMMANDS:
-s --set Set BMC Watchdog Config.
-g --get Get BMC Watchdog Config.
-r --reset Reset BMC Watchdog Timer.
-t --start Start BMC Watchdog Timer.
-y --stop Stop BMC Watchdog Timer.
-c --clear Clear BMC Watchdog Config.
-d --daemon Run in Daemon Mode.
OPTIONS:
-D STRING --driver-type=IPMIDRIVER Specify IPMI driver type.
--disable-auto-probe Do not probe driver
for default settings.
--driver-address=DRIVER-ADDRESS Specify driver address.
--driver-device=DEVICE Specify driver device
path.
--register-spacing=REGISTER-SPACING Specify driver
register spacing.
-f STRING --logfile=FILE Specify an alternate
logfile
--config-file=FILE Specify an alternate
config file
-n --no-logging Turn off all logging
-? --help Output help menu.
-V --version Output version.
--debug Turn on debugging.
> Now I've not looked at Linux and how it uses watchdogs… but I've experience with a few different embedded systems, and the above handling is almost precisely what I've seen done. NetBSD was nice because it instead offered a watchdog facility that extended into userland, allowing the service check to be done by a userland daemon, which is far more interesting than just that the clock interrupt handler is still working properly. :-)
More information about the oi-dev
mailing list