*Watchdog timer ** - A hardware-implemented system hang-up control scheme. It is a timer that is periodically reset by the monitored system. If it is not reset within a certain period of time, the system is forced to reboot. In some cases the watchdog can send a signal to the system to reboot ("soft" reboot), in other cases the reboot is done by hardware (by closing the signal wire RST or similar).
Installation in Linux Ubuntu/Debian:
sudo apt-get install watchdog
A list of some of the files that will be installed on the system:
- /etc/init.d/watchdog
- /etc/init.d/wd_keepalive
- /etc/watchdog.conf
- /etc/default/watchdog
- /dev/watchdog
- /usr/sbin/watchdog
- /usr/sbin/wd_identify
- /usr/sbin/wd_keepalive
- /usr/share/doc/watchdog/
- /usr/share/man/man5/watchdog.conf.5.gz
- /usr/share/man/man8/watchdog.8.gz
- /usr/share/man/man8/wd_identify.8.gz
- /usr/share/man/man8/wd_keepalive.8.gz
Possible config parameters for /etc/watchdog.conf:
interval =
Interval between two write operations to the watchdog. The default value is 10 seconds. An interval longer than a minute can only be used with the -f parameter from the command line.
logtick =
If you write logs, you can skip recording events every specified number of intervals. For example, if logtick = 60 and interval 10, you get 600 seconds, so there will be no more than one entry in the log file every 10 minutes.
max-load-1 =
The maximum allowed value of system load for 1 minute, above which the system will restart. 0 - disables the check.
max-load-5 =
The maximum allowed value of system load for 5 minutes, above which the system will restart. 0 - disables the check.
max-load-15 =
The maximum allowed value of system load for 15 minutes, above which the system will restart. 0 - disables the check.
min-memory =
Set the minimum amount of virtual memory which must be free. 0 - check disabled.
max-temperature =
Set the maximum temperature allowed.
watchdog-device =
Setting the device name.
temperature-device =
Setting the temperature device name.
file =
File mode, file check.
change =
Time interval for file mode.
pidfile =
The name of the pid file. You can add a monitored process, for example "pidfile = /var/run/apache2.pid". If the process cannot be started, watchdog will constantly reboot the system.
ping =
Ping mode, to check network connections. The option can be used more than once.
interface =
Set the name of the network interface.
test-binary =
Running a user test.
test-timeout =
The test can run the specified number of seconds. 0 - unlimited.
repair-binary =
Executed when the system cannot be rebooted.
admin =
Email address for notifications, you can leave the value blank to disable.
realtime =
Yes to make it impossible to unload watchdog from RAM.
priority =
Set the priority for realtime mode.
Example setting with Intel TCO Watchog Timer.
Module load:
sudo modprobe iTCO_wdt
In /etc/watchdog.conf it must be edited/added:
watchdog-device = /dev/watchdog
interval = 10
In /etc/default/watchdog specify the module name:
watchdog_module="iTCO_wdt"
You can add a debug option so that debugging information is written to the syslog:
watchdog_options="-v"
Restarting watchdog:
sudo /etc/init.d/watchdog restart
You can monitor real-time syslog entries with a command:
tail -f /var/log/syslog