Watchdog timer * - A hardware-implemented system hang-up control scheme. It is a timer that is periodically reset by the monitored system. If it is not reset within a certain period of time, the system is forced to reboot. In some cases the watchdog can send a signal to the system to reboot ("soft" reboot), in other cases the reboot is done by hardware (by closing the signal wire RST or similar).
Installation in Linux Ubuntu/Debian:
sudo apt-get install watchdog
A list of some of the files that will be installed on the system:
Possible config parameters for /etc/watchdog.conf:
Interval between two write operations to the watchdog. The default value is 10 seconds. An interval longer than a minute can only be used with the -f parameter from the command line.
If you write logs, you can skip recording events every specified number of intervals. For example, if logtick = 60 and interval 10, you get 600 seconds, so there will be no more than one entry in the log file every 10 minutes.
The maximum allowed value of system load for 1 minute, above which the system will restart. 0 - disables the check.
The maximum allowed value of system load for 5 minutes, above which the system will restart. 0 - disables the check.
The maximum allowed value of system load for 15 minutes, above which the system will restart. 0 - disables the check.
Set the minimum amount of virtual memory which must be free. 0 - check disabled.
Set the maximum temperature allowed.
Setting the device name.
Setting the temperature device name.
File mode, file check.
Time interval for file mode.
The name of the pid file. You can add a monitored process, for example "pidfile = /var/run/apache2.pid". If the process cannot be started, watchdog will constantly reboot the system.
Ping mode, to check network connections. The option can be used more than once.
Set the name of the network interface.
Running a user test.
The test can run the specified number of seconds. 0 - unlimited.
Executed when the system cannot be rebooted.
Email address for notifications, you can leave the value blank to disable.
Yes to make it impossible to unload watchdog from RAM.
Set the priority for realtime mode.
Example setting with Intel TCO Watchog Timer.
sudo modprobe iTCO_wdt
In /etc/watchdog.conf it must be edited/added:
watchdog-device = /dev/watchdog interval = 10
In /etc/default/watchdog specify the module name:
You can add a debug option so that debugging information is written to the syslog:
sudo /etc/init.d/watchdog restart
You can monitor real-time syslog entries with a command:
tail -f /var/log/syslog