Installing and configuring the Watchdog
Instructions for installing and configuring your server security software - Watchdog
Watchdog Timer is a hardware-based system designed to monitor and prevent system hangs. It is commonly used to avoid prolonged crashes and freezes on dedicated servers and VPS. A watchdog is essentially a timer that the monitored system periodically resets. If the timer is not reset within a specified interval, it triggers a forced system reboot. In some cases, this can be a “soft” reboot via an OS signal, while in others it’s a hardware-level reset, such as by shorting the RST signal line or a similar mechanism.
Installation on Linux Ubuntu/Debian:
sudo apt-get install watchdog
After installation, the following files and directories are added to the system:
- /etc/init.d/watchdog
- /etc/init.d/wd_keepalive
- /etc/watchdog.conf
- /etc/default/watchdog
- /dev/watchdog
- /usr/sbin/watchdog
- /usr/sbin/wd_identify
- /usr/sbin/wd_keepalive
- /usr/share/doc/watchdog/
- /usr/share/man/man5/watchdog.conf.5.gz
- /usr/share/man/man8/watchdog.8.gz
- /usr/share/man/man8/wd_identify.8.gz
- /usr/share/man/man8/wd_keepalive.8.gz
Main configuration options in /etc/watchdog.conf:
interval =
The interval between two write operations to the watchdog device. The default is 10 seconds. Intervals longer than one minute can only be used with the -f
command-line option.
logtick =
If logging is enabled, this parameter specifies how many intervals to skip between log entries. For example, with logtick = 60
and interval = 10
, events will be logged no more than once every 10 minutes.
max-load-1 =
max-load-5 =
max-load-15 =
The maximum allowed system load over 1, 5, and 15 minutes, respectively. If exceeded, the system will reboot. Setting 0
disables the check.
min-memory =
Minimum amount of free virtual memory. Setting 0
disables the check.
max-temperature =
Maximum allowed system temperature.
watchdog-device =
temperature-device =
The name of the watchdog device and the temperature sensor device.
file =
change =
File monitoring mode. change
sets the interval for checking files.
pidfile =
The PID file of a process to monitor. For example, pidfile = /var/run/apache2.pid
. If the process is not running, the watchdog will trigger a reboot.
ping =
interface =
Ping-based network check. interface
specifies which network interface to use.
test-binary =
test-timeout =
repair-binary =
Parameters for running custom tests or repair programs. test-timeout
sets the maximum duration of the test in seconds (0 means unlimited).
admin =
Email address for notifications. Leave blank to disable notifications.
realtime =
priority =
Real-time mode settings. realtime = Yes
prevents unloading the watchdog module from memory, and priority
sets its execution priority.
Example setup with Intel TCO Watchdog Timer:
Load the module:
sudo modprobe iTCO_wdt
In /etc/watchdog.conf, uncomment or add the following lines:
watchdog-device = /dev/watchdog
interval = 10
In /etc/default/watchdog, specify the module name:
watchdog_module="iTCO_wdt"
To enable debugging and detailed logging to syslog:
watchdog_options="-v"
Restart the watchdog service:
sudo /etc/init.d/watchdog restart
Monitor logs in real time:
tail -f /var/log/syslog