Installing and configuring the Watchdog
How to protect your server from hangs and freezes with a watchdog timer.
A watchdog timer is a hardware-backed failsafe that keeps your server from getting stuck indefinitely. The idea is simple: your system periodically sends a "heartbeat" signal to reset the timer. If that signal stops coming — because the system hung, crashed, or got stuck in a loop — the timer fires and triggers a reboot. Depending on the setup, that reboot can be a clean software-level restart or a hard reset at the hardware level (e.g. pulling the RST line).
On dedicated servers and VPS where you can't physically walk up to the machine, this kind of automatic recovery is genuinely valuable.
Installing on Ubuntu / Debian
sudo apt-get install watchdog
The package installs the following key files:
/etc/init.d/watchdog— service init script/etc/watchdog.conf— main configuration file/etc/default/watchdog— startup options/dev/watchdog— the watchdog device/usr/sbin/watchdog— the watchdog binary
Key parameters in /etc/watchdog.conf
Timing and logging:
interval— how often the watchdog writes to the device. Defaults to 10 seconds. Values over 60 seconds require the-fflag at startup.logtick— controls how frequently events are written to the log. Withlogtick = 60andinterval = 10, events are logged at most once every 10 minutes.
System load:
max-load-1,max-load-5,max-load-15— maximum acceptable system load averages over 1, 5, and 15 minutes. If any threshold is exceeded, watchdog triggers a reboot. Set to0to disable a check.
Memory and temperature:
min-memory— minimum acceptable free virtual memory. Set to0to disable.max-temperature— maximum acceptable temperature before a reboot is triggered.watchdog-device— path to the watchdog device (typically/dev/watchdog).temperature-device— path to the temperature sensor device.
File and process monitoring:
fileandchange— monitor a file for changes.changesets the check interval.pidfile— path to the PID file of a process you want to keep alive. Example:pidfile = /var/run/apache2.pid. If the process isn't running, watchdog will reboot the system.
Network:
pingandinterface— check network connectivity by pinging a host.interfacespecifies which network interface to use.
Custom tests:
test-binary— path to a custom test script or program.test-timeout— maximum execution time for the test in seconds (0for no limit).repair-binary— a program to run automatically when a problem is detected, before resorting to a reboot.
Notifications and priority:
admin— email address for event notifications. Leave blank to disable.realtime = Yes— locks the watchdog module in memory so it can't be swapped out.priority— real-time scheduling priority for the watchdog process.
Our products and services
Example setup with Intel TCO Watchdog
Load the kernel module:
sudo modprobe iTCO_wdt
In /etc/watchdog.conf, uncomment or add:
watchdog-device = /dev/watchdog
interval = 10
In /etc/default/watchdog, specify the module name:
watchdog_module="iTCO_wdt"
To enable verbose logging to syslog for debugging:
watchdog_options="-v"
Restart the service:
sudo /etc/init.d/watchdog restart
Watch the logs in real time to confirm everything's working:
tail -f /var/log/syslog
Help
If you have any questions or need assistance, please contact us through the ticket system — we're always here to help!