The AccuRev Server is designed for high reliability, to ensure the integrity of the repository and its availability to AccuRev users. But even the most robust software systems are occasionally compromised; the AccuRev Server can be brought down by a bad disk sector or an administrator’s mistaken command.
The reliability of the AccuRev Server is further enhanced by a companion program, called the “Watchdog”, which runs on the same machine. The sole function of the Watchdog is to monitor the Server and restart it in the event of a failure. The effect of the Watchdog on Server performance is insignificant.
Every 10 seconds, the Watchdog sends a simple command to the Server. If the Watchdog detects that the Server is not responding or is not functioning properly, the Watchdog restarts the Server. If the Watchdog detects five such failures within a three-minute timespan, it doesn’t restart the Server; such a situation indicates the need for server reconfiguration or investigation by the AccuRev support team. (If
ACCUREV_WATCHDOG_FAST_FAIL_DISABLE is set in the Watchdog’s environment, it keeps trying to restart the Server indefinitely.)
The Watchdog maintains a simple log file, acwatchdog.log, in subdirectory
logs of the
site_slice directory. On UNIX/Linux server machines, the Watchdog log file is rotated similarly to the Server log file.