9.3 Configuring Event Failover Destination

Change Guardian supports failover and failback mechanisms which enable an automatic switch to a reliable backup system when the primary resource is unavailable and automatically switches back to the primary resource once it is running. This disaster recovery mechanism ensures that the data is not lost.

9.3.1 Failover Mechanism

The event failover mechanism involves a server that is configured for a seamless flow of events from the agents when its respective primary event destination is not reachable. When the primary event destination fails to respond, the events are forwarded to the failover destination by default. The events are redirected to the primary event destination when it is reachable. Every new or existing policy that is assigned to an agent is also assigned to a configured failover destination by default. This configuration ensures the Change Guardian functionality support for seamless monitoring.

To configure an event failover destination:

  1. Log in to the web console, click CONFIGURATION > Events > Event Failover Destination.

  2. Specify the name, host, port, user name, and password.

  3. Click Configure.

  4. To view the configured failover destination and policy assignment, click CONFIGURATION > Policies > Assign Policies. Select the Assign Unassign icon under Agents. Select the policy and click Event Destinations.

NOTE:For uninterrupted flow of events, ensure that the module licenses have been added in the primary and failover event destinations.

9.3.2 Failback Mechanism

When the issues associated with the failure/ outage are resolved and the primary destination is available, the event data flow is automatically switched back to the primary destination immediately.

Recommendations

To avoid loss of event data at a single level failover destination configuration, consider the following information:

  1. An event destination can handle 2k EPS from a single HTTP server connector port. Ensure that the failover destination is set up in a manner that prevents event data from surpassing 4k EPS across two ports.

  2. When the primary event destination is down and the failover destination is operational, ensure that the downtime of the primary forwarder is less than 5 hours.

  3. When the primary and failover event destinations are not reachable, the Windows agent creates cache of upto 200 MB and UNIX agent creates cache of upto 10 MB. If the cache exceeds the limit, there is loss of event data.

    NOTE:When the primary event destination is reachable first, the cached events are sent to the primary event destination. When the failover event destination is reachable first, the cached events are lost.

  4. Considering the EPS for the primary and failover event destinations, create a data retention policies for both. For more information, see Data Retention Policy.