2.2 Planning for High Availability

You can ensure that your organization provides reliable services to your employees without downtime by planning for and configuring high availability in Cloud Bridge 1.9.0 or later.

2.2.1 Understanding High Availability in Cloud Bridge

Cloud Bridge addresses both instance and site failover scenarios as follows:

  • Instance failover: If the hardware or container hosting the CBA fails, Cloud Bridge can immediately switch to another CBA instance with the same data center configuration that has access to the same data sources.

  • Site failover: If an entire data center location becomes inoperative due to a catastrophic event, Cloud Bridge can immediately switch to a CBA instance at another site that has access to the same data sources.

When you install a CBA, you specify the following properties that are saved to the bridge-agent.yml configuration file for the CBA:

  • Instance ID - A unique identifier for that instance

  • Site Weight - Specifies the priority (Primary, Secondary, or Backup) of the site in relation to other data center sites

    In the event of failure, all CBA instances within a Primary site take precedence over CBA instances with a Secondary site configuration, and the Secondary site instances take precedence over CBA instances with a Backup site configuration.

  • Instance Weight - Specifies the priority (Primary, Secondary, or Backup) of the instance in relation to other CBA instances

    In the event of failure, all CBA instances (within the same site) with a Primary instance weight take precedence over CBA instances with a Secondary instance configuration, and the Secondary instances take precedence over CBA instances with a Backup instance configuration.

Cloud Bridge calculates the configured site weight and instance weight into a single haWeight value. When a shutdown takes place, whether planned or unplanned, Cloud Bridge uses a simple numerical comparison of the weight values from all active CBA instances to determine which CBA instance becomes the target CBA instance. Cloud Bridge fails over to the highest weighted instance that it finds. You can use whatever combination of weights you choose.

For a site, the weight values are as follows:

  • Primary = 30

  • Secondary = 20

  • Backup = 10

For an instance, the weight values are as follows:

  • Primary = 5

  • Secondary = 3

  • Backup = 1

So, for example, if you configured nine CBA instances in total, with a single CBA instance at each possible weight, failover would occur in the following order:

Table 2-2 Configuration Combinations

Order

Site

Instance

Calculated haWeight Value

1.

Primary (30)

Primary (5)

35

2.

Primary (30)

Secondary (3)

33

3.

Primary (30)

Backup (1)

31

4.

Secondary (20)

Primary (5)

25

5.

Secondary (20)

Secondary (3)

23

6.

Secondary (20)

Backup (1)

21

7.

Backup (10)

Primary (5)

15

8.

Backup (10)

Secondary (3)

13

9.

Backup (10)

Backup (1)

11

NOTE:

  • Cloud Bridge does not prevent you from configuring multiple CBA instances with the same weight values. If multiple instances have the same weight, the Cloud Bridge Client selects the first instance with that weight that it detects.

  • Secondary and Backup sites and instances provide additional layers for failover in larger environments, and you can have multiple sites and instances at each level if needed. However, you do not need to assign both Secondary and Backup site and instance roles if your organization does not need this level of complexity.

2.2.2 Understanding Cloud Bridge Communication in Failover Scenarios

Each active CBA sends a heartbeat message when it starts up and every 30 seconds thereafter. The Cloud Bridge Client monitors these heartbeats and is able to quickly detect an instance outage.

  • In a scenario where an administrator intentionally shuts down a CBA instance for maintenance, re-hosting, and so on, the CBA sends a heartbeat immediately to the Cloud Bridge Client, allowing failover to a new target to be performed instantly.

  • In an unplanned shutdown scenario where the target CBA instance is no longer able to communicate its heartbeat to the Cloud Bridge Client, the Client has an agent monitoring task that detects when a target CBA has not communicated within a configured timeframe (with a default of one minute). It then attempts to ping the CBA. If the ping attempt fails, the Cloud Bridge Client marks the target CBA as unresponsive and initiates the CBA target selection process.

Whether the shutdown was planned or unplanned, after the CBC selects a new target CBA from the current list of active CBA instances, the CBA sends a ping command to all CBA instances to inform them of the new CBA target selection. The CBC also loads the current data source configurations into the new target CBA. All data collection sessions that were active at the time of the shutdown fail, and new commands and collections are routed to the new target CBA.

If no active and initialized CBA is available, the CBC marks the data center as unconnected and uninitialized. All command traffic that is sent to the data center receives an immediate “no agent available” error response.

If a non-target CBA instance is shut down, whether intentionally or not, the CBC simply removes it from the list of active agent instances.

2.2.3 Understanding Encryption IV and Key Values

Cloud Bridge uses initialization vectors (IVs) and keys to encrypt and decrypt data for protection of customer data and data source connection passwords stored in the CBAs. For more information, see TechTarget and Wikipedia.

Each instance of the CBA configuration in your high availability environment must use the same encryption IV and Key values. Using the same values on all CBA instances ensures that no disruptions will occur during failovers.

If you create new encryption IV and Key values for the CBA configurations, the set of credentials for the CBAs (which were encrypted using the older keys) become invalid. At that point you must reset the password values. The CBA console indicates which credential sets are invalid.

For information about how to set the same encryption IV and Key values on secondary and backup CBA instances after you have installed your primary CBA instances, see Installing Secondary and Backup CBA Instances.

2.2.4 Essential Components of Successful Failover

The following are critical elements of successful configuration of the CBA high availability components:

  • Each CBA instance must have a unique Instance Id.

  • Each instance of the CBA configuration must use the same encryption IV and Key values. For more information, see Understanding Encryption IV and Key Values.

  • Cloud Bridge does not prevent you from configuring multiple CBA instances with the same weight values. If multiple instances have the same weight, the Cloud Bridge Client selects the first instance with that weight that it detects.

  • To enable effective failover, it is important to ensure that the set of data source credentials that are stored in the high availability CBAs are correct and consistent. After you have set up credentials for your first CBA, you can export and import credentials for the remaining CBAs.

NOTE:It is possible to run a single CBA instance using a pre-1.9.0 CBA bridge-agent.yml file. The Instance Id will default to an empty string, and the Site Weight and Instance Weight values will default to Backup. However, we do not recommend this scenario except for continuity during a CBA upgrade process.

2.2.5 Planning for Failover

Before you begin preparing servers to install the Cloud Bridge Agent, you should determine the following:

  • If you have more than one data center, which one will be the primary site, and which ones will serve as secondary and backup sites in the event of catastrophic site failure

  • Number of CBA servers you plan to install in each data center

  • The priority (primary, secondary, or backup) of each CBA server you plan to install

  • The naming convention you will use to identify each CBA instance

We recommend that you create a spreadsheet similar to the following example to record your decisions. Ensure that you keep this document up to date as you make changes in your environment.

Table 2-3 High Availability Planning Spreadsheet

Data Center Site Name

Site Priority

Instance Id

Instance Priority

Houston data center

Primary

Hou_Instance_1

Primary

 

 

Hou_Instance_2

Secondary

 

 

Hou_Instance_3

Backup

Provo data center

Secondary

Pro_Instance_1

Primary

 

 

Pro_Instance_2

Secondary

 

 

Pro_Instance_3

Backup

Cambridge data center

Backup

Cam_Instance_1

Primary

 

 

Cam_Instance_2

Secondary

 

 

Cam_Instance_3

Backup

2.2.6 Recommended Installation Order

If you have a large organization with multiple servers and sites, the most efficient method for you to set up your environment for successful failover is as follows:

  1. Install your first CBA instance, saving your encryption key and ID for reuse.

  2. Install all subsequent CBA instances using the encryption key and ID that you copied from your first CBA instance.

  3. Configure your first CBA instance with the credentials for all data sources you plan to use, then export those credentials to a file for reuse.

    NOTE:We recommend that you verify all data source credentials are correct before you export them for reuse.

  4. Import the data source credentials to each secondary and backup instance one at a time.