Agent Failover in AutoMate BPA Server 7

by marjo martinez, in Tech Talk, posted 10/5/09

Introduction

AutoMate BPA Server provides "Failover" capability in servers, systems or networks that require continuous availability and a high degree of reliability. Failover is the ability for an Agent to switch over automatically to a redundant or standby AutoMate BPA server component upon failure or abnormal termination of the previously active server component. Failover happens without human intervention. The second server will immediately take over the work of the first as soon as it detects a loss of communication.

 

Requirments

AutoMate BPA Failover relies on the following conditions in order to work properly:

  • Two AutoMate BPA Server components and at least one Agent.
  • An external database back end is in use as the BPA Datastore.

Although not a requirement, it is recommended that the primary, secondary (and if applicable, tertiary) BPA server components are installed on machines separate from those which have the BPA Development Tools (i.e. SMC, WFD and Task Builder components), BPA Agents and the external database to be used as the BPA Datastore.

 

Concept

The Failover functionality is primarily Agent-based in that remote Agents are responsible for communicating with the server and are able to connect to a secondary or tertiary server if the primary server connection is lost. This is performed independent of the BPA Server. The Agents can be configured for up to three server hostnames (or IP addresses) to allow a hierarchal order of primary, secondary and tertiary servers. The different hostnames are stored in the registry under a single key value separated by semi-colons. The Agent will always attempt to connect to the primary hostname first. If that is not available, the secondary, then tertiary server will be attempted. This process will cycle until the agent successfully connects to a server. 

If the Agent connects to the primary server, and the connection to that server is lost, the agent shall set a timer for fifteen seconds and attempt to connect to secondary and then tertiary servers as described above.

If the Agent connects to a secondary or tertiary server, a timer for thirty seconds is set. When the timer expires, a separate thread is started to test if the primary server is available. If it is, the connection is closed and the Agent reconnects to the primary server.

 

Instructions

  1. On each machine where an Agent is installed, create a backup of the registry.
  2. After backing up the Registry go to the following Registry location: HKEY_LOCAL_MACHINE\SOFTWARE\Network Automation\BPA Agent\TaskService\Agent\Host
  3. Enter the primary, secondary and tertiary server name/IP address followed by the port and separating each with a semi colon (see Figure 1). For example: servername1:7100;servername2:7100;servername3:7100
  4. Exit the Registry and repeat for all relevant Agent machines.

NOTE: Be aware, that careless Registry editing can make your system malfunction or even keep you from starting Windows. Proceed with caution.

 

Figure 1.  Editing the registry key for Agent failover.

 

Conclusion

AutoMate BPA Server’s Failover functionality is an ideal backup operation that automatically switches to a standby server if the primary system fails or is temporarily shut down. Failover is an important fault tolerance function of mission-critical systems that rely on constant accessibility due to its ability to redirect requests from the failed or down system to the backup system that mimics the operations of the primary system.