Configure agent for automatic fail over
Hope everyone is fine and safe during these tough times.
I would like to point out that i"m not a SME but have received the responsibilities of RSA and am learning as I go.
We have an Instance of Authentication Manager and one replica.
We have a few agents installed.
How do I either configure a new agent or reconfigure a pre-existing agent to automatically do the load balancing between available servers (primary and replica) and automatically fail over to the replica if primary instance becomes unreachable?
Reason I'm asking is that last week we have an issue on the site where the primary instance was located. But it did not switch over automatically to the replica.
Thank you in advance
- Auth Agent
- Authentication Agent
- Community Thread
- Forum Thread
- primary and replica replace
- replica to primary
- RSA SecurID
- RSA SecurID Access
At the risk of sounding like a consultant, the answer is "it depends."
Legacy agents that use UDP port 5500 such as the Authentication Agent for Windows version 7.4.x or earlier will learn what replicas are available and avoid those that are not responding, and if replica goes off line suddenly, they should time out in 30 seconds to fail over (5x retires at 5 seconds timeout each). However there could be local files such as sdopts/rec and sdstatus.12 that override the use of some replicas. You can safely delete sdstatus.12 (or sdstatus.1 on PAM v. 7 or earlier agents) since this is basically an elephant cache file. If sdopts.rec exists, you need to investigate why someone implemented it. Same directory as sdconf.rec.
Newer ReST agents, such as MFA agent v. 1.x, PAM agent 8.x in ReST configuration, only have fail-over capabilities that must be configured, so you have to . ReST agents authenticate against TCP port 5555. ReST agent support must be enabled or configured in the Security Console - Setup, and unlike most Auth Manager configuration that are replicated from the primary to the replicas, this one must be enabled on replicas too. If you have ReST agents not failing over, I'd check this first, but on the replica Security Console.
There are some TCP agents that mostly are Partner products that use agent API ver. 8.5 or 8.6 and authenticate against TCP port 5500.
To expand on Jay's response, the newer "REST" agents are configured by a Fully Qualified Host Name (FQHN) in a URL. The agent will resolve this FQHN to one or more addresses. The agent will "round-robin" between all the addresses to which the FQHN resolved.
For example, the administrator could define a new DNS name like "authservices.corp.com". If a replica is being added, the replica's address could be added to the DNS entry once the server is available. Similarly, if a server was being shutdown, the server's address would be removed from the DNS configuration. This same mechanism could be used to control specific sets of agents and force them to only use specific servers. I could have FQHNs for each agent group and use that FQHN to control the servers to which those agents send authentication requests. The agent uses a closed CA, so the FQHN for the server addresses need not have any relation to the specific system's FQHN.
The REST agent provides a lot of flexibility and an improved user experience. Agents should recover more gracefully if a server becomes unavailable. Unlike the UDP protocol, the REST agent has a "fail-fast" design with a short connection timeout and longer read-timeout. If a server is unavailable (and the agent is unable to connect), the agent moves-on to the next server. In the legacy UDP agent, a server failing to respond results in an authentication failure.