FSM Adapters consume large amounts of memory and goes into unstable state.
2 years ago
Originally Published: 2010-06-07
Article Number
000061404
Applies To
RSA File Security Manager (FSM) 2.2.1
RSA File Security Manager (FSM)
Microsoft Windows Server 2003 SP2
Issue
FSM Adapters consume large amounts of memory and goes into unstable state.
FSM VDSFWinPCService.exe application is using large memory
Audit logs are sometimes not getting shipped.
Cause

When Adapters on both the clusters and Adapter Manager are upgraded from 2.1.0.9 to 2.2.1, due to an issue in upgrade, both the adapters get the same Adapter ID. In customer's site it is A62F2F58-6E79-49f8-BDDC-A4AD18B8E1BD. So both the adapters are sending heartbeat and audit logs to Adapter Management Service (AMS) with same Adapter ID.

AMS identifies Adapters via the Adapter ID which is supposed to be unique for each adapter. AMS also creates a folder with Adapter ID in the central repository to store all the audit log files that are send from Adapter. When Audit file are sent by adapter, they are broken into small fragments (in this case it is 10MB as per FSAdapter.ini) and each fragment is sent to AMS with Adapter ID and serial fragment number as a single message. When AMS receives the message from Adapter, it extracts the Adapter ID and serial fragment number from the message and assembles the fragment with the part file in the folder that is created for Adapter in central repository.

Since in this case both Adapters have the same ID, AMS cannot distinguish one from the other. When both adapters send audit at the same time, the assembling and defragmentation logic is gone for the toss. This can cause un-predictable behavior in AMS which includes a deadlock where all the threads hang.

Thread id = 2432 :SSLHeartbeatListener::Start().\SSL_heartbeat_listener.cpp:352 04/08/10 06:48:34 INFO : Waiting for new connection (8968625)...

Thread id = 2432 :SSLHeartbeatListener::PrintRequestQueue().\SSL_heartbeat_listener.cpp:399 04/08/10 08:22:36 INFO : Size of Request Queue = 3

Thread id = 2432 :SSLHeartbeatListener::Start().\SSL_heartbeat_listener.cpp:352 04/08/10 08:22:36 INFO : Waiting for new connection (8968626)...

Thread id = 2432 :SSLHeartbeatListener::PrintRequestQueue().\SSL_heartbeat_listener.cpp:399 04/08/10 09:28:32 INFO : Size of Request Queue = 4

?.

?

?

Thread id = 2432 :SSLHeartbeatListener::Start().\SSL_heartbeat_listener.cpp:352 04/14/10 08:06:15 INFO : Waiting for new connection (8968684)...

Thread id = 2432 :SSLHeartbeatListener::PrintRequestQueue().\SSL_heartbeat_listener.cpp:399 04/14/10 08:19:13 INFO : Size of Request Queue = 62

In the above AMS logs it is clear that all the AMS threads went to dead lock and hence no threads in AMS are available to serve the new requests coming from adapters.

Since AMS is in dead lock, it is not accepting any new messages. Adapter will be sending the same message again until it succeeds. In mean time another thread in Adapter is loading the remaining part of the Audit log into queue. As per the FSAdapter.ini file, maximum queue size is 4GB, so adapter is trying to fill the queue up to 4GB with audit logs. As maximum memory size of a process in Windows is 2GB, adapter goes into unstable state once it reaches 2GB.


Resolution

Main issue is both the Adapters having same Adapter ID. To address this issue, Adapter ID of one of the adapters has to be changed. To change the Adapter ID, one of the adapters should be removed and reconfigured in Adapter Manager so that new Adapter ID is assigned to it.

In 2.1.0.9 customer did not see any issue with Adapter ID because, AMS was introduced in 2.2.1 and AMS is the component in FSM that identifies the Adapter using its Adapter ID.