Configurations
JMS topic configuration: Non- persistent, Non- durable Topic
JMS topic name: CLIENT-UPDATE
Publishers: Application server
Consumers: GUI
Message payload: Serialized Objects
Message purpose: Update GUI with latest information
Message expiration: No expiration on message
Message acknowledgement: Auto acknowledgement
System setup:
We have two active application servers publishing JMS messages to the topic when we receive data change events.
We have any number of SWING GUI clients on desktop that is connecting to the JMS messages for real time updates to the UI.
What is the Expected Behavior?
Publishers publish messages on the CLIENT-UPDATE topic. The JMS broker delivers messages to the consumers that are active at the time. The JMS broker gets acknowledgements from those consumers. The message disappears from the topic. So, we should not see any queued messages in the topic at any given time.
What is the Issue?
The CLIENT-UPDATE topic accumulates messages. The messages do not get removed from the topic. The messages seem to stay on topic as if message broker is waiting for a durable consumer that is not active. Thus, we believe that the issue is caused by orphan JMS connections. That is the JMS broker thinks these connections are alive while in reality these connections do not have any consumer associated.
How can you replicate the problem (root cause)?
Users login to the system through SWING GUI. As a part of logging in to the system, the GUI makes a JMS connection to the topic to receive updates to the data in the UI.
Here is the scenario in which case the issue arises:
A user logs into the system remotely through VPN. The application is fully functional. The GUI makes JMS connections and receives messages from the JMS topic. After sometime, the VPN gets disconnected. In this case, the application is not closed but all the connections – connection to JMS, connection to application server are severed.
Here, the JMS connection is not cleaned up; thus may leave an “orphan†connection on the JMS Broker. The JMS broker thinks that the JMS connection from this GUI is still active while in reality it is not.
Above scenario can be replicated by taking the network cable out to simulate the VPN disconnect.
Similar this case is possible when GUI JVM crashes suddenly and the JMS connection does not gets cleaned up.
In such case, the JMS broker thinks that the “orphan†connection is still active and waits for the acknowledgement from the consumer. In this case, there will not be any acknowledgements as there is no consumer associated with the connection. Therefore, the message gets queued in the topic as if it is a persistent topic.
In the rare case of JVM crash, this case seems to be pretty much unavoidable. Also, in such a setup, one would expect the JMS broker to recognize “orphan†connection and automatically close or ignore it.
One would expect that the JMS implementation of auto acknowledgement should have recognized that a system crash did not clean up the JMS connection and therefore issue a reconnect. However, this may not happen as when the GUI startup again, it will look like a new application. This is because we have configured the consumers to be non durable consumers and application may not guarantee the same id when connecting to JM. Thus, the broker sees such connection has new rather than the reusing the old connection.
What is the Resolution?
The resolution of this problem due to the above case is to have message expiration to small time say 10 or 15 seconds. As these messages are non-persistent and GUI is by design fine with not receiving the entire message, the message will expire. So, even though there are “orphan†connections, the message will get purged from the topic. This resolution will mask the problem but not actually resolve it.
One way to resolve the problem is to trap JMS exceptions and then use session interface to recover the session. This will work fine as long as we know that exception is going to cause disconnect from the JMS broker. There may not be a client solution as discussed below.
Why there may not be a solution in Client?
Case I JVM Crash
Only time the client can recover session is when the client code can capture the exception. However, when the JVM crashes, all the processes are dead. The application does not keep the JMS state; thus when it stats up after crash; the client does not have knowledge of the problem. So the client will not or cannot do session recover.
Case II Network Disconnect
The network disconnect will behave similarly. In this case, the client does not know that it got disconnected from the broker. This is because the client listener is a call back from the broker; thus the client does not know if the connection is live at any given time. Â Due to this call back implementation, when the network disconnect, the client does not even know that the connection is broken.
Only way we can keep the connection live is to implement a ping mechanism in which client will ping the JMS broker periodically to keep active connection. In this way, if the connection is broken, then we can recover.