To reproduce the error write the simple WinForm application in C# with listbox (or console app)
1. Create the connection: failover:(tcp://localhost:61616?keepAlive=true) and start it.
2. Create the session and the QueueConsumer for queue e.g. "TestQueue" (in default AutoAcknoledged mode)
3. For queue consumer set the message listener e.g. OnMessage
4. In OnMessage method do something like Sleep(5000) and then display the received textMessage (by Invoke add the messge to listbox).
5. By means of localhost:8161/admin create the TestQueue and put to this queue about 20 persistent text messages.
6. Run the application. You should get on the screen new line every 5 seconds.
7. Restart the ActiveMQ broker (I'm using 220.127.116.11)
8. After restarting the broker you stop receiving the messages.
9. Restart broker again. And you will start getting the messages.
The problem is in NMS.
Most likely when you restart ActiveMQ broker the client app will be in OnMessage method (just sleeping there for 5 seconds). When those 5 seconds is over then the NMS is trying to SendACK. And this method will not end until failover thread successfully reconnect.. For that time there is a lock on the unconsumedMessages.SynchRoot (see MessageConsumer.Dispatch method). And this fact is painful for another thread which is trying to do the unconsumedMessage.Clear() and needs the locked resource. (this thread is initated in Connection.OnTransportInterrupted() and it wants to call ClearMessagesInProgress on MessageConsumer).
The worst thing is that the MessageConsumer.ClearMessagesInProgress method cannot call (as waits for locked resource) TransportInterruptionProcessingComplete() which I guess registers consumers which have to be recovered when the connection is back.
So when the connection is back:
Failover thread DoRecover but does not find our consumer.
Dispatch method unlocks the unconsumedMessages.SynchRoot
The working thread registers consumers to be recovered (absolutely too late!!!!)