Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
-
Patch
Description
If a broker is repeatedly killed and restarted when it is federated to another broker, the second broker will run out of channels with a "channel pool is empty" message. The channel being used for the federation link is not being returned to the channel pool, and eventually (after ~32k restarts), the channel pool becomes exhausted.
A reproducer is contained in the attached file "channel_test.sh". If a small change is made to the Link.cpp constructor which shrinks the channel pool to 5 (see the diff below), then this test will show the error without having to wait hours for ~32k restarts. The test does 10 restarts.
diff --git a/src/qpid/broker/Link.cpp b/src/qpid/broker/Link.cpp index 14737e730..790c8ac5e 100644 --- a/src/qpid/broker/Link.cpp +++ b/src/qpid/broker/Link.cpp @@ -149,7 +149,7 @@ Link::Link(const string& _name, currentInterval(1), reconnectNext(0), // Index of next address for reconnecting in url. nextFreeChannel(1), - freeChannels(1, framing::CHANNEL_MAX), + freeChannels(1, 6), connection(0), agent(0), listener(l),
Running the test with the above temporary patch, the following is observed:
Channel allocations from broker 6001: 3377:2020-03-06 13:29:36 [Broker] debug Link qpid.tcp:localhost:5001 allocates channel: 1 3538:2020-03-06 13:29:36 [Broker] debug Link qpid.tcp:localhost:5001 allocates channel: 2 3739:2020-03-06 13:29:39 [Broker] debug Link qpid.tcp:localhost:5001 allocates channel: 3 3934:2020-03-06 13:29:43 [Broker] debug Link qpid.tcp:localhost:5001 allocates channel: 4 4022:2020-03-06 13:29:47 [Broker] debug Link qpid.tcp:localhost:5001 allocates channel: 5 4110:2020-03-06 13:29:51 [System] debug Exception constructed: Link qpid.tcp:localhost:5001 channel pool is empty 4111:2020-03-06 13:29:51 [System] error Link qpid.tcp:localhost:5001 channel pool is empty 4154:2020-03-06 13:29:52 [System] debug Exception constructed: Link qpid.tcp:localhost:5001 channel pool is empty 4155:2020-03-06 13:29:52 [System] error Link qpid.tcp:localhost:5001 channel pool is empty 4253:2020-03-06 13:29:55 [System] debug Exception constructed: Link qpid.tcp:localhost:5001 channel pool is empty 4254:2020-03-06 13:29:55 [System] error Link qpid.tcp:localhost:5001 channel pool is empty 4297:2020-03-06 13:29:56 [System] debug Exception constructed: Link qpid.tcp:localhost:5001 channel pool is empty 4298:2020-03-06 13:29:56 [System] error Link qpid.tcp:localhost:5001 channel pool is empty ... (repeated several times more)
A fix which returns the channels on links that are closing to the channel pool is suggested in attached patch BZ1748054.patch. With this patch applied (together with the temporary pool-shrink patch above), the following is now observed:
Channel allocations from broker 6001: 2020-03-06 12:20:59 [Broker] debug Link qpid.tcp:localhost:5001 allocates channel: 1 2020-03-06 12:20:59 [Broker] debug Link qpid.tcp:localhost:5001 allocates channel: 2 2020-03-06 12:21:02 [Broker] debug Link qpid.tcp:localhost:5001 allocates channel: 3 2020-03-06 12:21:06 [Broker] debug Link qpid.tcp:localhost:5001 allocates channel: 4 2020-03-06 12:21:10 [Broker] debug Link qpid.tcp:localhost:5001 allocates channel: 5 2020-03-06 12:21:14 [Broker] debug Link qpid.tcp:localhost:5001 allocates channel: 1 2020-03-06 12:21:18 [Broker] debug Link qpid.tcp:localhost:5001 allocates channel: 3 2020-03-06 12:21:22 [Broker] debug Link qpid.tcp:localhost:5001 allocates channel: 4 2020-03-06 12:21:26 [Broker] debug Link qpid.tcp:localhost:5001 allocates channel: 5 2020-03-06 12:21:30 [Broker] debug Link qpid.tcp:localhost:5001 allocates channel: 1 2020-03-06 12:21:34 [Broker] debug Link qpid.tcp:localhost:5001 allocates channel: 3 2020-03-06 12:21:38 [Broker] debug Link qpid.tcp:localhost:5001 allocates channel: 4
Channel 2 is used by a bridge which remains open (qpid.bridge_session_amq.fanout). The channels are re-used in a cyclical pattern, which I think is the intention.