Description
LaxConnPool may leak connections when a caller cancels a pending lease request.
Consider a pool of just 1 connection, currently assigned to Thread 2.
Thread1:
future = lease
... no available connection so added to pending and unsatisifed future returned ...
Thread2:
release
... put connection back in available list ...
servicePendingRequest
pending.poll() -> Thread1's outstanding LeaseRequest
leaseRequest.isDone() = false
leaseRequest.getDeadline.isExpired() = false
entry = getAvailableEntry()
addLeased(entry)
Thread1:
future.cancel
...
future.completed = true
...
return true;
... future successfully cancelled. Given up waiting for a connection, and moving on without one ...
Thread2:
leaseRequest.completed(entry)
future.completed(entry)
if (future.completed) return false;
At this point, the completed call failed (returned false), meaning the entry hasn't successfully been received by any caller waiting on the future. But we ignore that failure and proceed as if it has succeeded; leaving the entry in the leased list, where it will permanently now stay because no caller will ever release it.
The code must test the result of attempting to complete the future, and if it fails, must return the entry to the available pool.
The error can be recreated with the following test code:
import org.apache.hc.core5.io.ModalCloseable; import org.apache.hc.core5.io.CloseMode; import org.apache.hc.core5.pool.PoolEntry; import org.apache.hc.core5.pool.LaxConnPool; import org.apache.hc.core5.pool.PoolEntry; import org.apache.hc.core5.util.TimeValue; import java.util.concurrent.Future; import java.util.concurrent.Phaser; public class PoolCancelTest { private static class TestConn implements ModalCloseable { public void close() { } public void close(CloseMode mode) { } } private static LaxConnPool<String,TestConn> pool; private static Phaser sync = new Phaser(1); private static void dbg(String msg) { System.out.println(Thread.currentThread().getName()+": "+System.currentTimeMillis()+": "+msg); } public static void main(String[] args) { final int poolSize = Integer.parseInt(args[0]); pool = new LaxConnPool<>(poolSize); // Create a pool entry purely to prime class loading new PoolEntry<>("", TimeValue.NEG_ONE_MILLISECONDS); for (int i=0;i<poolSize*2;i++) { sync.register(); new Thread(String.format("req-%04d", i)) { @Override public void run() { dbg("Started"); Future<PoolEntry<String,TestConn>> req = pool.lease("somehost",null); PoolEntry<String,TestConn> entry = null; if (req.isDone()) { try { entry = req.get(); if (!entry.hasConnection()) { entry.assignConnection(new TestConn()); } } catch (Exception shouldNotHappen) { dbg("Should not happen!"); shouldNotHappen.printStackTrace(); } } sync.arriveAndAwaitAdvance(); if (entry != null) { pool.release(entry, true); dbg("Released pool entry "+System.identityHashCode(entry)); } else { if (req.cancel(true)) { req = pool.lease("somehost",null); } } try { dbg("Assigned pool entry "+System.identityHashCode(req.get())); } catch (Exception fail) { dbg("Getting pool entry failed"); fail.printStackTrace(); } } }.start(); } sync.arriveAndDeregister(); } }
It generally requires a significant number of threads to make the right timing sequence reliably occur, but I find that on my machine, if I run with command line arg of 300, that causes the code to lock up (because a connection is leaked from the pool, so a thread ends up in an infinite wait) on 50+% of runs.
Attachments
Issue Links
- links to