Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
1.0.0-incubating.M3
-
None
Description
The following test will more than likely cause an OOME due to a timing issue in stopping the gateway sender.
public void closingSenderWhileBatchOperationsAreProcessingShouldNotHaveMultipleThreadsReadFromSameStream() throws Exception { Integer lnPort = (Integer)vm0.invoke(() -> WANTestBase.createFirstLocatorWithDSId( 1 )); Integer nyPort = (Integer)vm1.invoke(() -> WANTestBase.createFirstRemoteLocator( 2, lnPort )); createCacheInVMs(nyPort, vm2); createReceiverInVMs(vm2); createCacheInVMs(lnPort, vm4); //keep the maxQueueMemory low enough to trigger eviction vm4.invoke(() -> WANTestBase.createConcurrentSender( "ln", 2, false, 100, 101, false, false, null, true, 3, OrderPolicy.KEY )); vm2.invoke(() -> WANTestBase.createPartitionedRegion( getTestMethodName() + "_RR", null, 0, 10, isOffHeap() )); // vm2.invoke(() -> WANTestBase.createPartitionedRegion( // getTestMethodName() + "_RR", null, 0, 10, isOffHeap() )); startSenderInVMs("ln", vm4); vm2.invoke(() -> addListenerToSleepAfterCreateEvent(10, getTestMethodName() + "_RR")); // vm4.invoke(() -> WANTestBase.createPartitionedRegion( // getTestMethodName() + "_RR", null, 0, 10, isOffHeap() )); vm4.invoke(() -> WANTestBase.createReplicatedRegion( getTestMethodName() + "_RR", "ln", isOffHeap() )); vm4.invoke(() -> addListenerToSleepAfterCreateEvent(1, getTestMethodName() + "_RR")); vm4.invokeAsync(() -> WANTestBase.doPutsAfter300( getTestMethodName() + "_RR", 1000000 )); Thread.sleep(5000); stopSenderInVMsAsync("ln", vm4); Thread.sleep(10000); for (int i = 0; i < 100; i++) { startSenderInVMs("ln", vm4); Thread.sleep(10000); stopSenderInVMs("ln", vm4); Thread.sleep(5000); } // // stopSenderInVMsAsync("ln", vm4); // Thread.sleep(1000); // startSenderInVMs("ln", vm4); // Thread.sleep(1000); vm2.invoke(() -> WANTestBase.validateRegionSize( getTestMethodName() + "_RR", 10000, 240000)); }
Due to the way this test is written, I wouldn't necessarily want it checked in as it is very time based and possibly flakey. It will run into the OOME eventually but it's all based on timing.
The issue is that the ack reader thread is reading off the same socket as the gateway sender closing thread, which causes the stream to be corrupted.