Uploaded image for project: 'Geode'
  1. Geode
  2. GEODE-1588

Starting and stopping wan sender can cause OOME

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 1.0.0-incubating.M3
    • 1.0.0-incubating.M3
    • wan
    • None

    Description

      The following test will more than likely cause an OOME due to a timing issue in stopping the gateway sender.

      public void closingSenderWhileBatchOperationsAreProcessingShouldNotHaveMultipleThreadsReadFromSameStream() throws Exception {
      
          Integer lnPort = (Integer)vm0.invoke(() -> WANTestBase.createFirstLocatorWithDSId( 1 ));
          Integer nyPort = (Integer)vm1.invoke(() -> WANTestBase.createFirstRemoteLocator( 2, lnPort ));
      
          createCacheInVMs(nyPort, vm2);
          createReceiverInVMs(vm2);
      
          createCacheInVMs(lnPort, vm4);
      
          //keep the maxQueueMemory low enough to trigger eviction
          vm4.invoke(() -> WANTestBase.createConcurrentSender( "ln", 2,
            false, 100, 101, false, false, null, true, 3, OrderPolicy.KEY ));
      
          vm2.invoke(() -> WANTestBase.createPartitionedRegion(
            getTestMethodName() + "_RR", null, 0, 10, isOffHeap() ));
          //    vm2.invoke(() -> WANTestBase.createPartitionedRegion(
          //      getTestMethodName() + "_RR", null, 0, 10, isOffHeap() ));
      
          startSenderInVMs("ln", vm4);
          vm2.invoke(() -> addListenerToSleepAfterCreateEvent(10, getTestMethodName() + "_RR"));
      
          //    vm4.invoke(() -> WANTestBase.createPartitionedRegion(
          //      getTestMethodName() + "_RR", null, 0, 10, isOffHeap() ));
          vm4.invoke(() -> WANTestBase.createReplicatedRegion(
            getTestMethodName() + "_RR", "ln", isOffHeap() ));
          vm4.invoke(() -> addListenerToSleepAfterCreateEvent(1, getTestMethodName() + "_RR"));
      
      
          vm4.invokeAsync(() -> WANTestBase.doPutsAfter300(
            getTestMethodName() + "_RR", 1000000 ));
      
          Thread.sleep(5000);
          stopSenderInVMsAsync("ln", vm4);
      
      
          Thread.sleep(10000);
          for (int i = 0; i < 100; i++) {
            startSenderInVMs("ln", vm4);
            Thread.sleep(10000);
            stopSenderInVMs("ln", vm4);
            Thread.sleep(5000);
          }
          //
          //    stopSenderInVMsAsync("ln", vm4);
          //    Thread.sleep(1000);
      
          //    startSenderInVMs("ln", vm4);
          //    Thread.sleep(1000);
          vm2.invoke(() -> WANTestBase.validateRegionSize(
            getTestMethodName() + "_RR", 10000, 240000));
        }
      

      Due to the way this test is written, I wouldn't necessarily want it checked in as it is very time based and possibly flakey. It will run into the OOME eventually but it's all based on timing.

      The issue is that the ack reader thread is reading off the same socket as the gateway sender closing thread, which causes the stream to be corrupted.

      Attachments

        Activity

          People

            jasonhuynh Jason Huynh
            jasonhuynh Jason Huynh
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: