Uploaded image for project: 'Ignite'
  1. Ignite
  2. IGNITE-15146

Checking the snapshot creates a large number of unused threads that do not terminate.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Blocker
    • Resolution: Fixed
    • 2.11
    • 2.11
    • None

    Description

      Each new run of snapshot verification creates dozens of new threads that do not terminate after the procedure is complete. Over time, this can lead to an OutOfMemoryError and node failure.

          @Test
          public void testClusterSnapshotCheckMultipleTimes() throws Exception {
              IgniteEx ignite = startGridsWithCache(3, dfltCacheCfg, CACHE_KEYS_RANGE);
      
              startClientGrid();
              
              ignite.snapshot().createSnapshot(SNAPSHOT_NAME)
                  .get();
      
              int activeThreadsCntBefore = Thread.activeCount();
      
              int iterations = 10;
      
              for (int i = 0; i < iterations; i++)
                  snp(ignite).checkSnapshot(SNAPSHOT_NAME).get();
      
              int createdThreads = Thread.activeCount() - activeThreadsCntBefore;
      
              assertTrue("Threads created: " + createdThreads, createdThreads < iterations);
          }
      

      Reproducer shows that 10 snapshot checks add approx ~250 new threads.

      The dump of "leaked" thread looks like this:

      "binary-metadata-writer-#2208" #2249 prio=5 os_prio=0 tid=0x00007f9974087000 nid=0x65b38 waiting on condition [0x00007f986cf9c000]
         java.lang.Thread.State: WAITING (parking)
      	at sun.misc.Unsafe.park(Native Method)
      	- parking to wait for  <merged>(a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
      	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
      	at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
      	at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
      	at org.apache.ignite.internal.processors.cache.binary.BinaryMetadataFileStore$BinaryMetadataAsyncWriter.body0(BinaryMetadataFileStore.java:460)
      	at org.apache.ignite.internal.processors.cache.binary.BinaryMetadataFileStore$BinaryMetadataAsyncWriter.body(BinaryMetadataFileStore.java:441)
      	at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
      	at java.lang.Thread.run(Thread.java:748)
      

      Attachments

        Issue Links

          Activity

            People

              mmuzaf Maxim Muzafarov
              xtern Pavel Pereslegin
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 0.5h
                  0.5h