Uploaded image for project: 'Ignite'
  1. Ignite
  2. IGNITE-1977

IgniteSemaphore's failover related tests lead to the deadlock or fail

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 1.8, 1.9
    • 2.0
    • data structures
    • None

    Description

      All IgniteSemaphore related tests from GridCacheAbstractDataStructuresFailoverSelfTest may cause a deadlock which leads to the whole suite hanging.

      The threads are waiting for the following condition:

      "topology-change-thread-3" prio=6 tid=0x000000001d98d800 nid=0x2b20 waiting on condition [0x000000002066f000]
         java.lang.Thread.State: WAITING (parking)
      	at sun.misc.Unsafe.park(Native Method)
      	- parking to wait for  <0x0000000798149948> (a org.apache.ignite.internal.processors.datastructures.GridCacheSemaphoreImpl$Sync)
      	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
      	at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
      	at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:994)
      	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1303)
      	at org.apache.ignite.internal.processors.datastructures.GridCacheSemaphoreImpl.acquire(GridCacheSemaphoreImpl.java:538)
      	at org.apache.ignite.internal.processors.datastructures.GridCacheSemaphoreImpl.acquire(GridCacheSemaphoreImpl.java:525)
      	at org.apache.ignite.internal.processors.cache.datastructures.GridCacheAbstractDataStructuresFailoverSelfTest$7.apply(GridCacheAbstractDataStructuresFailoverSelfTest.java:571)
      	at org.apache.ignite.internal.util.lang.GridAbsClosure.run(GridAbsClosure.java:50)
      	at org.apache.ignite.testframework.GridTestUtils$7.call(GridTestUtils.java:967)
      	at org.apache.ignite.testframework.GridTestThread.run(GridTestThread.java:86)
      

      Probably the semaphore is not properly released when a node leaves the topology abruptly.

      In addition the tests should be rewritten to the way which is followed by other data structures and atomics from this suite: using ConstantTopologyChangeWorker and its descendants.

      Attachments

        Issue Links

          Activity

            People

              vladisav Vladisav Jelisavcic
              dmagda Denis A. Magda
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 20m
                  20m