Uploaded image for project: 'Ignite'
  1. Ignite
  2. IGNITE-2146

Service deployment using a cache-accessing-predicate in node filter sometimes causes a deadlock

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • ignite-1.4, 1.5.0.final
    • 1.5.0.final
    • None
    • None
    • Mac OS X 10.11
      java version "1.8.0_66"
      Java(TM) SE Runtime Environment (build 1.8.0_66-b17)
      Java HotSpot(TM) 64-Bit Server VM (build 25.66-b17, mixed mode)
      Ignite 1.4.0 + Ignite-1.5.0-b1

    Description

      We're deploying a service on a cluster group. The cluster group is filtered using a predicate. The predicate accesses a cache key to determine if it should return true for a node or not.

      If a new node joins the cluster while the predicate is running a deadlock is encountered: cache functions do not return and the node does not finish joining the cluster.

      I've created a sample program to reproduce this on 1.4.0 and also tried to reproduce it on 1.5.0-b1;
      On 1.4.0 the deadlock is usually reproduced, on 1.5.0-b1 it might take a few tries (I'm guessing it's a timing issue.. I tried to improve the odds of reproduction by using a countdown latch).

      Code:

      Main.java
      import org.apache.ignite.Ignite;
      import org.apache.ignite.IgniteCache;
      import org.apache.ignite.Ignition;
      import org.apache.ignite.cache.CacheAtomicWriteOrderMode;
      import org.apache.ignite.cache.CacheAtomicityMode;
      import org.apache.ignite.cache.CacheMode;
      import org.apache.ignite.cache.CacheWriteSynchronizationMode;
      import org.apache.ignite.cluster.ClusterGroup;
      import org.apache.ignite.configuration.CacheConfiguration;
      import org.apache.ignite.configuration.IgniteConfiguration;
      import org.apache.ignite.services.Service;
      import org.apache.ignite.services.ServiceContext;
      
      import java.util.concurrent.CountDownLatch;
      
      /**
       * Created by noliran on 08/12/2015.
       */
      public class Main {
          public static IgniteCache<String, Object> cache1;
          public static CountDownLatch latch = new CountDownLatch(1);
      
          public static CacheConfiguration<String, Object> CACHE_CONFIG = new CacheConfiguration<String, Object>()
                  .setName("testCache")
                  .setAtomicityMode(CacheAtomicityMode.ATOMIC)
                  .setCacheMode(CacheMode.REPLICATED)
                  .setWriteSynchronizationMode(CacheWriteSynchronizationMode.FULL_SYNC)
                  .setAtomicWriteOrderMode(CacheAtomicWriteOrderMode.PRIMARY);
      
          public static void main(String[] args) throws InterruptedException {
              IgniteConfiguration igniteConfiguration = new IgniteConfiguration();
              Ignite ignite1 = Ignition.start(igniteConfiguration.setGridName("grid1"));
      
              System.out.println("Creating cache");
              cache1 = ignite1.getOrCreateCache(CACHE_CONFIG);
      
              ClusterGroup group = ignite1.cluster().forPredicate(node -> {
                  System.out.println("predicate: starting");
                  latch.countDown();
      
                  try {
                      System.out.println("predicate: before sleep");
                      Thread.sleep(10_000);
                  } catch (InterruptedException e) {
                      e.printStackTrace();
                  }
      
                  System.out.println("predicate: before containsKey");
                  boolean b1 = cache1.containsKey(node.id().toString());
      
                  System.out.println("predicate: returning");
                  return b1;
              });
      
              System.out.println("Deploying service with cache-based predicate");
              new Thread(() -> {
                  ignite1.services(group).deployNodeSingleton("testService", new TestService());
              }).start();
              System.out.println("Service in deployment.");
      
              latch.await();
      
              System.out.println("Starting second Ignite instance..");
              Ignite ignite2 = Ignition.start(igniteConfiguration.setGridName("grid2"));
              System.out.println("Second Ignite instance started successfully!"); // This isn't going to be printed.
          }
      
          public static class TestService implements Service
          {
              public void execute(ServiceContext ctx) throws Exception { System.out.println("execute()"); }
              public void init(ServiceContext ctx) throws Exception { System.out.println("init()"); }
              public void cancel(ServiceContext ctx) { System.out.println("cancel()"); }
          }
      }
      
      

      Output:

      [16:27:01]    __________  ________________ 
      [16:27:01]   /  _/ ___/ |/ /  _/_  __/ __/ 
      [16:27:01]  _/ // (7 7    // /  / / / _/   
      [16:27:01] /___/\___/_/|_/___/ /_/ /___/  
      [16:27:01] 
      [16:27:01] ver. 1.5.0-b1#20151201-sha1:062d440c
      [16:27:01] 2015 Copyright(C) Apache Software Foundation
      [16:27:01] 
      [16:27:01] Ignite documentation: http://ignite.apache.org
      [16:27:01] 
      [16:27:01] Quiet mode.
      [16:27:01]   ^-- To see **FULL** console log here add -DIGNITE_QUIET=false or "-v" to ignite.{sh|bat}
      [16:27:01] 
      [16:27:01] OS: Mac OS X 10.11 x86_64
      [16:27:01] VM information: Java(TM) SE Runtime Environment 1.8.0_66-b17 Oracle Corporation Java HotSpot(TM) 64-Bit Server VM 25.66-b17
      [16:27:01] Initial heap size is 256MB (should be no less than 512MB, use -Xms512m -Xmx512m).
      [16:27:01] Configured plugins:
      [16:27:01]   ^-- None
      [16:27:01] 
      [16:27:01] Security status [authentication=off, tls/ssl=off]
      [16:27:03] To start Console Management & Monitoring run ignitevisorcmd.{sh|bat}
      [16:27:03] 
      [16:27:03] Ignite node started OK (id=4821080c, grid=grid1)
      [16:27:03] Topology snapshot [ver=1, servers=1, clients=0, CPUs=4, heap=3.6GB]
      Creating cache
      Deploying service with cache-based predicate
      Service in deployment.
      predicate: starting
      predicate: before sleep
      Starting second Ignite instance..
      [16:27:03]    __________  ________________ 
      [16:27:03]   /  _/ ___/ |/ /  _/_  __/ __/ 
      [16:27:03]  _/ // (7 7    // /  / / / _/   
      [16:27:03] /___/\___/_/|_/___/ /_/ /___/  
      [16:27:03] 
      [16:27:03] ver. 1.5.0-b1#20151201-sha1:062d440c
      [16:27:03] 2015 Copyright(C) Apache Software Foundation
      [16:27:03] 
      [16:27:03] Ignite documentation: http://ignite.apache.org
      [16:27:03] 
      [16:27:03] Quiet mode.
      [16:27:03]   ^-- To see **FULL** console log here add -DIGNITE_QUIET=false or "-v" to ignite.{sh|bat}
      [16:27:03] 
      [16:27:03] OS: Mac OS X 10.11 x86_64
      [16:27:03] VM information: Java(TM) SE Runtime Environment 1.8.0_66-b17 Oracle Corporation Java HotSpot(TM) 64-Bit Server VM 25.66-b17
      [16:27:03] Initial heap size is 256MB (should be no less than 512MB, use -Xms512m -Xmx512m).
      [16:27:03] Configured plugins:
      [16:27:03]   ^-- None
      [16:27:03] 
      [16:27:03] Security status [authentication=off, tls/ssl=off]
      [16:27:04] Topology snapshot [ver=2, servers=2, clients=0, CPUs=4, heap=3.6GB]
      predicate: before containsKey
      

      Attachments

        1. stacktrace.txt
          117 kB
          Noam Liran

        Activity

          People

            Unassigned Unassigned
            noliran Noam Liran
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: