Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
ignite-1.4, 1.5.0.final
-
None
-
None
-
Mac OS X 10.11
java version "1.8.0_66"
Java(TM) SE Runtime Environment (build 1.8.0_66-b17)
Java HotSpot(TM) 64-Bit Server VM (build 25.66-b17, mixed mode)
Ignite 1.4.0 + Ignite-1.5.0-b1
Description
We're deploying a service on a cluster group. The cluster group is filtered using a predicate. The predicate accesses a cache key to determine if it should return true for a node or not.
If a new node joins the cluster while the predicate is running a deadlock is encountered: cache functions do not return and the node does not finish joining the cluster.
I've created a sample program to reproduce this on 1.4.0 and also tried to reproduce it on 1.5.0-b1;
On 1.4.0 the deadlock is usually reproduced, on 1.5.0-b1 it might take a few tries (I'm guessing it's a timing issue.. I tried to improve the odds of reproduction by using a countdown latch).
Code:
import org.apache.ignite.Ignite; import org.apache.ignite.IgniteCache; import org.apache.ignite.Ignition; import org.apache.ignite.cache.CacheAtomicWriteOrderMode; import org.apache.ignite.cache.CacheAtomicityMode; import org.apache.ignite.cache.CacheMode; import org.apache.ignite.cache.CacheWriteSynchronizationMode; import org.apache.ignite.cluster.ClusterGroup; import org.apache.ignite.configuration.CacheConfiguration; import org.apache.ignite.configuration.IgniteConfiguration; import org.apache.ignite.services.Service; import org.apache.ignite.services.ServiceContext; import java.util.concurrent.CountDownLatch; /** * Created by noliran on 08/12/2015. */ public class Main { public static IgniteCache<String, Object> cache1; public static CountDownLatch latch = new CountDownLatch(1); public static CacheConfiguration<String, Object> CACHE_CONFIG = new CacheConfiguration<String, Object>() .setName("testCache") .setAtomicityMode(CacheAtomicityMode.ATOMIC) .setCacheMode(CacheMode.REPLICATED) .setWriteSynchronizationMode(CacheWriteSynchronizationMode.FULL_SYNC) .setAtomicWriteOrderMode(CacheAtomicWriteOrderMode.PRIMARY); public static void main(String[] args) throws InterruptedException { IgniteConfiguration igniteConfiguration = new IgniteConfiguration(); Ignite ignite1 = Ignition.start(igniteConfiguration.setGridName("grid1")); System.out.println("Creating cache"); cache1 = ignite1.getOrCreateCache(CACHE_CONFIG); ClusterGroup group = ignite1.cluster().forPredicate(node -> { System.out.println("predicate: starting"); latch.countDown(); try { System.out.println("predicate: before sleep"); Thread.sleep(10_000); } catch (InterruptedException e) { e.printStackTrace(); } System.out.println("predicate: before containsKey"); boolean b1 = cache1.containsKey(node.id().toString()); System.out.println("predicate: returning"); return b1; }); System.out.println("Deploying service with cache-based predicate"); new Thread(() -> { ignite1.services(group).deployNodeSingleton("testService", new TestService()); }).start(); System.out.println("Service in deployment."); latch.await(); System.out.println("Starting second Ignite instance.."); Ignite ignite2 = Ignition.start(igniteConfiguration.setGridName("grid2")); System.out.println("Second Ignite instance started successfully!"); // This isn't going to be printed. } public static class TestService implements Service { public void execute(ServiceContext ctx) throws Exception { System.out.println("execute()"); } public void init(ServiceContext ctx) throws Exception { System.out.println("init()"); } public void cancel(ServiceContext ctx) { System.out.println("cancel()"); } } }
Output:
[16:27:01] __________ ________________ [16:27:01] / _/ ___/ |/ / _/_ __/ __/ [16:27:01] _/ // (7 7 // / / / / _/ [16:27:01] /___/\___/_/|_/___/ /_/ /___/ [16:27:01] [16:27:01] ver. 1.5.0-b1#20151201-sha1:062d440c [16:27:01] 2015 Copyright(C) Apache Software Foundation [16:27:01] [16:27:01] Ignite documentation: http://ignite.apache.org [16:27:01] [16:27:01] Quiet mode. [16:27:01] ^-- To see **FULL** console log here add -DIGNITE_QUIET=false or "-v" to ignite.{sh|bat} [16:27:01] [16:27:01] OS: Mac OS X 10.11 x86_64 [16:27:01] VM information: Java(TM) SE Runtime Environment 1.8.0_66-b17 Oracle Corporation Java HotSpot(TM) 64-Bit Server VM 25.66-b17 [16:27:01] Initial heap size is 256MB (should be no less than 512MB, use -Xms512m -Xmx512m). [16:27:01] Configured plugins: [16:27:01] ^-- None [16:27:01] [16:27:01] Security status [authentication=off, tls/ssl=off] [16:27:03] To start Console Management & Monitoring run ignitevisorcmd.{sh|bat} [16:27:03] [16:27:03] Ignite node started OK (id=4821080c, grid=grid1) [16:27:03] Topology snapshot [ver=1, servers=1, clients=0, CPUs=4, heap=3.6GB] Creating cache Deploying service with cache-based predicate Service in deployment. predicate: starting predicate: before sleep Starting second Ignite instance.. [16:27:03] __________ ________________ [16:27:03] / _/ ___/ |/ / _/_ __/ __/ [16:27:03] _/ // (7 7 // / / / / _/ [16:27:03] /___/\___/_/|_/___/ /_/ /___/ [16:27:03] [16:27:03] ver. 1.5.0-b1#20151201-sha1:062d440c [16:27:03] 2015 Copyright(C) Apache Software Foundation [16:27:03] [16:27:03] Ignite documentation: http://ignite.apache.org [16:27:03] [16:27:03] Quiet mode. [16:27:03] ^-- To see **FULL** console log here add -DIGNITE_QUIET=false or "-v" to ignite.{sh|bat} [16:27:03] [16:27:03] OS: Mac OS X 10.11 x86_64 [16:27:03] VM information: Java(TM) SE Runtime Environment 1.8.0_66-b17 Oracle Corporation Java HotSpot(TM) 64-Bit Server VM 25.66-b17 [16:27:03] Initial heap size is 256MB (should be no less than 512MB, use -Xms512m -Xmx512m). [16:27:03] Configured plugins: [16:27:03] ^-- None [16:27:03] [16:27:03] Security status [authentication=off, tls/ssl=off] [16:27:04] Topology snapshot [ver=2, servers=2, clients=0, CPUs=4, heap=3.6GB] predicate: before containsKey