Uploaded image for project: 'Ignite'
  1. Ignite
  2. IGNITE-21663

Cluster load balancing when 1 node is killed doesn't work

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 3.0.0-beta1
    • 3.0
    • persistence, sql
    • Docs Required, Release Notes Required

    Description

      Steps to reproduce:

      1. Start cluster with 2 nodes running locally.
      2. Make connection like this:
      try (IgniteClient igniteClient = IgniteClient.builder().retryPolicy(new RetryLimitPolicy()).addresses(thinClientEndpoints.toArray(new String[]{"localhost:10800","localhost:10801"})).build()) {
          try (Session session = igniteClient.sql().createSession()) {
              //code here
          }
      } 

      3. Create table with replication 2

      4. Execute insert 1 row and select from the table.
      5. Kill first node (in list of connection)
      6. Execute select from the table.

      Expected:
      Cluster works with one node.
      Actual:
      The exception on select after first node is killed, the select is not executed.

      org.apache.ignite.sql.SqlException: IGN-CMN-65535 TraceId:92e48867-2e6e-4730-9781-527a4e204b32 Unable to send fragment [targetNode=ConnectionTest_cluster_0, fragmentId=1, cause=io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: no further information: /192.168.100.5:3344]
          at java.base/java.lang.invoke.MethodHandle.invokeWithArguments(MethodHandle.java:710)
          at org.apache.ignite.internal.util.ExceptionUtils$1.copy(ExceptionUtils.java:765)
          at org.apache.ignite.internal.util.ExceptionUtils$ExceptionFactory.createCopy(ExceptionUtils.java:699)
          at org.apache.ignite.internal.util.ExceptionUtils.copyExceptionWithCause(ExceptionUtils.java:525)
          at org.apache.ignite.internal.util.ExceptionUtils.copyExceptionWithCauseInternal(ExceptionUtils.java:634)
          at org.apache.ignite.internal.util.ExceptionUtils.copyExceptionWithCause(ExceptionUtils.java:476)
          at org.apache.ignite.internal.sql.AbstractSession.execute(AbstractSession.java:63)
          at org.gridgain.ai3tests.tests.teststeps.ThinClientSteps.lambda$executeQuery$0(ThinClientSteps.java:61)
          at io.qameta.allure.Allure.lambda$step$1(Allure.java:127)
          at io.qameta.allure.Allure.step(Allure.java:181)
          at io.qameta.allure.Allure.step(Allure.java:125)
          at org.gridgain.ai3tests.tests.teststeps.ThinClientSteps.executeQuery(ThinClientSteps.java:61)
          at org.gridgain.ai3tests.tests.ConnectionTest.testThinClientConnectionToMultipleHost(ConnectionTest.java:93)
          at java.base/java.lang.reflect.Method.invoke(Method.java:566)
          at java.base/java.util.ArrayList.forEach(ArrayList.java:1540)
          at java.base/java.util.ArrayList.forEach(ArrayList.java:1540)
      Caused by: java.util.concurrent.CompletionException: org.apache.ignite.sql.SqlException: IGN-CMN-65535 TraceId:92e48867-2e6e-4730-9781-527a4e204b32 Unable to send fragment [targetNode=ConnectionTest_cluster_0, fragmentId=1, cause=io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: no further information: /192.168.100.5:3344]
          at java.base/java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:331)
          at java.base/java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:346)
          at java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:870)
          at java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:837)
          at java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506)
          at java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2088)
          at org.apache.ignite.internal.client.TcpClientChannel.processNextMessage(TcpClientChannel.java:419)
          at org.apache.ignite.internal.client.TcpClientChannel.lambda$onMessage$3(TcpClientChannel.java:238)
          at java.base/java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(ForkJoinTask.java:1426)
          at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290)
          at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1020)
          at java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1656)
          at java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1594)
          at java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:177)
      Caused by: org.apache.ignite.sql.SqlException: IGN-CMN-65535 TraceId:92e48867-2e6e-4730-9781-527a4e204b32 Unable to send fragment [targetNode=ConnectionTest_cluster_0, fragmentId=1, cause=io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: no further information: /192.168.100.5:3344]
          at java.base/java.lang.invoke.MethodHandle.invokeWithArguments(MethodHandle.java:710)
          at org.apache.ignite.internal.util.ExceptionUtils$1.copy(ExceptionUtils.java:765)
          at org.apache.ignite.internal.util.ExceptionUtils$ExceptionFactory.createCopy(ExceptionUtils.java:699)
          at org.apache.ignite.internal.util.ExceptionUtils.copyExceptionWithCause(ExceptionUtils.java:525)
          at org.apache.ignite.internal.client.TcpClientChannel.readError(TcpClientChannel.java:508)
          at org.apache.ignite.internal.client.TcpClientChannel.processNextMessage(TcpClientChannel.java:397)
          ... 7 more 

      Comments:
      The Java client makes request to the working node, but the working node tries to connect to the killed one and gets connection exception.

       

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              lunigorn Igor
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated: