Uploaded image for project: 'Ignite'
  1. Ignite
  2. IGNITE-22139

JDBC request to degraded cluster freezes forever

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Cannot Reproduce
    • 3.0.0-beta1
    • None
    • The 2 or 3 nodes cluster running locally.

    • Docs Required, Release Notes Required

    Description

      Steps to reproduce:

      1. Create zone with replication equals to amount of nodes (2 or 3 corresponding)
      2. Create 10 tables inside the zone.
      3. Insert 100 rows in every table.
      4. Await all tables*partitions*nodes local state is "HEALTHY"
      5. Await all tables*partitions*nodes global state is "AVAILABLE"
      6. Kill first node with kill -9.
      7. Assert all tables*partitions*nodes local state is "HEALTHY"
      8. Await all tables*partitions*nodes global state is "READ_ONLY" for 2 nodes cluster or "DEGRADED" for 3 nodes cluster,
      9. Execute select query using JDBC connecting to the second node (which is alive).

      Expected:

      Data is returned.

      Actual:
      The select query at step 9 freezes forever.
      The errors on the server side:

      2024-04-30 00:04:02:965 +0200 [ERROR][%ClusterFailover3NodesTest_cluster_1%JRaft-StepDownTimer-8][AbstractClientService] Fail to connect ClusterFailover3NodesTest_cluster_0, exception: java.util.concurrent.TimeoutException.
      2024-04-30 00:04:02:965 +0200 [ERROR][%ClusterFailover3NodesTest_cluster_1%JRaft-StepDownTimer-8][ReplicatorGroupImpl] Fail to check replicator connection to peer=ClusterFailover3NodesTest_cluster_0, replicatorType=Follower.
      2024-04-30 00:04:02:980 +0200 [ERROR][%ClusterFailover3NodesTest_cluster_1%JRaft-Response-Processor-1][AbstractClientService] Fail to connect ClusterFailover3NodesTest_cluster_0, exception: java.util.concurrent.TimeoutException.
      2024-04-30 00:04:02:980 +0200 [ERROR][%ClusterFailover3NodesTest_cluster_1%JRaft-Response-Processor-1][ReplicatorGroupImpl] Fail to check replicator connection to peer=ClusterFailover3NodesTest_cluster_0, replicatorType=Follower.
      2024-04-30 00:04:02:981 +0200 [ERROR][%ClusterFailover3NodesTest_cluster_1%JRaft-Response-Processor-1][NodeImpl] Fail to add a replicator, peer=ClusterFailover3NodesTest_cluster_0.
      2024-04-30 00:04:02:981 +0200 [WARNING][ClusterFailover3NodesTest_cluster_1-client-8][RaftGroupServiceImpl] Recoverable error during the request occurred (will be retried on the randomly selected node) [request=WriteActionRequestImpl [command=[0, 9, 41, -117, -128, -8, -15, -83, -4, -54, -57, 1], deserializedCommand=SafeTimeSyncCommandImpl [safeTimeLong=112356769098760202], groupId=26_part_10], peer=Peer [consistentId=ClusterFailover3NodesTest_cluster_0, idx=0], newPeer=Peer [consistentId=ClusterFailover3NodesTest_cluster_1, idx=0]].
      java.util.concurrent.CompletionException: io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: no further information: /192.168.100.5:3344
        at java.base/java.util.concurrent.CompletableFuture.encodeRelay(CompletableFuture.java:367)
        at java.base/java.util.concurrent.CompletableFuture.completeRelay(CompletableFuture.java:376)
        at java.base/java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:1074)
        at java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506)
        at java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2088)
        at org.apache.ignite.internal.network.netty.NettyUtils.lambda$toCompletableFuture$0(NettyUtils.java:74)
        at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:590)
        at io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:583)
        at io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:559)
        at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:492)
        at io.netty.util.concurrent.DefaultPromise.setValue0(DefaultPromise.java:636)
        at io.netty.util.concurrent.DefaultPromise.setFailure0(DefaultPromise.java:629)
        at io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:118)
        at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.fulfillConnectPromise(AbstractNioChannel.java:326)
        at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:342)
        at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:776)
        at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:724)
        at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:650)
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562)
        at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
        at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
        at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
        at java.base/java.lang.Thread.run(Thread.java:834)
      Caused by: io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: no further information: /192.168.100.5:3344
      Caused by: java.net.ConnectException: Connection refused: no further information
        at java.base/sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
        at java.base/sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:779)
        at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:337)
        at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:339)
        at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:776)
        at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:724)
        at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:650)
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562)
        at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
        at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
        at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
        at java.base/java.lang.Thread.run(Thread.java:834)
      2024-04-30 00:04:02:982 +0200 [ERROR][%ClusterFailover3NodesTest_cluster_1%JRaft-StepDownTimer-0][AbstractClientService] Fail to connect ClusterFailover3NodesTest_cluster_0, exception: java.util.concurrent.TimeoutException. 

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              lunigorn Igor
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: