Uploaded image for project: 'Ignite'
  1. Ignite
  2. IGNITE-21725

The exception "Primary replica has expired" on creation of 1000 tables

Attach filesAttach ScreenshotAdd voteVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 3.0.0-beta1
    • None
    • general, persistence
    • Docs Required, Release Notes Required

    Description

      Steps to reproduce:

      1. Start cluster with 1 node with JVM options: "-Xms4096m -Xmx4096m"

      2. Create 1000 tables with 200 varchar columns each  and insert 1 row into each. One by one.

      Expected result:
      Tables are created.

      Actual result:

      On table 949 the exception is thrown:

      java.sql.SQLException: Primary replica has expired, transaction will be rolled back: [groupId = 1850_part_21, expected enlistment consistency token = 112069202113202526, commit timestamp = HybridTimestamp [physical=2024-03-10 03:13:16:057 +0000, logical=396, composite=112069207395991948], current primary replica = null]
                at org.apache.ignite.internal.jdbc.proto.IgniteQueryErrorCode.createJdbcSqlException(IgniteQueryErrorCode.java:57)
                at org.apache.ignite.internal.jdbc.JdbcStatement.execute0(JdbcStatement.java:154)
                at org.apache.ignite.internal.jdbc.JdbcPreparedStatement.executeWithArguments(JdbcPreparedStatement.java:765)
                at org.apache.ignite.internal.jdbc.JdbcPreparedStatement.executeUpdate(JdbcPreparedStatement.java:173)
                at org.gridgain.ai3tests.tests.TablesAmountCapacityTest.lambda$insertRowAndAssertTimeout$1(TablesAmountCapacityTest.java:166)
                at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
                at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
                at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
                at java.base/java.lang.Thread.run(Thread.java:834) 

      In server logs there is an exception:

      2024-03-10 03:13:24:222 +0000 [WARNING][%TablesAmountCapacityTest_cluster_0%partition-operations-8][TxManagerImpl] Failed to finish Tx. The operation will be retried [txId=018e2659-b09f-009c-23c0-6ab500000001].
      java.util.concurrent.CompletionException: org.apache.ignite.internal.replicator.exception.ReplicationTimeoutException: IGN-REP-3 TraceId:7ff7e851-9f18-4212-b317-a70a0a92fdfe Replication is timed out [replicaGrpId=1850_part_21]
          at java.base/java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:331)
          at java.base/java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:346)
          at java.base/java.util.concurrent.CompletableFuture$UniAccept.tryFire(CompletableFuture.java:704)
          at java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506)
          at java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2088)
          at org.apache.ignite.internal.replicator.ReplicaService.lambda$sendToReplica$0(ReplicaService.java:110)
          at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
          at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
          at java.base/java.lang.Thread.run(Thread.java:834)
      Caused by: org.apache.ignite.internal.replicator.exception.ReplicationTimeoutException: IGN-REP-3 TraceId:7ff7e851-9f18-4212-b317-a70a0a92fdfe Replication is timed out [replicaGrpId=1850_part_21]
          ... 4 more
      2024-03-10 03:13:24:290 +0000 [WARNING][%TablesAmountCapacityTest_cluster_0%partition-operations-22][TrackableNetworkMessageHandler] Message handling has been too long [duration=67ms, message=[class org.apache.ignite.raft.jraft.rpc.WriteActionRequestImpl]]
      2024-03-10 03:13:24:290 +0000 [WARNING][%TablesAmountCapacityTest_cluster_0%partition-operations-11][TrackableNetworkMessageHandler] Message handling has been too long [duration=67ms, message=[class org.apache.ignite.raft.jraft.rpc.WriteActionRequestImpl]]
      2024-03-10 03:13:24:290 +0000 [WARNING][%TablesAmountCapacityTest_cluster_0%partition-operations-19][TrackableNetworkMessageHandler] Message handling has been too long [duration=67ms, message=[class org.apache.ignite.raft.jraft.rpc.WriteActionRequestImpl]]
      2024-03-10 03:13:24:290 +0000 [WARNING][%TablesAmountCapacityTest_cluster_0%partition-operations-17][TrackableNetworkMessageHandler] Message handling has been too long [duration=67ms, message=[class org.apache.ignite.raft.jraft.rpc.WriteActionRequestImpl]]
      2024-03-10 03:13:24:290 +0000 [WARNING][%TablesAmountCapacityTest_cluster_0%partition-operations-23][TrackableNetworkMessageHandler] Message handling has been too long [duration=67ms, message=[class org.apache.ignite.raft.jraft.rpc.WriteActionRequestImpl]]
      2024-03-10 03:13:24:290 +0000 [WARNING][%TablesAmountCapacityTest_cluster_0%partition-operations-6][TrackableNetworkMessageHandler] Message handling has been too long [duration=67ms, message=[class org.apache.ignite.raft.jraft.rpc.WriteActionRequestImpl]]
      2024-03-10 03:13:24:290 +0000 [WARNING][%TablesAmountCapacityTest_cluster_0%partition-operations-9][TrackableNetworkMessageHandler] Message handling has been too long [duration=67ms, message=[class org.apache.ignite.raft.jraft.rpc.WriteActionRequestImpl]]
      2024-03-10 03:13:24:290 +0000 [WARNING][%TablesAmountCapacityTest_cluster_0%partition-operations-20][TrackableNetworkMessageHandler] Message handling has been too long [duration=67ms, message=[class org.apache.ignite.raft.jraft.rpc.WriteActionRequestImpl]]
      2024-03-10 03:13:24:290 +0000 [WARNING][%TablesAmountCapacityTest_cluster_0%partition-operations-10][TrackableNetworkMessageHandler] Message handling has been too long [duration=67ms, message=[class org.apache.ignite.raft.jraft.rpc.WriteActionRequestImpl]]
      2024-03-10 03:13:24:291 +0000 [WARNING][%TablesAmountCapacityTest_cluster_0%partition-operations-12][TrackableNetworkMessageHandler] Message handling has been too long [duration=67ms, message=[class org.apache.ignite.raft.jraft.rpc.WriteActionRequestImpl]]
      2024-03-10 03:13:24:291 +0000 [WARNING][%TablesAmountCapacityTest_cluster_0%partition-operations-2][TrackableNetworkMessageHandler] Message handling has been too long [duration=67ms, message=[class org.apache.ignite.raft.jraft.rpc.WriteActionRequestImpl]]
      2024-03-10 03:13:24:290 +0000 [WARNING][%TablesAmountCapacityTest_cluster_0%partition-operations-7][TrackableNetworkMessageHandler] Message handling has been too long [duration=67ms, message=[class org.apache.ignite.raft.jraft.rpc.WriteActionRequestImpl]]
      2024-03-10 03:13:24:290 +0000 [WARNING][%TablesAmountCapacityTest_cluster_0%partition-operations-18][TrackableNetworkMessageHandler] Message handling has been too long [duration=67ms, message=[class org.apache.ignite.raft.jraft.rpc.WriteActionRequestImpl]]
      2024-03-10 03:13:24:290 +0000 [WARNING][%TablesAmountCapacityTest_cluster_0%partition-operations-15][TrackableNetworkMessageHandler] Message handling has been too long [duration=67ms, message=[class org.apache.ignite.raft.jraft.rpc.WriteActionRequestImpl]]
      2024-03-10 03:13:24:290 +0000 [WARNING][%TablesAmountCapacityTest_cluster_0%partition-operations-13][TrackableNetworkMessageHandler] Message handling has been too long [duration=67ms, message=[class org.apache.ignite.raft.jraft.rpc.WriteActionRequestImpl]]
      2024-03-10 03:13:24:291 +0000 [WARNING][%TablesAmountCapacityTest_cluster_0%partition-operations-4][TrackableNetworkMessageHandler] Message handling has been too long [duration=67ms, message=[class org.apache.ignite.raft.jraft.rpc.WriteActionRequestImpl]]
      2024-03-10 03:13:24:291 +0000 [WARNING][%TablesAmountCapacityTest_cluster_0%partition-operations-3][TrackableNetworkMessageHandler] Message handling has been too long [duration=67ms, message=[class org.apache.ignite.raft.jraft.rpc.WriteActionRequestImpl]]
      2024-03-10 03:13:24:290 +0000 [WARNING][%TablesAmountCapacityTest_cluster_0%partition-operations-1][TrackableNetworkMessageHandler] Message handling has been too long [duration=67ms, message=[class org.apache.ignite.raft.jraft.rpc.WriteActionRequestImpl]]
      2024-03-10 03:13:24:291 +0000 [WARNING][%TablesAmountCapacityTest_cluster_0%scheduled-idle-safe-time-sync-thread-0][TrackableNetworkMessageHandler] Message handling has been too long [duration=67ms, message=[class org.apache.ignite.raft.jraft.rpc.WriteActionRequestImpl]]
      2024-03-10 03:13:24:291 +0000 [WARNING][%TablesAmountCapacityTest_cluster_0%partition-operations-21][TrackableNetworkMessageHandler] Message handling has been too long [duration=67ms, message=[class org.apache.ignite.raft.jraft.rpc.WriteActionRequestImpl]]
      2024-03-10 03:13:24:291 +0000 [WARNING][%TablesAmountCapacityTest_cluster_0%partition-operations-24][TrackableNetworkMessageHandler] Message handling has been too long [duration=67ms, message=[class org.apache.ignite.raft.jraft.rpc.WriteActionRequestImpl]]
      2024-03-10 03:13:24:291 +0000 [WARNING][%TablesAmountCapacityTest_cluster_0%partition-operations-14][TrackableNetworkMessageHandler] Message handling has been too long [duration=67ms, message=[class org.apache.ignite.raft.jraft.rpc.WriteActionRequestImpl]]
      2024-03-10 03:13:24:291 +0000 [WARNING][%TablesAmountCapacityTest_cluster_0%partition-operations-5][TrackableNetworkMessageHandler] Message handling has been too long [duration=67ms, message=[class org.apache.ignite.raft.jraft.rpc.WriteActionRequestImpl]]
      2024-03-10 03:13:24:290 +0000 [WARNING][%TablesAmountCapacityTest_cluster_0%partition-operations-16][TrackableNetworkMessageHandler] Message handling has been too long [duration=67ms, message=[class org.apache.ignite.raft.jraft.rpc.WriteActionRequestImpl]]
      2024-03-10 03:13:24:397 +0000 [INFO][%TablesAmountCapacityTest_cluster_0%partition-operations-4][JdbcQueryEventHandlerImpl] Exception while executing query [query=insert into test_table_922(id, column_1, column_2, column_3, column_4, column_5, column_6, column_7, column_8, column_9, column_10, column_11, column_12, column_13, column_14, column_15, column_16, column_17, column_18, column_19, column_20, column_21, column_22, column_23, column_24, column_25, column_26, column_27, column_28, column_29, column_30, column_31, column_32, column_33, column_34, column_35, column_36, column_37, column_38, column_39, column_40, column_41, column_42, column_43, column_44, column_45, column_46, column_47, column_48, column_49, column_50, column_51, column_52, column_53, column_54, column_55, column_56, column_57, column_58, column_59, column_60, column_61, column_62, column_63, column_64, column_65, column_66, column_67, column_68, column_69, column_70, column_71, column_72, column_73, column_74, column_75, column_76, column_77, column_78, column_79, column_80, column_81, column_82, column_83, column_84, column_85, column_86, column_87, column_88, column_89, column_90, column_91, column_92, column_93, column_94, column_95, column_96, column_97, column_98, column_99, column_100, column_101, column_102, column_103, column_104, column_105, column_106, column_107, column_108, column_109, column_110, column_111, column_112, column_113, column_114, column_115, column_116, column_117, column_118, column_119, column_120, column_121, column_122, column_123, column_124, column_125, column_126, column_127, column_128, column_129, column_130, column_131, column_132, column_133, column_134, column_135, column_136, column_137, column_138, column_139, column_140, column_141, column_142, column_143, column_144, column_145, column_146, column_147, column_148, column_149, column_150, column_151, column_152, column_153, column_154, column_155, column_156, column_157, column_158, column_159, column_160, column_161, column_162, column_163, column_164, column_165, column_166, column_167, column_168, column_169, column_170, column_171, column_172, column_173, column_174, column_175, column_176, column_177, column_178, column_179, column_180, column_181, column_182, column_183, column_184, column_185, column_186, column_187, column_188, column_189, column_190, column_191, column_192, column_193, column_194, column_195, column_196, column_197, column_198, column_199) values (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)]
      org.apache.ignite.internal.tx.impl.PrimaryReplicaExpiredException: IGN-TX-13 TraceId:98224b48-892e-4831-b5de-4d4fb09e55b5 Primary replica has expired, transaction will be rolled back: [groupId = 1850_part_21, expected enlistment consistency token = 112069202113202526, commit timestamp = HybridTimestamp [physical=2024-03-10 03:13:16:057 +0000, logical=396, composite=112069207395991948], current primary replica = null]
          at org.apache.ignite.internal.tx.impl.TxManagerImpl.lambda$verifyCommitTimestamp$23(TxManagerImpl.java:917)
          at java.base/java.util.concurrent.CompletableFuture$UniAccept.tryFire(CompletableFuture.java:714)
          at java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506)
          at java.base/java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:2073)
          at org.apache.ignite.internal.util.PendingComparableValuesTracker.lambda$completeWaitersOnUpdate$0(PendingComparableValuesTracker.java:169)
          at java.base/java.util.concurrent.ConcurrentMap.forEach(ConcurrentMap.java:122)
          at org.apache.ignite.internal.util.PendingComparableValuesTracker.completeWaitersOnUpdate(PendingComparableValuesTracker.java:169)
          at org.apache.ignite.internal.util.PendingComparableValuesTracker.update(PendingComparableValuesTracker.java:103)
          at org.apache.ignite.internal.metastorage.server.time.ClusterTimeImpl.updateSafeTime(ClusterTimeImpl.java:146)
          at org.apache.ignite.internal.metastorage.impl.MetaStorageManagerImpl.onSafeTimeAdvanced(MetaStorageManagerImpl.java:823)
          at org.apache.ignite.internal.metastorage.impl.MetaStorageManagerImpl$1.onSafeTimeAdvanced(MetaStorageManagerImpl.java:430)
          at org.apache.ignite.internal.metastorage.server.WatchProcessor.lambda$advanceSafeTime$8(WatchProcessor.java:314)
          at java.base/java.util.concurrent.CompletableFuture$UniRun.tryFire(CompletableFuture.java:783)
          at java.base/java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:478)
          at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
          at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
          at java.base/java.lang.Thread.run(Thread.java:834)
      2024-03-10 03:13:24:569 +0000 [INFO][TablesAmountCapacityTest_cluster_0-srv-worker-1][ClientInboundMessageHandler] Partition primary replica changed, notifying client [connectionId=1, remoteAddress=/127.0.0.1:33690]
       

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned
            lunigorn Igor

            Dates

              Created:
              Updated:

              Slack

                Issue deployment