Uploaded image for project: 'Ignite'
  1. Ignite
  2. IGNITE-22011

aimem: repeat of create table and drop column leads to Failed to get the primary replica

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Blocker
    • Resolution: Unresolved
    • 3.0.0-beta1
    • None
    • persistence
    • 2 nodes cluster running on remote machine or locally

    Description

      Comment:
      This is the flaky issue and can happen on any operation to table with aimem persistence if the cluster lives long enough.

      Steps to reproduce:

      Run the next queries using IgniteSql in cycle with 50 repeats in single connection:

      create zone if not exists "AIMEM" engine aimem
      create table selectFromDropMultipleJdbc(k1 INTEGER not null, k2 INTEGER not null, v1 VARCHAR(100), v2 VARCHAR(255), v3 TIMESTAMP not null, primary key (k1, k2)) with PRIMARY_ZONE='AIMEM'
      insert into selectFromDropMultipleJdbc(k1, k2, v1, v2, v3) values (3366, 3367, null, null, '1980-02-27 01:01:49.000000000')
      insert into selectFromDropMultipleJdbc(k1, k2, v1, v2, v3) values (3367, 3368, '1v1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_', '1v2_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1_1', '1980-02-28 01:01:50.000000000')
      insert into selectFromDropMultipleJdbc(k1, k2, v1, v2, v3) values (3368, 3369, '2v1_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_', '2v2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2_2', '1980-02-29 01:01:51.000000000')
      insert into selectFromDropMultipleJdbc(k1, k2, v1, v2, v3) values (3369, 3370, '3v1_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_', '3v2_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3_3', '1980-03-01 01:01:52.000000000')
      insert into selectFromDropMultipleJdbc(k1, k2, v1, v2, v3) values (3370, 3371, null, null, '1980-03-02 01:01:53.000000000')
      insert into selectFromDropMultipleJdbc(k1, k2, v1, v2, v3) values (3371, 3372, '5v1_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_', '5v2_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5_5', '1980-03-03 01:01:54.000000000')
      insert into selectFromDropMultipleJdbc(k1, k2, v1, v2, v3) values (3372, 3373, '6v1_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_', '6v2_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6_6', '1980-03-04 01:01:55.000000000')
      insert into selectFromDropMultipleJdbc(k1, k2, v1, v2, v3) values (3373, 3374, '7v1_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_', '7v2_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7_7', '1980-03-05 01:01:56.000000000')
      insert into selectFromDropMultipleJdbc(k1, k2, v1, v2, v3) values (3374, 3375, null, null, '1980-03-06 01:01:57.000000000')
      insert into selectFromDropMultipleJdbc(k1, k2, v1, v2, v3) values (3375, 3376, '9v1_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_', '9v2_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9_9', '1980-03-07 01:01:58.000000000')
      select * from selectFromDropMultipleJdbc
      drop table selectFromDropMultipleJdbc 

      Expected:

      All queries are executed.

      Actual:

      On random repeat the client throws the exception:

      org.apache.ignite.sql.SqlException: IGN-PLACEMENTDRIVER-1 TraceId:16e895ba-34d2-4aac-aeb5-4718a116a97d Failed to get the primary replica [tablePartitionId=18_part_22, awaitTimestamp=HybridTimestamp [physical=2024-04-09 10:38:37:478 +0200, logical=53, composite=112240356063838261]]
          at java.base/java.lang.invoke.MethodHandle.invokeWithArguments(MethodHandle.java:710)
          at org.apache.ignite.internal.util.ExceptionUtils$1.copy(ExceptionUtils.java:765)
          at org.apache.ignite.internal.util.ExceptionUtils$ExceptionFactory.createCopy(ExceptionUtils.java:699)
          at org.apache.ignite.internal.util.ExceptionUtils.copyExceptionWithCause(ExceptionUtils.java:525)
          at org.apache.ignite.internal.util.ExceptionUtils.copyExceptionWithCauseInternal(ExceptionUtils.java:634)
          at org.apache.ignite.internal.util.ExceptionUtils.copyExceptionWithCause(ExceptionUtils.java:476)
          at org.apache.ignite.internal.client.sql.ClientSql.execute(ClientSql.java:94)
          at org.gridgain.ai3tests.tests.teststeps.ThinClientSteps.lambda$executeInsertQueries$1(ThinClientSteps.java:87)
          at io.qameta.allure.Allure.lambda$step$0(Allure.java:113)
          at io.qameta.allure.Allure.lambda$step$1(Allure.java:127)
          at io.qameta.allure.Allure.step(Allure.java:181)
          at io.qameta.allure.Allure.step(Allure.java:125)
          at io.qameta.allure.Allure.step(Allure.java:112)
          at org.gridgain.ai3tests.tests.teststeps.ThinClientSteps.executeInsertQueries(ThinClientSteps.java:83)
          at org.gridgain.ai3tests.tests.droptable.DropTableTestBase.selectFromDroppedTableThinClient(DropTableTestBase.java:56)
          at org.gridgain.ai3tests.tests.droptable.DropTableMultipleTriesThinTest.dropExistingTableMultipleThinClient(DropTableMultipleTriesThinTest.java:33)
          at java.base/java.lang.reflect.Method.invoke(Method.java:566)
          at io.qameta.allure.junit5.AllureJunit5.interceptTestTemplateMethod(AllureJunit5.java:59)
          at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
          at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
          at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
          at java.base/java.lang.Thread.run(Thread.java:834)
      Caused by: java.util.concurrent.CompletionException: org.apache.ignite.sql.SqlException: IGN-PLACEMENTDRIVER-1 TraceId:16e895ba-34d2-4aac-aeb5-4718a116a97d Failed to get the primary replica [tablePartitionId=18_part_22, awaitTimestamp=HybridTimestamp [physical=2024-04-09 10:38:37:478 +0200, logical=53, composite=112240356063838261]]
          at java.base/java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:331)
          at java.base/java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:346)
          at java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:870)
          at java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:837)
          at java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506)
          at java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2088)
          at org.apache.ignite.internal.client.TcpClientChannel.processNextMessage(TcpClientChannel.java:419)
          at org.apache.ignite.internal.client.TcpClientChannel.lambda$onMessage$3(TcpClientChannel.java:238)
          at java.base/java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(ForkJoinTask.java:1426)
          at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290)
          at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1020)
          at java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1656)
          at java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1594)
          at java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:177)
      Caused by: org.apache.ignite.sql.SqlException: IGN-PLACEMENTDRIVER-1 TraceId:16e895ba-34d2-4aac-aeb5-4718a116a97d Failed to get the primary replica [tablePartitionId=18_part_22, awaitTimestamp=HybridTimestamp [physical=2024-04-09 10:38:37:478 +0200, logical=53, composite=112240356063838261]]
          at java.base/java.lang.invoke.MethodHandle.invokeWithArguments(MethodHandle.java:710)
          at org.apache.ignite.internal.util.ExceptionUtils$1.copy(ExceptionUtils.java:765)
          at org.apache.ignite.internal.util.ExceptionUtils$ExceptionFactory.createCopy(ExceptionUtils.java:699)
          at org.apache.ignite.internal.util.ExceptionUtils.copyExceptionWithCause(ExceptionUtils.java:525)
          at org.apache.ignite.internal.client.TcpClientChannel.readError(TcpClientChannel.java:508)
          at org.apache.ignite.internal.client.TcpClientChannel.processNextMessage(TcpClientChannel.java:397)
          ... 7 more 

      The error in server log:

      2024-04-09 10:25:59:093 +0200 [WARNING][%DropTableMultipleTriesThinTest_cluster_0%sql-execution-pool-3][ClientInboundMessageHandler] Error processing client request [connectionId=5, id=8, op=50, remoteAddress=/127.0.0.1:60049]:org.apache.ignite.sql.SqlException: IGN-PLACEMENTDRIVER-1 TraceId:a343b2a9-ffa6-4449-9c8b-be4fe52cd302 Failed to get the primary replica [tablePartitionId=16_part_22, awaitTimestamp=HybridTimestamp [physical=2024-04-09 10:25:29:087 +0200, logical=3, composite=112240304395845635]]
      java.util.concurrent.CompletionException: org.apache.ignite.sql.SqlException: IGN-PLACEMENTDRIVER-1 TraceId:a343b2a9-ffa6-4449-9c8b-be4fe52cd302 Failed to get the primary replica [tablePartitionId=16_part_22, awaitTimestamp=HybridTimestamp [physical=2024-04-09 10:25:29:087 +0200, logical=3, composite=112240304395845635]]
        at org.apache.ignite.client.handler.requests.sql.ClientSqlExecuteRequest.lambda$executeAsync$4(ClientSqlExecuteRequest.java:193)
        at java.base/java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:986)
        at java.base/java.util.concurrent.CompletableFuture$UniExceptionally.tryFire(CompletableFuture.java:970)
        at java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506)
        at java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2088)
        at org.apache.ignite.internal.sql.engine.SqlQueryProcessor$PrefetchCallback.onPrefetchComplete(SqlQueryProcessor.java:1124)
        at org.apache.ignite.internal.sql.engine.prepare.KeyValueModifyPlan.lambda$execute$3(KeyValueModifyPlan.java:141)
        at java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:859)
        at java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:837)
        at java.base/java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:478)
        at org.apache.ignite.internal.sql.engine.exec.ExecutionContext.lambda$execute$0(ExecutionContext.java:329)
        at org.apache.ignite.internal.sql.engine.exec.QueryTaskExecutorImpl.lambda$execute$0(QueryTaskExecutorImpl.java:83)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at java.base/java.lang.Thread.run(Thread.java:834)
      Caused by: org.apache.ignite.sql.SqlException: IGN-PLACEMENTDRIVER-1 TraceId:a343b2a9-ffa6-4449-9c8b-be4fe52cd302 Failed to get the primary replica [tablePartitionId=16_part_22, awaitTimestamp=HybridTimestamp [physical=2024-04-09 10:25:29:087 +0200, logical=3, composite=112240304395845635]]
        at org.apache.ignite.internal.lang.SqlExceptionMapperUtil.mapToPublicSqlException(SqlExceptionMapperUtil.java:61)
        ... 10 more
      Caused by: org.apache.ignite.tx.TransactionException: IGN-PLACEMENTDRIVER-1 TraceId:a343b2a9-ffa6-4449-9c8b-be4fe52cd302 Failed to get the primary replica [tablePartitionId=16_part_22, awaitTimestamp=HybridTimestamp [physical=2024-04-09 10:25:29:087 +0200, logical=3, composite=112240304395845635]]
        at org.apache.ignite.internal.util.ExceptionUtils.lambda$withCause$1(ExceptionUtils.java:384)
        at org.apache.ignite.internal.util.ExceptionUtils.withCauseInternal(ExceptionUtils.java:446)
        at org.apache.ignite.internal.util.ExceptionUtils.withCause(ExceptionUtils.java:384)
        at org.apache.ignite.internal.table.distributed.storage.InternalTableImpl.lambda$enlist$76(InternalTableImpl.java:2011)
        at java.base/java.util.concurrent.CompletableFuture.uniHandle(CompletableFuture.java:930)
        at java.base/java.util.concurrent.CompletableFuture$UniHandle.tryFire(CompletableFuture.java:907)
        at java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506)
        at java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2088)
        at java.base/java.util.concurrent.CompletableFuture$Timeout.run(CompletableFuture.java:2792)
        at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
        at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
        at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
        ... 3 more
      Caused by: java.util.concurrent.CompletionException: org.apache.ignite.internal.placementdriver.PrimaryReplicaAwaitTimeoutException: IGN-PLACEMENTDRIVER-1 TraceId:a343b2a9-ffa6-4449-9c8b-be4fe52cd302 The primary replica await timed out [replicationGroupId=16_part_22, referenceTimestamp=HybridTimestamp [physical=2024-04-09 10:25:29:087 +0200, logical=3, composite=112240304395845635], currentLease=Lease [leaseholder=DropTableMultipleTriesThinTest_cluster_1, leaseholderId=64046ce6-d2e3-4751-8b57-add4baaa15a6, accepted=false, startTime=HybridTimestamp [physical=2024-04-09 10:25:28:017 +0200, logical=94, composite=112240304325722206], expirationTime=HybridTimestamp [physical=2024-04-09 10:27:28:017 +0200, logical=0, composite=112240312190042112], prolongable=false, replicationGroupId=16_part_22]]
        at java.base/java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:314)
        at java.base/java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:319)
        at java.base/java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:990)
        at java.base/java.util.concurrent.CompletableFuture$UniExceptionally.tryFire(CompletableFuture.java:970)
        ... 9 more
      Caused by: org.apache.ignite.internal.placementdriver.PrimaryReplicaAwaitTimeoutException: IGN-PLACEMENTDRIVER-1 TraceId:a343b2a9-ffa6-4449-9c8b-be4fe52cd302 The primary replica await timed out [replicationGroupId=16_part_22, referenceTimestamp=HybridTimestamp [physical=2024-04-09 10:25:29:087 +0200, logical=3, composite=112240304395845635], currentLease=Lease [leaseholder=DropTableMultipleTriesThinTest_cluster_1, leaseholderId=64046ce6-d2e3-4751-8b57-add4baaa15a6, accepted=false, startTime=HybridTimestamp [physical=2024-04-09 10:25:28:017 +0200, logical=94, composite=112240304325722206], expirationTime=HybridTimestamp [physical=2024-04-09 10:27:28:017 +0200, logical=0, composite=112240312190042112], prolongable=false, replicationGroupId=16_part_22]]
        at org.apache.ignite.internal.placementdriver.leases.LeaseTracker.lambda$awaitPrimaryReplica$5(LeaseTracker.java:280)
        at java.base/java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:986)
        ... 10 more
      Caused by: java.util.concurrent.TimeoutException
        ... 7 more 

      Attachments

        Activity

          People

            Unassigned Unassigned
            lunigorn Igor
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: