Uploaded image for project: 'Ignite'
  1. Ignite
  2. IGNITE-15257

ITNodeTest.testNodeTaskOverload is flaky

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Blocker
    • Resolution: Fixed
    • 3.0.0-alpha2
    • 3.0.0-alpha5
    • None

    Description

      Test failed locally:

      Aug 05, 2021 10:02:29 AM org.apache.ignite.lang.IgniteLogger logInternalAug 05, 2021 10:02:29 AM org.apache.ignite.lang.IgniteLogger logInternalINFO: >>>>>>>>>>>>>>> Start test method: testNodeTaskOverload()Aug 05, 2021 10:02:29 AM org.apache.ignite.lang.IgniteLogger logInternalINFO: Connection created [address=/0:0:0:0:0:0:0:0:5003]Aug 05, 2021 10:02:29 AM org.apache.ignite.lang.IgniteLogger logInternalINFO: Node joined: ClusterNode [id=5f83a413-f3b0-406f-93fd-74ecc2ee1370, name=127.0.1.1:5003, address=127.0.1.1:5003]Aug 05, 2021 10:02:29 AM org.apache.ignite.lang.IgniteLogger logInternalINFO: Topology snapshot [nodes=1]  ^-- ClusterNode [id=5f83a413-f3b0-406f-93fd-74ecc2ee1370, name=127.0.1.1:5003, address=127.0.1.1:5003]Aug 05, 2021 10:02:30 AM org.apache.ignite.lang.IgniteLogger logInternalINFO: Starts FSMCaller successfully.Aug 05, 2021 10:02:30 AM org.apache.ignite.lang.IgniteLogger logInternalWARNING: No data for snapshot reader target/work/ITNodeTest/testNodeTaskOverload_1628146949951/snapshot.Aug 05, 2021 10:02:30 AM org.apache.ignite.lang.IgniteLogger logInternalINFO: Node <unittest/127.0.1.1:5003> init, term=0, lastLogId=LogId [index=0, term=0], conf=127.0.1.1:5003, oldConf=.Aug 05, 2021 10:02:30 AM org.apache.ignite.lang.IgniteLogger logInternalINFO: Node <unittest/127.0.1.1:5003> start vote and grant vote self, term=0.Aug 05, 2021 10:02:30 AM org.apache.ignite.lang.IgniteLogger logInternalINFO: Save raft meta, path=target/work/ITNodeTest/testNodeTaskOverload_1628146949951/meta, term=1, votedFor=127.0.1.1:5003, cost time=7 msAug 05, 2021 10:02:30 AM org.apache.ignite.lang.IgniteLogger logInternalINFO: Node <unittest/127.0.1.1:5003> become leader of group, term=1, conf=127.0.1.1:5003, oldConf=.Aug 05, 2021 10:02:30 AM org.apache.ignite.lang.IgniteLogger logInternalINFO: Start the RaftGroupService successfully <unittest/127.0.1.1:5003>Aug 05, 2021 10:02:30 AM org.apache.ignite.lang.IgniteLogger logInternalWARNING: Node <unittest/127.0.1.1:5003> applyQueue is overload.Status[EBUSY<1009>: Node is busy, has too many tasks.]Aug 05, 2021 10:02:30 AM org.apache.ignite.lang.IgniteLogger logInternalINFO: onLeaderStart: term=1.Aug 05, 2021 10:02:30 AM org.apache.ignite.lang.IgniteLogger logInternalSEVERE: Node <unittest/127.0.1.1:5003> append [2, 2] failed, status=Status[EIO<1014>: Corrupted LogStorage].Aug 05, 2021 10:02:30 AM org.apache.ignite.lang.IgniteLogger logInternalSEVERE: Encountered an error=Status[EBUSY<1009>: LogManager is busy, disk queue overload.] on StateMachine org.apache.ignite.raft.jraft.core.MockStateMachine, it's highly recommended to implement this method as raft stops working since some error occurs, you should figure out the cause and repair or remove this node.Error [type=ERROR_TYPE_LOG, status=Status[EBUSY<1009>: LogManager is busy, disk queue overload.]] at org.apache.ignite.raft.jraft.storage.impl.LogManagerImpl.reportError(LogManagerImpl.java:577)Status[EBUSY<1009>: Node is busy, has too many tasks.] at org.apache.ignite.raft.jraft.storage.impl.LogManagerImpl.appendEntries(LogManagerImpl.java:332) at org.apache.ignite.raft.jraft.core.NodeImpl.executeApplyingTasks(NodeImpl.java:1371) at org.apache.ignite.raft.jraft.core.NodeImpl$LogEntryAndClosureHandler.onEvent(NodeImpl.java:292) at org.apache.ignite.raft.jraft.core.NodeImpl$LogEntryAndClosureHandler.onEvent(NodeImpl.java:275) at org.apache.ignite.raft.jraft.disruptor.StripedDisruptor$StripeEntryHandler.onEvent(StripedDisruptor.java:214) at org.apache.ignite.raft.jraft.disruptor.StripedDisruptor$StripeEntryHandler.onEvent(StripedDisruptor.java:179) at com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:137) at java.base/java.lang.Thread.run(Thread.java:829)
      Aug 05, 2021 10:02:30 AM org.apache.ignite.lang.IgniteLogger logInternalWARNING: Node <unittest/127.0.1.1:5003> got error: Error [type=ERROR_TYPE_LOG, status=Status[EBUSY<1009>: LogManager is busy, disk queue overload.]].Aug 05, 2021 10:02:30 AM org.apache.ignite.lang.IgniteLogger warnWARNING: FSMCaller already in error status, ignore new errorError [type=ERROR_TYPE_LOG, status=Status[EBUSY<1009>: LogManager is busy, disk queue overload.]] at org.apache.ignite.raft.jraft.storage.impl.LogManagerImpl.reportError(LogManagerImpl.java:577) at org.apache.ignite.raft.jraft.storage.impl.LogManagerImpl.appendEntries(LogManagerImpl.java:332) at org.apache.ignite.raft.jraft.core.NodeImpl.executeApplyingTasks(NodeImpl.java:1371) at org.apache.ignite.raft.jraft.core.NodeImpl$LogEntryAndClosureHandler.onEvent(NodeImpl.java:292) at org.apache.ignite.raft.jraft.core.NodeImpl$LogEntryAndClosureHandler.onEvent(NodeImpl.java:275) at org.apache.ignite.raft.jraft.disruptor.StripedDisruptor$StripeEntryHandler.onEvent(StripedDisruptor.java:214) at org.apache.ignite.raft.jraft.disruptor.StripedDisruptor$StripeEntryHandler.onEvent(StripedDisruptor.java:179) at com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:137) at java.base/java.lang.Thread.run(Thread.java:829)
      Aug 05, 2021 10:02:30 AM org.apache.ignite.lang.IgniteLogger logInternalWARNING: Node <unittest/127.0.1.1:5003> applyQueue is overload.Aug 05, 2021 10:02:30 AM org.apache.ignite.lang.IgniteLogger logInternalINFO: Fail to find the next candidate.Aug 05, 2021 10:02:30 AM org.apache.ignite.lang.IgniteLogger logInternalStatus[EPERM<1008>: Leader stepped down]SEVERE: Node <unittest/127.0.1.1:5003> append [0, 0] failed, status=Status[EIO<1014>: Corrupted LogStorage].Status[EPERM<1008>: Leader stepped down]Status[EPERM<1008>: Leader stepped down]Status[EPERM<1008>: Leader stepped down]Aug 05, 2021 10:02:30 AM org.apache.ignite.lang.IgniteLogger logInternalINFO: onLeaderStop: status=Status[EBADNODE<10009>: Raft node(leader or candidate) is in error.].Aug 05, 2021 10:02:30 AM org.apache.ignite.lang.IgniteLogger logInternalSEVERE: Node <unittest/127.0.1.1:5003> append [0, 0] failed, status=Status[EIO<1014>: Corrupted LogStorage].Aug 05, 2021 10:02:30 AM org.apache.ignite.lang.IgniteLogger logInternalWARNING: Node <unittest/127.0.1.1:5003> applyQueue is overload.Status[EPERM<1008>: Is not leader.]Status[EBUSY<1009>: Node is busy, has too many tasks.]Status[EPERM<1008>: Is not leader.]Status[EPERM<1008>: Is not leader.]Aug 05, 2021 10:03:00 AM org.apache.ignite.lang.IgniteLogger logInternalINFO: Node <unittest/127.0.1.1:5003> shutdown, currTerm=1 state=STATE_ERROR.Aug 05, 2021 10:03:00 AM org.apache.ignite.lang.IgniteLogger logInternalINFO: Save raft meta, path=target/work/ITNodeTest/testNodeTaskOverload_1628146949951/meta, term=1, votedFor=127.0.1.1:5003, cost time=14 msAug 05, 2021 10:03:00 AM org.apache.ignite.lang.IgniteLogger logInternalINFO: Shutting down FSMCaller...Aug 05, 2021 10:03:00 AM org.apache.ignite.lang.IgniteLogger logInternalINFO: onShutdown.Aug 05, 2021 10:03:00 AM org.apache.ignite.lang.IgniteLogger logInternalINFO: ThreadPool is terminated: JRaft-Node-ScheduleThreadPool-null-, org.apache.ignite.raft.jraft.util.MetricScheduledThreadPoolExecutor@24f4621[Shutting down, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 30].Aug 05, 2021 10:03:00 AM org.apache.ignite.lang.IgniteLogger logInternalINFO: Destroy timer: RepeatedTimer{timeout=null, stopped=true, running=false, destroyed=true, invoking=false, timeoutMs=1000, name='JRaft-ElectionTimer-null-'}.Aug 05, 2021 10:03:00 AM org.apache.ignite.lang.IgniteLogger logInternalINFO: Destroy timer: RepeatedTimer{timeout=null, stopped=true, running=false, destroyed=true, invoking=false, timeoutMs=1000, name='JRaft-VoteTimer-null-'}.Aug 05, 2021 10:03:00 AM org.apache.ignite.lang.IgniteLogger logInternalINFO: Destroy timer: RepeatedTimer{timeout=null, stopped=true, running=false, destroyed=true, invoking=false, timeoutMs=500, name='JRaft-StepDownTimer-null-'}.Aug 05, 2021 10:03:00 AM org.apache.ignite.lang.IgniteLogger logInternalINFO: Destroy timer: RepeatedTimer{timeout=null, stopped=true, running=false, destroyed=true, invoking=false, timeoutMs=3600000, name='JRaft-SnapshotTimer-null-'}.Aug 05, 2021 10:03:00 AM org.apache.ignite.lang.IgniteLogger logInternalINFO: DB destroyed, the db path is: target/work/ITNodeTest/testNodeTaskOverload_1628146949951/log.Aug 05, 2021 10:03:00 AM org.apache.ignite.lang.IgniteLogger logInternalINFO: Stop the RaftGroupService successfully.Aug 05, 2021 10:03:00 AM org.apache.ignite.lang.IgniteLogger logInternalINFO: Stopping 127.0.1.1:5003Aug 05, 2021 10:03:00 AM org.apache.ignite.lang.IgniteLogger logInternalINFO: Stopped 127.0.1.1:5003Aug 05, 2021 10:03:00 AM org.apache.ignite.lang.IgniteLogger logInternalINFO: >>>>>>>>>>>>>>> End test method: testNodeTaskOverload(), cost:30185 ms.Aug 05, 2021 10:03:00 AM org.apache.ignite.lang.IgniteLogger logInternalINFO: >>>>>>>>>>>>>>> Start test method: testRecoverFollower()
      java.util.concurrent.TimeoutException: joined timeout
       at org.apache.ignite.raft.jraft.closure.JoinableClosure.join(JoinableClosure.java:50) at org.apache.ignite.raft.jraft.entity.Task.join(Task.java:121) at org.apache.ignite.raft.jraft.entity.Task.joinAll(Task.java:154) at org.apache.ignite.raft.jraft.core.ITNodeTest.testNodeTaskOverload(ITNodeTest.java:256) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:566) at org.junit.platform.commons.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:688) at org.junit.jupiter.engine.execution.MethodInvocation.proceed(MethodInvocation.java:60) at org.junit.jupiter.engine.execution.InvocationInterceptorChain$ValidatingInvocation.proceed(InvocationInterceptorChain.java:131) at org.junit.jupiter.engine.extension.TimeoutExtension.intercept(TimeoutExtension.java:149) at org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestableMethod(TimeoutExtension.java:140) at org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestMethod(TimeoutExtension.java:84) at org.junit.jupiter.engine.execution.ExecutableInvoker$ReflectiveInterceptorCall.lambda$ofVoidMethod$0(ExecutableInvoker.java:115) at org.junit.jupiter.engine.execution.ExecutableInvoker.lambda$invoke$0(ExecutableInvoker.java:105) at org.junit.jupiter.engine.execution.InvocationInterceptorChain$InterceptedInvocation.proceed(InvocationInterceptorChain.java:106) at org.junit.jupiter.engine.execution.InvocationInterceptorChain.proceed(InvocationInterceptorChain.java:64) at org.junit.jupiter.engine.execution.InvocationInterceptorChain.chainAndInvoke(InvocationInterceptorChain.java:45) at org.junit.jupiter.engine.execution.InvocationInterceptorChain.invoke(InvocationInterceptorChain.java:37) at org.junit.jupiter.engine.execution.ExecutableInvoker.invoke(ExecutableInvoker.java:104) at org.junit.jupiter.engine.execution.ExecutableInvoker.invoke(ExecutableInvoker.java:98) at org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.lambda$invokeTestMethod$6(TestMethodTestDescriptor.java:210) at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73) at org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.invokeTestMethod(TestMethodTestDescriptor.java:206) at org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.execute(TestMethodTestDescriptor.java:131) at org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.execute(TestMethodTestDescriptor.java:65) at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$5(NodeTestTask.java:139) at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73) at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$7(NodeTestTask.java:129) at org.junit.platform.engine.support.hierarchical.Node.around(Node.java:137) at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$8(NodeTestTask.java:127) at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73) at org.junit.platform.engine.support.hierarchical.NodeTestTask.executeRecursively(NodeTestTask.java:126) at org.junit.platform.engine.support.hierarchical.NodeTestTask.execute(NodeTestTask.java:84) at java.base/java.util.ArrayList.forEach(ArrayList.java:1541) at org.junit.platform.engine.support.hierarchical.SameThreadHierarchicalTestExecutorService.invokeAll(SameThreadHierarchicalTestExecutorService.java:38) at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$5(NodeTestTask.java:143) at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73) at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$7(NodeTestTask.java:129) at org.junit.platform.engine.support.hierarchical.Node.around(Node.java:137) at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$8(NodeTestTask.java:127) at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73) at org.junit.platform.engine.support.hierarchical.NodeTestTask.executeRecursively(NodeTestTask.java:126) at org.junit.platform.engine.support.hierarchical.NodeTestTask.execute(NodeTestTask.java:84) at java.base/java.util.ArrayList.forEach(ArrayList.java:1541) at org.junit.platform.engine.support.hierarchical.SameThreadHierarchicalTestExecutorService.invokeAll(SameThreadHierarchicalTestExecutorService.java:38) at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$5(NodeTestTask.java:143) at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73) at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$7(NodeTestTask.java:129) at org.junit.platform.engine.support.hierarchical.Node.around(Node.java:137) at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$8(NodeTestTask.java:127) at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73) at org.junit.platform.engine.support.hierarchical.NodeTestTask.executeRecursively(NodeTestTask.java:126) at org.junit.platform.engine.support.hierarchical.NodeTestTask.execute(NodeTestTask.java:84) at org.junit.platform.engine.support.hierarchical.SameThreadHierarchicalTestExecutorService.submit(SameThreadHierarchicalTestExecutorService.java:32) at org.junit.platform.engine.support.hierarchical.HierarchicalTestExecutor.execute(HierarchicalTestExecutor.java:57) at org.junit.platform.engine.support.hierarchical.HierarchicalTestEngine.execute(HierarchicalTestEngine.java:51) at org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:108) at org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:88) at org.junit.platform.launcher.core.EngineExecutionOrchestrator.lambda$execute$0(EngineExecutionOrchestrator.java:54) at org.junit.platform.launcher.core.EngineExecutionOrchestrator.withInterceptedStreams(EngineExecutionOrchestrator.java:67) at org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:52) at org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:96) at org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:75) at com.intellij.junit5.JUnit5IdeaTestRunner.startRunnerWithArgs(JUnit5IdeaTestRunner.java:71) at com.intellij.rt.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:33) at com.intellij.rt.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:221) at com.intellij.rt.junit.JUnitStarter.main(JUnitStarter.java:54)
      

      Attachments

        Issue Links

          Activity

            People

              v.pyatkov Vladislav Pyatkov
              ibessonov Ivan Bessonov
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 10m
                  10m