Uploaded image for project: 'Apache Ozone'
  1. Apache Ozone
  2. HDDS-3816 Erasure Coding
  3. HDDS-6341

EC: Fix the race condition in TestECBlockReconstructedStripeInputStream

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • EC-Branch
    • None

    Description

      I see this failure in

      java.lang.NullPointerException
      2022-02-16T17:54:53.5848386Z  at org.apache.hadoop.ozone.client.rpc.read.TestECBlockReconstructedStripeInputStream.testReadFullStripesWithPartial(TestECBlockReconstructedStripeInputStream.java:177)
      2022-02-16T17:54:53.5849272Z  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      2022-02-16T17:54:53.5849806Z  at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
      2022-02-16T17:54:53.5850416Z  at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      2022-02-16T17:54:53.5850938Z  at java.lang.reflect.Method.invoke(Method.java:498)
      2022-02-16T17:54:53.5851434Z  at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
      2022-02-16T17:54:53.5852010Z  at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
      2022-02-16T17:54:53.5852613Z  at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
      2022-02-16T17:54:53.5853224Z  at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
      2022-02-16T17:54:53.5853816Z  at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
      2022-02-16T17:54:53.5854322Z  at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
      2022-02-16T17:54:53.5854827Z  at org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
      2022-02-16T17:54:53.5855335Z  at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
      2022-02-16T17:54:53.5855868Z  at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
      2022-02-16T17:54:53.5856439Z  at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
      2022-02-16T17:54:53.5856947Z  at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
      2022-02-16T17:54:53.5857395Z  at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
      2022-02-16T17:54:53.5857876Z  at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
      2022-02-16T17:54:53.5858345Z  at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
      2022-02-16T17:54:53.5858808Z  at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
      2022-02-16T17:54:53.5859261Z  at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
      2022-02-16T17:54:53.5859710Z  at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
      2022-02-16T17:54:53.5860121Z  at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
      2022-02-16T17:54:53.5860532Z  at org.junit.runner.JUnitCore.run(JUnitCore.java:115)
      2022-02-16T17:54:53.5861053Z  at org.junit.vintage.engine.execution.RunnerExecutor.execute(RunnerExecutor.java:43)
      2022-02-16T17:54:53.5861773Z  at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
      2022-02-16T17:54:53.5862251Z  at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
      2022-02-16T17:54:53.5862721Z  at java.util.Iterator.forEachRemaining(Iterator.java:116)
      2022-02-16T17:54:53.5863217Z  at java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
      2022-02-16T17:54:53.5863752Z  at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
      2022-02-16T17:54:53.5864295Z  at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
      2022-02-16T17:54:53.5864822Z  at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
      2022-02-16T17:54:53.5865338Z  at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
      2022-02-16T17:54:53.5865940Z  at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
      2022-02-16T17:54:53.5866462Z  at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:485)
      2022-02-16T17:54:53.5867034Z  at org.junit.vintage.engine.VintageTestEngine.executeAllChildren(VintageTestEngine.java:82)
      2022-02-16T17:54:53.5867641Z  at org.junit.vintage.engine.VintageTestEngine.execute(VintageTestEngine.java:73)
      2022-02-16T17:54:53.5868430Z  at org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:108)
      2022-02-16T17:54:53.5869848Z  at org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:88)
      2022-02-16T17:54:53.5870531Z  at org.junit.platform.launcher.core.EngineExecutionOrchestrator.lambda$execute$0(EngineExecutionOrchestrator.java:54)
      2022-02-16T17:54:53.5871303Z  at org.junit.platform.launcher.core.EngineExecutionOrchestrator.withInterceptedStreams(EngineExecutionOrchestrator.java:67)
      2022-02-16T17:54:53.5872059Z  at org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:52)
      2022-02-16T17:54:53.5872667Z  at org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:96)
      2022-02-16T17:54:53.5873183Z  at org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:75)
      2022-02-16T17:54:53.5873816Z  at org.apache.maven.surefire.junitplatform.JUnitPlatformProvider.invokeAllTests(JUnitPlatformProvider.java:142)
      2022-02-16T17:54:53.5874519Z  at org.apache.maven.surefire.junitplatform.JUnitPlatformProvider.invoke(JUnitPlatformProvider.java:113)
      2022-02-16T17:54:53.5875201Z  at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
      2022-02-16T17:54:53.5875820Z  at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
      2022-02-16T17:54:53.5876356Z  at org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
      2022-02-16T17:54:53.5876848Z  at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)

       

      First time I saw this failure in CI, but I tried this test to run in tight loop and very randomly couple of times failed. It quite hard to track this, but after adding couple of debug traces, multiple thread trying to add streams to arrayList in parallel. If we add elements in ArrayList from multiple threads, there is a possibility that elements can be nulls in arrayList. So, looking at the code where we are adding streams into the arrayList, only possibility is the race condition and turning elements into null.

      TestBlockInputStream stream = new TestBlockInputStream(
      blockInfo.getBlockID(), blockInfo.getLength(),
      blockStreamData.get(repInd - 1), repInd);
      if (failIndexes.contains(repInd)) {
      stream.setShouldError(true);
      }
      
      blockStreams.add(stream);
      return stream;
      

       

      Attachments

        Issue Links

          Activity

            People

              umamaheswararao Uma Maheswara Rao G
              umamaheswararao Uma Maheswara Rao G
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: