Uploaded image for project: 'Hadoop Distributed Data Store'
  1. Hadoop Distributed Data Store
  2. HDDS-2203

Race condition in ByteStringHelper.init()

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Critical
    • Resolution: Duplicate
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: Ozone Client, SCM
    • Labels:
      None

      Description

      The current init method:

      public static void init(boolean isUnsafeByteOperation) {
        final boolean set = INITIALIZED.compareAndSet(false, true);
        if (set) {
          ByteStringHelper.isUnsafeByteOperationsEnabled =
             isUnsafeByteOperation;
         } else {
           // already initialized, check values
           Preconditions.checkState(isUnsafeByteOperationsEnabled
             == isUnsafeByteOperation);
         }
      }
      

      In a scenario when two thread accesses this method, and the execution order is the following, then the second thread runs into an exception from PreCondition.checkState() in the else branch.

      In an unitialized state:

      • T1 thread arrives to the method with true as the parameter, the class initialises the isUnsafeByteOperationsEnabled to false
      • T1 sets INITIALIZED true
      • T2 arrives to the method with true as the parameter
      • T2 reads the INITALIZED value and as it is not false goes to else branch
      • T2 tries to check if the internal boolean property is the same true as it wanted to set, and as T1 still to set the value, the checkState throws an IllegalArgumentException.

      This happens in certain Hive query cases, as it came from that testing, the exception we see there is the following:

      Error: Error while processing statement: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Map 2, vertexId=vertex_1569486223160_0334_1_02, diagnostics=[Vertex vertex_1569486223160_0334_1_02 [Map 2] killed/failed
       due to:ROOT_INPUT_INIT_FAILURE, Vertex Input: item initializer failed, vertex=vertex_1569486223160_0334_1_02 [Map 2], java.io.IOException: Couldn't create RpcClient protocol
          at org.apache.hadoop.ozone.client.OzoneClientFactory.getClientProtocol(OzoneClientFactory.java:263)
          at org.apache.hadoop.ozone.client.OzoneClientFactory.getClientProtocol(OzoneClientFactory.java:239)
          at org.apache.hadoop.ozone.client.OzoneClientFactory.getRpcClient(OzoneClientFactory.java:203)
          at org.apache.hadoop.ozone.client.OzoneClientFactory.getRpcClient(OzoneClientFactory.java:165)
          at org.apache.hadoop.fs.ozone.BasicOzoneClientAdapterImpl.<init>(BasicOzoneClientAdapterImpl.java:158)
          at org.apache.hadoop.fs.ozone.OzoneClientAdapterImpl.<init>(OzoneClientAdapterImpl.java:50)
          at org.apache.hadoop.fs.ozone.OzoneFileSystem.createAdapter(OzoneFileSystem.java:102)
          at org.apache.hadoop.fs.ozone.BasicOzoneFileSystem.initialize(BasicOzoneFileSystem.java:155)
          at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3315)
          at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:136)
          at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3364)
          at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3332)
          at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:491)
          at org.apache.hadoop.fs.Path.getFileSystem(Path.java:361)
          at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1821)
          at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getSplits(OrcInputFormat.java:2002)
          at org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:524)
          at org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:781)
          at org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:243)
          at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:278)
          at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:269)
          at java.security.AccessController.doPrivileged(Native Method)
          at javax.security.auth.Subject.doAs(Subject.java:422)
          at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1876)
          at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:269)
          at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:253)
          at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:108)
          at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:41)
          at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:77)
          at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
          at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
          at java.lang.Thread.run(Thread.java:748)
      Caused by: java.lang.IllegalStateException
          at org.apache.hadoop.ozone.shaded.com.google.common.base.Preconditions.checkState(Preconditions.java:129)
          at org.apache.hadoop.hdds.scm.ByteStringHelper.init(ByteStringHelper.java:47)
          at org.apache.hadoop.ozone.client.rpc.RpcClient.<init>(RpcClient.java:241)
          at org.apache.hadoop.ozone.client.OzoneClientFactory.getClientProtocol(OzoneClientFactory.java:256)
          ... 31 more
      

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                pifta Istvan Fajth
                Reporter:
                pifta Istvan Fajth
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 1h 40m
                  1h 40m