Details

    • Type: Sub-task
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.8.0, 2.7.4, 3.0.0-alpha1
    • Component/s: hdfs
    • Labels:
      None
    • Target Version/s:
    • Hadoop Flags:
      Reviewed

      Description

      DFSUtil.byteArray2PathString generates excessive object allocation.

      1. each byte array is encoded to a string (copy)
      2. string appended to a builder which extracts the chars from the intermediate string (copy) and adds to its own char array
      3. builder's char array is re-alloced if over 16 chars (copy)
      4. builder's toString creates another string (copy)

      Instead of allocating all these objects and performing multiple byte/char encoding/decoding conversions, the byte array can be built in-place with a single final conversion to a string.

        Issue Links

          Activity

          Hide
          daryn Daryn Sharp added a comment -

          overlaps

          Show
          daryn Daryn Sharp added a comment - overlaps
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 30s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          -1 test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
          +1 mvninstall 8m 27s trunk passed
          +1 compile 0m 51s trunk passed
          +1 checkstyle 0m 29s trunk passed
          +1 mvnsite 0m 59s trunk passed
          +1 mvneclipse 0m 14s trunk passed
          +1 findbugs 2m 11s trunk passed
          +1 javadoc 0m 59s trunk passed
          +1 mvninstall 0m 56s the patch passed
          +1 compile 0m 49s the patch passed
          +1 javac 0m 49s the patch passed
          +1 checkstyle 0m 26s the patch passed
          +1 mvnsite 0m 56s the patch passed
          +1 mvneclipse 0m 10s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 findbugs 2m 20s the patch passed
          +1 javadoc 0m 57s the patch passed
          +1 unit 62m 39s hadoop-hdfs in the patch passed.
          +1 asflicense 0m 21s The patch does not generate ASF License warnings.
          85m 39s



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:9560f25
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12819146/HDFS-10656.patch
          JIRA Issue HDFS-10656
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux ccd3bc7fa258 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / 9f473cf
          Default Java 1.8.0_101
          findbugs v3.0.0
          Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/16278/testReport/
          modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/16278/console
          Powered by Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 30s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. -1 test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 mvninstall 8m 27s trunk passed +1 compile 0m 51s trunk passed +1 checkstyle 0m 29s trunk passed +1 mvnsite 0m 59s trunk passed +1 mvneclipse 0m 14s trunk passed +1 findbugs 2m 11s trunk passed +1 javadoc 0m 59s trunk passed +1 mvninstall 0m 56s the patch passed +1 compile 0m 49s the patch passed +1 javac 0m 49s the patch passed +1 checkstyle 0m 26s the patch passed +1 mvnsite 0m 56s the patch passed +1 mvneclipse 0m 10s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 2m 20s the patch passed +1 javadoc 0m 57s the patch passed +1 unit 62m 39s hadoop-hdfs in the patch passed. +1 asflicense 0m 21s The patch does not generate ASF License warnings. 85m 39s Subsystem Report/Notes Docker Image:yetus/hadoop:9560f25 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12819146/HDFS-10656.patch JIRA Issue HDFS-10656 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux ccd3bc7fa258 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 9f473cf Default Java 1.8.0_101 findbugs v3.0.0 Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/16278/testReport/ modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs Console output https://builds.apache.org/job/PreCommit-HDFS-Build/16278/console Powered by Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          kihwal Kihwal Lee added a comment -

          It looks like existing test cases (e.g. TestPathComponents) are sufficient.
          +1

          Show
          kihwal Kihwal Lee added a comment - It looks like existing test cases (e.g. TestPathComponents) are sufficient. +1
          Hide
          kihwal Kihwal Lee added a comment -

          Committed this to trunk and branch-2.

          Show
          kihwal Kihwal Lee added a comment - Committed this to trunk and branch-2.
          Hide
          hudson Hudson added a comment -

          SUCCESS: Integrated in Hadoop-trunk-Commit #10202 (See https://builds.apache.org/job/Hadoop-trunk-Commit/10202/)
          HDFS-10656. Optimize conversion of byte arrays back to path string. (kihwal: rev bebf10d2455cad1aa8985553417d4d74a61150ee)

          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSUtil.java
          Show
          hudson Hudson added a comment - SUCCESS: Integrated in Hadoop-trunk-Commit #10202 (See https://builds.apache.org/job/Hadoop-trunk-Commit/10202/ ) HDFS-10656 . Optimize conversion of byte arrays back to path string. (kihwal: rev bebf10d2455cad1aa8985553417d4d74a61150ee) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSUtil.java
          Hide
          zhz Zhe Zhang added a comment -

          I just committed the change to branch-2.8 and branch-2.7.

          Show
          zhz Zhe Zhang added a comment - I just committed the change to branch-2.8 and branch-2.7.
          Hide
          xiaochen Xiao Chen added a comment - - edited

          Thank you for the nice optimization Daryn Sharp!

          Sorry for posting my late questions here:

          • It seems the new check on range doesn't cover the scenario that length < 0. So length < 0 && offset + length >=0 would be a valid input. Should we worry about this? Edit: sorry false alarm, my bad.
          • It seems the old behavior is to always return "" if the byte array is 0-length, without any input validation on offset/length. New behavior will throw IndexOutOfBoundsException from the precondition check.
            I'm only asking from a code review perspective. And I guess this is okay since DFSUtil is private? Not sure why the old behavior is as such.
          Show
          xiaochen Xiao Chen added a comment - - edited Thank you for the nice optimization Daryn Sharp ! Sorry for posting my late questions here: It seems the new check on range doesn't cover the scenario that length < 0. So length < 0 && offset + length >=0 would be a valid input. Should we worry about this? Edit: sorry false alarm, my bad. It seems the old behavior is to always return "" if the byte array is 0-length, without any input validation on offset / length . New behavior will throw IndexOutOfBoundsException from the precondition check. I'm only asking from a code review perspective. And I guess this is okay since DFSUtil is private? Not sure why the old behavior is as such.

            People

            • Assignee:
              daryn Daryn Sharp
              Reporter:
              daryn Daryn Sharp
            • Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development