Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-13168

XmlImageVisitor - Prefer Array over LinkedList

Details

    • Improvement
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • 3.0.0
    • 3.1.0
    • hdfs
    • None

    Description

      ArrayDeque

      This class is likely to be faster than Stack when used as a stack, and faster than LinkedList when used as a queue.

      .. not to mention less memory fragmentation (single backing array v.s. many ArrayList nodes).

      https://docs.oracle.com/javase/8/docs/api/java/util/ArrayDeque.html

      Attachments

        1. HDFS-13168.2.patch
          3 kB
          David Mollitor
        2. HDFS-13168.1.patch
          2 kB
          David Mollitor

        Activity

          genericqa genericqa added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 18s Docker mode activated.
                Prechecks
          +1 @author 0m 0s The patch does not contain any @author tags.
          -1 test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
                trunk Compile Tests
          +1 mvninstall 17m 9s trunk passed
          +1 compile 0m 56s trunk passed
          +1 checkstyle 0m 39s trunk passed
          +1 mvnsite 1m 3s trunk passed
          +1 shadedclient 11m 50s branch has no errors when building and testing our client artifacts.
          +1 findbugs 1m 56s trunk passed
          +1 javadoc 0m 53s trunk passed
                Patch Compile Tests
          +1 mvninstall 0m 59s the patch passed
          +1 compile 0m 53s the patch passed
          +1 javac 0m 53s the patch passed
          +1 checkstyle 0m 35s hadoop-hdfs-project/hadoop-hdfs: The patch generated 0 new + 0 unchanged - 1 fixed = 0 total (was 1)
          +1 mvnsite 0m 57s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 shadedclient 11m 10s patch has no errors when building and testing our client artifacts.
          +1 findbugs 2m 2s the patch passed
          +1 javadoc 0m 54s the patch passed
                Other Tests
          -1 unit 91m 4s hadoop-hdfs in the patch failed.
          +1 asflicense 0m 24s The patch does not generate ASF License warnings.
          143m 36s



          Reason Tests
          Failed junit tests hadoop.hdfs.server.namenode.TestTruncateQuotaUpdate
            hadoop.hdfs.web.TestWebHDFS



          Subsystem Report/Notes
          Docker Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639
          JIRA Issue HDFS-13168
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12911078/HDFS-13168.1.patch
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle
          uname Linux 52635e2aa02a 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 14:43:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/patchprocess/precommit/personality/provided.sh
          git revision trunk / 4d4dde5
          maven version: Apache Maven 3.3.9
          Default Java 1.8.0_151
          findbugs v3.1.0-RC1
          unit https://builds.apache.org/job/PreCommit-HDFS-Build/23114/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
          Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/23114/testReport/
          Max. process+thread count 4071 (vs. ulimit of 5500)
          modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/23114/console
          Powered by Apache Yetus 0.8.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          genericqa genericqa added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 18s Docker mode activated.       Prechecks +1 @author 0m 0s The patch does not contain any @author tags. -1 test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.       trunk Compile Tests +1 mvninstall 17m 9s trunk passed +1 compile 0m 56s trunk passed +1 checkstyle 0m 39s trunk passed +1 mvnsite 1m 3s trunk passed +1 shadedclient 11m 50s branch has no errors when building and testing our client artifacts. +1 findbugs 1m 56s trunk passed +1 javadoc 0m 53s trunk passed       Patch Compile Tests +1 mvninstall 0m 59s the patch passed +1 compile 0m 53s the patch passed +1 javac 0m 53s the patch passed +1 checkstyle 0m 35s hadoop-hdfs-project/hadoop-hdfs: The patch generated 0 new + 0 unchanged - 1 fixed = 0 total (was 1) +1 mvnsite 0m 57s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 shadedclient 11m 10s patch has no errors when building and testing our client artifacts. +1 findbugs 2m 2s the patch passed +1 javadoc 0m 54s the patch passed       Other Tests -1 unit 91m 4s hadoop-hdfs in the patch failed. +1 asflicense 0m 24s The patch does not generate ASF License warnings. 143m 36s Reason Tests Failed junit tests hadoop.hdfs.server.namenode.TestTruncateQuotaUpdate   hadoop.hdfs.web.TestWebHDFS Subsystem Report/Notes Docker Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 JIRA Issue HDFS-13168 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12911078/HDFS-13168.1.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle uname Linux 52635e2aa02a 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 14:43:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/patchprocess/precommit/personality/provided.sh git revision trunk / 4d4dde5 maven version: Apache Maven 3.3.9 Default Java 1.8.0_151 findbugs v3.1.0-RC1 unit https://builds.apache.org/job/PreCommit-HDFS-Build/23114/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/23114/testReport/ Max. process+thread count 4071 (vs. ulimit of 5500) modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs Console output https://builds.apache.org/job/PreCommit-HDFS-Build/23114/console Powered by Apache Yetus 0.8.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          elgoiri Íñigo Goiri added a comment - - edited

          belugabehr, do you have any approximate performance improvements after using ArrayDequeue?
          I would go for it anyway but just for completeness.

          A minor comment on HDFS-13168.1.patch, we could use the implicit type and do new ArrayDeque<>();.
          The failed unit tests seem unrelated to me.
          The related test would be TestCheckpoint and it seems to pass; not sure it tests much though.

          elgoiri Íñigo Goiri added a comment - - edited belugabehr , do you have any approximate performance improvements after using ArrayDequeue ? I would go for it anyway but just for completeness. A minor comment on HDFS-13168.1.patch , we could use the implicit type and do new ArrayDeque<>(); . The failed unit tests seem unrelated to me. The related test would be TestCheckpoint and it seems to pass ; not sure it tests much though.
          belugabehr David Mollitor added a comment - - edited

          Tested a 4K FSImage file

          ## Current Implementation
          
          Result "org.apache.hadoop.hdfs.tools.offlineImageViewer.TestSpeed.testSpeed":
            1737.194 ±(99.9%) 19.775 ops/s [Average]
            (min, avg, max) = (1533.113, 1737.194, 1967.332), stdev = 83.729
            CI (99.9%): [1717.419, 1756.969] (assumes normal distribution)
          
          
          # Run complete. Total time: 00:06:45
          
          Benchmark                                Mode  Cnt     Score    Error  Units
          offlineImageViewer.TestSpeed.testSpeed  thrpt  200  1737.194 ± 19.775  ops/s
          
          
          ## Patch Impelementation
          
          Result "org.apache.hadoop.hdfs.tools.offlineImageViewer.TestSpeed.testSpeed":
            1781.301 ±(99.9%) 21.108 ops/s [Average]
            (min, avg, max) = (1407.945, 1781.301, 2020.971), stdev = 89.372
            CI (99.9%): [1760.193, 1802.409] (assumes normal distribution)
          
          
          # Run complete. Total time: 00:06:45
          
          Benchmark                                Mode  Cnt     Score    Error  Units
          offlineImageViewer.TestSpeed.testSpeed  thrpt  200  1781.301 ± 21.108  ops/s
          

          ... and memory fragmentation is not really counted here of the LinkedList because of large heap size

          belugabehr David Mollitor added a comment - - edited Tested a 4K FSImage file ## Current Implementation Result "org.apache.hadoop.hdfs.tools.offlineImageViewer.TestSpeed.testSpeed" : 1737.194 ±(99.9%) 19.775 ops/s [Average] (min, avg, max) = (1533.113, 1737.194, 1967.332), stdev = 83.729 CI (99.9%): [1717.419, 1756.969] (assumes normal distribution) # Run complete. Total time: 00:06:45 Benchmark Mode Cnt Score Error Units offlineImageViewer.TestSpeed.testSpeed thrpt 200 1737.194 ± 19.775 ops/s ## Patch Impelementation Result "org.apache.hadoop.hdfs.tools.offlineImageViewer.TestSpeed.testSpeed" : 1781.301 ±(99.9%) 21.108 ops/s [Average] (min, avg, max) = (1407.945, 1781.301, 2020.971), stdev = 89.372 CI (99.9%): [1760.193, 1802.409] (assumes normal distribution) # Run complete. Total time: 00:06:45 Benchmark Mode Cnt Score Error Units offlineImageViewer.TestSpeed.testSpeed thrpt 200 1781.301 ± 21.108 ops/s ... and memory fragmentation is not really counted here of the LinkedList because of large heap size

          Thanks belugabehr for the results, the performance improvements are nothing spectacular but as I said, I think it makes sense.
          The memory fragmentation is obviously there now, so that's another plus.

          HDFS-13168.2.patch now keeps element.toString(), is this intentional? I think what you had in HDFS-13168.1.patch made more sense there.

          elgoiri Íñigo Goiri added a comment - Thanks belugabehr for the results, the performance improvements are nothing spectacular but as I said, I think it makes sense. The memory fragmentation is obviously there now, so that's another plus. HDFS-13168.2.patch now keeps element.toString() , is this intentional? I think what you had in HDFS-13168.1.patch made more sense there.
          belugabehr David Mollitor added a comment -

          elgoiri It was intentional.

           

          Because the following method uses it and there's no way around it, so I wanted to keep it consistent.

          @Override
          void visit(ImageElement element, String value) throws IOException {
             writeTag(element.toString(), value);
          }
          belugabehr David Mollitor added a comment - elgoiri It was intentional.   Because the following method uses it and there's no way around it, so I wanted to keep it consistent. @Override void visit(ImageElement element, String value) throws IOException { writeTag(element.toString(), value); }

          Keeping it as is for consistency makes sense.
          Given we are not changing those toString(), not sure if there are many advantages of using char instead of String (for example '<' ).
          I would avoid making those changes just to keep the patch constraint (assuming my char vs single char String is correct).

          elgoiri Íñigo Goiri added a comment - Keeping it as is for consistency makes sense. Given we are not changing those toString() , not sure if there are many advantages of using char instead of String (for example '<' ). I would avoid making those changes just to keep the patch constraint (assuming my char vs single char String is correct).
          genericqa genericqa added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 24s Docker mode activated.
                Prechecks
          +1 @author 0m 0s The patch does not contain any @author tags.
          -1 test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
                trunk Compile Tests
          +1 mvninstall 15m 34s trunk passed
          +1 compile 0m 51s trunk passed
          +1 checkstyle 0m 37s trunk passed
          +1 mvnsite 0m 58s trunk passed
          +1 shadedclient 11m 1s branch has no errors when building and testing our client artifacts.
          +1 findbugs 1m 47s trunk passed
          +1 javadoc 0m 53s trunk passed
                Patch Compile Tests
          +1 mvninstall 0m 53s the patch passed
          +1 compile 0m 49s the patch passed
          +1 javac 0m 49s the patch passed
          +1 checkstyle 0m 34s hadoop-hdfs-project/hadoop-hdfs: The patch generated 0 new + 0 unchanged - 1 fixed = 0 total (was 1)
          +1 mvnsite 0m 53s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 shadedclient 10m 26s patch has no errors when building and testing our client artifacts.
          +1 findbugs 1m 52s the patch passed
          +1 javadoc 0m 54s the patch passed
                Other Tests
          -1 unit 157m 19s hadoop-hdfs in the patch failed.
          +1 asflicense 0m 21s The patch does not generate ASF License warnings.
          206m 3s



          Reason Tests
          Failed junit tests hadoop.hdfs.server.namenode.TestTruncateQuotaUpdate
            hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA
            hadoop.hdfs.web.TestWebHdfsTimeouts
            hadoop.hdfs.TestDFSStripedOutputStreamWithFailure
            hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean



          Subsystem Report/Notes
          Docker Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639
          JIRA Issue HDFS-13168
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12911139/HDFS-13168.2.patch
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle
          uname Linux 993a55cfba1c 4.4.0-64-generic #85-Ubuntu SMP Mon Feb 20 11:50:30 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/patchprocess/precommit/personality/provided.sh
          git revision trunk / b9a429b
          maven version: Apache Maven 3.3.9
          Default Java 1.8.0_151
          findbugs v3.1.0-RC1
          unit https://builds.apache.org/job/PreCommit-HDFS-Build/23123/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
          Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/23123/testReport/
          Max. process+thread count 3366 (vs. ulimit of 5500)
          modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/23123/console
          Powered by Apache Yetus 0.8.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          genericqa genericqa added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 24s Docker mode activated.       Prechecks +1 @author 0m 0s The patch does not contain any @author tags. -1 test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.       trunk Compile Tests +1 mvninstall 15m 34s trunk passed +1 compile 0m 51s trunk passed +1 checkstyle 0m 37s trunk passed +1 mvnsite 0m 58s trunk passed +1 shadedclient 11m 1s branch has no errors when building and testing our client artifacts. +1 findbugs 1m 47s trunk passed +1 javadoc 0m 53s trunk passed       Patch Compile Tests +1 mvninstall 0m 53s the patch passed +1 compile 0m 49s the patch passed +1 javac 0m 49s the patch passed +1 checkstyle 0m 34s hadoop-hdfs-project/hadoop-hdfs: The patch generated 0 new + 0 unchanged - 1 fixed = 0 total (was 1) +1 mvnsite 0m 53s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 shadedclient 10m 26s patch has no errors when building and testing our client artifacts. +1 findbugs 1m 52s the patch passed +1 javadoc 0m 54s the patch passed       Other Tests -1 unit 157m 19s hadoop-hdfs in the patch failed. +1 asflicense 0m 21s The patch does not generate ASF License warnings. 206m 3s Reason Tests Failed junit tests hadoop.hdfs.server.namenode.TestTruncateQuotaUpdate   hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA   hadoop.hdfs.web.TestWebHdfsTimeouts   hadoop.hdfs.TestDFSStripedOutputStreamWithFailure   hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean Subsystem Report/Notes Docker Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 JIRA Issue HDFS-13168 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12911139/HDFS-13168.2.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle uname Linux 993a55cfba1c 4.4.0-64-generic #85-Ubuntu SMP Mon Feb 20 11:50:30 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/patchprocess/precommit/personality/provided.sh git revision trunk / b9a429b maven version: Apache Maven 3.3.9 Default Java 1.8.0_151 findbugs v3.1.0-RC1 unit https://builds.apache.org/job/PreCommit-HDFS-Build/23123/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/23123/testReport/ Max. process+thread count 3366 (vs. ulimit of 5500) modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs Console output https://builds.apache.org/job/PreCommit-HDFS-Build/23123/console Powered by Apache Yetus 0.8.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          belugabehr David Mollitor added a comment -

          elgoiri Using 'char' type is intentional.  It's faster to add a 'char' to a StringBuilder than a string.  Adding a string requires a null check and a length check.  For a char, it cannot be null and it can only be a length of 1, so it's faster.

           

          Please consider this patch for inclusion into the project.  Thanks!

          belugabehr David Mollitor added a comment - elgoiri Using 'char' type is intentional.  It's faster to add a 'char' to a StringBuilder than a string.  Adding a string requires a null check and a length check.  For a char, it cannot be null and it can only be a length of 1, so it's faster.   Please consider this patch for inclusion into the project.  Thanks!
          elgoiri Íñigo Goiri added a comment - - edited

          StringBuilder is more optimal but I'm not sure how does + in String is implemented.
          Anyway, just philosophical at this point, it should be good either way.
          I'm committing HDFS-13168.2.patch to trunk; belugabehr, for the commit message do I use your name capitalized as it is here or do you prefer some other spelling?

          elgoiri Íñigo Goiri added a comment - - edited StringBuilder is more optimal but I'm not sure how does + in String is implemented. Anyway, just philosophical at this point, it should be good either way. I'm committing HDFS-13168.2.patch to trunk ; belugabehr , for the commit message do I use your name capitalized as it is here or do you prefer some other spelling?
          belugabehr David Mollitor added a comment -

          + is turned into StringBuilder under the covers.

          No preference, but I believe it's been all caps thus far. Thanks!!!

          belugabehr David Mollitor added a comment - + is turned into StringBuilder under the covers. No preference, but I believe it's been all caps thus far. Thanks!!!

          Thanks belugabehr for the patch, committed to trunk.

          elgoiri Íñigo Goiri added a comment - Thanks belugabehr for the patch, committed to trunk.
          hudson Hudson added a comment -

          SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13687 (See https://builds.apache.org/job/Hadoop-trunk-Commit/13687/)
          HDFS-13168. XmlImageVisitor - Prefer Array over LinkedList. Contributed (inigoiri: rev 17c592e6cfd1ea3dbe9671c4703caabd095d87cf)

          • (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/XmlImageVisitor.java
          hudson Hudson added a comment - SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13687 (See https://builds.apache.org/job/Hadoop-trunk-Commit/13687/ ) HDFS-13168 . XmlImageVisitor - Prefer Array over LinkedList. Contributed (inigoiri: rev 17c592e6cfd1ea3dbe9671c4703caabd095d87cf) (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/XmlImageVisitor.java

          People

            belugabehr David Mollitor
            belugabehr David Mollitor
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: