Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-12185

NetworkTopology is not efficient adding/getting/removing nodes

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.7.0
    • Fix Version/s: 2.8.0, 3.0.0-alpha1
    • Component/s: None
    • Labels:
      None
    • Target Version/s:
    • Hadoop Flags:
      Reviewed

      Description

      NetworkToplogy uses nodes with a list of children. The access to these children is slow as it's a linear search.

      1. HADOOP-12185-1.patch
        7 kB
        Íñigo Goiri
      2. HADOOP-12185-2.patch
        10 kB
        Íñigo Goiri

        Issue Links

          Activity

          Hide
          elgoiri Íñigo Goiri added a comment -

          First, I realized that to get a node in NetworkTopology, it did a full linear search without stopping when the node was found:

                Node childnode = null;
                for(int i=0; i<children.size(); i++) {
                  if (children.get(i).getName().equals(path[0])) {
                    childnode = children.get(i);
                  }
                }
                if (childnode == null) return null; // non-existing node
          

          Adding a break when the child is found improves the performance a lot (2x in most cases).

          After profiling, I realized that this linear search pattern in general was pretty bad and using a HashMap improved the performance significantly.
          I did some testing adding, getting, and removing nodes, and these are the average results for 10 executions:
          Large cluster: 100 racks with 40 machine per rack:

          Default: 1067.0 1259.0 1157.3
          Break: 1305.2 602.3 116.7
          Fancy: 122.3 69.6 137.9

          Very large cluster: 800 racks with 800 machine per rack:

          Default: 22110 29600 17975
          Break: 21756 12816 1975
          Fancy: 6013 2362 2015

          Medium cluster: 100 racks with 100 machine per rack:

          Default: 57.2 52.9 50.6
          Break: 59.3 42.4 32.3
          Fancy: 40.1 24.6 41.2

          Regular cluster: 10 racks with 10 machine per rack:

          Default: 3.2 4.0 3.2
          Break: 3.1 4.1 3.9
          Fancy: 2.5 4.3 2.4

          The results are average running time for ten executions in milliseconds.
          The first column is addition of nodes, the second one is retrieving those nodes and the last one is removing them.
          "Default" is what NetworkTopology does right now, "Break" is adding the break in the search, and "Fancy" is the one that includes the HashMap to do the search (I'll submit a patch with it).

          Show
          elgoiri Íñigo Goiri added a comment - First, I realized that to get a node in NetworkTopology, it did a full linear search without stopping when the node was found: Node childnode = null ; for ( int i=0; i<children.size(); i++) { if (children.get(i).getName().equals(path[0])) { childnode = children.get(i); } } if (childnode == null ) return null ; // non-existing node Adding a break when the child is found improves the performance a lot (2x in most cases). After profiling, I realized that this linear search pattern in general was pretty bad and using a HashMap improved the performance significantly. I did some testing adding, getting, and removing nodes, and these are the average results for 10 executions: Large cluster: 100 racks with 40 machine per rack: Default: 1067.0 1259.0 1157.3 Break: 1305.2 602.3 116.7 Fancy: 122.3 69.6 137.9 Very large cluster: 800 racks with 800 machine per rack: Default: 22110 29600 17975 Break: 21756 12816 1975 Fancy: 6013 2362 2015 Medium cluster: 100 racks with 100 machine per rack: Default: 57.2 52.9 50.6 Break: 59.3 42.4 32.3 Fancy: 40.1 24.6 41.2 Regular cluster: 10 racks with 10 machine per rack: Default: 3.2 4.0 3.2 Break: 3.1 4.1 3.9 Fancy: 2.5 4.3 2.4 The results are average running time for ten executions in milliseconds. The first column is addition of nodes, the second one is retrieving those nodes and the last one is removing them. "Default" is what NetworkTopology does right now, "Break" is adding the break in the search, and "Fancy" is the one that includes the HashMap to do the search (I'll submit a patch with it).
          Hide
          elgoiri Íñigo Goiri added a comment -

          Use a HashMap to search in children.

          Show
          elgoiri Íñigo Goiri added a comment - Use a HashMap to search in children.
          Hide
          chris.douglas Chris Douglas added a comment -

          I haven't looked at the context at all, but is there any reason to retain both the List and the Map? Wouldn't it be sufficient to look at childrenMap.values() for any context requiring the List?

          Show
          chris.douglas Chris Douglas added a comment - I haven't looked at the context at all, but is there any reason to retain both the List and the Map? Wouldn't it be sufficient to look at childrenMap.values() for any context requiring the List?
          Hide
          elgoiri Íñigo Goiri added a comment -

          That was my first approach but there are a couple functions that use the indeces for adding leafs and so on.
          I would vote for replacing the whole thing by a Map but it would imply changing many lines and I was afraid of that causing too much controversy; said that, I'm all in for that.

          Show
          elgoiri Íñigo Goiri added a comment - That was my first approach but there are a couple functions that use the indeces for adding leafs and so on. I would vote for replacing the whole thing by a Map but it would imply changing many lines and I was afraid of that causing too much controversy; said that, I'm all in for that.
          Hide
          hadoopqa Hadoop QA added a comment -



          +1 overall



          Vote Subsystem Runtime Comment
          0 pre-patch 16m 52s Pre-patch trunk compilation is healthy.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 tests included 0m 0s The patch appears to include 1 new or modified test files.
          +1 javac 7m 44s There were no new javac warning messages.
          +1 javadoc 9m 43s There were no new javadoc warning messages.
          +1 release audit 0m 22s The applied patch does not increase the total number of release audit warnings.
          +1 checkstyle 1m 5s There were no new checkstyle issues.
          +1 whitespace 0m 0s The patch has no lines that end in whitespace.
          +1 install 1m 31s mvn install still works.
          +1 eclipse:eclipse 0m 34s The patch built with eclipse:eclipse.
          +1 findbugs 1m 50s The patch does not introduce any new Findbugs (version 3.0.0) warnings.
          +1 common tests 22m 1s Tests passed in hadoop-common.
              61m 45s  



          Subsystem Report/Notes
          Patch URL http://issues.apache.org/jira/secure/attachment/12743558/HADOOP-12185-1.patch
          Optional Tests javadoc javac unit findbugs checkstyle
          git revision trunk / 2eae130
          hadoop-common test log https://builds.apache.org/job/PreCommit-HADOOP-Build/7142/artifact/patchprocess/testrun_hadoop-common.txt
          Test Results https://builds.apache.org/job/PreCommit-HADOOP-Build/7142/testReport/
          Java 1.7.0_55
          uname Linux asf907.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Console output https://builds.apache.org/job/PreCommit-HADOOP-Build/7142/console

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - +1 overall Vote Subsystem Runtime Comment 0 pre-patch 16m 52s Pre-patch trunk compilation is healthy. +1 @author 0m 0s The patch does not contain any @author tags. +1 tests included 0m 0s The patch appears to include 1 new or modified test files. +1 javac 7m 44s There were no new javac warning messages. +1 javadoc 9m 43s There were no new javadoc warning messages. +1 release audit 0m 22s The applied patch does not increase the total number of release audit warnings. +1 checkstyle 1m 5s There were no new checkstyle issues. +1 whitespace 0m 0s The patch has no lines that end in whitespace. +1 install 1m 31s mvn install still works. +1 eclipse:eclipse 0m 34s The patch built with eclipse:eclipse. +1 findbugs 1m 50s The patch does not introduce any new Findbugs (version 3.0.0) warnings. +1 common tests 22m 1s Tests passed in hadoop-common.     61m 45s   Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12743558/HADOOP-12185-1.patch Optional Tests javadoc javac unit findbugs checkstyle git revision trunk / 2eae130 hadoop-common test log https://builds.apache.org/job/PreCommit-HADOOP-Build/7142/artifact/patchprocess/testrun_hadoop-common.txt Test Results https://builds.apache.org/job/PreCommit-HADOOP-Build/7142/testReport/ Java 1.7.0_55 uname Linux asf907.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Console output https://builds.apache.org/job/PreCommit-HADOOP-Build/7142/console This message was automatically generated.
          Hide
          elgoiri Íñigo Goiri added a comment -

          I've implemented it with only Map and there is one operation that gets much more expensive: chooseRandom(). Right now, picking a random node is done selecting a random index in the list of children which is O(k); however, the same operation is O for HashMap. I thought about using TreeMap or LinkedHashMap but both of them have the same issue to some degree.

          Given that chooseRandom() is one of the most commonly used operations (e.g., block placement in HDFS), I would stick to the addition of the auxiliary map. To speedup chooseRamdom (which uses indexOf heavily) we can replace my proposed HashMap<String,Node> by HashMap<String,Integer>. I'm going to do some performance analysis to see what's better.

          Show
          elgoiri Íñigo Goiri added a comment - I've implemented it with only Map and there is one operation that gets much more expensive: chooseRandom(). Right now, picking a random node is done selecting a random index in the list of children which is O(k); however, the same operation is O for HashMap. I thought about using TreeMap or LinkedHashMap but both of them have the same issue to some degree. Given that chooseRandom() is one of the most commonly used operations (e.g., block placement in HDFS), I would stick to the addition of the auxiliary map. To speedup chooseRamdom (which uses indexOf heavily) we can replace my proposed HashMap<String,Node> by HashMap<String,Integer>. I'm going to do some performance analysis to see what's better.
          Hide
          elgoiri Íñigo Goiri added a comment -

          Correcting myself here: we cannot do a Map based on indeces because we would have to update them when removing nodes and this gets expensive.

          Show
          elgoiri Íñigo Goiri added a comment - Correcting myself here: we cannot do a Map based on indeces because we would have to update them when removing nodes and this gets expensive.
          Hide
          elgoiri Íñigo Goiri added a comment -

          Added unit test for random and improved efficiency searching for excluded nodes when adding.

          Show
          elgoiri Íñigo Goiri added a comment - Added unit test for random and improved efficiency searching for excluded nodes when adding.
          Hide
          chris.douglas Chris Douglas added a comment -

          +1 lgtm

          Will commit when Jenkins comes back clean.

          Show
          chris.douglas Chris Douglas added a comment - +1 lgtm Will commit when Jenkins comes back clean.
          Hide
          hadoopqa Hadoop QA added a comment -



          +1 overall



          Vote Subsystem Runtime Comment
          0 pre-patch 16m 25s Pre-patch trunk compilation is healthy.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 tests included 0m 0s The patch appears to include 1 new or modified test files.
          +1 javac 7m 33s There were no new javac warning messages.
          +1 javadoc 9m 36s There were no new javadoc warning messages.
          +1 release audit 0m 23s The applied patch does not increase the total number of release audit warnings.
          +1 checkstyle 1m 3s There were no new checkstyle issues.
          +1 whitespace 0m 0s The patch has no lines that end in whitespace.
          +1 install 1m 34s mvn install still works.
          +1 eclipse:eclipse 0m 33s The patch built with eclipse:eclipse.
          +1 findbugs 1m 49s The patch does not introduce any new Findbugs (version 3.0.0) warnings.
          +1 common tests 22m 5s Tests passed in hadoop-common.
              61m 4s  



          Subsystem Report/Notes
          Patch URL http://issues.apache.org/jira/secure/attachment/12743749/HADOOP-12185-2.patch
          Optional Tests javadoc javac unit findbugs checkstyle
          git revision trunk / fc92d3e
          hadoop-common test log https://builds.apache.org/job/PreCommit-HADOOP-Build/7161/artifact/patchprocess/testrun_hadoop-common.txt
          Test Results https://builds.apache.org/job/PreCommit-HADOOP-Build/7161/testReport/
          Java 1.7.0_55
          uname Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Console output https://builds.apache.org/job/PreCommit-HADOOP-Build/7161/console

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - +1 overall Vote Subsystem Runtime Comment 0 pre-patch 16m 25s Pre-patch trunk compilation is healthy. +1 @author 0m 0s The patch does not contain any @author tags. +1 tests included 0m 0s The patch appears to include 1 new or modified test files. +1 javac 7m 33s There were no new javac warning messages. +1 javadoc 9m 36s There were no new javadoc warning messages. +1 release audit 0m 23s The applied patch does not increase the total number of release audit warnings. +1 checkstyle 1m 3s There were no new checkstyle issues. +1 whitespace 0m 0s The patch has no lines that end in whitespace. +1 install 1m 34s mvn install still works. +1 eclipse:eclipse 0m 33s The patch built with eclipse:eclipse. +1 findbugs 1m 49s The patch does not introduce any new Findbugs (version 3.0.0) warnings. +1 common tests 22m 5s Tests passed in hadoop-common.     61m 4s   Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12743749/HADOOP-12185-2.patch Optional Tests javadoc javac unit findbugs checkstyle git revision trunk / fc92d3e hadoop-common test log https://builds.apache.org/job/PreCommit-HADOOP-Build/7161/artifact/patchprocess/testrun_hadoop-common.txt Test Results https://builds.apache.org/job/PreCommit-HADOOP-Build/7161/testReport/ Java 1.7.0_55 uname Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Console output https://builds.apache.org/job/PreCommit-HADOOP-Build/7161/console This message was automatically generated.
          Hide
          chris.douglas Chris Douglas added a comment -

          I committed this to trunk and branch-2. Thanks, Inigo

          Show
          chris.douglas Chris Douglas added a comment - I committed this to trunk and branch-2. Thanks, Inigo
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-trunk-Commit #8118 (See https://builds.apache.org/job/Hadoop-trunk-Commit/8118/)
          HADOOP-12185. NetworkTopology is not efficient adding/getting/removing nodes. Contributed by Inigo Goiri (cdouglas: rev 47a69ec7a5417cb56b75d07184dd6888ff068302)

          • hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/net/NetworkTopology.java
          • hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/net/TestClusterTopology.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-trunk-Commit #8118 (See https://builds.apache.org/job/Hadoop-trunk-Commit/8118/ ) HADOOP-12185 . NetworkTopology is not efficient adding/getting/removing nodes. Contributed by Inigo Goiri (cdouglas: rev 47a69ec7a5417cb56b75d07184dd6888ff068302) hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/net/NetworkTopology.java hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/net/TestClusterTopology.java
          Hide
          elgoiri Íñigo Goiri added a comment -

          Thank you!

          Show
          elgoiri Íñigo Goiri added a comment - Thank you!
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Yarn-trunk #979 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/979/)
          HADOOP-12185. NetworkTopology is not efficient adding/getting/removing nodes. Contributed by Inigo Goiri (cdouglas: rev 47a69ec7a5417cb56b75d07184dd6888ff068302)

          • hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/net/NetworkTopology.java
          • hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/net/TestClusterTopology.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Yarn-trunk #979 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/979/ ) HADOOP-12185 . NetworkTopology is not efficient adding/getting/removing nodes. Contributed by Inigo Goiri (cdouglas: rev 47a69ec7a5417cb56b75d07184dd6888ff068302) hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/net/NetworkTopology.java hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/net/TestClusterTopology.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #249 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/249/)
          HADOOP-12185. NetworkTopology is not efficient adding/getting/removing nodes. Contributed by Inigo Goiri (cdouglas: rev 47a69ec7a5417cb56b75d07184dd6888ff068302)

          • hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/net/TestClusterTopology.java
          • hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/net/NetworkTopology.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #249 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/249/ ) HADOOP-12185 . NetworkTopology is not efficient adding/getting/removing nodes. Contributed by Inigo Goiri (cdouglas: rev 47a69ec7a5417cb56b75d07184dd6888ff068302) hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/net/TestClusterTopology.java hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/net/NetworkTopology.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #247 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/247/)
          HADOOP-12185. NetworkTopology is not efficient adding/getting/removing nodes. Contributed by Inigo Goiri (cdouglas: rev 47a69ec7a5417cb56b75d07184dd6888ff068302)

          • hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/net/NetworkTopology.java
          • hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/net/TestClusterTopology.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #247 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/247/ ) HADOOP-12185 . NetworkTopology is not efficient adding/getting/removing nodes. Contributed by Inigo Goiri (cdouglas: rev 47a69ec7a5417cb56b75d07184dd6888ff068302) hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/net/NetworkTopology.java hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/net/TestClusterTopology.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Mapreduce-trunk #2195 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2195/)
          HADOOP-12185. NetworkTopology is not efficient adding/getting/removing nodes. Contributed by Inigo Goiri (cdouglas: rev 47a69ec7a5417cb56b75d07184dd6888ff068302)

          • hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/net/NetworkTopology.java
          • hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/net/TestClusterTopology.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Mapreduce-trunk #2195 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2195/ ) HADOOP-12185 . NetworkTopology is not efficient adding/getting/removing nodes. Contributed by Inigo Goiri (cdouglas: rev 47a69ec7a5417cb56b75d07184dd6888ff068302) hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/net/NetworkTopology.java hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/net/TestClusterTopology.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Hdfs-trunk #2176 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2176/)
          HADOOP-12185. NetworkTopology is not efficient adding/getting/removing nodes. Contributed by Inigo Goiri (cdouglas: rev 47a69ec7a5417cb56b75d07184dd6888ff068302)

          • hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/net/NetworkTopology.java
          • hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/net/TestClusterTopology.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk #2176 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2176/ ) HADOOP-12185 . NetworkTopology is not efficient adding/getting/removing nodes. Contributed by Inigo Goiri (cdouglas: rev 47a69ec7a5417cb56b75d07184dd6888ff068302) hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/net/NetworkTopology.java hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/net/TestClusterTopology.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #237 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/237/)
          HADOOP-12185. NetworkTopology is not efficient adding/getting/removing nodes. Contributed by Inigo Goiri (cdouglas: rev 47a69ec7a5417cb56b75d07184dd6888ff068302)

          • hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/net/NetworkTopology.java
          • hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/net/TestClusterTopology.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #237 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/237/ ) HADOOP-12185 . NetworkTopology is not efficient adding/getting/removing nodes. Contributed by Inigo Goiri (cdouglas: rev 47a69ec7a5417cb56b75d07184dd6888ff068302) hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/net/NetworkTopology.java hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/net/TestClusterTopology.java

            People

            • Assignee:
              elgoiri Íñigo Goiri
              Reporter:
              elgoiri Íñigo Goiri
            • Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development