Details

    • Hadoop Flags:
      Reviewed
    • Release Note:
      Hide
      Add a new conf "dfs.balancer.max-size-to-move" so that Balancer.MAX_SIZE_TO_MOVE becomes configurable.
      Show
      Add a new conf "dfs.balancer.max-size-to-move" so that Balancer.MAX_SIZE_TO_MOVE becomes configurable.

      Description

      The original design of Balancer is intentionally to make it run slowly so that the balancing activities won't affect the normal cluster activities and the running jobs.

      There are new use case that cluster admin may choose to balance the cluster when the cluster load is low, or in a maintain window. So that we should have an option to allow Balancer to run faster.

      1. bal1.png
        21 kB
        Kihwal Lee
      2. bal2.png
        16 kB
        Kihwal Lee
      3. h8818_20150723.patch
        20 kB
        Tsz Wo Nicholas Sze
      4. h8818_20150727.patch
        20 kB
        Tsz Wo Nicholas Sze
      5. HDFS-8818-branch-2.7.00.patch
        19 kB
        Zhe Zhang

        Issue Links

          Activity

          Hide
          szetszwo Tsz Wo Nicholas Sze added a comment -

          h8818_20150723.patch: changes the global moveExecutor to per datanode moveExecutor(s) and changes MAX_SIZE_TO_MOVE to be configurable.

          Show
          szetszwo Tsz Wo Nicholas Sze added a comment - h8818_20150723.patch: changes the global moveExecutor to per datanode moveExecutor(s) and changes MAX_SIZE_TO_MOVE to be configurable.
          Hide
          hadoopqa Hadoop QA added a comment -



          -1 overall



          Vote Subsystem Runtime Comment
          0 pre-patch 21m 53s Pre-patch trunk compilation is healthy.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 tests included 0m 0s The patch appears to include 1 new or modified test files.
          +1 javac 10m 6s There were no new javac warning messages.
          +1 javadoc 11m 43s There were no new javadoc warning messages.
          +1 release audit 0m 27s The applied patch does not increase the total number of release audit warnings.
          -1 checkstyle 1m 43s The applied patch generated 7 new checkstyle issues (total was 524, now 530).
          -1 whitespace 0m 1s The patch has 6 line(s) that end in whitespace. Use git apply --whitespace=fix.
          +1 install 1m 36s mvn install still works.
          +1 eclipse:eclipse 0m 40s The patch built with eclipse:eclipse.
          -1 findbugs 3m 8s The patch appears to introduce 1 new Findbugs (version 3.0.0) warnings.
          +1 native 3m 43s Pre-build of native portion
          -1 hdfs tests 174m 34s Tests failed in hadoop-hdfs.
              229m 40s  



          Reason Tests
          FindBugs module:hadoop-hdfs
          Failed unit tests hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistFiles
            hadoop.hdfs.TestParallelShortCircuitReadUnCached
            hadoop.hdfs.TestDistributedFileSystem
            hadoop.hdfs.server.namenode.ha.TestHAAppend



          Subsystem Report/Notes
          Patch URL http://issues.apache.org/jira/secure/attachment/12746966/h8818_20150723.patch
          Optional Tests javadoc javac unit findbugs checkstyle
          git revision trunk / 02c0181
          checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/11824/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt
          whitespace https://builds.apache.org/job/PreCommit-HDFS-Build/11824/artifact/patchprocess/whitespace.txt
          Findbugs warnings https://builds.apache.org/job/PreCommit-HDFS-Build/11824/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
          hadoop-hdfs test log https://builds.apache.org/job/PreCommit-HDFS-Build/11824/artifact/patchprocess/testrun_hadoop-hdfs.txt
          Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/11824/testReport/
          Java 1.7.0_55
          uname Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/11824/console

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 pre-patch 21m 53s Pre-patch trunk compilation is healthy. +1 @author 0m 0s The patch does not contain any @author tags. +1 tests included 0m 0s The patch appears to include 1 new or modified test files. +1 javac 10m 6s There were no new javac warning messages. +1 javadoc 11m 43s There were no new javadoc warning messages. +1 release audit 0m 27s The applied patch does not increase the total number of release audit warnings. -1 checkstyle 1m 43s The applied patch generated 7 new checkstyle issues (total was 524, now 530). -1 whitespace 0m 1s The patch has 6 line(s) that end in whitespace. Use git apply --whitespace=fix. +1 install 1m 36s mvn install still works. +1 eclipse:eclipse 0m 40s The patch built with eclipse:eclipse. -1 findbugs 3m 8s The patch appears to introduce 1 new Findbugs (version 3.0.0) warnings. +1 native 3m 43s Pre-build of native portion -1 hdfs tests 174m 34s Tests failed in hadoop-hdfs.     229m 40s   Reason Tests FindBugs module:hadoop-hdfs Failed unit tests hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistFiles   hadoop.hdfs.TestParallelShortCircuitReadUnCached   hadoop.hdfs.TestDistributedFileSystem   hadoop.hdfs.server.namenode.ha.TestHAAppend Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12746966/h8818_20150723.patch Optional Tests javadoc javac unit findbugs checkstyle git revision trunk / 02c0181 checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/11824/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt whitespace https://builds.apache.org/job/PreCommit-HDFS-Build/11824/artifact/patchprocess/whitespace.txt Findbugs warnings https://builds.apache.org/job/PreCommit-HDFS-Build/11824/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html hadoop-hdfs test log https://builds.apache.org/job/PreCommit-HDFS-Build/11824/artifact/patchprocess/testrun_hadoop-hdfs.txt Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/11824/testReport/ Java 1.7.0_55 uname Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Console output https://builds.apache.org/job/PreCommit-HDFS-Build/11824/console This message was automatically generated.
          Hide
          jnp Jitendra Nath Pandey added a comment -

          +1 conditional on addressing findbugs/style issues.

          Show
          jnp Jitendra Nath Pandey added a comment - +1 conditional on addressing findbugs/style issues.
          Hide
          szetszwo Tsz Wo Nicholas Sze added a comment -

          h8818_20150727.patch: fixes the findbugs warning and some trailing whitespaces.

          Show
          szetszwo Tsz Wo Nicholas Sze added a comment - h8818_20150727.patch: fixes the findbugs warning and some trailing whitespaces.
          Hide
          hadoopqa Hadoop QA added a comment -



          -1 overall



          Vote Subsystem Runtime Comment
          0 pre-patch 17m 0s Pre-patch trunk compilation is healthy.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 tests included 0m 0s The patch appears to include 1 new or modified test files.
          +1 javac 7m 36s There were no new javac warning messages.
          +1 javadoc 9m 34s There were no new javadoc warning messages.
          +1 release audit 0m 23s The applied patch does not increase the total number of release audit warnings.
          -1 checkstyle 1m 20s The applied patch generated 7 new checkstyle issues (total was 525, now 531).
          +1 whitespace 0m 0s The patch has no lines that end in whitespace.
          +1 install 1m 20s mvn install still works.
          +1 eclipse:eclipse 0m 33s The patch built with eclipse:eclipse.
          +1 findbugs 2m 32s The patch does not introduce any new Findbugs (version 3.0.0) warnings.
          +1 native 3m 8s Pre-build of native portion
          -1 hdfs tests 159m 31s Tests failed in hadoop-hdfs.
              203m 2s  



          Reason Tests
          Failed unit tests hadoop.hdfs.TestRollingUpgrade
            hadoop.hdfs.server.namenode.ha.TestStandbyIsHot



          Subsystem Report/Notes
          Patch URL http://issues.apache.org/jira/secure/attachment/12747448/h8818_20150727.patch
          Optional Tests javadoc javac unit findbugs checkstyle
          git revision trunk / 3e6fce9
          checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/11849/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt
          hadoop-hdfs test log https://builds.apache.org/job/PreCommit-HDFS-Build/11849/artifact/patchprocess/testrun_hadoop-hdfs.txt
          Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/11849/testReport/
          Java 1.7.0_55
          uname Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/11849/console

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 pre-patch 17m 0s Pre-patch trunk compilation is healthy. +1 @author 0m 0s The patch does not contain any @author tags. +1 tests included 0m 0s The patch appears to include 1 new or modified test files. +1 javac 7m 36s There were no new javac warning messages. +1 javadoc 9m 34s There were no new javadoc warning messages. +1 release audit 0m 23s The applied patch does not increase the total number of release audit warnings. -1 checkstyle 1m 20s The applied patch generated 7 new checkstyle issues (total was 525, now 531). +1 whitespace 0m 0s The patch has no lines that end in whitespace. +1 install 1m 20s mvn install still works. +1 eclipse:eclipse 0m 33s The patch built with eclipse:eclipse. +1 findbugs 2m 32s The patch does not introduce any new Findbugs (version 3.0.0) warnings. +1 native 3m 8s Pre-build of native portion -1 hdfs tests 159m 31s Tests failed in hadoop-hdfs.     203m 2s   Reason Tests Failed unit tests hadoop.hdfs.TestRollingUpgrade   hadoop.hdfs.server.namenode.ha.TestStandbyIsHot Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12747448/h8818_20150727.patch Optional Tests javadoc javac unit findbugs checkstyle git revision trunk / 3e6fce9 checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/11849/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt hadoop-hdfs test log https://builds.apache.org/job/PreCommit-HDFS-Build/11849/artifact/patchprocess/testrun_hadoop-hdfs.txt Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/11849/testReport/ Java 1.7.0_55 uname Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Console output https://builds.apache.org/job/PreCommit-HDFS-Build/11849/console This message was automatically generated.
          Hide
          lichangleo Chang Li added a comment -

          Tsz Wo Nicholas Sze thanks for the patch. One thing I am concerned is the change

          -      return srcBlocks.size() < SOURCE_BLOCKS_MIN_SIZE && blocksToReceive > 0;
          +      return blocksToReceive > 0;
          

          now that dispatcher will keep fetching more blocks from namenode every iteration, but namenode is likely to return very same list of blocks since the block moving is not that fast and namenode can't know the blocks just moved instantly. This could increase useless load on namenode.

          Show
          lichangleo Chang Li added a comment - Tsz Wo Nicholas Sze thanks for the patch. One thing I am concerned is the change - return srcBlocks.size() < SOURCE_BLOCKS_MIN_SIZE && blocksToReceive > 0; + return blocksToReceive > 0; now that dispatcher will keep fetching more blocks from namenode every iteration, but namenode is likely to return very same list of blocks since the block moving is not that fast and namenode can't know the blocks just moved instantly. This could increase useless load on namenode.
          Hide
          szetszwo Tsz Wo Nicholas Sze added a comment -

          > now that dispatcher will keep fetching more blocks from namenode every iteration, but namenode is likely to return very same list of blocks since the block moving is not that fast and namenode can't know the blocks just moved instantly. ...

          I think it is not the case by the following reasons.

          • blocksToReceive will become <= 0.
          • Balancer waits until all block transfer are done each iteration and Datanodes send a block receipt immediately once they receive a block.
          • BlockManager.getBlocks(..) uses a random offset to get the blocks from the list.
          Show
          szetszwo Tsz Wo Nicholas Sze added a comment - > now that dispatcher will keep fetching more blocks from namenode every iteration, but namenode is likely to return very same list of blocks since the block moving is not that fast and namenode can't know the blocks just moved instantly. ... I think it is not the case by the following reasons. blocksToReceive will become <= 0. Balancer waits until all block transfer are done each iteration and Datanodes send a block receipt immediately once they receive a block. BlockManager.getBlocks(..) uses a random offset to get the blocks from the list.
          Hide
          lichangleo Chang Li added a comment -

          Balancer waits until all block transfer are done each iteration and Datanodes send a block receipt immediately once they receive a block.

          But inside dispatchBlocks(), the Source will not wait for block gets transferred, it will quickly iterate and ask namenode for more blocks, even random offset can not prevent namenode from returning many same blocks, which will be waste

          Show
          lichangleo Chang Li added a comment - Balancer waits until all block transfer are done each iteration and Datanodes send a block receipt immediately once they receive a block. But inside dispatchBlocks(), the Source will not wait for block gets transferred, it will quickly iterate and ask namenode for more blocks, even random offset can not prevent namenode from returning many same blocks, which will be waste
          Hide
          szetszwo Tsz Wo Nicholas Sze added a comment -

          Datanode store blocks in TB scale. Balancer only gets GB blocks. It seems unlikely to get the same blocks.

          Show
          szetszwo Tsz Wo Nicholas Sze added a comment - Datanode store blocks in TB scale. Balancer only gets GB blocks. It seems unlikely to get the same blocks.
          Hide
          lichangleo Chang Li added a comment -

          Also right now Souce will fetch from namenode no more than 2GB blocks at a time. IMO it's better to increase MAX_BLOCKS_SIZE_TO_FETCH, say about 10G. It's not efficient to ask namenode for this little amount each time and ask it a lot of times.

          Show
          lichangleo Chang Li added a comment - Also right now Souce will fetch from namenode no more than 2GB blocks at a time. IMO it's better to increase MAX_BLOCKS_SIZE_TO_FETCH, say about 10G. It's not efficient to ask namenode for this little amount each time and ask it a lot of times.
          Hide
          szetszwo Tsz Wo Nicholas Sze added a comment -

          I am going to make MAX_BLOCKS_SIZE_TO_FETCH configurable in HDFS-8824.

          Show
          szetszwo Tsz Wo Nicholas Sze added a comment - I am going to make MAX_BLOCKS_SIZE_TO_FETCH configurable in HDFS-8824 .
          Hide
          jnp Jitendra Nath Pandey added a comment -

          +1 for the latest patch.

          Show
          jnp Jitendra Nath Pandey added a comment - +1 for the latest patch.
          Hide
          szetszwo Tsz Wo Nicholas Sze added a comment -

          I have committed this.

          Show
          szetszwo Tsz Wo Nicholas Sze added a comment - I have committed this.
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-trunk-Commit #8281 (See https://builds.apache.org/job/Hadoop-trunk-Commit/8281/)
          HDFS-8818. Changes the global moveExecutor to per datanode executors and changes MAX_SIZE_TO_MOVE to be configurable. (szetszwo: rev b56daff6a186599764b046248565918b894ec116)

          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/MovedBlocks.java
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Balancer.java
          • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Dispatcher.java
          • hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/balancer/TestBalancer.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-trunk-Commit #8281 (See https://builds.apache.org/job/Hadoop-trunk-Commit/8281/ ) HDFS-8818 . Changes the global moveExecutor to per datanode executors and changes MAX_SIZE_TO_MOVE to be configurable. (szetszwo: rev b56daff6a186599764b046248565918b894ec116) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/MovedBlocks.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Balancer.java hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Dispatcher.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/balancer/TestBalancer.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #284 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/284/)
          HDFS-8818. Changes the global moveExecutor to per datanode executors and changes MAX_SIZE_TO_MOVE to be configurable. (szetszwo: rev b56daff6a186599764b046248565918b894ec116)

          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/MovedBlocks.java
          • hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/balancer/TestBalancer.java
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Balancer.java
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Dispatcher.java
          • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #284 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/284/ ) HDFS-8818 . Changes the global moveExecutor to per datanode executors and changes MAX_SIZE_TO_MOVE to be configurable. (szetszwo: rev b56daff6a186599764b046248565918b894ec116) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/MovedBlocks.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/balancer/TestBalancer.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Balancer.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Dispatcher.java hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Yarn-trunk #1014 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/1014/)
          HDFS-8818. Changes the global moveExecutor to per datanode executors and changes MAX_SIZE_TO_MOVE to be configurable. (szetszwo: rev b56daff6a186599764b046248565918b894ec116)

          • hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/balancer/TestBalancer.java
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
          • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Balancer.java
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Dispatcher.java
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/MovedBlocks.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Yarn-trunk #1014 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/1014/ ) HDFS-8818 . Changes the global moveExecutor to per datanode executors and changes MAX_SIZE_TO_MOVE to be configurable. (szetszwo: rev b56daff6a186599764b046248565918b894ec116) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/balancer/TestBalancer.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Balancer.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Dispatcher.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/MovedBlocks.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Mapreduce-trunk #2230 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2230/)
          HDFS-8818. Changes the global moveExecutor to per datanode executors and changes MAX_SIZE_TO_MOVE to be configurable. (szetszwo: rev b56daff6a186599764b046248565918b894ec116)

          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/MovedBlocks.java
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Balancer.java
          • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Dispatcher.java
          • hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/balancer/TestBalancer.java
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Mapreduce-trunk #2230 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2230/ ) HDFS-8818 . Changes the global moveExecutor to per datanode executors and changes MAX_SIZE_TO_MOVE to be configurable. (szetszwo: rev b56daff6a186599764b046248565918b894ec116) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/MovedBlocks.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Balancer.java hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Dispatcher.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/balancer/TestBalancer.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #281 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/281/)
          HDFS-8818. Changes the global moveExecutor to per datanode executors and changes MAX_SIZE_TO_MOVE to be configurable. (szetszwo: rev b56daff6a186599764b046248565918b894ec116)

          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/MovedBlocks.java
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Dispatcher.java
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Balancer.java
          • hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/balancer/TestBalancer.java
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
          • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #281 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/281/ ) HDFS-8818 . Changes the global moveExecutor to per datanode executors and changes MAX_SIZE_TO_MOVE to be configurable. (szetszwo: rev b56daff6a186599764b046248565918b894ec116) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/MovedBlocks.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Dispatcher.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Balancer.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/balancer/TestBalancer.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Hdfs-trunk #2211 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2211/)
          HDFS-8818. Changes the global moveExecutor to per datanode executors and changes MAX_SIZE_TO_MOVE to be configurable. (szetszwo: rev b56daff6a186599764b046248565918b894ec116)

          • hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/balancer/TestBalancer.java
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Dispatcher.java
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/MovedBlocks.java
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Balancer.java
          • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk #2211 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2211/ ) HDFS-8818 . Changes the global moveExecutor to per datanode executors and changes MAX_SIZE_TO_MOVE to be configurable. (szetszwo: rev b56daff6a186599764b046248565918b894ec116) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/balancer/TestBalancer.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Dispatcher.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/MovedBlocks.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Balancer.java hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #273 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/273/)
          HDFS-8818. Changes the global moveExecutor to per datanode executors and changes MAX_SIZE_TO_MOVE to be configurable. (szetszwo: rev b56daff6a186599764b046248565918b894ec116)

          • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
          • hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/balancer/TestBalancer.java
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/MovedBlocks.java
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Balancer.java
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Dispatcher.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #273 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/273/ ) HDFS-8818 . Changes the global moveExecutor to per datanode executors and changes MAX_SIZE_TO_MOVE to be configurable. (szetszwo: rev b56daff6a186599764b046248565918b894ec116) hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/balancer/TestBalancer.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/MovedBlocks.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Balancer.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Dispatcher.java
          Hide
          zhz Zhe Zhang added a comment -

          Sorry to reopen this one. I think this is a valid improvement for branch-2.7 and I'm trying to backport it. I'll attach a branch-2.7 patch for Jenkins verification.

          Show
          zhz Zhe Zhang added a comment - Sorry to reopen this one. I think this is a valid improvement for branch-2.7 and I'm trying to backport it. I'll attach a branch-2.7 patch for Jenkins verification.
          Hide
          zhz Zhe Zhang added a comment -

          Quick note that 2 failures were found even without the patch. HDFS-10859

          Show
          zhz Zhe Zhang added a comment - Quick note that 2 failures were found even without the patch. HDFS-10859
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 11m 23s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
          +1 mvninstall 7m 37s branch-2.7 passed
          +1 compile 0m 57s branch-2.7 passed with JDK v1.8.0_101
          +1 compile 1m 0s branch-2.7 passed with JDK v1.7.0_111
          +1 checkstyle 0m 30s branch-2.7 passed
          +1 mvnsite 1m 0s branch-2.7 passed
          +1 mvneclipse 0m 21s branch-2.7 passed
          +1 findbugs 2m 56s branch-2.7 passed
          +1 javadoc 0m 58s branch-2.7 passed with JDK v1.8.0_101
          +1 javadoc 1m 42s branch-2.7 passed with JDK v1.7.0_111
          +1 mvninstall 0m 51s the patch passed
          +1 compile 0m 55s the patch passed with JDK v1.8.0_101
          +1 javac 0m 55s the patch passed
          +1 compile 0m 58s the patch passed with JDK v1.7.0_111
          +1 javac 0m 58s the patch passed
          -0 checkstyle 0m 28s hadoop-hdfs-project/hadoop-hdfs: The patch generated 7 new + 574 unchanged - 1 fixed = 581 total (was 575)
          +1 mvnsite 0m 55s the patch passed
          +1 mvneclipse 0m 12s the patch passed
          -1 whitespace 0m 0s The patch has 1967 line(s) that end in whitespace. Use git apply --whitespace=fix <<patch_file>>. Refer https://git-scm.com/docs/git-apply
          -1 whitespace 0m 51s The patch 161 line(s) with tabs.
          +1 findbugs 3m 4s the patch passed
          +1 javadoc 0m 54s the patch passed with JDK v1.8.0_101
          +1 javadoc 1m 41s the patch passed with JDK v1.7.0_111
          -1 unit 42m 0s hadoop-hdfs in the patch failed with JDK v1.7.0_111.
          -1 asflicense 0m 19s The patch generated 3 ASF License warnings.
          128m 36s



          Reason Tests
          JDK v1.8.0_101 Failed junit tests hadoop.hdfs.web.TestWebHdfsTimeouts
            hadoop.hdfs.web.TestHttpsFileSystem
            hadoop.hdfs.server.namenode.snapshot.TestRenameWithSnapshots
          JDK v1.7.0_111 Failed junit tests hadoop.hdfs.TestDFSShell
            hadoop.hdfs.server.namenode.snapshot.TestRenameWithSnapshots



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:c420dfe
          JIRA Issue HDFS-8818
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12828084/HDFS-8818-branch-2.7.00.patch
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux bcf4885c725e 3.13.0-93-generic #140-Ubuntu SMP Mon Jul 18 21:21:05 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision branch-2.7 / e807fde
          Default Java 1.7.0_111
          Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_101 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_111
          findbugs v3.0.0
          checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/16711/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
          whitespace https://builds.apache.org/job/PreCommit-HDFS-Build/16711/artifact/patchprocess/whitespace-eol.txt
          whitespace https://builds.apache.org/job/PreCommit-HDFS-Build/16711/artifact/patchprocess/whitespace-tabs.txt
          unit https://builds.apache.org/job/PreCommit-HDFS-Build/16711/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_111.txt
          JDK v1.7.0_111 Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/16711/testReport/
          asflicense https://builds.apache.org/job/PreCommit-HDFS-Build/16711/artifact/patchprocess/patch-asflicense-problems.txt
          modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/16711/console
          Powered by Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 11m 23s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files. +1 mvninstall 7m 37s branch-2.7 passed +1 compile 0m 57s branch-2.7 passed with JDK v1.8.0_101 +1 compile 1m 0s branch-2.7 passed with JDK v1.7.0_111 +1 checkstyle 0m 30s branch-2.7 passed +1 mvnsite 1m 0s branch-2.7 passed +1 mvneclipse 0m 21s branch-2.7 passed +1 findbugs 2m 56s branch-2.7 passed +1 javadoc 0m 58s branch-2.7 passed with JDK v1.8.0_101 +1 javadoc 1m 42s branch-2.7 passed with JDK v1.7.0_111 +1 mvninstall 0m 51s the patch passed +1 compile 0m 55s the patch passed with JDK v1.8.0_101 +1 javac 0m 55s the patch passed +1 compile 0m 58s the patch passed with JDK v1.7.0_111 +1 javac 0m 58s the patch passed -0 checkstyle 0m 28s hadoop-hdfs-project/hadoop-hdfs: The patch generated 7 new + 574 unchanged - 1 fixed = 581 total (was 575) +1 mvnsite 0m 55s the patch passed +1 mvneclipse 0m 12s the patch passed -1 whitespace 0m 0s The patch has 1967 line(s) that end in whitespace. Use git apply --whitespace=fix <<patch_file>>. Refer https://git-scm.com/docs/git-apply -1 whitespace 0m 51s The patch 161 line(s) with tabs. +1 findbugs 3m 4s the patch passed +1 javadoc 0m 54s the patch passed with JDK v1.8.0_101 +1 javadoc 1m 41s the patch passed with JDK v1.7.0_111 -1 unit 42m 0s hadoop-hdfs in the patch failed with JDK v1.7.0_111. -1 asflicense 0m 19s The patch generated 3 ASF License warnings. 128m 36s Reason Tests JDK v1.8.0_101 Failed junit tests hadoop.hdfs.web.TestWebHdfsTimeouts   hadoop.hdfs.web.TestHttpsFileSystem   hadoop.hdfs.server.namenode.snapshot.TestRenameWithSnapshots JDK v1.7.0_111 Failed junit tests hadoop.hdfs.TestDFSShell   hadoop.hdfs.server.namenode.snapshot.TestRenameWithSnapshots Subsystem Report/Notes Docker Image:yetus/hadoop:c420dfe JIRA Issue HDFS-8818 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12828084/HDFS-8818-branch-2.7.00.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux bcf4885c725e 3.13.0-93-generic #140-Ubuntu SMP Mon Jul 18 21:21:05 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision branch-2.7 / e807fde Default Java 1.7.0_111 Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_101 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_111 findbugs v3.0.0 checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/16711/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt whitespace https://builds.apache.org/job/PreCommit-HDFS-Build/16711/artifact/patchprocess/whitespace-eol.txt whitespace https://builds.apache.org/job/PreCommit-HDFS-Build/16711/artifact/patchprocess/whitespace-tabs.txt unit https://builds.apache.org/job/PreCommit-HDFS-Build/16711/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_111.txt JDK v1.7.0_111 Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/16711/testReport/ asflicense https://builds.apache.org/job/PreCommit-HDFS-Build/16711/artifact/patchprocess/patch-asflicense-problems.txt modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs Console output https://builds.apache.org/job/PreCommit-HDFS-Build/16711/console Powered by Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 11m 25s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
          +1 mvninstall 7m 37s branch-2.7 passed
          +1 compile 1m 0s branch-2.7 passed with JDK v1.8.0_101
          +1 compile 1m 2s branch-2.7 passed with JDK v1.7.0_111
          +1 checkstyle 0m 30s branch-2.7 passed
          +1 mvnsite 0m 59s branch-2.7 passed
          +1 mvneclipse 0m 16s branch-2.7 passed
          +1 findbugs 2m 56s branch-2.7 passed
          +1 javadoc 0m 59s branch-2.7 passed with JDK v1.8.0_101
          +1 javadoc 1m 46s branch-2.7 passed with JDK v1.7.0_111
          +1 mvninstall 0m 51s the patch passed
          +1 compile 0m 58s the patch passed with JDK v1.8.0_101
          +1 javac 0m 58s the patch passed
          +1 compile 0m 58s the patch passed with JDK v1.7.0_111
          +1 javac 0m 58s the patch passed
          -0 checkstyle 0m 28s hadoop-hdfs-project/hadoop-hdfs: The patch generated 7 new + 574 unchanged - 1 fixed = 581 total (was 575)
          +1 mvnsite 0m 55s the patch passed
          +1 mvneclipse 0m 12s the patch passed
          -1 whitespace 0m 0s The patch has 1967 line(s) that end in whitespace. Use git apply --whitespace=fix <<patch_file>>. Refer https://git-scm.com/docs/git-apply
          -1 whitespace 0m 47s The patch 161 line(s) with tabs.
          +1 findbugs 3m 7s the patch passed
          +1 javadoc 0m 56s the patch passed with JDK v1.8.0_101
          +1 javadoc 1m 40s the patch passed with JDK v1.7.0_111
          -1 unit 43m 46s hadoop-hdfs in the patch failed with JDK v1.7.0_111.
          -1 asflicense 0m 18s The patch generated 3 ASF License warnings.
          131m 46s



          Reason Tests
          JDK v1.8.0_101 Failed junit tests hadoop.hdfs.server.namenode.snapshot.TestRenameWithSnapshots
            hadoop.hdfs.server.balancer.TestBalancer
            hadoop.hdfs.server.namenode.TestFSImageWithSnapshot
            hadoop.hdfs.server.datanode.TestBlockScanner
          JDK v1.7.0_111 Failed junit tests hadoop.hdfs.server.namenode.snapshot.TestRenameWithSnapshots
            hadoop.hdfs.server.balancer.TestBalancer



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:c420dfe
          JIRA Issue HDFS-8818
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12828084/HDFS-8818-branch-2.7.00.patch
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux cd07a2945a03 3.13.0-93-generic #140-Ubuntu SMP Mon Jul 18 21:21:05 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision branch-2.7 / e807fde
          Default Java 1.7.0_111
          Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_101 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_111
          findbugs v3.0.0
          checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/16712/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
          whitespace https://builds.apache.org/job/PreCommit-HDFS-Build/16712/artifact/patchprocess/whitespace-eol.txt
          whitespace https://builds.apache.org/job/PreCommit-HDFS-Build/16712/artifact/patchprocess/whitespace-tabs.txt
          unit https://builds.apache.org/job/PreCommit-HDFS-Build/16712/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_111.txt
          JDK v1.7.0_111 Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/16712/testReport/
          asflicense https://builds.apache.org/job/PreCommit-HDFS-Build/16712/artifact/patchprocess/patch-asflicense-problems.txt
          modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/16712/console
          Powered by Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 11m 25s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files. +1 mvninstall 7m 37s branch-2.7 passed +1 compile 1m 0s branch-2.7 passed with JDK v1.8.0_101 +1 compile 1m 2s branch-2.7 passed with JDK v1.7.0_111 +1 checkstyle 0m 30s branch-2.7 passed +1 mvnsite 0m 59s branch-2.7 passed +1 mvneclipse 0m 16s branch-2.7 passed +1 findbugs 2m 56s branch-2.7 passed +1 javadoc 0m 59s branch-2.7 passed with JDK v1.8.0_101 +1 javadoc 1m 46s branch-2.7 passed with JDK v1.7.0_111 +1 mvninstall 0m 51s the patch passed +1 compile 0m 58s the patch passed with JDK v1.8.0_101 +1 javac 0m 58s the patch passed +1 compile 0m 58s the patch passed with JDK v1.7.0_111 +1 javac 0m 58s the patch passed -0 checkstyle 0m 28s hadoop-hdfs-project/hadoop-hdfs: The patch generated 7 new + 574 unchanged - 1 fixed = 581 total (was 575) +1 mvnsite 0m 55s the patch passed +1 mvneclipse 0m 12s the patch passed -1 whitespace 0m 0s The patch has 1967 line(s) that end in whitespace. Use git apply --whitespace=fix <<patch_file>>. Refer https://git-scm.com/docs/git-apply -1 whitespace 0m 47s The patch 161 line(s) with tabs. +1 findbugs 3m 7s the patch passed +1 javadoc 0m 56s the patch passed with JDK v1.8.0_101 +1 javadoc 1m 40s the patch passed with JDK v1.7.0_111 -1 unit 43m 46s hadoop-hdfs in the patch failed with JDK v1.7.0_111. -1 asflicense 0m 18s The patch generated 3 ASF License warnings. 131m 46s Reason Tests JDK v1.8.0_101 Failed junit tests hadoop.hdfs.server.namenode.snapshot.TestRenameWithSnapshots   hadoop.hdfs.server.balancer.TestBalancer   hadoop.hdfs.server.namenode.TestFSImageWithSnapshot   hadoop.hdfs.server.datanode.TestBlockScanner JDK v1.7.0_111 Failed junit tests hadoop.hdfs.server.namenode.snapshot.TestRenameWithSnapshots   hadoop.hdfs.server.balancer.TestBalancer Subsystem Report/Notes Docker Image:yetus/hadoop:c420dfe JIRA Issue HDFS-8818 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12828084/HDFS-8818-branch-2.7.00.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux cd07a2945a03 3.13.0-93-generic #140-Ubuntu SMP Mon Jul 18 21:21:05 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision branch-2.7 / e807fde Default Java 1.7.0_111 Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_101 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_111 findbugs v3.0.0 checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/16712/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt whitespace https://builds.apache.org/job/PreCommit-HDFS-Build/16712/artifact/patchprocess/whitespace-eol.txt whitespace https://builds.apache.org/job/PreCommit-HDFS-Build/16712/artifact/patchprocess/whitespace-tabs.txt unit https://builds.apache.org/job/PreCommit-HDFS-Build/16712/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_111.txt JDK v1.7.0_111 Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/16712/testReport/ asflicense https://builds.apache.org/job/PreCommit-HDFS-Build/16712/artifact/patchprocess/patch-asflicense-problems.txt modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs Console output https://builds.apache.org/job/PreCommit-HDFS-Build/16712/console Powered by Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          zhz Zhe Zhang added a comment -

          Committed to branch-2.7. Pre-existing flaky tests in TestBalancer are being addressed in HDFS-10859.

          Show
          zhz Zhe Zhang added a comment - Committed to branch-2.7. Pre-existing flaky tests in TestBalancer are being addressed in HDFS-10859 .
          Hide
          kihwal Kihwal Lee added a comment -

          -1. Proposing revert of this untested change.

          We've run it on a small (to us) 280 node cluster. It moves a little for a couple of iterations and hangs forever. So, contrary to the title, it becomes extremely slow. Blindly creating a thread pool per datanode is a bad idea.

            public void executePendingMove(final PendingMove p) {
              // move the block
              final DDatanode targetDn = p.target.getDDatanode();
              ExecutorService moveExecutor = targetDn.getMoveExecutor();
              if (moveExecutor == null) {
                final int nThreads = moverThreadAllocator.allocate(maxConcurrentMovesPerNode);
                if (nThreads > 0) {
                  moveExecutor = targetDn.initMoveExecutor(nThreads);
                }
              }
              if (moveExecutor == null) {
                LOG.warn("No mover threads available: skip moving " + p);
                return;
              }
          

          By default, this causes a pool of 50 threads to be created per target and at some point no more can be created. If this happens, the pending move is simply skipped. This causes eternal hang at waitForMoveCompletion(), because the pending moves are not removed. I saw 25 thread pools with 50 threads in the 280-node cluster in this hang state.

          IMO, creating more threads is not a scalable solution. Somehow allowing more thread pool creation is definitely not a solution either.

          Show
          kihwal Kihwal Lee added a comment - -1. Proposing revert of this untested change. We've run it on a small (to us) 280 node cluster. It moves a little for a couple of iterations and hangs forever. So, contrary to the title, it becomes extremely slow. Blindly creating a thread pool per datanode is a bad idea. public void executePendingMove( final PendingMove p) { // move the block final DDatanode targetDn = p.target.getDDatanode(); ExecutorService moveExecutor = targetDn.getMoveExecutor(); if (moveExecutor == null ) { final int nThreads = moverThreadAllocator.allocate(maxConcurrentMovesPerNode); if (nThreads > 0) { moveExecutor = targetDn.initMoveExecutor(nThreads); } } if (moveExecutor == null ) { LOG.warn( "No mover threads available: skip moving " + p); return ; } By default, this causes a pool of 50 threads to be created per target and at some point no more can be created. If this happens, the pending move is simply skipped. This causes eternal hang at waitForMoveCompletion() , because the pending moves are not removed. I saw 25 thread pools with 50 threads in the 280-node cluster in this hang state. IMO, creating more threads is not a scalable solution. Somehow allowing more thread pool creation is definitely not a solution either.
          Hide
          jojochuang Wei-Chiu Chuang added a comment - - edited

          Kihwal Lee
          FWIW, the hang you mentioned was fixed in HDFS-11377.

          Show
          jojochuang Wei-Chiu Chuang added a comment - - edited Kihwal Lee FWIW, the hang you mentioned was fixed in HDFS-11377 .
          Hide
          kihwal Kihwal Lee added a comment -

          Thanks for the update, Wei-Chiu Chuang. That fixes the hang, but the design is still flawed. Move decision is made then thread pools are created, which likely only cover a subset of targets in the schedule. Also, the schedule will less likely make use of all 50 threads anyway. In any case, the moves will all pile up to the first 25 targets (in our case, 25 thread pools seemed to be the limit). I don't think it will be faster than the previous one.

          Show
          kihwal Kihwal Lee added a comment - Thanks for the update, Wei-Chiu Chuang . That fixes the hang, but the design is still flawed. Move decision is made then thread pools are created, which likely only cover a subset of targets in the schedule. Also, the schedule will less likely make use of all 50 threads anyway. In any case, the moves will all pile up to the first 25 targets (in our case, 25 thread pools seemed to be the limit). I don't think it will be faster than the previous one.
          Hide
          kihwal Kihwal Lee added a comment -

          Reopening for revert. git won't revert it cleanly since there have been other changes since the commit. I manually reverted the changes, except for some of the non-core changes for configs and logging. Those seem to worth keeping. The balancer with the revert is being tested on a 2,400 node cluster.

          Show
          kihwal Kihwal Lee added a comment - Reopening for revert. git won't revert it cleanly since there have been other changes since the commit. I manually reverted the changes, except for some of the non-core changes for configs and logging. Those seem to worth keeping. The balancer with the revert is being tested on a 2,400 node cluster.
          Hide
          andrew.wang Andrew Wang added a comment -

          Can we open a new JIRA to track the revert in the changelog? This one has already gone out in some releases.

          Show
          andrew.wang Andrew Wang added a comment - Can we open a new JIRA to track the revert in the changelog? This one has already gone out in some releases.
          Hide
          kihwal Kihwal Lee added a comment -

          Attaching patches for revert of the main body of the change. It leaves config and logging changes as is.

          Show
          kihwal Kihwal Lee added a comment - Attaching patches for revert of the main body of the change. It leaves config and logging changes as is.
          Hide
          kihwal Kihwal Lee added a comment -

          I will open a new one and post the patch there.

          Show
          kihwal Kihwal Lee added a comment - I will open a new one and post the patch there.
          Hide
          szetszwo Tsz Wo Nicholas Sze added a comment -

          Hi Kihwal Lee, let's have a more detailed discussion before reverting.

          The patch here (a thread pool per datanode pair) is indeed an improvement for the previous design (a global thread pool) since it limits the number of threads assigned to a particular datanode pair. Previously, if the first datanode pair has a lot of pending moves, all the threads will be used to execute the moves for that pair so that it will be very slow since it cannot utilize the entire network.

          We also has tested the new code a lot and see significant performance improvement.

          Have you tested it with HDFS-11377?

          Show
          szetszwo Tsz Wo Nicholas Sze added a comment - Hi Kihwal Lee , let's have a more detailed discussion before reverting. The patch here (a thread pool per datanode pair) is indeed an improvement for the previous design (a global thread pool) since it limits the number of threads assigned to a particular datanode pair. Previously, if the first datanode pair has a lot of pending moves, all the threads will be used to execute the moves for that pair so that it will be very slow since it cannot utilize the entire network. We also has tested the new code a lot and see significant performance improvement. Have you tested it with HDFS-11377 ?
          Hide
          kihwal Kihwal Lee added a comment -

          We see this problem in EVERY single cluster we tried this on.

          The first big three bumps (20 min each) is 2.8 with this change reverted. The following small ones are the result of non-reverted 2.8 with the hang fix (HDFS-11377). It is still no where near what it used to be.

          This is the stock 2.8 balancer without HDFS-11377.

          I don't doubt that it worked great for the case you designed and tested for. However, please do realize that it will be a regression for many other users. I am not denying the shortcomings of the existing design. But the new design has clear issues, which cannot be worked around with config tweaks. Even if it works better for some cases, it is a regression.

          Show
          kihwal Kihwal Lee added a comment - We see this problem in EVERY single cluster we tried this on. The first big three bumps (20 min each) is 2.8 with this change reverted. The following small ones are the result of non-reverted 2.8 with the hang fix ( HDFS-11377 ). It is still no where near what it used to be. This is the stock 2.8 balancer without HDFS-11377 . I don't doubt that it worked great for the case you designed and tested for. However, please do realize that it will be a regression for many other users. I am not denying the shortcomings of the existing design. But the new design has clear issues, which cannot be worked around with config tweaks. Even if it works better for some cases, it is a regression.
          Hide
          szetszwo Tsz Wo Nicholas Sze added a comment -

          > ... However, please do realize that it will be a regression for many other users. ...

          I believe this statement is incorrect. The new design is more flexible than the previous one since we can control the number of thread per datanode pair.

          BTW, using the default configuration may not give us performance since we want the balancing activities not affecting the normal cluster activities and the running jobs.

          > ... which cannot be worked around with config tweaks. ...

          What are the config used exactly?

          Show
          szetszwo Tsz Wo Nicholas Sze added a comment - > ... However, please do realize that it will be a regression for many other users. ... I believe this statement is incorrect. The new design is more flexible than the previous one since we can control the number of thread per datanode pair. BTW, using the default configuration may not give us performance since we want the balancing activities not affecting the normal cluster activities and the running jobs. > ... which cannot be worked around with config tweaks. ... What are the config used exactly?
          Hide
          daryn Daryn Sharp added a comment -

          The new design is more flexible than the previous one since we can control the number of thread per datanode pair.

          That sounds good on paper but flexibility does not negate the fact it's proven not to scale. I'm sure the redesign works great on a couple dozen node cluster. As illustrated by Kihwal, it limps along on a 280 node cluster running slower than before and is virtually unusable on multi-thousand node clusters even with HDFS-11377.

          This has to be fixed in a manner that restores previous performance or be reverted. A jira touting "run faster" can't make the balancer slower and unfit for production...

          Show
          daryn Daryn Sharp added a comment - The new design is more flexible than the previous one since we can control the number of thread per datanode pair. That sounds good on paper but flexibility does not negate the fact it's proven not to scale. I'm sure the redesign works great on a couple dozen node cluster. As illustrated by Kihwal, it limps along on a 280 node cluster running slower than before and is virtually unusable on multi-thousand node clusters even with HDFS-11377 . This has to be fixed in a manner that restores previous performance or be reverted. A jira touting "run faster" can't make the balancer slower and unfit for production...
          Hide
          kihwal Kihwal Lee added a comment - - edited

          I set dfs.datanode.balance.max.concurrent.moves to 15, which is what we used to run with 2.7. 71 thread pools with 15 threads in each created. The 72th one is small and then started to fail. So I still see "skipping..." message in the log. The throughput is still visibly lower, after a minute of initial spike. The iteration lasts a bit longer, but the average throughput is still similarly low. Other config values are all set to default.

          I believe I have presented enough evidence.

          Show
          kihwal Kihwal Lee added a comment - - edited I set dfs.datanode.balance.max.concurrent.moves to 15, which is what we used to run with 2.7. 71 thread pools with 15 threads in each created. The 72th one is small and then started to fail. So I still see "skipping..." message in the log. The throughput is still visibly lower, after a minute of initial spike. The iteration lasts a bit longer, but the average throughput is still similarly low. Other config values are all set to default. I believe I have presented enough evidence.
          Hide
          szetszwo Tsz Wo Nicholas Sze added a comment - - edited

          15*72 = 1080. I guess you were using default dfs.balancer.moverThreads, which is 1,000. Have you tried to increased it, say 10,000 or 30,000?

          Show
          szetszwo Tsz Wo Nicholas Sze added a comment - - edited 15*72 = 1080. I guess you were using default dfs.balancer.moverThreads, which is 1,000. Have you tried to increased it, say 10,000 or 30,000?
          Hide
          szetszwo Tsz Wo Nicholas Sze added a comment -

          > ... I'm sure the redesign works great on a couple dozen node cluster. ...

          We never design Hadoop to ONLY work great on a couple dozen node cluster. We did have tested this with 500-node cluster. The performance did have been improved around 100x.

          Show
          szetszwo Tsz Wo Nicholas Sze added a comment - > ... I'm sure the redesign works great on a couple dozen node cluster. ... We never design Hadoop to ONLY work great on a couple dozen node cluster. We did have tested this with 500-node cluster. The performance did have been improved around 100x.
          Hide
          szetszwo Tsz Wo Nicholas Sze added a comment -

          Kihwal Lee, the metric replaceblockoperationspersec may not directly reflect the actual performance. Do you have HDFS-8824 in your runs? I suspect the first run has it but the second one does not.

          What are the values shown in the Balancer output?

          Show
          szetszwo Tsz Wo Nicholas Sze added a comment - Kihwal Lee , the metric replaceblockoperationspersec may not directly reflect the actual performance. Do you have HDFS-8824 in your runs? I suspect the first run has it but the second one does not. What are the values shown in the Balancer output?
          Hide
          kihwal Kihwal Lee added a comment -

          Do you have HDFS-8824 in your runs? I suspect the first run has it but the second one does not.

          It is the up-to-date branch-2.8, so all runs had it. About HDFS-8824, you do realize that over time older nodes will end up with only small blocks, if it is set permanently? It will look good for quick balancing, but may not be good in long term. We run with the min block size set to 1.

          The performance did have been improved around 100x.

          Can you reveal more details on the nature of the testing? It is unrealistic to expect 100x in our typical use case with the base line being 2.7.
          What was your config when you tested on the 500 node cluster? What was the nature of imbalance? Did the default values work? If not, how did you get there? Do you expect regular users to easily get there? At what point did you hit HDFS-11377?

          Show
          kihwal Kihwal Lee added a comment - Do you have HDFS-8824 in your runs? I suspect the first run has it but the second one does not. It is the up-to-date branch-2.8, so all runs had it. About HDFS-8824 , you do realize that over time older nodes will end up with only small blocks, if it is set permanently? It will look good for quick balancing, but may not be good in long term. We run with the min block size set to 1. The performance did have been improved around 100x. Can you reveal more details on the nature of the testing? It is unrealistic to expect 100x in our typical use case with the base line being 2.7. What was your config when you tested on the 500 node cluster? What was the nature of imbalance? Did the default values work? If not, how did you get there? Do you expect regular users to easily get there? At what point did you hit HDFS-11377 ?
          Hide
          daryn Daryn Sharp added a comment -

          Rather than creating fixed thread pools which will be idle as cluster size increases, perhaps cached thread pools that spawn dynamically would help.

          The previous balancer was easy to configure. I don't fully understand the previous design but a simpler approach that achieves the same improvement would be returning to a single fixed thread pool - with intelligent queuing of work. Ie. interleaving work for all targets, with a max queued limit, so replications are distributed evenly across nodes. I'm assuming it didn't do that.

          Do you have HDFS-8824 in your runs? I suspect the first run has it but the second one does not.

          over time older nodes will end up with only small blocks, if it is set permanently? It will look good for quick balancing, but may not be good in long term

          Exactly. We had to disable the feature because nodes become concentrated with small blocks. getBlocks becomes increasing expensive as it searches for a dwindling number of large blocks on unbalanced nodes. The client load increases on those nodes due to block volume. Eventually the balancer just plays a shell game moving the larger blocks.

          The current balancer probably works great when adding nodes, but not as a continuous service. If not reverted, something has to be done to restore previous steady state performance.

          Show
          daryn Daryn Sharp added a comment - Rather than creating fixed thread pools which will be idle as cluster size increases, perhaps cached thread pools that spawn dynamically would help. The previous balancer was easy to configure. I don't fully understand the previous design but a simpler approach that achieves the same improvement would be returning to a single fixed thread pool - with intelligent queuing of work. Ie. interleaving work for all targets, with a max queued limit, so replications are distributed evenly across nodes. I'm assuming it didn't do that. Do you have HDFS-8824 in your runs? I suspect the first run has it but the second one does not. over time older nodes will end up with only small blocks, if it is set permanently? It will look good for quick balancing, but may not be good in long term Exactly. We had to disable the feature because nodes become concentrated with small blocks. getBlocks becomes increasing expensive as it searches for a dwindling number of large blocks on unbalanced nodes. The client load increases on those nodes due to block volume. Eventually the balancer just plays a shell game moving the larger blocks. The current balancer probably works great when adding nodes, but not as a continuous service. If not reverted, something has to be done to restore previous steady state performance.
          Hide
          szetszwo Tsz Wo Nicholas Sze added a comment -

          > Can you reveal more details on the nature of the testing? It is unrealistic to expect 100x in our typical use case with the base line being 2.7.

          In our tests, we ran balancer over a 500-node cluster. We were only able to get ~5GB per minute before. Then, we were able to get ~500GB per minute after a serious of balancer improvement including this. This JIRA is the most critical since, without this, balancer schedules most the moves in the first few datanode pairs and the remaining datanodes are mostly idle.

          Below are the confs:

          • Datanode
            dfs.datanode.balance.max.concurrent.moves: 4 x #disks
            dfs.datanode.balance.bandwidthPerSec: 10737418240 (=10GB)
          • Balancer
            dfs.datanode.balance.max.concurrent.moves: 4 x #disks
            dfs.balancer.moverThreads: 20,000
            dfs.balancer.max-size-to-move: 107374182400 (=100GB)
            dfs.balancer.getBlocks.min-block-size: 104857600 (=100MB)
          Show
          szetszwo Tsz Wo Nicholas Sze added a comment - > Can you reveal more details on the nature of the testing? It is unrealistic to expect 100x in our typical use case with the base line being 2.7. In our tests, we ran balancer over a 500-node cluster. We were only able to get ~5GB per minute before. Then, we were able to get ~500GB per minute after a serious of balancer improvement including this. This JIRA is the most critical since, without this, balancer schedules most the moves in the first few datanode pairs and the remaining datanodes are mostly idle. Below are the confs: Datanode dfs.datanode.balance.max.concurrent.moves: 4 x #disks dfs.datanode.balance.bandwidthPerSec: 10737418240 (=10GB) Balancer dfs.datanode.balance.max.concurrent.moves: 4 x #disks dfs.balancer.moverThreads: 20,000 dfs.balancer.max-size-to-move: 107374182400 (=100GB) dfs.balancer.getBlocks.min-block-size: 104857600 (=100MB)
          Hide
          kihwal Kihwal Lee added a comment -

          I will post a patch for improvement tomorrow. I am letting it run thought the night.

          Show
          kihwal Kihwal Lee added a comment - I will post a patch for improvement tomorrow. I am letting it run thought the night.
          Hide
          kihwal Kihwal Lee added a comment -

          Posted a patch to HDFS-11742. Hopefully, this will prevent users from experiencing performance degradation and take the benefit of the per-target thread pool. I have test it overnight yesterday and it seems good. Now I see throughput improvement even with the default config values.

          Show
          kihwal Kihwal Lee added a comment - Posted a patch to HDFS-11742 . Hopefully, this will prevent users from experiencing performance degradation and take the benefit of the per-target thread pool. I have test it overnight yesterday and it seems good. Now I see throughput improvement even with the default config values.
          Hide
          hudson Hudson added a comment -

          SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #12043 (See https://builds.apache.org/job/Hadoop-trunk-Commit/12043/)
          HDFS-11742. Improve balancer usability after HDFS-8818. Contributed by (kihwal: rev 8e3a992eccff26a7344c3f0e719898fa97706b8c)

          • (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Dispatcher.java
          Show
          hudson Hudson added a comment - SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #12043 (See https://builds.apache.org/job/Hadoop-trunk-Commit/12043/ ) HDFS-11742 . Improve balancer usability after HDFS-8818 . Contributed by (kihwal: rev 8e3a992eccff26a7344c3f0e719898fa97706b8c) (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Dispatcher.java

            People

            • Assignee:
              szetszwo Tsz Wo Nicholas Sze
              Reporter:
              szetszwo Tsz Wo Nicholas Sze
            • Votes:
              0 Vote for this issue
              Watchers:
              19 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development