Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-12044

Mismatch between BlockManager#maxReplicationStreams and ErasureCodingWorker.stripedReconstructionPool pool size causes slow and bursty recovery

    Details

      Description

      ErasureCodingWorker#stripedReconstructionPool is with corePoolSize=2 and maxPoolSize=8 as default. And it rejects more tasks if the queue is full.

      When BlockManager#maxReplicationStream is larger than ErasureCodingWorker#stripedReconstructionPool#corePoolSize/maxPoolSize, for example, maxReplicationStream=20 and corePoolSize=2 , maxPoolSize=8. Meanwhile, NN sends up to maxTransfer reconstruction tasks to DN for each heartbeat, and it is calculated in FSNamesystem:

      final int maxTransfer = blockManager.getMaxReplicationStreams() - xmitsInProgress;
      

      However, at any giving time, {ErasureCodingWorker#stripedReconstructionPool takes 2 xmitInProcess. So for each heartbeat in 3s, NN will send about 20-2 = 18 reconstruction tasks to the DN, and DN throw away most of them if there were 8 tasks in the queue already. So NN needs to take longer to re-consider these blocks were under-replicated to schedule new tasks.

      1. HDFS-12044.00.patch
        1 kB
        Lei (Eddy) Xu
      2. HDFS-12044.01.patch
        5 kB
        Lei (Eddy) Xu
      3. HDFS-12044.02.patch
        8 kB
        Lei (Eddy) Xu
      4. HDFS-12044.03.patch
        14 kB
        Lei (Eddy) Xu
      5. HDFS-12044.04.patch
        17 kB
        Lei (Eddy) Xu
      6. HDFS-12044.05.patch
        17 kB
        Lei (Eddy) Xu

        Issue Links

          Activity

          Hide
          eddyxu Lei (Eddy) Xu added a comment -

          There are two potential approaches to this.

          • Allow ErasureCodingWorker to accept un-bounded number of re-construct tasks. The reconstruction worker for regular replicated files accept unbounded number of reconstruction tasks as well.
          • Make DFS_NAMENODE_REPLICATION_MAX_STREAMS_KEY and DFS_DN_EC_RECONSTRUCTION_STRIPED_BLK_THREADS_KEY be the same, i.e., sharing the same key and value.

          Attach a patch for the first approach which is simpler.

          Show
          eddyxu Lei (Eddy) Xu added a comment - There are two potential approaches to this. Allow ErasureCodingWorker to accept un-bounded number of re-construct tasks. The reconstruction worker for regular replicated files accept unbounded number of reconstruction tasks as well. Make DFS_NAMENODE_REPLICATION_MAX_STREAMS_KEY and DFS_DN_EC_RECONSTRUCTION_STRIPED_BLK_THREADS_KEY be the same, i.e., sharing the same key and value. Attach a patch for the first approach which is simpler.
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 13s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          -1 test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
          +1 mvninstall 12m 42s trunk passed
          +1 compile 0m 42s trunk passed
          +1 checkstyle 0m 30s trunk passed
          +1 mvnsite 0m 49s trunk passed
          +1 findbugs 1m 34s trunk passed
          +1 javadoc 0m 37s trunk passed
          +1 mvninstall 0m 50s the patch passed
          +1 compile 0m 41s the patch passed
          +1 javac 0m 41s the patch passed
          +1 checkstyle 0m 28s the patch passed
          +1 mvnsite 0m 45s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 findbugs 1m 36s the patch passed
          +1 javadoc 0m 42s the patch passed
          -1 unit 66m 23s hadoop-hdfs in the patch failed.
          +1 asflicense 0m 16s The patch does not generate ASF License warnings.
          90m 4s



          Reason Tests
          Failed junit tests hadoop.hdfs.web.TestWebHdfsTimeouts
            hadoop.hdfs.server.namenode.TestNamenodeCapacityReport
          Timed out junit tests org.apache.hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:14b5c93
          JIRA Issue HDFS-12044
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12874764/HDFS-12044.00.patch
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux a8178b113a85 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / 63ce159
          Default Java 1.8.0_131
          findbugs v3.1.0-RC1
          unit https://builds.apache.org/job/PreCommit-HDFS-Build/20064/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
          Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/20064/testReport/
          modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/20064/console
          Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 13s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. -1 test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 mvninstall 12m 42s trunk passed +1 compile 0m 42s trunk passed +1 checkstyle 0m 30s trunk passed +1 mvnsite 0m 49s trunk passed +1 findbugs 1m 34s trunk passed +1 javadoc 0m 37s trunk passed +1 mvninstall 0m 50s the patch passed +1 compile 0m 41s the patch passed +1 javac 0m 41s the patch passed +1 checkstyle 0m 28s the patch passed +1 mvnsite 0m 45s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 1m 36s the patch passed +1 javadoc 0m 42s the patch passed -1 unit 66m 23s hadoop-hdfs in the patch failed. +1 asflicense 0m 16s The patch does not generate ASF License warnings. 90m 4s Reason Tests Failed junit tests hadoop.hdfs.web.TestWebHdfsTimeouts   hadoop.hdfs.server.namenode.TestNamenodeCapacityReport Timed out junit tests org.apache.hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean Subsystem Report/Notes Docker Image:yetus/hadoop:14b5c93 JIRA Issue HDFS-12044 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12874764/HDFS-12044.00.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux a8178b113a85 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 63ce159 Default Java 1.8.0_131 findbugs v3.1.0-RC1 unit https://builds.apache.org/job/PreCommit-HDFS-Build/20064/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/20064/testReport/ modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs Console output https://builds.apache.org/job/PreCommit-HDFS-Build/20064/console Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          eddyxu Lei (Eddy) Xu added a comment -

          Add unit test to approach 1.

          Show
          eddyxu Lei (Eddy) Xu added a comment - Add unit test to approach 1.
          Hide
          andrew.wang Andrew Wang added a comment -

          Hi Eddy, thanks for working on this,

          IIUC the intent of this patch is to remove limiting on the DN side, since the NN already has limits. However, I don't follow all the changes. The max threads is changed to Integer.MAX_VALUE, but I think it still makes sense to limit the degree of parallelism. Can we increase the queue size? Is the NN sending duplicate tasks for items already in the queue?

          Show
          andrew.wang Andrew Wang added a comment - Hi Eddy, thanks for working on this, IIUC the intent of this patch is to remove limiting on the DN side, since the NN already has limits. However, I don't follow all the changes. The max threads is changed to Integer.MAX_VALUE, but I think it still makes sense to limit the degree of parallelism. Can we increase the queue size? Is the NN sending duplicate tasks for items already in the queue?
          Hide
          eddyxu Lei (Eddy) Xu added a comment -

          Hi, Andrew Wang

          NN limits the speed of putting such tasks to DN's queue at speed of maxReplicationStreams - xmitsInProcess, so that when DN has active reconstruction tasks taking xmitsInProcess, NN will adaptively slow down the enqueue process and will not increase the queue length infinitely. I think this is how the non-EC block recovery works today without an Executor.

          I agree that it would provide more guarantee if we explicitly limit the parallelism from the Executor, so I changed patch 02 to use an unbounded queue in the Executor to hold submitted re-construction tasks in 02 patch. Do you think it is sufficient, Andrew Wang

          Show
          eddyxu Lei (Eddy) Xu added a comment - Hi, Andrew Wang NN limits the speed of putting such tasks to DN's queue at speed of maxReplicationStreams - xmitsInProcess , so that when DN has active reconstruction tasks taking xmitsInProcess , NN will adaptively slow down the enqueue process and will not increase the queue length infinitely. I think this is how the non-EC block recovery works today without an Executor . I agree that it would provide more guarantee if we explicitly limit the parallelism from the Executor, so I changed patch 02 to use an unbounded queue in the Executor to hold submitted re-construction tasks in 02 patch. Do you think it is sufficient, Andrew Wang
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 0s Docker mode activated.
          -1 patch 0m 5s HDFS-12044 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help.



          Subsystem Report/Notes
          JIRA Issue HDFS-12044
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12874965/HDFS-12044.02.patch
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/20083/console
          Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 0s Docker mode activated. -1 patch 0m 5s HDFS-12044 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. Subsystem Report/Notes JIRA Issue HDFS-12044 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12874965/HDFS-12044.02.patch Console output https://builds.apache.org/job/PreCommit-HDFS-Build/20083/console Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          eddyxu Lei (Eddy) Xu added a comment -

          Rebase and re-upload patch 02

          Show
          eddyxu Lei (Eddy) Xu added a comment - Rebase and re-upload patch 02
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 12s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
          0 mvndep 0m 33s Maven dependency ordering for branch
          +1 mvninstall 16m 44s trunk passed
          +1 compile 1m 46s trunk passed
          +1 checkstyle 0m 48s trunk passed
          +1 mvnsite 1m 57s trunk passed
          +1 findbugs 3m 49s trunk passed
          +1 javadoc 1m 15s trunk passed
          0 mvndep 0m 9s Maven dependency ordering for patch
          +1 mvninstall 1m 45s the patch passed
          +1 compile 1m 54s the patch passed
          +1 javac 1m 54s the patch passed
          -0 checkstyle 0m 51s hadoop-hdfs-project: The patch generated 1 new + 29 unchanged - 0 fixed = 30 total (was 29)
          +1 mvnsite 1m 50s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 findbugs 3m 59s the patch passed
          +1 javadoc 1m 8s the patch passed
          +1 unit 1m 24s hadoop-hdfs-client in the patch passed.
          -1 unit 74m 14s hadoop-hdfs in the patch failed.
          +1 asflicense 0m 21s The patch does not generate ASF License warnings.
          116m 28s



          Reason Tests
          Failed junit tests hadoop.hdfs.server.namenode.ha.TestEditLogsDuringFailover
            hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:14b5c93
          JIRA Issue HDFS-12044
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12874969/HDFS-12044.02.patch
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux 0a8940a1337a 3.13.0-119-generic #166-Ubuntu SMP Wed May 3 12:18:55 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / 990aa34
          Default Java 1.8.0_131
          findbugs v3.1.0-RC1
          checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/20084/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project.txt
          unit https://builds.apache.org/job/PreCommit-HDFS-Build/20084/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
          Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/20084/testReport/
          modules C: hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/20084/console
          Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 12s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files. 0 mvndep 0m 33s Maven dependency ordering for branch +1 mvninstall 16m 44s trunk passed +1 compile 1m 46s trunk passed +1 checkstyle 0m 48s trunk passed +1 mvnsite 1m 57s trunk passed +1 findbugs 3m 49s trunk passed +1 javadoc 1m 15s trunk passed 0 mvndep 0m 9s Maven dependency ordering for patch +1 mvninstall 1m 45s the patch passed +1 compile 1m 54s the patch passed +1 javac 1m 54s the patch passed -0 checkstyle 0m 51s hadoop-hdfs-project: The patch generated 1 new + 29 unchanged - 0 fixed = 30 total (was 29) +1 mvnsite 1m 50s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 3m 59s the patch passed +1 javadoc 1m 8s the patch passed +1 unit 1m 24s hadoop-hdfs-client in the patch passed. -1 unit 74m 14s hadoop-hdfs in the patch failed. +1 asflicense 0m 21s The patch does not generate ASF License warnings. 116m 28s Reason Tests Failed junit tests hadoop.hdfs.server.namenode.ha.TestEditLogsDuringFailover   hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting Subsystem Report/Notes Docker Image:yetus/hadoop:14b5c93 JIRA Issue HDFS-12044 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12874969/HDFS-12044.02.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 0a8940a1337a 3.13.0-119-generic #166-Ubuntu SMP Wed May 3 12:18:55 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 990aa34 Default Java 1.8.0_131 findbugs v3.1.0-RC1 checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/20084/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project.txt unit https://builds.apache.org/job/PreCommit-HDFS-Build/20084/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/20084/testReport/ modules C: hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project Console output https://builds.apache.org/job/PreCommit-HDFS-Build/20084/console Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          andrew.wang Andrew Wang added a comment -

          Thanks for the explanation Eddy, very helpful. I refreshed myself on the block reconstruction process. A review for myself and other watchers:

          • BlockManager calculates reconstruction work and places tasks in the DatanodeDescriptor's queue, and also in PendingReconstruction work.
          • handleHeartbeat polls the DD queue and gives tasks to the DN every heartbeat, based on maxReplicationStreams - xmitsInProgress
          • PendingReconstruction will retry timedout tasks after 5 minutes

          The core of the current issue is that the DN is refusing work from the NN because it exceeds numReconstructionThreads, and the NN keeps assigning more work because it thinks the DN has additional xmit capacity.

          I think the basic fix here is to increment xmitsInProgress even for queued reconstruction work. This way maxReplicationStreams - xmitsInProgress will eventually be <= 0 and the NN will stop giving more work. Otherwise the queue will not converge.

          I also wonder about the relative weights of EC and replicated reconstruction. 20 EC reconstruction tasks is a different amount of work than 20 re-replication tasks. We should be counting each block reader in StripedReconstructor as its own xmit, i.e. an RS(10,4) recovery task would count as 10 xmits. I looked at this in HDFS-11023 and thought it was accounted for properly, but looking again that's not true.

          More generally, I think HDFS-11023 is still worth revisiting. The NN throttles are coarse and only operate on the heartbeat interval. The DN would ideally have byte-based throttles the same as the balancer settings, to be more user-friendly.

          Show
          andrew.wang Andrew Wang added a comment - Thanks for the explanation Eddy, very helpful. I refreshed myself on the block reconstruction process. A review for myself and other watchers: BlockManager calculates reconstruction work and places tasks in the DatanodeDescriptor's queue, and also in PendingReconstruction work. handleHeartbeat polls the DD queue and gives tasks to the DN every heartbeat, based on maxReplicationStreams - xmitsInProgress PendingReconstruction will retry timedout tasks after 5 minutes The core of the current issue is that the DN is refusing work from the NN because it exceeds numReconstructionThreads, and the NN keeps assigning more work because it thinks the DN has additional xmit capacity. I think the basic fix here is to increment xmitsInProgress even for queued reconstruction work. This way maxReplicationStreams - xmitsInProgress will eventually be <= 0 and the NN will stop giving more work. Otherwise the queue will not converge. I also wonder about the relative weights of EC and replicated reconstruction. 20 EC reconstruction tasks is a different amount of work than 20 re-replication tasks. We should be counting each block reader in StripedReconstructor as its own xmit, i.e. an RS(10,4) recovery task would count as 10 xmits. I looked at this in HDFS-11023 and thought it was accounted for properly, but looking again that's not true. More generally, I think HDFS-11023 is still worth revisiting. The NN throttles are coarse and only operate on the heartbeat interval. The DN would ideally have byte-based throttles the same as the balancer settings, to be more user-friendly.
          Hide
          eddyxu Lei (Eddy) Xu added a comment -

          Thanks for the review, Andrew.

          As you suggested, in the latest patch, xmitsInProgress increases for queued tasks also, to throttle the speed of NN sending tasks to DN. Also for EC recon task, it increases the xmit with a "weight", that is currently calculated as len(sources) + len(targets) to represent the # of network connections.

          I feel that the way of this weight calculation would not need to be calculate, as long as it presents the relative cost of recovery task (i.e., more connections usuallly mean more I/O (block size is the same) and more CPU (because more data)).

          Show
          eddyxu Lei (Eddy) Xu added a comment - Thanks for the review, Andrew. As you suggested, in the latest patch, xmitsInProgress increases for queued tasks also, to throttle the speed of NN sending tasks to DN. Also for EC recon task, it increases the xmit with a "weight", that is currently calculated as len(sources) + len(targets) to represent the # of network connections. I feel that the way of this weight calculation would not need to be calculate, as long as it presents the relative cost of recovery task (i.e., more connections usuallly mean more I/O (block size is the same) and more CPU (because more data)).
          Hide
          andrew.wang Andrew Wang added a comment -

          Thanks Eddy, looks real good, few comments:

          • Can the LinkedBlockingDeque be a LinkedBlockingQueue? I don't think it needs the Deque functionality.
          • SRInfo#getWeight adds together the # sources and # targets. I think this will overestimate. Note in StripedReader#readMinimumSources we only read from minRequiredSources. We also don't read and write at the same time, so it'd be better to take max(minSources, targets).
          • In ECWorker, xmits is incremented after submitting the task. Is there a possible small race here? We could increment first to reserve capacity, then try/catch to decrement if the submit fails.
          • Comment could be enhanced slightly, maybe:
                      // See HDFS-12044. We increase xmitsInProgress even if the task is only
                      // enqueued, so that
                      //   1) NN will not send more tasks than DN can execute and
                      //   2) DN will not throw away reconstruction tasks, and instead keeps an
                      //      unbounded number of tasks in the executor's task queue.
            

          I also had another question about accounting. The NN accounts for DN xciever load when doing block placement, but the xmit count is not factored in. The source and target DNs will each use an xceiver to send or receive the block, but the DN running the reconstruction task doesn't (AFAICT). Should we twiddle the xceiver count (or use an xceiver?) to influence BPP?

          Aside, I noticed what looks like an existing bug, that DataNode#transferBlock does not create its Daemon in the xceiver thread group (which is how we currently count the # of xceivers). BlockRecoveryWorker#recoverBlocks is an example of something not in DataTransferProtocol that still counts against this thread group.

          Unit tests:

          • Could you add a unit test with two node failures, for some additional coverage? IIUC a single reconstruction task will recover all the missing blocks for an EC group, would be good to validate.
          • Also would be good to do some reconstruct tasks and validate at the end that the xmitsInProgress for all DNs go back to zero at the end.
          Show
          andrew.wang Andrew Wang added a comment - Thanks Eddy, looks real good, few comments: Can the LinkedBlockingDeque be a LinkedBlockingQueue? I don't think it needs the Deque functionality. SRInfo#getWeight adds together the # sources and # targets. I think this will overestimate. Note in StripedReader#readMinimumSources we only read from minRequiredSources. We also don't read and write at the same time, so it'd be better to take max(minSources, targets). In ECWorker, xmits is incremented after submitting the task. Is there a possible small race here? We could increment first to reserve capacity, then try/catch to decrement if the submit fails. Comment could be enhanced slightly, maybe: // See HDFS-12044. We increase xmitsInProgress even if the task is only // enqueued, so that // 1) NN will not send more tasks than DN can execute and // 2) DN will not throw away reconstruction tasks, and instead keeps an // unbounded number of tasks in the executor's task queue. I also had another question about accounting. The NN accounts for DN xciever load when doing block placement, but the xmit count is not factored in. The source and target DNs will each use an xceiver to send or receive the block, but the DN running the reconstruction task doesn't (AFAICT). Should we twiddle the xceiver count (or use an xceiver?) to influence BPP? Aside, I noticed what looks like an existing bug, that DataNode#transferBlock does not create its Daemon in the xceiver thread group (which is how we currently count the # of xceivers). BlockRecoveryWorker#recoverBlocks is an example of something not in DataTransferProtocol that still counts against this thread group. Unit tests: Could you add a unit test with two node failures, for some additional coverage? IIUC a single reconstruction task will recover all the missing blocks for an EC group, would be good to validate. Also would be good to do some reconstruct tasks and validate at the end that the xmitsInProgress for all DNs go back to zero at the end.
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 20s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
          0 mvndep 0m 23s Maven dependency ordering for branch
          +1 mvninstall 13m 22s trunk passed
          +1 compile 1m 29s trunk passed
          +1 checkstyle 0m 39s trunk passed
          +1 mvnsite 1m 34s trunk passed
          -1 findbugs 1m 30s hadoop-hdfs-project/hadoop-hdfs-client in trunk has 2 extant Findbugs warnings.
          -1 findbugs 1m 42s hadoop-hdfs-project/hadoop-hdfs in trunk has 10 extant Findbugs warnings.
          +1 javadoc 0m 58s trunk passed
          0 mvndep 0m 7s Maven dependency ordering for patch
          +1 mvninstall 1m 21s the patch passed
          +1 compile 1m 26s the patch passed
          +1 javac 1m 26s the patch passed
          -0 checkstyle 0m 39s hadoop-hdfs-project: The patch generated 1 new + 184 unchanged - 0 fixed = 185 total (was 184)
          +1 mvnsite 1m 26s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 findbugs 3m 22s the patch passed
          +1 javadoc 0m 57s the patch passed
          +1 unit 1m 16s hadoop-hdfs-client in the patch passed.
          -1 unit 81m 37s hadoop-hdfs in the patch failed.
          +1 asflicense 0m 20s The patch does not generate ASF License warnings.
          115m 51s



          Reason Tests
          Failed junit tests hadoop.hdfs.TestDFSStripedOutputStreamWithFailure080
            hadoop.hdfs.TestDFSStripedInputStreamWithRandomECPolicy
            hadoop.hdfs.TestDFSStripedOutputStreamWithFailure070
            hadoop.hdfs.server.namenode.TestDecommissioningStatus
            hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting
            hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes
            hadoop.hdfs.web.TestWebHdfsTimeouts
            hadoop.hdfs.TestFileChecksum
          Timed out junit tests org.apache.hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:14b5c93
          JIRA Issue HDFS-12044
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12875339/HDFS-12044.03.patch
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux 6c831fdeb141 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / 147df30
          Default Java 1.8.0_131
          findbugs v3.1.0-RC1
          findbugs https://builds.apache.org/job/PreCommit-HDFS-Build/20123/artifact/patchprocess/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-client-warnings.html
          findbugs https://builds.apache.org/job/PreCommit-HDFS-Build/20123/artifact/patchprocess/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-warnings.html
          checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/20123/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project.txt
          unit https://builds.apache.org/job/PreCommit-HDFS-Build/20123/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
          Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/20123/testReport/
          modules C: hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/20123/console
          Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 20s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files. 0 mvndep 0m 23s Maven dependency ordering for branch +1 mvninstall 13m 22s trunk passed +1 compile 1m 29s trunk passed +1 checkstyle 0m 39s trunk passed +1 mvnsite 1m 34s trunk passed -1 findbugs 1m 30s hadoop-hdfs-project/hadoop-hdfs-client in trunk has 2 extant Findbugs warnings. -1 findbugs 1m 42s hadoop-hdfs-project/hadoop-hdfs in trunk has 10 extant Findbugs warnings. +1 javadoc 0m 58s trunk passed 0 mvndep 0m 7s Maven dependency ordering for patch +1 mvninstall 1m 21s the patch passed +1 compile 1m 26s the patch passed +1 javac 1m 26s the patch passed -0 checkstyle 0m 39s hadoop-hdfs-project: The patch generated 1 new + 184 unchanged - 0 fixed = 185 total (was 184) +1 mvnsite 1m 26s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 3m 22s the patch passed +1 javadoc 0m 57s the patch passed +1 unit 1m 16s hadoop-hdfs-client in the patch passed. -1 unit 81m 37s hadoop-hdfs in the patch failed. +1 asflicense 0m 20s The patch does not generate ASF License warnings. 115m 51s Reason Tests Failed junit tests hadoop.hdfs.TestDFSStripedOutputStreamWithFailure080   hadoop.hdfs.TestDFSStripedInputStreamWithRandomECPolicy   hadoop.hdfs.TestDFSStripedOutputStreamWithFailure070   hadoop.hdfs.server.namenode.TestDecommissioningStatus   hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting   hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes   hadoop.hdfs.web.TestWebHdfsTimeouts   hadoop.hdfs.TestFileChecksum Timed out junit tests org.apache.hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean Subsystem Report/Notes Docker Image:yetus/hadoop:14b5c93 JIRA Issue HDFS-12044 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12875339/HDFS-12044.03.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 6c831fdeb141 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 147df30 Default Java 1.8.0_131 findbugs v3.1.0-RC1 findbugs https://builds.apache.org/job/PreCommit-HDFS-Build/20123/artifact/patchprocess/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-client-warnings.html findbugs https://builds.apache.org/job/PreCommit-HDFS-Build/20123/artifact/patchprocess/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-warnings.html checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/20123/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project.txt unit https://builds.apache.org/job/PreCommit-HDFS-Build/20123/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/20123/testReport/ modules C: hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project Console output https://builds.apache.org/job/PreCommit-HDFS-Build/20123/console Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          eddyxu Lei (Eddy) Xu added a comment -

          Thanks for the detailed suggestions, Andrew Wang. Updated the patch accordingly.

          The NN accounts for DN xciever load when doing block placement, but the xmit count is not factored in.

          I will file another JIRA.

          that DataNode#transferBlock does not create its Daemon in the xceiver thread group (which is how we currently count the # of xceivers).

          Will file JIRA for this as well.

          Show
          eddyxu Lei (Eddy) Xu added a comment - Thanks for the detailed suggestions, Andrew Wang . Updated the patch accordingly. The NN accounts for DN xciever load when doing block placement, but the xmit count is not factored in. I will file another JIRA. that DataNode#transferBlock does not create its Daemon in the xceiver thread group (which is how we currently count the # of xceivers). Will file JIRA for this as well.
          Hide
          andrew.wang Andrew Wang added a comment -

          LGTM +1, I just retriggered the precommit build since I don't see a run on the 04 patch.

          Could you link the follow-on JIRAs to this one? I might have missed them. Thanks Eddy!

          Show
          andrew.wang Andrew Wang added a comment - LGTM +1, I just retriggered the precommit build since I don't see a run on the 04 patch. Could you link the follow-on JIRAs to this one? I might have missed them. Thanks Eddy!
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 14s Docker mode activated.
                Prechecks
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
                trunk Compile Tests
          0 mvndep 0m 8s Maven dependency ordering for branch
          -1 mvninstall 4m 26s root in trunk failed.
          +1 compile 1m 26s trunk passed
          +1 checkstyle 0m 43s trunk passed
          +1 mvnsite 1m 28s trunk passed
          -1 findbugs 1m 26s hadoop-hdfs-project/hadoop-hdfs-client in trunk has 2 extant Findbugs warnings.
          -1 findbugs 1m 42s hadoop-hdfs-project/hadoop-hdfs in trunk has 10 extant Findbugs warnings.
          +1 javadoc 1m 0s trunk passed
                Patch Compile Tests
          0 mvndep 0m 7s Maven dependency ordering for patch
          +1 mvninstall 1m 24s the patch passed
          +1 compile 1m 27s the patch passed
          +1 javac 1m 27s the patch passed
          -0 checkstyle 0m 42s hadoop-hdfs-project: The patch generated 2 new + 186 unchanged - 0 fixed = 188 total (was 186)
          +1 mvnsite 1m 29s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 findbugs 3m 23s the patch passed
          +1 javadoc 0m 56s the patch passed
                Other Tests
          +1 unit 1m 12s hadoop-hdfs-client in the patch passed.
          -1 unit 68m 45s hadoop-hdfs in the patch failed.
          +1 asflicense 0m 20s The patch does not generate ASF License warnings.
          93m 47s



          Reason Tests
          Failed junit tests hadoop.hdfs.TestDFSStripedOutputStreamWithFailure150
            hadoop.hdfs.TestDFSStripedInputStreamWithRandomECPolicy
            hadoop.hdfs.TestDFSStripedOutputStreamWithFailure010
            hadoop.hdfs.TestDFSStripedOutputStreamWithFailure080
            hadoop.hdfs.TestFileChecksum
            hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks
            hadoop.hdfs.TestCrcCorruption



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:14b5c93
          JIRA Issue HDFS-12044
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12878902/HDFS-12044.04.patch
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux db232ee595db 3.13.0-117-generic #164-Ubuntu SMP Fri Apr 7 11:05:26 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / f66fd11
          Default Java 1.8.0_131
          mvninstall https://builds.apache.org/job/PreCommit-HDFS-Build/20438/artifact/patchprocess/branch-mvninstall-root.txt
          findbugs v3.1.0-RC1
          findbugs https://builds.apache.org/job/PreCommit-HDFS-Build/20438/artifact/patchprocess/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-client-warnings.html
          findbugs https://builds.apache.org/job/PreCommit-HDFS-Build/20438/artifact/patchprocess/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-warnings.html
          checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/20438/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project.txt
          unit https://builds.apache.org/job/PreCommit-HDFS-Build/20438/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
          Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/20438/testReport/
          modules C: hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/20438/console
          Powered by Apache Yetus 0.6.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 14s Docker mode activated.       Prechecks +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.       trunk Compile Tests 0 mvndep 0m 8s Maven dependency ordering for branch -1 mvninstall 4m 26s root in trunk failed. +1 compile 1m 26s trunk passed +1 checkstyle 0m 43s trunk passed +1 mvnsite 1m 28s trunk passed -1 findbugs 1m 26s hadoop-hdfs-project/hadoop-hdfs-client in trunk has 2 extant Findbugs warnings. -1 findbugs 1m 42s hadoop-hdfs-project/hadoop-hdfs in trunk has 10 extant Findbugs warnings. +1 javadoc 1m 0s trunk passed       Patch Compile Tests 0 mvndep 0m 7s Maven dependency ordering for patch +1 mvninstall 1m 24s the patch passed +1 compile 1m 27s the patch passed +1 javac 1m 27s the patch passed -0 checkstyle 0m 42s hadoop-hdfs-project: The patch generated 2 new + 186 unchanged - 0 fixed = 188 total (was 186) +1 mvnsite 1m 29s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 3m 23s the patch passed +1 javadoc 0m 56s the patch passed       Other Tests +1 unit 1m 12s hadoop-hdfs-client in the patch passed. -1 unit 68m 45s hadoop-hdfs in the patch failed. +1 asflicense 0m 20s The patch does not generate ASF License warnings. 93m 47s Reason Tests Failed junit tests hadoop.hdfs.TestDFSStripedOutputStreamWithFailure150   hadoop.hdfs.TestDFSStripedInputStreamWithRandomECPolicy   hadoop.hdfs.TestDFSStripedOutputStreamWithFailure010   hadoop.hdfs.TestDFSStripedOutputStreamWithFailure080   hadoop.hdfs.TestFileChecksum   hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks   hadoop.hdfs.TestCrcCorruption Subsystem Report/Notes Docker Image:yetus/hadoop:14b5c93 JIRA Issue HDFS-12044 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12878902/HDFS-12044.04.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux db232ee595db 3.13.0-117-generic #164-Ubuntu SMP Fri Apr 7 11:05:26 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / f66fd11 Default Java 1.8.0_131 mvninstall https://builds.apache.org/job/PreCommit-HDFS-Build/20438/artifact/patchprocess/branch-mvninstall-root.txt findbugs v3.1.0-RC1 findbugs https://builds.apache.org/job/PreCommit-HDFS-Build/20438/artifact/patchprocess/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-client-warnings.html findbugs https://builds.apache.org/job/PreCommit-HDFS-Build/20438/artifact/patchprocess/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-warnings.html checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/20438/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project.txt unit https://builds.apache.org/job/PreCommit-HDFS-Build/20438/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/20438/testReport/ modules C: hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project Console output https://builds.apache.org/job/PreCommit-HDFS-Build/20438/console Powered by Apache Yetus 0.6.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          eddyxu Lei (Eddy) Xu added a comment -

          The test failures seem to be relevant. Looking into it.

          Show
          eddyxu Lei (Eddy) Xu added a comment - The test failures seem to be relevant. Looking into it.
          Hide
          eddyxu Lei (Eddy) Xu added a comment -

          Fix TestFileChecksum. The rest of tests pass on my machine.

          Show
          eddyxu Lei (Eddy) Xu added a comment - Fix TestFileChecksum . The rest of tests pass on my machine.
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 18s Docker mode activated.
                Prechecks
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
                trunk Compile Tests
          0 mvndep 0m 8s Maven dependency ordering for branch
          +1 mvninstall 13m 9s trunk passed
          +1 compile 1m 26s trunk passed
          +1 checkstyle 0m 42s trunk passed
          +1 mvnsite 1m 27s trunk passed
          -1 findbugs 1m 22s hadoop-hdfs-project/hadoop-hdfs-client in trunk has 2 extant Findbugs warnings.
          -1 findbugs 1m 39s hadoop-hdfs-project/hadoop-hdfs in trunk has 10 extant Findbugs warnings.
          +1 javadoc 1m 1s trunk passed
                Patch Compile Tests
          0 mvndep 0m 8s Maven dependency ordering for patch
          +1 mvninstall 1m 20s the patch passed
          +1 compile 1m 23s the patch passed
          +1 javac 1m 23s the patch passed
          -0 checkstyle 0m 40s hadoop-hdfs-project: The patch generated 1 new + 186 unchanged - 0 fixed = 187 total (was 186)
          +1 mvnsite 1m 23s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 findbugs 3m 14s the patch passed
          +1 javadoc 0m 56s the patch passed
                Other Tests
          +1 unit 1m 9s hadoop-hdfs-client in the patch passed.
          -1 unit 64m 59s hadoop-hdfs in the patch failed.
          +1 asflicense 0m 17s The patch does not generate ASF License warnings.
          98m 8s



          Reason Tests
          Failed junit tests hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting
            hadoop.hdfs.server.namenode.ha.TestBootstrapStandbyWithQJM



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:14b5c93
          JIRA Issue HDFS-12044
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12879251/HDFS-12044.05.patch
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux 56ce65acd785 3.13.0-119-generic #166-Ubuntu SMP Wed May 3 12:18:55 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / e3c7300
          Default Java 1.8.0_131
          findbugs v3.1.0-RC1
          findbugs https://builds.apache.org/job/PreCommit-HDFS-Build/20445/artifact/patchprocess/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-client-warnings.html
          findbugs https://builds.apache.org/job/PreCommit-HDFS-Build/20445/artifact/patchprocess/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-warnings.html
          checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/20445/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project.txt
          unit https://builds.apache.org/job/PreCommit-HDFS-Build/20445/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
          Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/20445/testReport/
          modules C: hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/20445/console
          Powered by Apache Yetus 0.6.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 18s Docker mode activated.       Prechecks +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.       trunk Compile Tests 0 mvndep 0m 8s Maven dependency ordering for branch +1 mvninstall 13m 9s trunk passed +1 compile 1m 26s trunk passed +1 checkstyle 0m 42s trunk passed +1 mvnsite 1m 27s trunk passed -1 findbugs 1m 22s hadoop-hdfs-project/hadoop-hdfs-client in trunk has 2 extant Findbugs warnings. -1 findbugs 1m 39s hadoop-hdfs-project/hadoop-hdfs in trunk has 10 extant Findbugs warnings. +1 javadoc 1m 1s trunk passed       Patch Compile Tests 0 mvndep 0m 8s Maven dependency ordering for patch +1 mvninstall 1m 20s the patch passed +1 compile 1m 23s the patch passed +1 javac 1m 23s the patch passed -0 checkstyle 0m 40s hadoop-hdfs-project: The patch generated 1 new + 186 unchanged - 0 fixed = 187 total (was 186) +1 mvnsite 1m 23s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 3m 14s the patch passed +1 javadoc 0m 56s the patch passed       Other Tests +1 unit 1m 9s hadoop-hdfs-client in the patch passed. -1 unit 64m 59s hadoop-hdfs in the patch failed. +1 asflicense 0m 17s The patch does not generate ASF License warnings. 98m 8s Reason Tests Failed junit tests hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting   hadoop.hdfs.server.namenode.ha.TestBootstrapStandbyWithQJM Subsystem Report/Notes Docker Image:yetus/hadoop:14b5c93 JIRA Issue HDFS-12044 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12879251/HDFS-12044.05.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 56ce65acd785 3.13.0-119-generic #166-Ubuntu SMP Wed May 3 12:18:55 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / e3c7300 Default Java 1.8.0_131 findbugs v3.1.0-RC1 findbugs https://builds.apache.org/job/PreCommit-HDFS-Build/20445/artifact/patchprocess/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-client-warnings.html findbugs https://builds.apache.org/job/PreCommit-HDFS-Build/20445/artifact/patchprocess/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-warnings.html checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/20445/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project.txt unit https://builds.apache.org/job/PreCommit-HDFS-Build/20445/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/20445/testReport/ modules C: hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project Console output https://builds.apache.org/job/PreCommit-HDFS-Build/20445/console Powered by Apache Yetus 0.6.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          andrew.wang Andrew Wang added a comment -

          +1 thanks Eddy, that checkstyle error looks extant.

          Show
          andrew.wang Andrew Wang added a comment - +1 thanks Eddy, that checkstyle error looks extant.
          Hide
          andrew.wang Andrew Wang added a comment -

          BTW did we JIRA this follow-on?

          that DataNode#transferBlock does not create its Daemon in the xceiver thread group (which is how we currently count the # of xceivers).

          Show
          andrew.wang Andrew Wang added a comment - BTW did we JIRA this follow-on? that DataNode#transferBlock does not create its Daemon in the xceiver thread group (which is how we currently count the # of xceivers).
          Hide
          eddyxu Lei (Eddy) Xu added a comment -

          Thanks for the reviews, Andrew Wang

          Committed to trunk.

          Also filed HDFS-12215 and HDFS-12208 as follow on.

          Show
          eddyxu Lei (Eddy) Xu added a comment - Thanks for the reviews, Andrew Wang Committed to trunk. Also filed HDFS-12215 and HDFS-12208 as follow on.
          Hide
          hudson Hudson added a comment -

          SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #12068 (See https://builds.apache.org/job/Hadoop-trunk-Commit/12068/)
          HDFS-12044. Mismatch between BlockManager.maxReplicationStreams and (lei: rev 77791e4c36ddc9305306c83806bf486d4d32575d)

          • (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java
          • (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/erasurecode/StripedBlockReconstructor.java
          • (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/erasurecode/StripedReader.java
          • (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/erasurecode/StripedReconstructor.java
          • (edit) hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSUtilClient.java
          • (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestReconstructStripedFile.java
          • (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/erasurecode/ErasureCodingWorker.java
          • (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/erasurecode/StripedReconstructionInfo.java
          Show
          hudson Hudson added a comment - SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #12068 (See https://builds.apache.org/job/Hadoop-trunk-Commit/12068/ ) HDFS-12044 . Mismatch between BlockManager.maxReplicationStreams and (lei: rev 77791e4c36ddc9305306c83806bf486d4d32575d) (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/erasurecode/StripedBlockReconstructor.java (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/erasurecode/StripedReader.java (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/erasurecode/StripedReconstructor.java (edit) hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSUtilClient.java (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestReconstructStripedFile.java (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/erasurecode/ErasureCodingWorker.java (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/erasurecode/StripedReconstructionInfo.java

            People

            • Assignee:
              eddyxu Lei (Eddy) Xu
              Reporter:
              eddyxu Lei (Eddy) Xu
            • Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development