Details

    • Type: Sub-task
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.9.0, 3.0.0-alpha2
    • Component/s: datanode
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      Update DataNode to use the DatasetVolumeChecker class introduced in HDFS-11149 to parallelize disk checks.

        Issue Links

          Activity

          Hide
          arpitagarwal Arpit Agarwal added a comment -

          Pushed to branch-2. Thank you for reviewing the branch-2 patch Xiaoyu Yao.

          Show
          arpitagarwal Arpit Agarwal added a comment - Pushed to branch-2. Thank you for reviewing the branch-2 patch Xiaoyu Yao .
          Hide
          xyao Xiaoyu Yao added a comment -

          Thanks Arpit Agarwal for working on the branch-2 patch. It looks good to me. +1

          Show
          xyao Xiaoyu Yao added a comment - Thanks Arpit Agarwal for working on the branch-2 patch. It looks good to me. +1
          Hide
          arpitagarwal Arpit Agarwal added a comment - - edited

          Attached a patch for branch-2. Xiaoyu Yao, can you please take a look at the patch. The main difference from trunk is that in branch-2 DatasetVolumeChecker returns Set<FsVolumeSpi> whereas in trunk it returns Set<StorageLocation>.

          The difference is that due to branch divergence, we cannot get the StorageLocation from an FsVolumeSpi trivially in branch-2. I ran test-patch locally and results are below. Also tested the branch-2 patch by simulating disk failure and verifying that the failure was caught and the volume was removed.

          Vote Subsystem Runtime Comment
          +1 @author 0m 0s The patch does not contain any @author
                tags.
          +1 test4tests 0m 0s The patch appears to include 20 new or
                modified test files.
          +1 mvninstall 3m 31s branch-2 passed
          +1 compile 0m 26s branch-2 passed
          +1 checkstyle 0m 29s branch-2 passed
          +1 mvnsite 0m 31s branch-2 passed
          +1 mvneclipse 0m 11s branch-2 passed
          +1 findbugs 0m 59s branch-2 passed
          +1 javadoc 0m 38s branch-2 passed
          +1 mvninstall 0m 26s the patch passed
          +1 compile 0m 23s the patch passed
          +1 javac 0m 23s the patch passed
          -1 checkstyle 0m 26s hadoop-hdfs-project/hadoop-hdfs: The
                patch generated 17 new + 994 unchanged -
                23 fixed = 1011 total (was 1017)
          +1 mvnsite 0m 29s the patch passed
          +1 mvneclipse 0m 8s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 xml 0m 1s The patch has no ill-formed XML file.
          +1 findbugs 1m 2s the patch passed
          +1 javadoc 0m 36s the patch passed
          +1 unit 201m 0s hadoop-hdfs in the patch passed.
          +1 asflicense 0m 12s The patch does not generate ASF License
                warnings.
              212m 21s
          Show
          arpitagarwal Arpit Agarwal added a comment - - edited Attached a patch for branch-2. Xiaoyu Yao , can you please take a look at the patch. The main difference from trunk is that in branch-2 DatasetVolumeChecker returns Set<FsVolumeSpi> whereas in trunk it returns Set<StorageLocation>. The difference is that due to branch divergence, we cannot get the StorageLocation from an FsVolumeSpi trivially in branch-2. I ran test-patch locally and results are below. Also tested the branch-2 patch by simulating disk failure and verifying that the failure was caught and the volume was removed. Vote Subsystem Runtime Comment +1 @author 0m 0s The patch does not contain any @author       tags. +1 test4tests 0m 0s The patch appears to include 20 new or       modified test files. +1 mvninstall 3m 31s branch-2 passed +1 compile 0m 26s branch-2 passed +1 checkstyle 0m 29s branch-2 passed +1 mvnsite 0m 31s branch-2 passed +1 mvneclipse 0m 11s branch-2 passed +1 findbugs 0m 59s branch-2 passed +1 javadoc 0m 38s branch-2 passed +1 mvninstall 0m 26s the patch passed +1 compile 0m 23s the patch passed +1 javac 0m 23s the patch passed -1 checkstyle 0m 26s hadoop-hdfs-project/hadoop-hdfs: The       patch generated 17 new + 994 unchanged -       23 fixed = 1011 total (was 1017) +1 mvnsite 0m 29s the patch passed +1 mvneclipse 0m 8s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 xml 0m 1s The patch has no ill-formed XML file. +1 findbugs 1m 2s the patch passed +1 javadoc 0m 36s the patch passed +1 unit 201m 0s hadoop-hdfs in the patch passed. +1 asflicense 0m 12s The patch does not generate ASF License       warnings.     212m 21s
          Hide
          manju_hadoop Manjunath Anand added a comment -

          Arpit Agarwal thanks for the clarification and for the effort on this jira which contributes to performance improvement by parallelism

          Show
          manju_hadoop Manjunath Anand added a comment - Arpit Agarwal thanks for the clarification and for the effort on this jira which contributes to performance improvement by parallelism
          Hide
          hudson Hudson added a comment -

          SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #11022 (See https://builds.apache.org/job/Hadoop-trunk-Commit/11022/)
          HDFS-11182. Update DataNode to use DatasetVolumeChecker. Contributed by (xyao: rev f678080dbd25a218e0406463a3c3a1fc03680702)

          • (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsVolumeImpl.java
          • (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockStatsMXBean.java
          • (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/FsDatasetSpi.java
          • (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/checker/TestDatasetVolumeChecker.java
          • (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/checker/TestDatasetVolumeCheckerFailures.java
          • (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeVolumeFailureToleration.java
          • (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsVolumeList.java
          • (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeVolumeFailure.java
          • (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeHotSwapVolumes.java
          • (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java
          • (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/SimulatedFSDataset.java
          • (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDiskError.java
          • (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/TestFsVolumeList.java
          • (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/checker/DatasetVolumeChecker.java
          • (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeVolumeFailureReporting.java
          • (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java
          • (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/extdataset/ExternalDatasetImpl.java
          • (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/TestFsDatasetImpl.java
          Show
          hudson Hudson added a comment - SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #11022 (See https://builds.apache.org/job/Hadoop-trunk-Commit/11022/ ) HDFS-11182 . Update DataNode to use DatasetVolumeChecker. Contributed by (xyao: rev f678080dbd25a218e0406463a3c3a1fc03680702) (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsVolumeImpl.java (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockStatsMXBean.java (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/FsDatasetSpi.java (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/checker/TestDatasetVolumeChecker.java (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/checker/TestDatasetVolumeCheckerFailures.java (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeVolumeFailureToleration.java (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsVolumeList.java (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeVolumeFailure.java (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeHotSwapVolumes.java (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/SimulatedFSDataset.java (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDiskError.java (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/TestFsVolumeList.java (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/checker/DatasetVolumeChecker.java (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeVolumeFailureReporting.java (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/extdataset/ExternalDatasetImpl.java (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/TestFsDatasetImpl.java
          Hide
          arpitagarwal Arpit Agarwal added a comment -

          Thanks for the detailed code reviews and committing it Xiaoyu Yao!

          Show
          arpitagarwal Arpit Agarwal added a comment - Thanks for the detailed code reviews and committing it Xiaoyu Yao !
          Hide
          xyao Xiaoyu Yao added a comment -

          Thanks Arpit Agarwal for the contribution and all for the reviews. I've commit the patch to trunk.

          Show
          xyao Xiaoyu Yao added a comment - Thanks Arpit Agarwal for the contribution and all for the reviews. I've commit the patch to trunk.
          Hide
          xyao Xiaoyu Yao added a comment -

          +1 for the latest patch. I tried the failed Jenkins test but can't repro it.
          I plan to commit it shortly with "git apply --whitespace=fix".
          The checkstyle indentation level issue with Java8 Lambdas seems to be a known issue. Will open separate ticket to fix that.

          Show
          xyao Xiaoyu Yao added a comment - +1 for the latest patch. I tried the failed Jenkins test but can't repro it. I plan to commit it shortly with "git apply --whitespace=fix". The checkstyle indentation level issue with Java8 Lambdas seems to be a known issue. Will open separate ticket to fix that.
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 12s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 12 new or modified test files.
          +1 mvninstall 12m 39s trunk passed
          +1 compile 0m 44s trunk passed
          +1 checkstyle 0m 32s trunk passed
          +1 mvnsite 0m 51s trunk passed
          +1 mvneclipse 0m 13s trunk passed
          +1 findbugs 1m 43s trunk passed
          +1 javadoc 0m 39s trunk passed
          +1 mvninstall 0m 46s the patch passed
          +1 compile 0m 42s the patch passed
          +1 javac 0m 42s the patch passed
          -0 checkstyle 0m 29s hadoop-hdfs-project/hadoop-hdfs: The patch generated 14 new + 466 unchanged - 10 fixed = 480 total (was 476)
          +1 mvnsite 0m 48s the patch passed
          +1 mvneclipse 0m 10s the patch passed
          -1 whitespace 0m 0s The patch has 4 line(s) that end in whitespace. Use git apply --whitespace=fix <<patch_file>>. Refer https://git-scm.com/docs/git-apply
          +1 findbugs 1m 47s the patch passed
          +1 javadoc 0m 36s the patch passed
          -1 unit 62m 12s hadoop-hdfs in the patch failed.
          +1 asflicense 0m 18s The patch does not generate ASF License warnings.
          86m 36s



          Reason Tests
          Failed junit tests hadoop.hdfs.TestDFSStripedOutputStreamWithFailure110



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:a9ad5d6
          JIRA Issue HDFS-11182
          GITHUB PR https://github.com/apache/hadoop/pull/168
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux a6dfe967cca9 3.13.0-93-generic #140-Ubuntu SMP Mon Jul 18 21:21:05 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / 1b401f6
          Default Java 1.8.0_111
          findbugs v3.0.0
          checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/17921/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
          whitespace https://builds.apache.org/job/PreCommit-HDFS-Build/17921/artifact/patchprocess/whitespace-eol.txt
          unit https://builds.apache.org/job/PreCommit-HDFS-Build/17921/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
          Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/17921/testReport/
          modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/17921/console
          Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 12s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 12 new or modified test files. +1 mvninstall 12m 39s trunk passed +1 compile 0m 44s trunk passed +1 checkstyle 0m 32s trunk passed +1 mvnsite 0m 51s trunk passed +1 mvneclipse 0m 13s trunk passed +1 findbugs 1m 43s trunk passed +1 javadoc 0m 39s trunk passed +1 mvninstall 0m 46s the patch passed +1 compile 0m 42s the patch passed +1 javac 0m 42s the patch passed -0 checkstyle 0m 29s hadoop-hdfs-project/hadoop-hdfs: The patch generated 14 new + 466 unchanged - 10 fixed = 480 total (was 476) +1 mvnsite 0m 48s the patch passed +1 mvneclipse 0m 10s the patch passed -1 whitespace 0m 0s The patch has 4 line(s) that end in whitespace. Use git apply --whitespace=fix <<patch_file>>. Refer https://git-scm.com/docs/git-apply +1 findbugs 1m 47s the patch passed +1 javadoc 0m 36s the patch passed -1 unit 62m 12s hadoop-hdfs in the patch failed. +1 asflicense 0m 18s The patch does not generate ASF License warnings. 86m 36s Reason Tests Failed junit tests hadoop.hdfs.TestDFSStripedOutputStreamWithFailure110 Subsystem Report/Notes Docker Image:yetus/hadoop:a9ad5d6 JIRA Issue HDFS-11182 GITHUB PR https://github.com/apache/hadoop/pull/168 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux a6dfe967cca9 3.13.0-93-generic #140-Ubuntu SMP Mon Jul 18 21:21:05 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 1b401f6 Default Java 1.8.0_111 findbugs v3.0.0 checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/17921/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt whitespace https://builds.apache.org/job/PreCommit-HDFS-Build/17921/artifact/patchprocess/whitespace-eol.txt unit https://builds.apache.org/job/PreCommit-HDFS-Build/17921/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/17921/testReport/ modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs Console output https://builds.apache.org/job/PreCommit-HDFS-Build/17921/console Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user xiaoyuyao commented on a diff in the pull request:

          https://github.com/apache/hadoop/pull/168#discussion_r93171165

          — Diff: hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java —
          @@ -3208,68 +3200,49 @@ public ShortCircuitRegistry getShortCircuitRegistry() {
          }

          /**

          • * Check the disk error
            + * Check the disk error synchronously.
            */
          • private void checkDiskError() {
          • Set<StorageLocation> unhealthyLocations = data.checkDataDir();
          • if (unhealthyLocations != null && !unhealthyLocations.isEmpty()) {
            + @VisibleForTesting
            + public void checkDiskError() throws IOException {
            + Set<FsVolumeSpi> unhealthyVolumes;
            + try {
            + unhealthyVolumes = volumeChecker.checkAllVolumes(data);
            + LOG.info("checkDiskError got {} failed volumes - {}",
            + unhealthyVolumes.size(), unhealthyVolumes);
            + lastDiskErrorCheck = Time.monotonicNow();
            + } catch (InterruptedException e) { + LOG.error("Interruped while running disk check", e); + throw new IOException("Interrupted while running disk check", e); + }

            + handleVolumeFailures(unhealthyVolumes);
            + }
            +
            + private void handleVolumeFailures(Set<FsVolumeSpi> unhealthyVolumes) {
            + data.handleVolumeFailures(unhealthyVolumes);
            + Set<StorageLocation> unhealthyLocations = new HashSet<>(
            + unhealthyVolumes.size());
            +
            + if (!unhealthyVolumes.isEmpty()) {
            + StringBuilder sb = new StringBuilder("DataNode failed volumes:");
            + for (FsVolumeSpi vol : unhealthyVolumes)

            { + unhealthyLocations.add(vol.getStorageLocation()); + sb.append(vol.getStorageLocation()).append(";"); + }

            +
            try

            { // Remove all unhealthy volumes from DataNode. removeVolumes(unhealthyLocations, false); }

            catch (IOException e)

            { LOG.warn("Error occurred when removing unhealthy storage dirs: " + e.getMessage(), e); }
          • StringBuilder sb = new StringBuilder("DataNode failed volumes:");
          • for (StorageLocation location : unhealthyLocations) { - sb.append(location + ";"); - }

            + LOG.info(sb.toString());

              • End diff –

          Thanks for fixing that.

          Show
          githubbot ASF GitHub Bot added a comment - Github user xiaoyuyao commented on a diff in the pull request: https://github.com/apache/hadoop/pull/168#discussion_r93171165 — Diff: hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java — @@ -3208,68 +3200,49 @@ public ShortCircuitRegistry getShortCircuitRegistry() { } /** * Check the disk error + * Check the disk error synchronously. */ private void checkDiskError() { Set<StorageLocation> unhealthyLocations = data.checkDataDir(); if (unhealthyLocations != null && !unhealthyLocations.isEmpty()) { + @VisibleForTesting + public void checkDiskError() throws IOException { + Set<FsVolumeSpi> unhealthyVolumes; + try { + unhealthyVolumes = volumeChecker.checkAllVolumes(data); + LOG.info("checkDiskError got {} failed volumes - {}", + unhealthyVolumes.size(), unhealthyVolumes); + lastDiskErrorCheck = Time.monotonicNow(); + } catch (InterruptedException e) { + LOG.error("Interruped while running disk check", e); + throw new IOException("Interrupted while running disk check", e); + } + handleVolumeFailures(unhealthyVolumes); + } + + private void handleVolumeFailures(Set<FsVolumeSpi> unhealthyVolumes) { + data.handleVolumeFailures(unhealthyVolumes); + Set<StorageLocation> unhealthyLocations = new HashSet<>( + unhealthyVolumes.size()); + + if (!unhealthyVolumes.isEmpty()) { + StringBuilder sb = new StringBuilder("DataNode failed volumes:"); + for (FsVolumeSpi vol : unhealthyVolumes) { + unhealthyLocations.add(vol.getStorageLocation()); + sb.append(vol.getStorageLocation()).append(";"); + } + try { // Remove all unhealthy volumes from DataNode. removeVolumes(unhealthyLocations, false); } catch (IOException e) { LOG.warn("Error occurred when removing unhealthy storage dirs: " + e.getMessage(), e); } StringBuilder sb = new StringBuilder("DataNode failed volumes:"); for (StorageLocation location : unhealthyLocations) { - sb.append(location + ";"); - } + LOG.info(sb.toString()); End diff – Thanks for fixing that.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user xiaoyuyao commented on a diff in the pull request:

          https://github.com/apache/hadoop/pull/168#discussion_r93171132

          — Diff: hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java —
          @@ -1944,6 +1935,8 @@ public void shutdown() {
          }
          }

          + volumeChecker.shutdownAndWait(1, TimeUnit.SECONDS);
          +
          if (storageLocationChecker != null) {
          — End diff –

          Agree. Let's do that in a follow up jira.

          Show
          githubbot ASF GitHub Bot added a comment - Github user xiaoyuyao commented on a diff in the pull request: https://github.com/apache/hadoop/pull/168#discussion_r93171132 — Diff: hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java — @@ -1944,6 +1935,8 @@ public void shutdown() { } } + volumeChecker.shutdownAndWait(1, TimeUnit.SECONDS); + if (storageLocationChecker != null) { — End diff – Agree. Let's do that in a follow up jira.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user xiaoyuyao commented on a diff in the pull request:

          https://github.com/apache/hadoop/pull/168#discussion_r93171107

          — Diff: hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/checker/DatasetVolumeChecker.java —
          @@ -299,26 +327,35 @@ public void checkVolume(
          private class ResultHandler
          implements FutureCallback<VolumeCheckResult> {
          private final FsVolumeReference reference;

          • private final Set<StorageLocation> failedVolumes;
          • private final Set<StorageLocation> healthyVolumes;
          • private final CountDownLatch latch;
          • private final AtomicLong numVolumes;
            + private final Set<FsVolumeSpi> failedVolumes;
            + private final Set<FsVolumeSpi> healthyVolumes;
            + private final Semaphore semaphore;

          @Nullable
          private final Callback callback;

          + /**
          + *
          + * @param reference FsVolumeReference to be released when the check is
          + * complete.
          + * @param healthyVolumes set of healthy volumes. If the disk check is
          + * successful, add the volume here.
          + * @param failedVolumes set of failed volumes. If the disk check fails,
          + * add the volume here.
          + * @param semaphore semaphore used to trigger callback invocation.
          — End diff –

          Thanks. The new logic looks good to me.

          Show
          githubbot ASF GitHub Bot added a comment - Github user xiaoyuyao commented on a diff in the pull request: https://github.com/apache/hadoop/pull/168#discussion_r93171107 — Diff: hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/checker/DatasetVolumeChecker.java — @@ -299,26 +327,35 @@ public void checkVolume( private class ResultHandler implements FutureCallback<VolumeCheckResult> { private final FsVolumeReference reference; private final Set<StorageLocation> failedVolumes; private final Set<StorageLocation> healthyVolumes; private final CountDownLatch latch; private final AtomicLong numVolumes; + private final Set<FsVolumeSpi> failedVolumes; + private final Set<FsVolumeSpi> healthyVolumes; + private final Semaphore semaphore; @Nullable private final Callback callback; + /** + * + * @param reference FsVolumeReference to be released when the check is + * complete. + * @param healthyVolumes set of healthy volumes. If the disk check is + * successful, add the volume here. + * @param failedVolumes set of failed volumes. If the disk check fails, + * add the volume here. + * @param semaphore semaphore used to trigger callback invocation. — End diff – Thanks. The new logic looks good to me.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user arp7 commented on the issue:

          https://github.com/apache/hadoop/pull/168

          @xiaoyuyao I pushed one more commit to improve the logging. Now we log at warn if there is a volume failure and at debug if there is no failure.

          Show
          githubbot ASF GitHub Bot added a comment - Github user arp7 commented on the issue: https://github.com/apache/hadoop/pull/168 @xiaoyuyao I pushed one more commit to improve the logging. Now we log at warn if there is a volume failure and at debug if there is no failure.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user arp7 commented on a diff in the pull request:

          https://github.com/apache/hadoop/pull/168#discussion_r93169191

          — Diff: hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java —
          @@ -1944,6 +1935,8 @@ public void shutdown() {
          }
          }

          + volumeChecker.shutdownAndWait(1, TimeUnit.SECONDS);
          +
          if (storageLocationChecker != null) {
          — End diff –

          By the way regarding the reuse, I really wanted to do that too but it's non-trivial because the handling logic is different in both paths. It probably should have never been made different but reconciling them now is a bit of work. We can look at it in a separate Jira.

          Show
          githubbot ASF GitHub Bot added a comment - Github user arp7 commented on a diff in the pull request: https://github.com/apache/hadoop/pull/168#discussion_r93169191 — Diff: hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java — @@ -1944,6 +1935,8 @@ public void shutdown() { } } + volumeChecker.shutdownAndWait(1, TimeUnit.SECONDS); + if (storageLocationChecker != null) { — End diff – By the way regarding the reuse, I really wanted to do that too but it's non-trivial because the handling logic is different in both paths. It probably should have never been made different but reconciling them now is a bit of work. We can look at it in a separate Jira.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user arp7 commented on a diff in the pull request:

          https://github.com/apache/hadoop/pull/168#discussion_r93168784

          — Diff: hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java —
          @@ -3208,68 +3200,49 @@ public ShortCircuitRegistry getShortCircuitRegistry() {
          }

          /**

          • * Check the disk error
            + * Check the disk error synchronously.
            */
          • private void checkDiskError() {
          • Set<StorageLocation> unhealthyLocations = data.checkDataDir();
          • if (unhealthyLocations != null && !unhealthyLocations.isEmpty()) {
            + @VisibleForTesting
            + public void checkDiskError() throws IOException {
            + Set<FsVolumeSpi> unhealthyVolumes;
            + try {
            + unhealthyVolumes = volumeChecker.checkAllVolumes(data);
            + LOG.info("checkDiskError got {} failed volumes - {}",
            + unhealthyVolumes.size(), unhealthyVolumes);
            + lastDiskErrorCheck = Time.monotonicNow();
            + } catch (InterruptedException e) { + LOG.error("Interruped while running disk check", e); + throw new IOException("Interrupted while running disk check", e); + }

            + handleVolumeFailures(unhealthyVolumes);
            + }
            +
            + private void handleVolumeFailures(Set<FsVolumeSpi> unhealthyVolumes) {
            + data.handleVolumeFailures(unhealthyVolumes);
            + Set<StorageLocation> unhealthyLocations = new HashSet<>(
            + unhealthyVolumes.size());
            +
            + if (!unhealthyVolumes.isEmpty()) {
            + StringBuilder sb = new StringBuilder("DataNode failed volumes:");
            + for (FsVolumeSpi vol : unhealthyVolumes)

            { + unhealthyLocations.add(vol.getStorageLocation()); + sb.append(vol.getStorageLocation()).append(";"); + }

            +
            try

            { // Remove all unhealthy volumes from DataNode. removeVolumes(unhealthyLocations, false); }

            catch (IOException e)

            { LOG.warn("Error occurred when removing unhealthy storage dirs: " + e.getMessage(), e); }
          • StringBuilder sb = new StringBuilder("DataNode failed volumes:");
          • for (StorageLocation location : unhealthyLocations) { - sb.append(location + ";"); - }

            + LOG.info(sb.toString());

              • End diff –

          Good point, will push an update shortly to improve the logging.

          Show
          githubbot ASF GitHub Bot added a comment - Github user arp7 commented on a diff in the pull request: https://github.com/apache/hadoop/pull/168#discussion_r93168784 — Diff: hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java — @@ -3208,68 +3200,49 @@ public ShortCircuitRegistry getShortCircuitRegistry() { } /** * Check the disk error + * Check the disk error synchronously. */ private void checkDiskError() { Set<StorageLocation> unhealthyLocations = data.checkDataDir(); if (unhealthyLocations != null && !unhealthyLocations.isEmpty()) { + @VisibleForTesting + public void checkDiskError() throws IOException { + Set<FsVolumeSpi> unhealthyVolumes; + try { + unhealthyVolumes = volumeChecker.checkAllVolumes(data); + LOG.info("checkDiskError got {} failed volumes - {}", + unhealthyVolumes.size(), unhealthyVolumes); + lastDiskErrorCheck = Time.monotonicNow(); + } catch (InterruptedException e) { + LOG.error("Interruped while running disk check", e); + throw new IOException("Interrupted while running disk check", e); + } + handleVolumeFailures(unhealthyVolumes); + } + + private void handleVolumeFailures(Set<FsVolumeSpi> unhealthyVolumes) { + data.handleVolumeFailures(unhealthyVolumes); + Set<StorageLocation> unhealthyLocations = new HashSet<>( + unhealthyVolumes.size()); + + if (!unhealthyVolumes.isEmpty()) { + StringBuilder sb = new StringBuilder("DataNode failed volumes:"); + for (FsVolumeSpi vol : unhealthyVolumes) { + unhealthyLocations.add(vol.getStorageLocation()); + sb.append(vol.getStorageLocation()).append(";"); + } + try { // Remove all unhealthy volumes from DataNode. removeVolumes(unhealthyLocations, false); } catch (IOException e) { LOG.warn("Error occurred when removing unhealthy storage dirs: " + e.getMessage(), e); } StringBuilder sb = new StringBuilder("DataNode failed volumes:"); for (StorageLocation location : unhealthyLocations) { - sb.append(location + ";"); - } + LOG.info(sb.toString()); End diff – Good point, will push an update shortly to improve the logging.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user arp7 commented on a diff in the pull request:

          https://github.com/apache/hadoop/pull/168#discussion_r93168774

          — Diff: hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java —
          @@ -1944,6 +1935,8 @@ public void shutdown() {
          }
          }

          + volumeChecker.shutdownAndWait(1, TimeUnit.SECONDS);
          +
          if (storageLocationChecker != null) {
          — End diff –

          Will address it in a follow up patch.

          Show
          githubbot ASF GitHub Bot added a comment - Github user arp7 commented on a diff in the pull request: https://github.com/apache/hadoop/pull/168#discussion_r93168774 — Diff: hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java — @@ -1944,6 +1935,8 @@ public void shutdown() { } } + volumeChecker.shutdownAndWait(1, TimeUnit.SECONDS); + if (storageLocationChecker != null) { — End diff – Will address it in a follow up patch.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user arp7 commented on a diff in the pull request:

          https://github.com/apache/hadoop/pull/168#discussion_r93168456

          — Diff: hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java —
          @@ -3208,68 +3200,49 @@ public ShortCircuitRegistry getShortCircuitRegistry() {
          }

          /**

          • * Check the disk error
            + * Check the disk error synchronously.
            */
          • private void checkDiskError() {
          • Set<StorageLocation> unhealthyLocations = data.checkDataDir();
          • if (unhealthyLocations != null && !unhealthyLocations.isEmpty()) {
            + @VisibleForTesting
            + public void checkDiskError() throws IOException {
            + Set<FsVolumeSpi> unhealthyVolumes;
            + try {
            + unhealthyVolumes = volumeChecker.checkAllVolumes(data);
            + LOG.info("checkDiskError got {} failed volumes - {}",
            + unhealthyVolumes.size(), unhealthyVolumes);
            + lastDiskErrorCheck = Time.monotonicNow();
              • End diff –

          Same as above.

          Show
          githubbot ASF GitHub Bot added a comment - Github user arp7 commented on a diff in the pull request: https://github.com/apache/hadoop/pull/168#discussion_r93168456 — Diff: hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java — @@ -3208,68 +3200,49 @@ public ShortCircuitRegistry getShortCircuitRegistry() { } /** * Check the disk error + * Check the disk error synchronously. */ private void checkDiskError() { Set<StorageLocation> unhealthyLocations = data.checkDataDir(); if (unhealthyLocations != null && !unhealthyLocations.isEmpty()) { + @VisibleForTesting + public void checkDiskError() throws IOException { + Set<FsVolumeSpi> unhealthyVolumes; + try { + unhealthyVolumes = volumeChecker.checkAllVolumes(data); + LOG.info("checkDiskError got {} failed volumes - {}", + unhealthyVolumes.size(), unhealthyVolumes); + lastDiskErrorCheck = Time.monotonicNow(); End diff – Same as above.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user arp7 commented on a diff in the pull request:

          https://github.com/apache/hadoop/pull/168#discussion_r93168439

          — Diff: hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java —
          @@ -2051,14 +2044,13 @@ public void shutdown() {

          • Check if there is a disk failure asynchronously and if so, handle the error
            */
            public void checkDiskErrorAsync() {
          • synchronized(checkDiskErrorMutex) {
          • checkDiskErrorFlag = true;
          • if(checkDiskErrorThread == null) { - startCheckDiskErrorThread(); - checkDiskErrorThread.start(); - LOG.info("Starting CheckDiskError Thread"); - }
          • }
            + volumeChecker.checkAllVolumesAsync(
            + data, (healthyVolumes, failedVolumes) -> {
            + LOG.info("checkDiskErrorAsync callback got {} failed volumes: {}",
            + failedVolumes.size(), failedVolumes);
            + lastDiskErrorCheck = Time.monotonicNow();
              • End diff –

          The DataNode does not maintain a timer object right now. It is only passed to DatasetVolumeChecker during construction for unit testability of that class.

          Show
          githubbot ASF GitHub Bot added a comment - Github user arp7 commented on a diff in the pull request: https://github.com/apache/hadoop/pull/168#discussion_r93168439 — Diff: hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java — @@ -2051,14 +2044,13 @@ public void shutdown() { Check if there is a disk failure asynchronously and if so, handle the error */ public void checkDiskErrorAsync() { synchronized(checkDiskErrorMutex) { checkDiskErrorFlag = true; if(checkDiskErrorThread == null) { - startCheckDiskErrorThread(); - checkDiskErrorThread.start(); - LOG.info("Starting CheckDiskError Thread"); - } } + volumeChecker.checkAllVolumesAsync( + data, (healthyVolumes, failedVolumes) -> { + LOG.info("checkDiskErrorAsync callback got {} failed volumes: {}", + failedVolumes.size(), failedVolumes); + lastDiskErrorCheck = Time.monotonicNow(); End diff – The DataNode does not maintain a timer object right now. It is only passed to DatasetVolumeChecker during construction for unit testability of that class.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user xiaoyuyao commented on a diff in the pull request:

          https://github.com/apache/hadoop/pull/168#discussion_r93166078

          — Diff: hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java —
          @@ -3208,68 +3200,49 @@ public ShortCircuitRegistry getShortCircuitRegistry() {
          }

          /**

          • * Check the disk error
            + * Check the disk error synchronously.
            */
          • private void checkDiskError() {
          • Set<StorageLocation> unhealthyLocations = data.checkDataDir();
          • if (unhealthyLocations != null && !unhealthyLocations.isEmpty()) {
            + @VisibleForTesting
            + public void checkDiskError() throws IOException {
            + Set<FsVolumeSpi> unhealthyVolumes;
            + try {
            + unhealthyVolumes = volumeChecker.checkAllVolumes(data);
            + LOG.info("checkDiskError got {} failed volumes - {}",
            + unhealthyVolumes.size(), unhealthyVolumes);
            + lastDiskErrorCheck = Time.monotonicNow();
              • End diff –

          should this be timer.monotonicNow();

          Show
          githubbot ASF GitHub Bot added a comment - Github user xiaoyuyao commented on a diff in the pull request: https://github.com/apache/hadoop/pull/168#discussion_r93166078 — Diff: hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java — @@ -3208,68 +3200,49 @@ public ShortCircuitRegistry getShortCircuitRegistry() { } /** * Check the disk error + * Check the disk error synchronously. */ private void checkDiskError() { Set<StorageLocation> unhealthyLocations = data.checkDataDir(); if (unhealthyLocations != null && !unhealthyLocations.isEmpty()) { + @VisibleForTesting + public void checkDiskError() throws IOException { + Set<FsVolumeSpi> unhealthyVolumes; + try { + unhealthyVolumes = volumeChecker.checkAllVolumes(data); + LOG.info("checkDiskError got {} failed volumes - {}", + unhealthyVolumes.size(), unhealthyVolumes); + lastDiskErrorCheck = Time.monotonicNow(); End diff – should this be timer.monotonicNow();
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user xiaoyuyao commented on a diff in the pull request:

          https://github.com/apache/hadoop/pull/168#discussion_r93166028

          — Diff: hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java —
          @@ -2051,14 +2044,13 @@ public void shutdown() {

          • Check if there is a disk failure asynchronously and if so, handle the error
            */
            public void checkDiskErrorAsync() {
          • synchronized(checkDiskErrorMutex) {
          • checkDiskErrorFlag = true;
          • if(checkDiskErrorThread == null) { - startCheckDiskErrorThread(); - checkDiskErrorThread.start(); - LOG.info("Starting CheckDiskError Thread"); - }
          • }
            + volumeChecker.checkAllVolumesAsync(
            + data, (healthyVolumes, failedVolumes) -> {
            + LOG.info("checkDiskErrorAsync callback got {} failed volumes: {}",
            + failedVolumes.size(), failedVolumes);
            + lastDiskErrorCheck = Time.monotonicNow();
              • End diff –

          should this be timer.monotonicNow();

          Show
          githubbot ASF GitHub Bot added a comment - Github user xiaoyuyao commented on a diff in the pull request: https://github.com/apache/hadoop/pull/168#discussion_r93166028 — Diff: hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java — @@ -2051,14 +2044,13 @@ public void shutdown() { Check if there is a disk failure asynchronously and if so, handle the error */ public void checkDiskErrorAsync() { synchronized(checkDiskErrorMutex) { checkDiskErrorFlag = true; if(checkDiskErrorThread == null) { - startCheckDiskErrorThread(); - checkDiskErrorThread.start(); - LOG.info("Starting CheckDiskError Thread"); - } } + volumeChecker.checkAllVolumesAsync( + data, (healthyVolumes, failedVolumes) -> { + LOG.info("checkDiskErrorAsync callback got {} failed volumes: {}", + failedVolumes.size(), failedVolumes); + lastDiskErrorCheck = Time.monotonicNow(); End diff – should this be timer.monotonicNow();
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user xiaoyuyao commented on a diff in the pull request:

          https://github.com/apache/hadoop/pull/168#discussion_r93165658

          — Diff: hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java —
          @@ -1944,6 +1935,8 @@ public void shutdown() {
          }
          }

          + volumeChecker.shutdownAndWait(1, TimeUnit.SECONDS);
          +
          if (storageLocationChecker != null) {
          — End diff –

          Can we reuse the synchronize version of the DatasetVolume checker for datanode startup handling? This way, we don't need to maintain two checkers for Datanode? This can be done in as a follow up.

          Show
          githubbot ASF GitHub Bot added a comment - Github user xiaoyuyao commented on a diff in the pull request: https://github.com/apache/hadoop/pull/168#discussion_r93165658 — Diff: hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java — @@ -1944,6 +1935,8 @@ public void shutdown() { } } + volumeChecker.shutdownAndWait(1, TimeUnit.SECONDS); + if (storageLocationChecker != null) { — End diff – Can we reuse the synchronize version of the DatasetVolume checker for datanode startup handling? This way, we don't need to maintain two checkers for Datanode? This can be done in as a follow up.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user xiaoyuyao commented on a diff in the pull request:

          https://github.com/apache/hadoop/pull/168#discussion_r93165428

          — Diff: hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java —
          @@ -1944,6 +1935,8 @@ public void shutdown() {
          }
          }

          + volumeChecker.shutdownAndWait(1, TimeUnit.SECONDS);
          +
          if (storageLocationChecker != null) {
          — End diff –

          The Datanode#storageLocationChecker is only needed during the datanode startup. We don't need to pass it as a parameter to DataNode constructor and keep it running during the lifetime of the datanode until datanode shutdown. This can be done as an optimization later.

          Show
          githubbot ASF GitHub Bot added a comment - Github user xiaoyuyao commented on a diff in the pull request: https://github.com/apache/hadoop/pull/168#discussion_r93165428 — Diff: hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java — @@ -1944,6 +1935,8 @@ public void shutdown() { } } + volumeChecker.shutdownAndWait(1, TimeUnit.SECONDS); + if (storageLocationChecker != null) { — End diff – The Datanode#storageLocationChecker is only needed during the datanode startup. We don't need to pass it as a parameter to DataNode constructor and keep it running during the lifetime of the datanode until datanode shutdown. This can be done as an optimization later.
          Hide
          arpitagarwal Arpit Agarwal added a comment -

          The remaining checkstyle warnings are for indentation and pre-existing long methods.

          Show
          arpitagarwal Arpit Agarwal added a comment - The remaining checkstyle warnings are for indentation and pre-existing long methods.
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 12s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 12 new or modified test files.
          +1 mvninstall 12m 35s trunk passed
          +1 compile 0m 45s trunk passed
          +1 checkstyle 0m 32s trunk passed
          +1 mvnsite 0m 50s trunk passed
          +1 mvneclipse 0m 12s trunk passed
          +1 findbugs 1m 41s trunk passed
          +1 javadoc 0m 39s trunk passed
          +1 mvninstall 0m 46s the patch passed
          +1 compile 0m 44s the patch passed
          +1 javac 0m 44s the patch passed
          -0 checkstyle 0m 30s hadoop-hdfs-project/hadoop-hdfs: The patch generated 10 new + 467 unchanged - 10 fixed = 477 total (was 477)
          +1 mvnsite 0m 49s the patch passed
          +1 mvneclipse 0m 10s the patch passed
          -1 whitespace 0m 0s The patch has 4 line(s) that end in whitespace. Use git apply --whitespace=fix <<patch_file>>. Refer https://git-scm.com/docs/git-apply
          +1 findbugs 1m 49s the patch passed
          +1 javadoc 0m 37s the patch passed
          +1 unit 63m 39s hadoop-hdfs in the patch passed.
          +1 asflicense 0m 21s The patch does not generate ASF License warnings.
          88m 7s



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:a9ad5d6
          JIRA Issue HDFS-11182
          GITHUB PR https://github.com/apache/hadoop/pull/168
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux f3abaae15447 3.13.0-93-generic #140-Ubuntu SMP Mon Jul 18 21:21:05 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / ef2dd7b
          Default Java 1.8.0_111
          findbugs v3.0.0
          checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/17904/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
          whitespace https://builds.apache.org/job/PreCommit-HDFS-Build/17904/artifact/patchprocess/whitespace-eol.txt
          Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/17904/testReport/
          modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/17904/console
          Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 12s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 12 new or modified test files. +1 mvninstall 12m 35s trunk passed +1 compile 0m 45s trunk passed +1 checkstyle 0m 32s trunk passed +1 mvnsite 0m 50s trunk passed +1 mvneclipse 0m 12s trunk passed +1 findbugs 1m 41s trunk passed +1 javadoc 0m 39s trunk passed +1 mvninstall 0m 46s the patch passed +1 compile 0m 44s the patch passed +1 javac 0m 44s the patch passed -0 checkstyle 0m 30s hadoop-hdfs-project/hadoop-hdfs: The patch generated 10 new + 467 unchanged - 10 fixed = 477 total (was 477) +1 mvnsite 0m 49s the patch passed +1 mvneclipse 0m 10s the patch passed -1 whitespace 0m 0s The patch has 4 line(s) that end in whitespace. Use git apply --whitespace=fix <<patch_file>>. Refer https://git-scm.com/docs/git-apply +1 findbugs 1m 49s the patch passed +1 javadoc 0m 37s the patch passed +1 unit 63m 39s hadoop-hdfs in the patch passed. +1 asflicense 0m 21s The patch does not generate ASF License warnings. 88m 7s Subsystem Report/Notes Docker Image:yetus/hadoop:a9ad5d6 JIRA Issue HDFS-11182 GITHUB PR https://github.com/apache/hadoop/pull/168 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux f3abaae15447 3.13.0-93-generic #140-Ubuntu SMP Mon Jul 18 21:21:05 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / ef2dd7b Default Java 1.8.0_111 findbugs v3.0.0 checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/17904/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt whitespace https://builds.apache.org/job/PreCommit-HDFS-Build/17904/artifact/patchprocess/whitespace-eol.txt Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/17904/testReport/ modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs Console output https://builds.apache.org/job/PreCommit-HDFS-Build/17904/console Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 11s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 12 new or modified test files.
          +1 mvninstall 13m 34s trunk passed
          +1 compile 0m 48s trunk passed
          +1 checkstyle 0m 32s trunk passed
          +1 mvnsite 0m 52s trunk passed
          +1 mvneclipse 0m 13s trunk passed
          +1 findbugs 1m 51s trunk passed
          +1 javadoc 0m 39s trunk passed
          +1 mvninstall 0m 45s the patch passed
          +1 compile 0m 43s the patch passed
          +1 javac 0m 43s the patch passed
          -0 checkstyle 0m 29s hadoop-hdfs-project/hadoop-hdfs: The patch generated 12 new + 467 unchanged - 10 fixed = 479 total (was 477)
          +1 mvnsite 0m 48s the patch passed
          +1 mvneclipse 0m 11s the patch passed
          -1 whitespace 0m 0s The patch has 4 line(s) that end in whitespace. Use git apply --whitespace=fix <<patch_file>>. Refer https://git-scm.com/docs/git-apply
          +1 findbugs 1m 46s the patch passed
          +1 javadoc 0m 38s the patch passed
          -1 unit 65m 14s hadoop-hdfs in the patch failed.
          +1 asflicense 0m 19s The patch does not generate ASF License warnings.
          90m 58s



          Reason Tests
          Failed junit tests hadoop.hdfs.server.balancer.TestBalancerWithSaslDataTransfer



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:a9ad5d6
          JIRA Issue HDFS-11182
          GITHUB PR https://github.com/apache/hadoop/pull/168
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux 0a64cf41a382 3.13.0-105-generic #152-Ubuntu SMP Fri Dec 2 15:37:11 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / ef2dd7b
          Default Java 1.8.0_111
          findbugs v3.0.0
          checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/17901/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
          whitespace https://builds.apache.org/job/PreCommit-HDFS-Build/17901/artifact/patchprocess/whitespace-eol.txt
          unit https://builds.apache.org/job/PreCommit-HDFS-Build/17901/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
          Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/17901/testReport/
          modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/17901/console
          Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 11s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 12 new or modified test files. +1 mvninstall 13m 34s trunk passed +1 compile 0m 48s trunk passed +1 checkstyle 0m 32s trunk passed +1 mvnsite 0m 52s trunk passed +1 mvneclipse 0m 13s trunk passed +1 findbugs 1m 51s trunk passed +1 javadoc 0m 39s trunk passed +1 mvninstall 0m 45s the patch passed +1 compile 0m 43s the patch passed +1 javac 0m 43s the patch passed -0 checkstyle 0m 29s hadoop-hdfs-project/hadoop-hdfs: The patch generated 12 new + 467 unchanged - 10 fixed = 479 total (was 477) +1 mvnsite 0m 48s the patch passed +1 mvneclipse 0m 11s the patch passed -1 whitespace 0m 0s The patch has 4 line(s) that end in whitespace. Use git apply --whitespace=fix <<patch_file>>. Refer https://git-scm.com/docs/git-apply +1 findbugs 1m 46s the patch passed +1 javadoc 0m 38s the patch passed -1 unit 65m 14s hadoop-hdfs in the patch failed. +1 asflicense 0m 19s The patch does not generate ASF License warnings. 90m 58s Reason Tests Failed junit tests hadoop.hdfs.server.balancer.TestBalancerWithSaslDataTransfer Subsystem Report/Notes Docker Image:yetus/hadoop:a9ad5d6 JIRA Issue HDFS-11182 GITHUB PR https://github.com/apache/hadoop/pull/168 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 0a64cf41a382 3.13.0-105-generic #152-Ubuntu SMP Fri Dec 2 15:37:11 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / ef2dd7b Default Java 1.8.0_111 findbugs v3.0.0 checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/17901/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt whitespace https://builds.apache.org/job/PreCommit-HDFS-Build/17901/artifact/patchprocess/whitespace-eol.txt unit https://builds.apache.org/job/PreCommit-HDFS-Build/17901/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/17901/testReport/ modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs Console output https://builds.apache.org/job/PreCommit-HDFS-Build/17901/console Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          arpitagarwal Arpit Agarwal added a comment - - edited

          Thanks for the feecback Xiaoyu Yao!

          Pushed the following commit:
          https://github.com/apache/hadoop/pull/168/commits/7ef55e7b2622de9345dd2d138c3e828ad5916bfa

          • Removed the semaphore as both you and Manjunath Anand commented on it, replaced with a simple counter initialized to the number of volumes.
          • Factored out some common code in tests you pointed out into a @Before method.

          Manjunath Anand, to address your other point about shared code between the sync and async checks, there's a couple of differences that make it harder to share code.

          Show
          arpitagarwal Arpit Agarwal added a comment - - edited Thanks for the feecback Xiaoyu Yao ! Pushed the following commit: https://github.com/apache/hadoop/pull/168/commits/7ef55e7b2622de9345dd2d138c3e828ad5916bfa Removed the semaphore as both you and Manjunath Anand commented on it, replaced with a simple counter initialized to the number of volumes. Factored out some common code in tests you pointed out into a @Before method. Manjunath Anand , to address your other point about shared code between the sync and async checks, there's a couple of differences that make it harder to share code.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user arp7 commented on a diff in the pull request:

          https://github.com/apache/hadoop/pull/168#discussion_r93133957

          — Diff: hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/checker/DatasetVolumeChecker.java —
          @@ -299,26 +327,35 @@ public void checkVolume(
          private class ResultHandler
          implements FutureCallback<VolumeCheckResult> {
          private final FsVolumeReference reference;

          • private final Set<StorageLocation> failedVolumes;
          • private final Set<StorageLocation> healthyVolumes;
          • private final CountDownLatch latch;
          • private final AtomicLong numVolumes;
            + private final Set<FsVolumeSpi> failedVolumes;
            + private final Set<FsVolumeSpi> healthyVolumes;
            + private final Semaphore semaphore;

          @Nullable
          private final Callback callback;

          + /**
          + *
          + * @param reference FsVolumeReference to be released when the check is
          + * complete.
          + * @param healthyVolumes set of healthy volumes. If the disk check is
          + * successful, add the volume here.
          + * @param failedVolumes set of failed volumes. If the disk check fails,
          + * add the volume here.
          + * @param semaphore semaphore used to trigger callback invocation.
          — End diff –

          I think this logic can be simplified. Will post an updated patch shortly.

          Show
          githubbot ASF GitHub Bot added a comment - Github user arp7 commented on a diff in the pull request: https://github.com/apache/hadoop/pull/168#discussion_r93133957 — Diff: hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/checker/DatasetVolumeChecker.java — @@ -299,26 +327,35 @@ public void checkVolume( private class ResultHandler implements FutureCallback<VolumeCheckResult> { private final FsVolumeReference reference; private final Set<StorageLocation> failedVolumes; private final Set<StorageLocation> healthyVolumes; private final CountDownLatch latch; private final AtomicLong numVolumes; + private final Set<FsVolumeSpi> failedVolumes; + private final Set<FsVolumeSpi> healthyVolumes; + private final Semaphore semaphore; @Nullable private final Callback callback; + /** + * + * @param reference FsVolumeReference to be released when the check is + * complete. + * @param healthyVolumes set of healthy volumes. If the disk check is + * successful, add the volume here. + * @param failedVolumes set of failed volumes. If the disk check fails, + * add the volume here. + * @param semaphore semaphore used to trigger callback invocation. — End diff – I think this logic can be simplified. Will post an updated patch shortly.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user xiaoyuyao commented on a diff in the pull request:

          https://github.com/apache/hadoop/pull/168#discussion_r93132992

          — Diff: hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsVolumeList.java —
          @@ -235,23 +233,14 @@ public void run() {

          • Use {@link checkDirsLock}

            to allow only one instance of checkDirs() call.
            *

          • @return list of all the failed volumes.
            + * @param failedVolumes
            */
          • Set<StorageLocation> checkDirs() {
            + void handleVolumeFailures(Set<FsVolumeSpi> failedVolumes) {
            try (AutoCloseableLock lock = checkDirsLock.acquire()) {
          • Set<StorageLocation> failedLocations = null;
          • // Make a copy of volumes for performing modification
          • final List<FsVolumeImpl> volumeList = getVolumes();
          • for(Iterator<FsVolumeImpl> i = volumeList.iterator(); i.hasNext(); ) {
          • final FsVolumeImpl fsv = i.next();
            + for(FsVolumeSpi vol : failedVolumes) {
            + FsVolumeImpl fsv = (FsVolumeImpl) vol;
              • End diff –

          Thanks for the explanation. Looks good to me.

          Show
          githubbot ASF GitHub Bot added a comment - Github user xiaoyuyao commented on a diff in the pull request: https://github.com/apache/hadoop/pull/168#discussion_r93132992 — Diff: hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsVolumeList.java — @@ -235,23 +233,14 @@ public void run() { Use {@link checkDirsLock} to allow only one instance of checkDirs() call. * @return list of all the failed volumes. + * @param failedVolumes */ Set<StorageLocation> checkDirs() { + void handleVolumeFailures(Set<FsVolumeSpi> failedVolumes) { try (AutoCloseableLock lock = checkDirsLock.acquire()) { Set<StorageLocation> failedLocations = null; // Make a copy of volumes for performing modification final List<FsVolumeImpl> volumeList = getVolumes(); for(Iterator<FsVolumeImpl> i = volumeList.iterator(); i.hasNext(); ) { final FsVolumeImpl fsv = i.next(); + for(FsVolumeSpi vol : failedVolumes) { + FsVolumeImpl fsv = (FsVolumeImpl) vol; End diff – Thanks for the explanation. Looks good to me.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user arp7 commented on a diff in the pull request:

          https://github.com/apache/hadoop/pull/168#discussion_r93130106

          — Diff: hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsVolumeList.java —
          @@ -235,23 +233,14 @@ public void run() {

          • Use {@link checkDirsLock}

            to allow only one instance of checkDirs() call.
            *

          • @return list of all the failed volumes.
            + * @param failedVolumes
            */
          • Set<StorageLocation> checkDirs() {
            + void handleVolumeFailures(Set<FsVolumeSpi> failedVolumes) {
            try (AutoCloseableLock lock = checkDirsLock.acquire()) {
          • Set<StorageLocation> failedLocations = null;
          • // Make a copy of volumes for performing modification
          • final List<FsVolumeImpl> volumeList = getVolumes();
          • for(Iterator<FsVolumeImpl> i = volumeList.iterator(); i.hasNext(); ) {
          • final FsVolumeImpl fsv = i.next();
            + for(FsVolumeSpi vol : failedVolumes) {
            + FsVolumeImpl fsv = (FsVolumeImpl) vol;
              • End diff –

          Yes. FsVolumeList is part of the fsdataset.impl package and its methods are only invoked from FsDatasetImpl so it is safe to assume that the volume is an FsVolumeImpl.

          At least one existing method also makes the same assumption (see copyReplicaWithNewBlockIdAndGS).
          ```
          private File[] copyReplicaWithNewBlockIdAndGS(
          ReplicaInfo replicaInfo, String bpid, long newBlkId, long newGS)
          throws IOException {
          String blockFileName = Block.BLOCK_FILE_PREFIX + newBlkId;
          FsVolumeImpl v = (FsVolumeImpl) replicaInfo.getVolume();
          ```

          Show
          githubbot ASF GitHub Bot added a comment - Github user arp7 commented on a diff in the pull request: https://github.com/apache/hadoop/pull/168#discussion_r93130106 — Diff: hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsVolumeList.java — @@ -235,23 +233,14 @@ public void run() { Use {@link checkDirsLock} to allow only one instance of checkDirs() call. * @return list of all the failed volumes. + * @param failedVolumes */ Set<StorageLocation> checkDirs() { + void handleVolumeFailures(Set<FsVolumeSpi> failedVolumes) { try (AutoCloseableLock lock = checkDirsLock.acquire()) { Set<StorageLocation> failedLocations = null; // Make a copy of volumes for performing modification final List<FsVolumeImpl> volumeList = getVolumes(); for(Iterator<FsVolumeImpl> i = volumeList.iterator(); i.hasNext(); ) { final FsVolumeImpl fsv = i.next(); + for(FsVolumeSpi vol : failedVolumes) { + FsVolumeImpl fsv = (FsVolumeImpl) vol; End diff – Yes. FsVolumeList is part of the fsdataset.impl package and its methods are only invoked from FsDatasetImpl so it is safe to assume that the volume is an FsVolumeImpl. At least one existing method also makes the same assumption (see copyReplicaWithNewBlockIdAndGS). ``` private File[] copyReplicaWithNewBlockIdAndGS( ReplicaInfo replicaInfo, String bpid, long newBlkId, long newGS) throws IOException { String blockFileName = Block.BLOCK_FILE_PREFIX + newBlkId; FsVolumeImpl v = (FsVolumeImpl) replicaInfo.getVolume(); ```
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user arp7 commented on a diff in the pull request:

          https://github.com/apache/hadoop/pull/168#discussion_r93129689

          — Diff: hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/checker/DatasetVolumeChecker.java —
          @@ -299,26 +327,35 @@ public void checkVolume(
          private class ResultHandler
          implements FutureCallback<VolumeCheckResult> {
          private final FsVolumeReference reference;

          • private final Set<StorageLocation> failedVolumes;
          • private final Set<StorageLocation> healthyVolumes;
          • private final CountDownLatch latch;
          • private final AtomicLong numVolumes;
            + private final Set<FsVolumeSpi> failedVolumes;
            + private final Set<FsVolumeSpi> healthyVolumes;
            + private final Semaphore semaphore;

          @Nullable
          private final Callback callback;

          + /**
          + *
          + * @param reference FsVolumeReference to be released when the check is
          + * complete.
          + * @param healthyVolumes set of healthy volumes. If the disk check is
          + * successful, add the volume here.
          + * @param failedVolumes set of failed volumes. If the disk check fails,
          + * add the volume here.
          + * @param semaphore semaphore used to trigger callback invocation.
          — End diff –

          CountDownLatch#countDown returns no value so there is no easy way to detect when the count falls to zero and the callback can be invoked (it must be invoked once only). I was using an AtomicLong to detect the 0->1 transition but it had a bug.

          The semaphore approach fixes it. We still need the CountDownLatch which we can use as an event. I could have used an Object mutex instead but that would have required extra code to deal with the spurious wakeup problem which CountDownLatch does not suffer from.

          Show
          githubbot ASF GitHub Bot added a comment - Github user arp7 commented on a diff in the pull request: https://github.com/apache/hadoop/pull/168#discussion_r93129689 — Diff: hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/checker/DatasetVolumeChecker.java — @@ -299,26 +327,35 @@ public void checkVolume( private class ResultHandler implements FutureCallback<VolumeCheckResult> { private final FsVolumeReference reference; private final Set<StorageLocation> failedVolumes; private final Set<StorageLocation> healthyVolumes; private final CountDownLatch latch; private final AtomicLong numVolumes; + private final Set<FsVolumeSpi> failedVolumes; + private final Set<FsVolumeSpi> healthyVolumes; + private final Semaphore semaphore; @Nullable private final Callback callback; + /** + * + * @param reference FsVolumeReference to be released when the check is + * complete. + * @param healthyVolumes set of healthy volumes. If the disk check is + * successful, add the volume here. + * @param failedVolumes set of failed volumes. If the disk check fails, + * add the volume here. + * @param semaphore semaphore used to trigger callback invocation. — End diff – CountDownLatch#countDown returns no value so there is no easy way to detect when the count falls to zero and the callback can be invoked (it must be invoked once only). I was using an AtomicLong to detect the 0->1 transition but it had a bug. The semaphore approach fixes it. We still need the CountDownLatch which we can use as an event. I could have used an Object mutex instead but that would have required extra code to deal with the spurious wakeup problem which CountDownLatch does not suffer from.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user xiaoyuyao commented on a diff in the pull request:

          https://github.com/apache/hadoop/pull/168#discussion_r93123146

          — Diff: hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsVolumeList.java —
          @@ -235,23 +233,14 @@ public void run() {

          • Use {@link checkDirsLock}

            to allow only one instance of checkDirs() call.
            *

          • @return list of all the failed volumes.
            + * @param failedVolumes
            */
          • Set<StorageLocation> checkDirs() {
            + void handleVolumeFailures(Set<FsVolumeSpi> failedVolumes) {
            try (AutoCloseableLock lock = checkDirsLock.acquire()) {
          • Set<StorageLocation> failedLocations = null;
          • // Make a copy of volumes for performing modification
          • final List<FsVolumeImpl> volumeList = getVolumes();
          • for(Iterator<FsVolumeImpl> i = volumeList.iterator(); i.hasNext(); ) {
          • final FsVolumeImpl fsv = i.next();
            + for(FsVolumeSpi vol : failedVolumes) {
            + FsVolumeImpl fsv = (FsVolumeImpl) vol;
              • End diff –

          Is it a safe cast from FsVolumeSpi to FsVolumeImpl? Can we add some log here in case the cast fail?

          Show
          githubbot ASF GitHub Bot added a comment - Github user xiaoyuyao commented on a diff in the pull request: https://github.com/apache/hadoop/pull/168#discussion_r93123146 — Diff: hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsVolumeList.java — @@ -235,23 +233,14 @@ public void run() { Use {@link checkDirsLock} to allow only one instance of checkDirs() call. * @return list of all the failed volumes. + * @param failedVolumes */ Set<StorageLocation> checkDirs() { + void handleVolumeFailures(Set<FsVolumeSpi> failedVolumes) { try (AutoCloseableLock lock = checkDirsLock.acquire()) { Set<StorageLocation> failedLocations = null; // Make a copy of volumes for performing modification final List<FsVolumeImpl> volumeList = getVolumes(); for(Iterator<FsVolumeImpl> i = volumeList.iterator(); i.hasNext(); ) { final FsVolumeImpl fsv = i.next(); + for(FsVolumeSpi vol : failedVolumes) { + FsVolumeImpl fsv = (FsVolumeImpl) vol; End diff – Is it a safe cast from FsVolumeSpi to FsVolumeImpl? Can we add some log here in case the cast fail?
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user xiaoyuyao commented on a diff in the pull request:

          https://github.com/apache/hadoop/pull/168#discussion_r93122724

          — Diff: hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/checker/DatasetVolumeChecker.java —
          @@ -299,26 +327,35 @@ public void checkVolume(
          private class ResultHandler
          implements FutureCallback<VolumeCheckResult> {
          private final FsVolumeReference reference;

          • private final Set<StorageLocation> failedVolumes;
          • private final Set<StorageLocation> healthyVolumes;
          • private final CountDownLatch latch;
          • private final AtomicLong numVolumes;
            + private final Set<FsVolumeSpi> failedVolumes;
            + private final Set<FsVolumeSpi> healthyVolumes;
            + private final Semaphore semaphore;

          @Nullable
          private final Callback callback;

          + /**
          + *
          + * @param reference FsVolumeReference to be released when the check is
          + * complete.
          + * @param healthyVolumes set of healthy volumes. If the disk check is
          + * successful, add the volume here.
          + * @param failedVolumes set of failed volumes. If the disk check fails,
          + * add the volume here.
          + * @param semaphore semaphore used to trigger callback invocation.
          — End diff –

          The usage of semaphore here seems like a countUpLatch. Have you hit any problem with the existing CountDownLatch approach?

          Show
          githubbot ASF GitHub Bot added a comment - Github user xiaoyuyao commented on a diff in the pull request: https://github.com/apache/hadoop/pull/168#discussion_r93122724 — Diff: hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/checker/DatasetVolumeChecker.java — @@ -299,26 +327,35 @@ public void checkVolume( private class ResultHandler implements FutureCallback<VolumeCheckResult> { private final FsVolumeReference reference; private final Set<StorageLocation> failedVolumes; private final Set<StorageLocation> healthyVolumes; private final CountDownLatch latch; private final AtomicLong numVolumes; + private final Set<FsVolumeSpi> failedVolumes; + private final Set<FsVolumeSpi> healthyVolumes; + private final Semaphore semaphore; @Nullable private final Callback callback; + /** + * + * @param reference FsVolumeReference to be released when the check is + * complete. + * @param healthyVolumes set of healthy volumes. If the disk check is + * successful, add the volume here. + * @param failedVolumes set of failed volumes. If the disk check fails, + * add the volume here. + * @param semaphore semaphore used to trigger callback invocation. — End diff – The usage of semaphore here seems like a countUpLatch. Have you hit any problem with the existing CountDownLatch approach?
          Hide
          hadoopqa Hadoop QA added a comment -
          +1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 16s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 12 new or modified test files.
          +1 mvninstall 7m 45s trunk passed
          +1 compile 0m 44s trunk passed
          +1 checkstyle 0m 33s trunk passed
          +1 mvnsite 0m 51s trunk passed
          +1 mvneclipse 0m 12s trunk passed
          +1 findbugs 1m 44s trunk passed
          +1 javadoc 0m 39s trunk passed
          +1 mvninstall 0m 45s the patch passed
          +1 compile 0m 43s the patch passed
          +1 javac 0m 43s the patch passed
          -0 checkstyle 0m 29s hadoop-hdfs-project/hadoop-hdfs: The patch generated 10 new + 467 unchanged - 10 fixed = 477 total (was 477)
          +1 mvnsite 0m 49s the patch passed
          +1 mvneclipse 0m 10s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 findbugs 1m 47s the patch passed
          +1 javadoc 0m 37s the patch passed
          +1 unit 62m 23s hadoop-hdfs in the patch passed.
          +1 asflicense 0m 19s The patch does not generate ASF License warnings.
          82m 3s



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:a9ad5d6
          JIRA Issue HDFS-11182
          GITHUB PR https://github.com/apache/hadoop/pull/168
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux bb9d62902181 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / f5e0bd3
          Default Java 1.8.0_111
          findbugs v3.0.0
          checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/17858/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
          Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/17858/testReport/
          modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/17858/console
          Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - +1 overall Vote Subsystem Runtime Comment 0 reexec 0m 16s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 12 new or modified test files. +1 mvninstall 7m 45s trunk passed +1 compile 0m 44s trunk passed +1 checkstyle 0m 33s trunk passed +1 mvnsite 0m 51s trunk passed +1 mvneclipse 0m 12s trunk passed +1 findbugs 1m 44s trunk passed +1 javadoc 0m 39s trunk passed +1 mvninstall 0m 45s the patch passed +1 compile 0m 43s the patch passed +1 javac 0m 43s the patch passed -0 checkstyle 0m 29s hadoop-hdfs-project/hadoop-hdfs: The patch generated 10 new + 467 unchanged - 10 fixed = 477 total (was 477) +1 mvnsite 0m 49s the patch passed +1 mvneclipse 0m 10s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 1m 47s the patch passed +1 javadoc 0m 37s the patch passed +1 unit 62m 23s hadoop-hdfs in the patch passed. +1 asflicense 0m 19s The patch does not generate ASF License warnings. 82m 3s Subsystem Report/Notes Docker Image:yetus/hadoop:a9ad5d6 JIRA Issue HDFS-11182 GITHUB PR https://github.com/apache/hadoop/pull/168 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux bb9d62902181 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / f5e0bd3 Default Java 1.8.0_111 findbugs v3.0.0 checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/17858/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/17858/testReport/ modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs Console output https://builds.apache.org/job/PreCommit-HDFS-Build/17858/console Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 14s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 12 new or modified test files.
          +1 mvninstall 8m 10s trunk passed
          +1 compile 0m 47s trunk passed
          +1 checkstyle 0m 31s trunk passed
          +1 mvnsite 0m 53s trunk passed
          +1 mvneclipse 0m 13s trunk passed
          +1 findbugs 1m 46s trunk passed
          +1 javadoc 0m 42s trunk passed
          +1 mvninstall 0m 57s the patch passed
          +1 compile 0m 54s the patch passed
          +1 javac 0m 54s the patch passed
          -0 checkstyle 0m 31s hadoop-hdfs-project/hadoop-hdfs: The patch generated 10 new + 468 unchanged - 10 fixed = 478 total (was 478)
          +1 mvnsite 1m 4s the patch passed
          +1 mvneclipse 0m 11s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 findbugs 2m 3s the patch passed
          +1 javadoc 0m 40s the patch passed
          -1 unit 65m 13s hadoop-hdfs in the patch failed.
          +1 asflicense 0m 20s The patch does not generate ASF License warnings.
          86m 42s



          Reason Tests
          Failed junit tests hadoop.hdfs.TestTrashWithSecureEncryptionZones
            hadoop.hdfs.TestSecureEncryptionZoneWithKMS



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:a9ad5d6
          JIRA Issue HDFS-11182
          GITHUB PR https://github.com/apache/hadoop/pull/168
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux 0547f5948b52 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / e24a923
          Default Java 1.8.0_111
          findbugs v3.0.0
          checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/17852/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
          unit https://builds.apache.org/job/PreCommit-HDFS-Build/17852/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
          Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/17852/testReport/
          modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/17852/console
          Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 14s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 12 new or modified test files. +1 mvninstall 8m 10s trunk passed +1 compile 0m 47s trunk passed +1 checkstyle 0m 31s trunk passed +1 mvnsite 0m 53s trunk passed +1 mvneclipse 0m 13s trunk passed +1 findbugs 1m 46s trunk passed +1 javadoc 0m 42s trunk passed +1 mvninstall 0m 57s the patch passed +1 compile 0m 54s the patch passed +1 javac 0m 54s the patch passed -0 checkstyle 0m 31s hadoop-hdfs-project/hadoop-hdfs: The patch generated 10 new + 468 unchanged - 10 fixed = 478 total (was 478) +1 mvnsite 1m 4s the patch passed +1 mvneclipse 0m 11s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 2m 3s the patch passed +1 javadoc 0m 40s the patch passed -1 unit 65m 13s hadoop-hdfs in the patch failed. +1 asflicense 0m 20s The patch does not generate ASF License warnings. 86m 42s Reason Tests Failed junit tests hadoop.hdfs.TestTrashWithSecureEncryptionZones   hadoop.hdfs.TestSecureEncryptionZoneWithKMS Subsystem Report/Notes Docker Image:yetus/hadoop:a9ad5d6 JIRA Issue HDFS-11182 GITHUB PR https://github.com/apache/hadoop/pull/168 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 0547f5948b52 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / e24a923 Default Java 1.8.0_111 findbugs v3.0.0 checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/17852/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt unit https://builds.apache.org/job/PreCommit-HDFS-Build/17852/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/17852/testReport/ modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs Console output https://builds.apache.org/job/PreCommit-HDFS-Build/17852/console Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 12s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 12 new or modified test files.
          +1 mvninstall 6m 48s trunk passed
          +1 compile 0m 44s trunk passed
          +1 checkstyle 0m 32s trunk passed
          +1 mvnsite 0m 51s trunk passed
          +1 mvneclipse 0m 12s trunk passed
          +1 findbugs 1m 42s trunk passed
          +1 javadoc 0m 39s trunk passed
          +1 mvninstall 0m 45s the patch passed
          +1 compile 0m 43s the patch passed
          +1 javac 0m 43s the patch passed
          -0 checkstyle 0m 29s hadoop-hdfs-project/hadoop-hdfs: The patch generated 12 new + 468 unchanged - 10 fixed = 480 total (was 478)
          +1 mvnsite 0m 49s the patch passed
          +1 mvneclipse 0m 10s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 findbugs 1m 46s the patch passed
          +1 javadoc 0m 37s the patch passed
          -1 unit 63m 59s hadoop-hdfs in the patch failed.
          +1 asflicense 0m 19s The patch does not generate ASF License warnings.
          82m 30s



          Reason Tests
          Failed junit tests hadoop.hdfs.TestTrashWithSecureEncryptionZones
            hadoop.hdfs.TestSecureEncryptionZoneWithKMS
            hadoop.hdfs.server.namenode.TestStartup



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:a9ad5d6
          JIRA Issue HDFS-11182
          GITHUB PR https://github.com/apache/hadoop/pull/168
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux d5af07d09463 3.13.0-93-generic #140-Ubuntu SMP Mon Jul 18 21:21:05 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / ef34bf2
          Default Java 1.8.0_111
          findbugs v3.0.0
          checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/17850/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
          unit https://builds.apache.org/job/PreCommit-HDFS-Build/17850/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
          Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/17850/testReport/
          modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/17850/console
          Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 12s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 12 new or modified test files. +1 mvninstall 6m 48s trunk passed +1 compile 0m 44s trunk passed +1 checkstyle 0m 32s trunk passed +1 mvnsite 0m 51s trunk passed +1 mvneclipse 0m 12s trunk passed +1 findbugs 1m 42s trunk passed +1 javadoc 0m 39s trunk passed +1 mvninstall 0m 45s the patch passed +1 compile 0m 43s the patch passed +1 javac 0m 43s the patch passed -0 checkstyle 0m 29s hadoop-hdfs-project/hadoop-hdfs: The patch generated 12 new + 468 unchanged - 10 fixed = 480 total (was 478) +1 mvnsite 0m 49s the patch passed +1 mvneclipse 0m 10s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 1m 46s the patch passed +1 javadoc 0m 37s the patch passed -1 unit 63m 59s hadoop-hdfs in the patch failed. +1 asflicense 0m 19s The patch does not generate ASF License warnings. 82m 30s Reason Tests Failed junit tests hadoop.hdfs.TestTrashWithSecureEncryptionZones   hadoop.hdfs.TestSecureEncryptionZoneWithKMS   hadoop.hdfs.server.namenode.TestStartup Subsystem Report/Notes Docker Image:yetus/hadoop:a9ad5d6 JIRA Issue HDFS-11182 GITHUB PR https://github.com/apache/hadoop/pull/168 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux d5af07d09463 3.13.0-93-generic #140-Ubuntu SMP Mon Jul 18 21:21:05 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / ef34bf2 Default Java 1.8.0_111 findbugs v3.0.0 checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/17850/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt unit https://builds.apache.org/job/PreCommit-HDFS-Build/17850/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/17850/testReport/ modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs Console output https://builds.apache.org/job/PreCommit-HDFS-Build/17850/console Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 17s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 8 new or modified test files.
          +1 mvninstall 7m 29s trunk passed
          +1 compile 0m 46s trunk passed
          +1 checkstyle 0m 30s trunk passed
          +1 mvnsite 0m 52s trunk passed
          +1 mvneclipse 0m 13s trunk passed
          +1 findbugs 1m 45s trunk passed
          +1 javadoc 0m 40s trunk passed
          +1 mvninstall 0m 48s the patch passed
          +1 compile 0m 44s the patch passed
          +1 javac 0m 44s the patch passed
          -0 checkstyle 0m 29s hadoop-hdfs-project/hadoop-hdfs: The patch generated 21 new + 384 unchanged - 10 fixed = 405 total (was 394)
          +1 mvnsite 0m 52s the patch passed
          +1 mvneclipse 0m 11s the patch passed
          -1 whitespace 0m 0s The patch has 8 line(s) that end in whitespace. Use git apply --whitespace=fix <<patch_file>>. Refer https://git-scm.com/docs/git-apply
          +1 findbugs 1m 59s the patch passed
          +1 javadoc 0m 38s the patch passed
          -1 unit 70m 19s hadoop-hdfs in the patch failed.
          +1 asflicense 0m 20s The patch does not generate ASF License warnings.
          90m 13s



          Reason Tests
          Failed junit tests hadoop.hdfs.TestTrashWithSecureEncryptionZones
            hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting
            hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes
            hadoop.hdfs.server.datanode.TestDirectoryScanner
            hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure
            hadoop.hdfs.TestSecureEncryptionZoneWithKMS
            hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:a9ad5d6
          JIRA Issue HDFS-11182
          GITHUB PR https://github.com/apache/hadoop/pull/168
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux 59a4a67b1803 3.13.0-93-generic #140-Ubuntu SMP Mon Jul 18 21:21:05 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / c6a3923
          Default Java 1.8.0_111
          findbugs v3.0.0
          checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/17843/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
          whitespace https://builds.apache.org/job/PreCommit-HDFS-Build/17843/artifact/patchprocess/whitespace-eol.txt
          unit https://builds.apache.org/job/PreCommit-HDFS-Build/17843/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
          Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/17843/testReport/
          modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/17843/console
          Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 17s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 8 new or modified test files. +1 mvninstall 7m 29s trunk passed +1 compile 0m 46s trunk passed +1 checkstyle 0m 30s trunk passed +1 mvnsite 0m 52s trunk passed +1 mvneclipse 0m 13s trunk passed +1 findbugs 1m 45s trunk passed +1 javadoc 0m 40s trunk passed +1 mvninstall 0m 48s the patch passed +1 compile 0m 44s the patch passed +1 javac 0m 44s the patch passed -0 checkstyle 0m 29s hadoop-hdfs-project/hadoop-hdfs: The patch generated 21 new + 384 unchanged - 10 fixed = 405 total (was 394) +1 mvnsite 0m 52s the patch passed +1 mvneclipse 0m 11s the patch passed -1 whitespace 0m 0s The patch has 8 line(s) that end in whitespace. Use git apply --whitespace=fix <<patch_file>>. Refer https://git-scm.com/docs/git-apply +1 findbugs 1m 59s the patch passed +1 javadoc 0m 38s the patch passed -1 unit 70m 19s hadoop-hdfs in the patch failed. +1 asflicense 0m 20s The patch does not generate ASF License warnings. 90m 13s Reason Tests Failed junit tests hadoop.hdfs.TestTrashWithSecureEncryptionZones   hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting   hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes   hadoop.hdfs.server.datanode.TestDirectoryScanner   hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure   hadoop.hdfs.TestSecureEncryptionZoneWithKMS   hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean Subsystem Report/Notes Docker Image:yetus/hadoop:a9ad5d6 JIRA Issue HDFS-11182 GITHUB PR https://github.com/apache/hadoop/pull/168 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 59a4a67b1803 3.13.0-93-generic #140-Ubuntu SMP Mon Jul 18 21:21:05 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / c6a3923 Default Java 1.8.0_111 findbugs v3.0.0 checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/17843/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt whitespace https://builds.apache.org/job/PreCommit-HDFS-Build/17843/artifact/patchprocess/whitespace-eol.txt unit https://builds.apache.org/job/PreCommit-HDFS-Build/17843/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/17843/testReport/ modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs Console output https://builds.apache.org/job/PreCommit-HDFS-Build/17843/console Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          arpitagarwal Arpit Agarwal added a comment -

          Updated the pull request to address unit test failures from the last run. Removed two unit tests:

          1. FsVolumeList#testCheckDirsWithClosedVolume - replaced with TestDatasetVolumeCheckerFailures#testCheckingClosedVolume.
          2. TestFsDatasetImpl#testChangeVolumeWithRunningCheckDirs - no longer relevant as volume checks are always parallelized on an initial snapshot of the volume list.
          Show
          arpitagarwal Arpit Agarwal added a comment - Updated the pull request to address unit test failures from the last run. Removed two unit tests: FsVolumeList#testCheckDirsWithClosedVolume - replaced with TestDatasetVolumeCheckerFailures#testCheckingClosedVolume . TestFsDatasetImpl#testChangeVolumeWithRunningCheckDirs - no longer relevant as volume checks are always parallelized on an initial snapshot of the volume list.
          Hide
          arpitagarwal Arpit Agarwal added a comment -

          The test failures are caused by the changes. I will post update the PR later this week.

          Show
          arpitagarwal Arpit Agarwal added a comment - The test failures are caused by the changes. I will post update the PR later this week.
          Hide
          arpitagarwal Arpit Agarwal added a comment -

          Hi Manjunath Anand,

          The invokeCallback releases permit in Semaphore and then does a tryAcquire which would allow it to aquire the permit it just released and hence would go and call the callback.call method which would release the latch by the countDown.

          Since the number of permits is initially negative, it will take references.size() releases before any acquire call succeeds. So the first references.size() -1 threads will not be able to acquire the permit they just released.

          Show
          arpitagarwal Arpit Agarwal added a comment - Hi Manjunath Anand , The invokeCallback releases permit in Semaphore and then does a tryAcquire which would allow it to aquire the permit it just released and hence would go and call the callback.call method which would release the latch by the countDown. Since the number of permits is initially negative, it will take references.size() releases before any acquire call succeeds. So the first references.size() -1 threads will not be able to acquire the permit they just released.
          Hide
          manju_hadoop Manjunath Anand added a comment - - edited

          Arpit Agarwal I am little confused about the reason for the change in DatasetVolumeChecker.checkAllVolume methods to use Semaphore and CoutDownLatch instead of the earlier approach of using CountDownLatch.

          The confusion is whether the CountDownLatch countdown is correctly called after all threads scheduled for completing the volume check finish their jobs. I am presenting my understanding as to why the CountDownLatch countdown is called by the first thread completing the volume check which would result in incorrect expected behaviour. Please correct me if I am wrong.

          The below code acquires all references.size()-1 permits .

          final Semaphore semaphore = new Semaphore(-references.size() + 1);
          

          The invokeCallback releases permit in Semaphore and then does a tryAcquire which would allow it to aquire the permit it just released and hence would go and call the callback.call method which would release the latch by the countDown.

          So it looks like the first thread which completed the volume check will release the latch which would mean we havent waited for maxAllowedTimeForCheckMs by the await call made on the latch.

          private void invokeCallback() {
                try {
                  semaphore.release();
                  if (callback != null && semaphore.tryAcquire()) {
                    callback.call(healthyVolumes, failedVolumes);
                  }
                } catch(Exception e) {
                  // Propagating this exception is unlikely to be helpful.
                  LOG.warn("Unexpected exception", e);
                }
              }
          
          
          Futures.addCallback(future, new ResultHandler(
           -          reference, healthyVolumes, failedVolumes, resultsLatch, null));		 +          reference, healthyVolumes, failedVolumes, semaphore, new Callback() {
           +        @Override
           +        public void call(Set<StorageLocation> ignored1,
           +                         Set<StorageLocation> ignored2) {
           +          latch.countDown();
           +        }
           +      }));
          

          Please suggest on this.

          Show
          manju_hadoop Manjunath Anand added a comment - - edited Arpit Agarwal I am little confused about the reason for the change in DatasetVolumeChecker.checkAllVolume methods to use Semaphore and CoutDownLatch instead of the earlier approach of using CountDownLatch. The confusion is whether the CountDownLatch countdown is correctly called after all threads scheduled for completing the volume check finish their jobs. I am presenting my understanding as to why the CountDownLatch countdown is called by the first thread completing the volume check which would result in incorrect expected behaviour. Please correct me if I am wrong. The below code acquires all references.size()-1 permits . final Semaphore semaphore = new Semaphore(-references.size() + 1); The invokeCallback releases permit in Semaphore and then does a tryAcquire which would allow it to aquire the permit it just released and hence would go and call the callback.call method which would release the latch by the countDown. So it looks like the first thread which completed the volume check will release the latch which would mean we havent waited for maxAllowedTimeForCheckMs by the await call made on the latch. private void invokeCallback() { try { semaphore.release(); if (callback != null && semaphore.tryAcquire()) { callback.call(healthyVolumes, failedVolumes); } } catch (Exception e) { // Propagating this exception is unlikely to be helpful. LOG.warn( "Unexpected exception" , e); } } Futures.addCallback( future , new ResultHandler( - reference, healthyVolumes, failedVolumes, resultsLatch, null )); + reference, healthyVolumes, failedVolumes, semaphore, new Callback() { + @Override + public void call(Set<StorageLocation> ignored1, + Set<StorageLocation> ignored2) { + latch.countDown(); + } + })); Please suggest on this.
          Hide
          manju_hadoop Manjunath Anand added a comment - - edited

          Arpit Agarwal , sorry for the late code review as this comment holds good more for the JIRA HDFS-11149. Thanks for the effort for this change. In DatasetVolumeChecker the checkAllVolumes and checkAllVolumesAsync does the same tasks except for few additional things in checkAllVolumes. The code can be modified such that checkAllVolumes inturn calls the checkAllVolumesAsync and save lines of code and also make sure that both does the same intended work without code duplication in future as well.

           public ResultHandler checkAllVolumesAsync(
                      final FsDatasetSpi<? extends FsVolumeSpi> dataset,
                      Callback callback) {
          
                  if (timer.monotonicNow() - lastAllVolumesCheck < minDiskCheckGapMs) {
                      numSkippedChecks.incrementAndGet();
                      return null;
                  }
          
                  lastAllVolumesCheck = timer.monotonicNow();
                  final Set<StorageLocation> healthyVolumes = new HashSet<>();
                  final Set<StorageLocation> failedVolumes = new HashSet<>();
                  final Set<StorageLocation> allVolumes = new HashSet<>();
          
                  final FsDatasetSpi.FsVolumeReferences references =
                          dataset.getFsVolumeReferences();
                  final CountDownLatch latch = new CountDownLatch(references.size());
          
                  LOG.info("Checking {} volumes", references.size());
                  for (int i = 0; i < references.size(); ++i) {
                      final FsVolumeReference reference = references.getReference(i);
                      allVolumes.add(reference.getVolume().getStorageLocation());
                      // The context parameter is currently ignored.
                      ListenableFuture<VolumeCheckResult> future =
                              delegateChecker.schedule(reference.getVolume(), IGNORED_CONTEXT);
                      LOG.info("Scheduled health check for volume {}", reference.getVolume());
                      Futures.addCallback(future, new ResultHandler(
                              reference, healthyVolumes, failedVolumes, latch, callback));
                  }
                  numAsyncDatasetChecks.incrementAndGet();
                  return new ResultHandler(
                          null, healthyVolumes, failedVolumes, latch, callback);
              }
          
          public Set<StorageLocation> checkAllVolumes(
                      final FsDatasetSpi<? extends FsVolumeSpi> dataset)
                      throws InterruptedException {
          
                  ResultHandler resultHandler = checkAllVolumesAsync(dataset, null);
          
                  if (resultHandler == null) {
                      return Collections.emptySet();
                  }
          
                  // Wait until our timeout elapses, after which we give up on
                  // the remaining volumes.
                  if (!resultHandler.getResultsLatch().await(maxAllowedTimeForCheckMs, TimeUnit.MILLISECONDS)) {
                      LOG.warn("checkAllVolumes timed out after {} ms" +
                              maxAllowedTimeForCheckMs);
                  }
          
                  numSyncDatasetChecks.incrementAndGet();
                  synchronized (this) {
                      // All volumes that have not been detected as healthy should be
                      // considered failed. This is a superset of 'failedVolumes'.
                      //
                      // Make a copy under the mutex as Sets.difference() returns a view
                      // of a potentially changing set.
                      return new HashSet<>(Sets.difference(resultHandler.getAllVolumes(), resultHandler.getHealthyVolumes()));
                  }
              }
          
          

          Although this is not a great thing but it will save duplicate code and differences of core behaviour between the 2 methods which is currently present as well in terms of some loggers eg:-

           LOG.info("Scheduled health check for volume {}", reference.getVolume()); 

          this is there only in checkAllVolumes where as

           LOG.info("Checking {} volumes", references.size());

          is present only in checkAllVolumesAsync. Please let me know your thoughts on this.

          Show
          manju_hadoop Manjunath Anand added a comment - - edited Arpit Agarwal , sorry for the late code review as this comment holds good more for the JIRA HDFS-11149 . Thanks for the effort for this change. In DatasetVolumeChecker the checkAllVolumes and checkAllVolumesAsync does the same tasks except for few additional things in checkAllVolumes. The code can be modified such that checkAllVolumes inturn calls the checkAllVolumesAsync and save lines of code and also make sure that both does the same intended work without code duplication in future as well. public ResultHandler checkAllVolumesAsync( final FsDatasetSpi<? extends FsVolumeSpi> dataset, Callback callback) { if (timer.monotonicNow() - lastAllVolumesCheck < minDiskCheckGapMs) { numSkippedChecks.incrementAndGet(); return null ; } lastAllVolumesCheck = timer.monotonicNow(); final Set<StorageLocation> healthyVolumes = new HashSet<>(); final Set<StorageLocation> failedVolumes = new HashSet<>(); final Set<StorageLocation> allVolumes = new HashSet<>(); final FsDatasetSpi.FsVolumeReferences references = dataset.getFsVolumeReferences(); final CountDownLatch latch = new CountDownLatch(references.size()); LOG.info( "Checking {} volumes" , references.size()); for ( int i = 0; i < references.size(); ++i) { final FsVolumeReference reference = references.getReference(i); allVolumes.add(reference.getVolume().getStorageLocation()); // The context parameter is currently ignored. ListenableFuture<VolumeCheckResult> future = delegateChecker.schedule(reference.getVolume(), IGNORED_CONTEXT); LOG.info( "Scheduled health check for volume {}" , reference.getVolume()); Futures.addCallback( future , new ResultHandler( reference, healthyVolumes, failedVolumes, latch, callback)); } numAsyncDatasetChecks.incrementAndGet(); return new ResultHandler( null , healthyVolumes, failedVolumes, latch, callback); } public Set<StorageLocation> checkAllVolumes( final FsDatasetSpi<? extends FsVolumeSpi> dataset) throws InterruptedException { ResultHandler resultHandler = checkAllVolumesAsync(dataset, null ); if (resultHandler == null ) { return Collections.emptySet(); } // Wait until our timeout elapses, after which we give up on // the remaining volumes. if (!resultHandler.getResultsLatch().await(maxAllowedTimeForCheckMs, TimeUnit.MILLISECONDS)) { LOG.warn( "checkAllVolumes timed out after {} ms" + maxAllowedTimeForCheckMs); } numSyncDatasetChecks.incrementAndGet(); synchronized ( this ) { // All volumes that have not been detected as healthy should be // considered failed. This is a superset of 'failedVolumes'. // // Make a copy under the mutex as Sets.difference() returns a view // of a potentially changing set. return new HashSet<>(Sets.difference(resultHandler.getAllVolumes(), resultHandler.getHealthyVolumes())); } } Although this is not a great thing but it will save duplicate code and differences of core behaviour between the 2 methods which is currently present as well in terms of some loggers eg:- LOG.info( "Scheduled health check for volume {}" , reference.getVolume()); this is there only in checkAllVolumes where as LOG.info( "Checking {} volumes" , references.size()); is present only in checkAllVolumesAsync. Please let me know your thoughts on this.
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 20s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 10 new or modified test files.
          +1 mvninstall 8m 9s trunk passed
          +1 compile 0m 51s trunk passed
          +1 checkstyle 0m 33s trunk passed
          +1 mvnsite 0m 56s trunk passed
          +1 mvneclipse 0m 15s trunk passed
          -1 findbugs 1m 48s hadoop-hdfs-project/hadoop-hdfs in trunk has 1 extant Findbugs warnings.
          +1 javadoc 0m 45s trunk passed
          +1 mvninstall 0m 57s the patch passed
          +1 compile 0m 48s the patch passed
          +1 javac 0m 48s the patch passed
          -0 checkstyle 0m 30s hadoop-hdfs-project/hadoop-hdfs: The patch generated 27 new + 447 unchanged - 13 fixed = 474 total (was 460)
          +1 mvnsite 0m 56s the patch passed
          +1 mvneclipse 0m 12s the patch passed
          -1 whitespace 0m 0s The patch has 5 line(s) that end in whitespace. Use git apply --whitespace=fix <<patch_file>>. Refer https://git-scm.com/docs/git-apply
          +1 findbugs 1m 58s the patch passed
          +1 javadoc 0m 42s the patch passed
          -1 unit 100m 48s hadoop-hdfs in the patch failed.
          +1 asflicense 0m 26s The patch does not generate ASF License warnings.
          122m 35s



          Reason Tests
          Failed junit tests hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes
            hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting
            hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureToleration
            hadoop.metrics2.sink.TestRollingFileSystemSinkWithHdfs
            hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl
            hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean
            hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistReplicaRecovery
          Timed out junit tests org.apache.hadoop.hdfs.server.datanode.TestDiskError
            org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:a9ad5d6
          JIRA Issue HDFS-11182
          GITHUB PR https://github.com/apache/hadoop/pull/168
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux 180afbaf1a82 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / 69fb70c
          Default Java 1.8.0_111
          findbugs v3.0.0
          findbugs https://builds.apache.org/job/PreCommit-HDFS-Build/17719/artifact/patchprocess/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-warnings.html
          checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/17719/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
          whitespace https://builds.apache.org/job/PreCommit-HDFS-Build/17719/artifact/patchprocess/whitespace-eol.txt
          unit https://builds.apache.org/job/PreCommit-HDFS-Build/17719/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
          Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/17719/testReport/
          modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/17719/console
          Powered by Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 20s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 10 new or modified test files. +1 mvninstall 8m 9s trunk passed +1 compile 0m 51s trunk passed +1 checkstyle 0m 33s trunk passed +1 mvnsite 0m 56s trunk passed +1 mvneclipse 0m 15s trunk passed -1 findbugs 1m 48s hadoop-hdfs-project/hadoop-hdfs in trunk has 1 extant Findbugs warnings. +1 javadoc 0m 45s trunk passed +1 mvninstall 0m 57s the patch passed +1 compile 0m 48s the patch passed +1 javac 0m 48s the patch passed -0 checkstyle 0m 30s hadoop-hdfs-project/hadoop-hdfs: The patch generated 27 new + 447 unchanged - 13 fixed = 474 total (was 460) +1 mvnsite 0m 56s the patch passed +1 mvneclipse 0m 12s the patch passed -1 whitespace 0m 0s The patch has 5 line(s) that end in whitespace. Use git apply --whitespace=fix <<patch_file>>. Refer https://git-scm.com/docs/git-apply +1 findbugs 1m 58s the patch passed +1 javadoc 0m 42s the patch passed -1 unit 100m 48s hadoop-hdfs in the patch failed. +1 asflicense 0m 26s The patch does not generate ASF License warnings. 122m 35s Reason Tests Failed junit tests hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes   hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting   hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureToleration   hadoop.metrics2.sink.TestRollingFileSystemSinkWithHdfs   hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl   hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean   hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistReplicaRecovery Timed out junit tests org.apache.hadoop.hdfs.server.datanode.TestDiskError   org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure Subsystem Report/Notes Docker Image:yetus/hadoop:a9ad5d6 JIRA Issue HDFS-11182 GITHUB PR https://github.com/apache/hadoop/pull/168 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 180afbaf1a82 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 69fb70c Default Java 1.8.0_111 findbugs v3.0.0 findbugs https://builds.apache.org/job/PreCommit-HDFS-Build/17719/artifact/patchprocess/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-warnings.html checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/17719/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt whitespace https://builds.apache.org/job/PreCommit-HDFS-Build/17719/artifact/patchprocess/whitespace-eol.txt unit https://builds.apache.org/job/PreCommit-HDFS-Build/17719/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/17719/testReport/ modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs Console output https://builds.apache.org/job/PreCommit-HDFS-Build/17719/console Powered by Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 11s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 11 new or modified test files.
          +1 mvninstall 8m 51s trunk passed
          +1 compile 0m 51s trunk passed
          +1 checkstyle 0m 35s trunk passed
          +1 mvnsite 1m 1s trunk passed
          +1 mvneclipse 0m 13s trunk passed
          +1 findbugs 1m 50s trunk passed
          +1 javadoc 0m 44s trunk passed
          +1 mvninstall 0m 59s the patch passed
          +1 compile 0m 54s the patch passed
          +1 javac 0m 54s the patch passed
          -0 checkstyle 0m 35s hadoop-hdfs-project/hadoop-hdfs: The patch generated 24 new + 475 unchanged - 13 fixed = 499 total (was 488)
          +1 mvnsite 1m 8s the patch passed
          +1 mvneclipse 0m 12s the patch passed
          -1 whitespace 0m 0s The patch has 5 line(s) that end in whitespace. Use git apply --whitespace=fix <<patch_file>>. Refer https://git-scm.com/docs/git-apply
          +1 xml 0m 1s The patch has no ill-formed XML file.
          -1 findbugs 2m 5s hadoop-hdfs-project/hadoop-hdfs generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0)
          +1 javadoc 0m 43s the patch passed
          -1 unit 74m 14s hadoop-hdfs in the patch failed.
          +1 asflicense 0m 21s The patch does not generate ASF License warnings.
          96m 49s



          Reason Tests
          FindBugs module:hadoop-hdfs-project/hadoop-hdfs
            Private method org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList.addVolumeFailureInfo(FsVolumeImpl) is never called At FsVolumeList.java:called At FsVolumeList.java:[lines 353-357]
          Failed junit tests hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting
            hadoop.hdfs.server.namenode.TestReconstructStripedBlocks
            hadoop.hdfs.server.namenode.ha.TestHASafeMode
            hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl
            hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes
            hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureToleration
            hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA
            hadoop.hdfs.server.datanode.TestDataNodeLifeline
            hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure
            hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean
          Timed out junit tests org.apache.hadoop.hdfs.server.datanode.TestDiskError



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:a9ad5d6
          JIRA Issue HDFS-11182
          GITHUB PR https://github.com/apache/hadoop/pull/168
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle xml
          uname Linux 68c63b5ac198 3.13.0-93-generic #140-Ubuntu SMP Mon Jul 18 21:21:05 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / 8f6e143
          Default Java 1.8.0_111
          findbugs v3.0.0
          checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/17708/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
          whitespace https://builds.apache.org/job/PreCommit-HDFS-Build/17708/artifact/patchprocess/whitespace-eol.txt
          findbugs https://builds.apache.org/job/PreCommit-HDFS-Build/17708/artifact/patchprocess/new-findbugs-hadoop-hdfs-project_hadoop-hdfs.html
          unit https://builds.apache.org/job/PreCommit-HDFS-Build/17708/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
          Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/17708/testReport/
          modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/17708/console
          Powered by Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 11s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 11 new or modified test files. +1 mvninstall 8m 51s trunk passed +1 compile 0m 51s trunk passed +1 checkstyle 0m 35s trunk passed +1 mvnsite 1m 1s trunk passed +1 mvneclipse 0m 13s trunk passed +1 findbugs 1m 50s trunk passed +1 javadoc 0m 44s trunk passed +1 mvninstall 0m 59s the patch passed +1 compile 0m 54s the patch passed +1 javac 0m 54s the patch passed -0 checkstyle 0m 35s hadoop-hdfs-project/hadoop-hdfs: The patch generated 24 new + 475 unchanged - 13 fixed = 499 total (was 488) +1 mvnsite 1m 8s the patch passed +1 mvneclipse 0m 12s the patch passed -1 whitespace 0m 0s The patch has 5 line(s) that end in whitespace. Use git apply --whitespace=fix <<patch_file>>. Refer https://git-scm.com/docs/git-apply +1 xml 0m 1s The patch has no ill-formed XML file. -1 findbugs 2m 5s hadoop-hdfs-project/hadoop-hdfs generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) +1 javadoc 0m 43s the patch passed -1 unit 74m 14s hadoop-hdfs in the patch failed. +1 asflicense 0m 21s The patch does not generate ASF License warnings. 96m 49s Reason Tests FindBugs module:hadoop-hdfs-project/hadoop-hdfs   Private method org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList.addVolumeFailureInfo(FsVolumeImpl) is never called At FsVolumeList.java:called At FsVolumeList.java: [lines 353-357] Failed junit tests hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting   hadoop.hdfs.server.namenode.TestReconstructStripedBlocks   hadoop.hdfs.server.namenode.ha.TestHASafeMode   hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl   hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes   hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureToleration   hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA   hadoop.hdfs.server.datanode.TestDataNodeLifeline   hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure   hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean Timed out junit tests org.apache.hadoop.hdfs.server.datanode.TestDiskError Subsystem Report/Notes Docker Image:yetus/hadoop:a9ad5d6 JIRA Issue HDFS-11182 GITHUB PR https://github.com/apache/hadoop/pull/168 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle xml uname Linux 68c63b5ac198 3.13.0-93-generic #140-Ubuntu SMP Mon Jul 18 21:21:05 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 8f6e143 Default Java 1.8.0_111 findbugs v3.0.0 checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/17708/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt whitespace https://builds.apache.org/job/PreCommit-HDFS-Build/17708/artifact/patchprocess/whitespace-eol.txt findbugs https://builds.apache.org/job/PreCommit-HDFS-Build/17708/artifact/patchprocess/new-findbugs-hadoop-hdfs-project_hadoop-hdfs.html unit https://builds.apache.org/job/PreCommit-HDFS-Build/17708/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/17708/testReport/ modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs Console output https://builds.apache.org/job/PreCommit-HDFS-Build/17708/console Powered by Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 19s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 11 new or modified test files.
          +1 mvninstall 7m 31s trunk passed
          +1 compile 0m 49s trunk passed
          +1 checkstyle 0m 32s trunk passed
          +1 mvnsite 0m 57s trunk passed
          +1 mvneclipse 0m 13s trunk passed
          +1 findbugs 1m 45s trunk passed
          +1 javadoc 0m 39s trunk passed
          +1 mvninstall 0m 49s the patch passed
          +1 compile 0m 44s the patch passed
          +1 javac 0m 44s the patch passed
          -0 checkstyle 0m 29s hadoop-hdfs-project/hadoop-hdfs: The patch generated 27 new + 533 unchanged - 13 fixed = 560 total (was 546)
          +1 mvnsite 0m 50s the patch passed
          +1 mvneclipse 0m 10s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 xml 0m 1s The patch has no ill-formed XML file.
          -1 findbugs 1m 53s hadoop-hdfs-project/hadoop-hdfs generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0)
          -1 javadoc 0m 37s hadoop-hdfs-project_hadoop-hdfs generated 3 new + 7 unchanged - 0 fixed = 10 total (was 7)
          -1 unit 94m 13s hadoop-hdfs in the patch failed.
          +1 asflicense 0m 21s The patch does not generate ASF License warnings.
          114m 13s



          Reason Tests
          FindBugs module:hadoop-hdfs-project/hadoop-hdfs
            Private method org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList.addVolumeFailureInfo(FsVolumeImpl) is never called At FsVolumeList.java:called At FsVolumeList.java:[lines 353-357]
          Failed junit tests hadoop.hdfs.TestRead
            hadoop.hdfs.TestPread
            hadoop.hdfs.server.datanode.TestReadOnlySharedStorage
            hadoop.hdfs.TestDFSStripedInputStream
            hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFS
            hadoop.hdfs.server.datanode.TestDataNodeTransferSocketSize
            hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureToleration
            hadoop.hdfs.server.balancer.TestBalancerWithEncryptedTransfer
            hadoop.hdfs.server.balancer.TestBalancerWithNodeGroup
            hadoop.hdfs.TestSetrepIncreasing
            hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl
            hadoop.hdfs.server.datanode.TestDataNodeMetrics
            hadoop.hdfs.server.balancer.TestBalancerWithHANameNodes
            hadoop.hdfs.server.balancer.TestBalancerWithSaslDataTransfer
            hadoop.hdfs.TestInjectionForSimulatedStorage
            hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFSStriped
            hadoop.hdfs.TestDecommission
            hadoop.hdfs.server.namenode.TestNameNodeMXBean
            hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure
            hadoop.hdfs.TestHDFSServerPorts
            hadoop.hdfs.server.balancer.TestBalancer
            hadoop.hdfs.server.datanode.TestDataNodeInitStorage
            hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes
            hadoop.hdfs.TestFileCreation
            hadoop.hdfs.TestWriteBlockGetsBlockLengthHint
            hadoop.hdfs.server.namenode.TestFileLimit
            hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes
            hadoop.hdfs.TestSmallBlock
            hadoop.hdfs.TestReplication
            hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean
            hadoop.hdfs.TestDFSXORStripedInputStream
            hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting
            hadoop.hdfs.TestDFSClientRetries
            hadoop.hdfs.server.namenode.TestAddOverReplicatedStripedBlocks
          Timed out junit tests org.apache.hadoop.hdfs.server.datanode.TestDiskError



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:a9ad5d6
          JIRA Issue HDFS-11182
          GITHUB PR https://github.com/apache/hadoop/pull/168
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle xml
          uname Linux 8ce4bbdddb1b 3.13.0-96-generic #143-Ubuntu SMP Mon Aug 29 20:15:20 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / d60a60b
          Default Java 1.8.0_111
          findbugs v3.0.0
          checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/17687/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
          findbugs https://builds.apache.org/job/PreCommit-HDFS-Build/17687/artifact/patchprocess/new-findbugs-hadoop-hdfs-project_hadoop-hdfs.html
          javadoc https://builds.apache.org/job/PreCommit-HDFS-Build/17687/artifact/patchprocess/diff-javadoc-javadoc-hadoop-hdfs-project_hadoop-hdfs.txt
          unit https://builds.apache.org/job/PreCommit-HDFS-Build/17687/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
          Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/17687/testReport/
          modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/17687/console
          Powered by Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 19s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 11 new or modified test files. +1 mvninstall 7m 31s trunk passed +1 compile 0m 49s trunk passed +1 checkstyle 0m 32s trunk passed +1 mvnsite 0m 57s trunk passed +1 mvneclipse 0m 13s trunk passed +1 findbugs 1m 45s trunk passed +1 javadoc 0m 39s trunk passed +1 mvninstall 0m 49s the patch passed +1 compile 0m 44s the patch passed +1 javac 0m 44s the patch passed -0 checkstyle 0m 29s hadoop-hdfs-project/hadoop-hdfs: The patch generated 27 new + 533 unchanged - 13 fixed = 560 total (was 546) +1 mvnsite 0m 50s the patch passed +1 mvneclipse 0m 10s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 xml 0m 1s The patch has no ill-formed XML file. -1 findbugs 1m 53s hadoop-hdfs-project/hadoop-hdfs generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) -1 javadoc 0m 37s hadoop-hdfs-project_hadoop-hdfs generated 3 new + 7 unchanged - 0 fixed = 10 total (was 7) -1 unit 94m 13s hadoop-hdfs in the patch failed. +1 asflicense 0m 21s The patch does not generate ASF License warnings. 114m 13s Reason Tests FindBugs module:hadoop-hdfs-project/hadoop-hdfs   Private method org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList.addVolumeFailureInfo(FsVolumeImpl) is never called At FsVolumeList.java:called At FsVolumeList.java: [lines 353-357] Failed junit tests hadoop.hdfs.TestRead   hadoop.hdfs.TestPread   hadoop.hdfs.server.datanode.TestReadOnlySharedStorage   hadoop.hdfs.TestDFSStripedInputStream   hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFS   hadoop.hdfs.server.datanode.TestDataNodeTransferSocketSize   hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureToleration   hadoop.hdfs.server.balancer.TestBalancerWithEncryptedTransfer   hadoop.hdfs.server.balancer.TestBalancerWithNodeGroup   hadoop.hdfs.TestSetrepIncreasing   hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl   hadoop.hdfs.server.datanode.TestDataNodeMetrics   hadoop.hdfs.server.balancer.TestBalancerWithHANameNodes   hadoop.hdfs.server.balancer.TestBalancerWithSaslDataTransfer   hadoop.hdfs.TestInjectionForSimulatedStorage   hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFSStriped   hadoop.hdfs.TestDecommission   hadoop.hdfs.server.namenode.TestNameNodeMXBean   hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure   hadoop.hdfs.TestHDFSServerPorts   hadoop.hdfs.server.balancer.TestBalancer   hadoop.hdfs.server.datanode.TestDataNodeInitStorage   hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes   hadoop.hdfs.TestFileCreation   hadoop.hdfs.TestWriteBlockGetsBlockLengthHint   hadoop.hdfs.server.namenode.TestFileLimit   hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes   hadoop.hdfs.TestSmallBlock   hadoop.hdfs.TestReplication   hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean   hadoop.hdfs.TestDFSXORStripedInputStream   hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting   hadoop.hdfs.TestDFSClientRetries   hadoop.hdfs.server.namenode.TestAddOverReplicatedStripedBlocks Timed out junit tests org.apache.hadoop.hdfs.server.datanode.TestDiskError Subsystem Report/Notes Docker Image:yetus/hadoop:a9ad5d6 JIRA Issue HDFS-11182 GITHUB PR https://github.com/apache/hadoop/pull/168 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle xml uname Linux 8ce4bbdddb1b 3.13.0-96-generic #143-Ubuntu SMP Mon Aug 29 20:15:20 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / d60a60b Default Java 1.8.0_111 findbugs v3.0.0 checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/17687/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt findbugs https://builds.apache.org/job/PreCommit-HDFS-Build/17687/artifact/patchprocess/new-findbugs-hadoop-hdfs-project_hadoop-hdfs.html javadoc https://builds.apache.org/job/PreCommit-HDFS-Build/17687/artifact/patchprocess/diff-javadoc-javadoc-hadoop-hdfs-project_hadoop-hdfs.txt unit https://builds.apache.org/job/PreCommit-HDFS-Build/17687/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/17687/testReport/ modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs Console output https://builds.apache.org/job/PreCommit-HDFS-Build/17687/console Powered by Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          arpitagarwal Arpit Agarwal added a comment -

          v01 patch for Jenkins run.

          Show
          arpitagarwal Arpit Agarwal added a comment - v01 patch for Jenkins run.
          Hide
          githubbot ASF GitHub Bot added a comment -

          GitHub user arp7 opened a pull request:

          https://github.com/apache/hadoop/pull/168

          HDFS-11182. Update DataNode to use DatasetVolumeChecker.

          Preliminary patch for Jenkins runs.

          You can merge this pull request into a Git repository by running:

          $ git pull https://github.com/arp7/hadoop HDFS-11182

          Alternatively you can review and apply these changes as the patch at:

          https://github.com/apache/hadoop/pull/168.patch

          To close this pull request, make a commit to your master/trunk branch
          with (at least) the following in the commit message:

          This closes #168


          commit 621fdc6118c8fe3f738ca8a86ebc9dac71804298
          Author: Arpit Agarwal <arp@apache.org>
          Date: 2016-11-23T19:38:14Z

          HDFS-11149. Support for parallel checking of FsVolumes.

          commit b9ede5b8b990821a681bc8fbc9975ee93fb158d6
          Author: Arpit Agarwal <arp@apache.org>
          Date: 2016-11-29T01:25:51Z

          HDFS-11182. Update DataNode to use DatasetVolumeChecker.


          Show
          githubbot ASF GitHub Bot added a comment - GitHub user arp7 opened a pull request: https://github.com/apache/hadoop/pull/168 HDFS-11182 . Update DataNode to use DatasetVolumeChecker. Preliminary patch for Jenkins runs. You can merge this pull request into a Git repository by running: $ git pull https://github.com/arp7/hadoop HDFS-11182 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/hadoop/pull/168.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #168 commit 621fdc6118c8fe3f738ca8a86ebc9dac71804298 Author: Arpit Agarwal <arp@apache.org> Date: 2016-11-23T19:38:14Z HDFS-11149 . Support for parallel checking of FsVolumes. commit b9ede5b8b990821a681bc8fbc9975ee93fb158d6 Author: Arpit Agarwal <arp@apache.org> Date: 2016-11-29T01:25:51Z HDFS-11182 . Update DataNode to use DatasetVolumeChecker.

            People

            • Assignee:
              arpitagarwal Arpit Agarwal
              Reporter:
              arpitagarwal Arpit Agarwal
            • Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development