Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-8380

Always call addStoredBlock on blocks which have been shifted from one storage to another

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.6.0
    • Fix Version/s: 2.8.0, 3.0.0-alpha1
    • Component/s: None
    • Labels:
      None
    • Target Version/s:

      Description

      We should always call addStoredBlock on blocks which have been shifted from one storage to another.

      1. HDFS-8380.001.patch
        10 kB
        Colin P. McCabe

        Issue Links

          Activity

          Hide
          cmccabe Colin P. McCabe added a comment -

          Background: HDFS-6830 attempted to implement "block shifting logic," whereby when the NameNode received a report about some replica saying it was in some DataNode storage, it would update the NN's internal data structures to reflect the fact that this replica was not in any other storages on that DataNode. The assumption was (and still is) that each replica is present in at most one storage on each DN (an assumption we might want to revisit at some point, but that's outside the scope of this JIRA...).

          HDFS-6830 was flawed, however. Although it changed BlockManager#addBlock to update the storage which a particular block was in, it would not actually call BlockManager#addBlock on blocks it received in the full block report, if it had already seen their IDs. So in the case where blocks were moved between storages, HDFS-6830 would not actually update the internal data structures on the NameNode... they would remain in the old storages.

          HDFS-6991, although it would appear to be unrelated based on the title, actually has a partial fix for the bug in HDFS-6830, in the form of this code:

          -        && (!storedBlock.findDatanode(dn)
          -        || corruptReplicas.isReplicaCorrupt(storedBlock, dn))) {
          +        && (storedBlock.findStorageInfo(storageInfo) == -1 ||
          +            corruptReplicas.isReplicaCorrupt(storedBlock, dn))) {
                            addBlock(...)
          

          However, HDFS-6991 doesn't fix the issue for RBW blocks. Admittedly, it is much less likely for RBW blocks to be shifted between storages, because when restarting a datanode, the RBW replicas become RWR. However, for the sake of robustness, we should implement the shifting behavior there too.

          This patch does that. It also adds logging for the first time we receive a storage report for a given storage. This should happen only once per storage, so it won't generate too many logs. It will be useful for tracing what is going on. It also adds debug logs to the initial storage report, similar to the debug logs available for the non-initial storage report. Finally, it adds a unit test for the shifting behavior. The unit test tests shifting of finalized blocks rather than RBW ones, so it doesn't require the rest of the patch to pass, but it's still very useful for preventing regressions.

          Show
          cmccabe Colin P. McCabe added a comment - Background: HDFS-6830 attempted to implement "block shifting logic," whereby when the NameNode received a report about some replica saying it was in some DataNode storage, it would update the NN's internal data structures to reflect the fact that this replica was not in any other storages on that DataNode. The assumption was (and still is) that each replica is present in at most one storage on each DN (an assumption we might want to revisit at some point, but that's outside the scope of this JIRA...). HDFS-6830 was flawed, however. Although it changed BlockManager#addBlock to update the storage which a particular block was in, it would not actually call BlockManager#addBlock on blocks it received in the full block report, if it had already seen their IDs. So in the case where blocks were moved between storages, HDFS-6830 would not actually update the internal data structures on the NameNode... they would remain in the old storages. HDFS-6991 , although it would appear to be unrelated based on the title, actually has a partial fix for the bug in HDFS-6830 , in the form of this code: - && (!storedBlock.findDatanode(dn) - || corruptReplicas.isReplicaCorrupt(storedBlock, dn))) { + && (storedBlock.findStorageInfo(storageInfo) == -1 || + corruptReplicas.isReplicaCorrupt(storedBlock, dn))) { addBlock(...) However, HDFS-6991 doesn't fix the issue for RBW blocks. Admittedly, it is much less likely for RBW blocks to be shifted between storages, because when restarting a datanode, the RBW replicas become RWR. However, for the sake of robustness, we should implement the shifting behavior there too. This patch does that. It also adds logging for the first time we receive a storage report for a given storage. This should happen only once per storage, so it won't generate too many logs. It will be useful for tracing what is going on. It also adds debug logs to the initial storage report, similar to the debug logs available for the non-initial storage report. Finally, it adds a unit test for the shifting behavior. The unit test tests shifting of finalized blocks rather than RBW ones, so it doesn't require the rest of the patch to pass, but it's still very useful for preventing regressions.
          Hide
          hadoopqa Hadoop QA added a comment -



          -1 overall



          Vote Subsystem Runtime Comment
          0 pre-patch 14m 37s Pre-patch trunk compilation is healthy.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 tests included 0m 0s The patch appears to include 1 new or modified test files.
          +1 javac 7m 28s There were no new javac warning messages.
          +1 javadoc 9m 40s There were no new javadoc warning messages.
          +1 release audit 0m 22s The applied patch does not increase the total number of release audit warnings.
          -1 checkstyle 2m 16s The applied patch generated 1 new checkstyle issues (total was 228, now 228).
          +1 whitespace 0m 0s The patch has no lines that end in whitespace.
          +1 install 1m 34s mvn install still works.
          +1 eclipse:eclipse 0m 33s The patch built with eclipse:eclipse.
          +1 findbugs 3m 3s The patch does not introduce any new Findbugs (version 2.0.3) warnings.
          +1 native 3m 12s Pre-build of native portion
          -1 hdfs tests 164m 41s Tests failed in hadoop-hdfs.
              207m 29s  



          Reason Tests
          Failed unit tests hadoop.tools.TestHdfsConfigFields
            hadoop.tracing.TestTraceAdmin



          Subsystem Report/Notes
          Patch URL http://issues.apache.org/jira/secure/attachment/12732424/HDFS-8380.001.patch
          Optional Tests javadoc javac unit findbugs checkstyle
          git revision trunk / f24452d
          checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/10943/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt
          hadoop-hdfs test log https://builds.apache.org/job/PreCommit-HDFS-Build/10943/artifact/patchprocess/testrun_hadoop-hdfs.txt
          Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/10943/testReport/
          Java 1.7.0_55
          uname Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/10943/console

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 pre-patch 14m 37s Pre-patch trunk compilation is healthy. +1 @author 0m 0s The patch does not contain any @author tags. +1 tests included 0m 0s The patch appears to include 1 new or modified test files. +1 javac 7m 28s There were no new javac warning messages. +1 javadoc 9m 40s There were no new javadoc warning messages. +1 release audit 0m 22s The applied patch does not increase the total number of release audit warnings. -1 checkstyle 2m 16s The applied patch generated 1 new checkstyle issues (total was 228, now 228). +1 whitespace 0m 0s The patch has no lines that end in whitespace. +1 install 1m 34s mvn install still works. +1 eclipse:eclipse 0m 33s The patch built with eclipse:eclipse. +1 findbugs 3m 3s The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 native 3m 12s Pre-build of native portion -1 hdfs tests 164m 41s Tests failed in hadoop-hdfs.     207m 29s   Reason Tests Failed unit tests hadoop.tools.TestHdfsConfigFields   hadoop.tracing.TestTraceAdmin Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12732424/HDFS-8380.001.patch Optional Tests javadoc javac unit findbugs checkstyle git revision trunk / f24452d checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/10943/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt hadoop-hdfs test log https://builds.apache.org/job/PreCommit-HDFS-Build/10943/artifact/patchprocess/testrun_hadoop-hdfs.txt Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/10943/testReport/ Java 1.7.0_55 uname Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Console output https://builds.apache.org/job/PreCommit-HDFS-Build/10943/console This message was automatically generated.
          Hide
          atm Aaron T. Myers added a comment -

          Great find, Colin. The patch looks good to me. Pretty confident the test failures are unrelated - both fail for me locally, even without the patch applied. The checkstyle warning is about the length of the BlockManager class file, which I don't think there's much we can do about.

          +1

          Show
          atm Aaron T. Myers added a comment - Great find, Colin. The patch looks good to me. Pretty confident the test failures are unrelated - both fail for me locally, even without the patch applied. The checkstyle warning is about the length of the BlockManager class file, which I don't think there's much we can do about. +1
          Hide
          cmccabe Colin P. McCabe added a comment -

          HDFS-7980 is also a fix for this, since it ensures that the initial block report pathway is always used for the first block report of a storage. And the initial block report pathway doesn't have this bug.

          Show
          cmccabe Colin P. McCabe added a comment - HDFS-7980 is also a fix for this, since it ensures that the initial block report pathway is always used for the first block report of a storage. And the initial block report pathway doesn't have this bug.
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-trunk-Commit #7824 (See https://builds.apache.org/job/Hadoop-trunk-Commit/7824/)
          HDFS-8380. Always call addStoredBlock on blocks which have been shifted from one storage to another (cmccabe) (cmccabe: rev 281d47a96937bc329b1b4051ffcb8f5fcac98354)

          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
          • hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestNameNodePrunesMissingStorages.java
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockInfoContiguous.java
          • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-trunk-Commit #7824 (See https://builds.apache.org/job/Hadoop-trunk-Commit/7824/ ) HDFS-8380 . Always call addStoredBlock on blocks which have been shifted from one storage to another (cmccabe) (cmccabe: rev 281d47a96937bc329b1b4051ffcb8f5fcac98354) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestNameNodePrunesMissingStorages.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockInfoContiguous.java hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          Hide
          cmccabe Colin P. McCabe added a comment -

          committed, thanks

          Show
          cmccabe Colin P. McCabe added a comment - committed, thanks
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Yarn-trunk #927 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/927/)
          HDFS-8380. Always call addStoredBlock on blocks which have been shifted from one storage to another (cmccabe) (cmccabe: rev 281d47a96937bc329b1b4051ffcb8f5fcac98354)

          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockInfoContiguous.java
          • hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestNameNodePrunesMissingStorages.java
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
          • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Yarn-trunk #927 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/927/ ) HDFS-8380 . Always call addStoredBlock on blocks which have been shifted from one storage to another (cmccabe) (cmccabe: rev 281d47a96937bc329b1b4051ffcb8f5fcac98354) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockInfoContiguous.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestNameNodePrunesMissingStorages.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          Hide
          hudson Hudson added a comment -

          SUCCESS: Integrated in Hadoop-Yarn-trunk-Java8 #196 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/196/)
          HDFS-8380. Always call addStoredBlock on blocks which have been shifted from one storage to another (cmccabe) (cmccabe: rev 281d47a96937bc329b1b4051ffcb8f5fcac98354)

          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
          • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockInfoContiguous.java
          • hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestNameNodePrunesMissingStorages.java
          Show
          hudson Hudson added a comment - SUCCESS: Integrated in Hadoop-Yarn-trunk-Java8 #196 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/196/ ) HDFS-8380 . Always call addStoredBlock on blocks which have been shifted from one storage to another (cmccabe) (cmccabe: rev 281d47a96937bc329b1b4051ffcb8f5fcac98354) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockInfoContiguous.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestNameNodePrunesMissingStorages.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #195 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/195/)
          HDFS-8380. Always call addStoredBlock on blocks which have been shifted from one storage to another (cmccabe) (cmccabe: rev 281d47a96937bc329b1b4051ffcb8f5fcac98354)

          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockInfoContiguous.java
          • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
          • hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestNameNodePrunesMissingStorages.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #195 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/195/ ) HDFS-8380 . Always call addStoredBlock on blocks which have been shifted from one storage to another (cmccabe) (cmccabe: rev 281d47a96937bc329b1b4051ffcb8f5fcac98354) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockInfoContiguous.java hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestNameNodePrunesMissingStorages.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Hdfs-trunk #2125 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2125/)
          HDFS-8380. Always call addStoredBlock on blocks which have been shifted from one storage to another (cmccabe) (cmccabe: rev 281d47a96937bc329b1b4051ffcb8f5fcac98354)

          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockInfoContiguous.java
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
          • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestNameNodePrunesMissingStorages.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk #2125 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2125/ ) HDFS-8380 . Always call addStoredBlock on blocks which have been shifted from one storage to another (cmccabe) (cmccabe: rev 281d47a96937bc329b1b4051ffcb8f5fcac98354) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockInfoContiguous.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestNameNodePrunesMissingStorages.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #185 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/185/)
          HDFS-8380. Always call addStoredBlock on blocks which have been shifted from one storage to another (cmccabe) (cmccabe: rev 281d47a96937bc329b1b4051ffcb8f5fcac98354)

          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockInfoContiguous.java
          • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestNameNodePrunesMissingStorages.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #185 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/185/ ) HDFS-8380 . Always call addStoredBlock on blocks which have been shifted from one storage to another (cmccabe) (cmccabe: rev 281d47a96937bc329b1b4051ffcb8f5fcac98354) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockInfoContiguous.java hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestNameNodePrunesMissingStorages.java
          Hide
          hudson Hudson added a comment -

          SUCCESS: Integrated in Hadoop-Mapreduce-trunk #2143 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2143/)
          HDFS-8380. Always call addStoredBlock on blocks which have been shifted from one storage to another (cmccabe) (cmccabe: rev 281d47a96937bc329b1b4051ffcb8f5fcac98354)

          • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestNameNodePrunesMissingStorages.java
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockInfoContiguous.java
          Show
          hudson Hudson added a comment - SUCCESS: Integrated in Hadoop-Mapreduce-trunk #2143 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2143/ ) HDFS-8380 . Always call addStoredBlock on blocks which have been shifted from one storage to another (cmccabe) (cmccabe: rev 281d47a96937bc329b1b4051ffcb8f5fcac98354) hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestNameNodePrunesMissingStorages.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockInfoContiguous.java

            People

            • Assignee:
              cmccabe Colin P. McCabe
              Reporter:
              cmccabe Colin P. McCabe
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development