Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-10525

Fix NPE in CacheReplicationMonitor#rescanCachedBlockMap

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.8.0
    • Fix Version/s: 2.8.0, 3.0.0-alpha1
    • Component/s: caching
    • Labels:
      None
    • Target Version/s:
    1. HDFS-10525.01.patch
      1 kB
      Xiao Chen
    2. HDFS-10525.02.patch
      1 kB
      Xiao Chen

      Issue Links

        Activity

        Hide
        xiaochen Xiao Chen added a comment -

        Thanks Colin P. McCabe for the +4 and commit.

        Show
        xiaochen Xiao Chen added a comment - Thanks Colin P. McCabe for the +4 and commit.
        Hide
        hudson Hudson added a comment -

        SUCCESS: Integrated in Hadoop-trunk-Commit #9966 (See https://builds.apache.org/job/Hadoop-trunk-Commit/9966/)
        HDFS-10525. Fix NPE in CacheReplicationMonitor#rescanCachedBlockMap (cmccabe: rev 2ca73445f5c2929d9c2ff4232dca58a63a0570a0)

        • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/CacheReplicationMonitor.java
        Show
        hudson Hudson added a comment - SUCCESS: Integrated in Hadoop-trunk-Commit #9966 (See https://builds.apache.org/job/Hadoop-trunk-Commit/9966/ ) HDFS-10525 . Fix NPE in CacheReplicationMonitor#rescanCachedBlockMap (cmccabe: rev 2ca73445f5c2929d9c2ff4232dca58a63a0570a0) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/CacheReplicationMonitor.java
        Hide
        cmccabe Colin P. McCabe added a comment -

        +1.

        Committed to 2.8. Thanks, Xiao Chen.

        Show
        cmccabe Colin P. McCabe added a comment - +1. Committed to 2.8. Thanks, Xiao Chen .
        Hide
        cmccabe Colin P. McCabe added a comment -

        +1. Thanks, Xiao Chen.

        Show
        cmccabe Colin P. McCabe added a comment - +1. Thanks, Xiao Chen .
        Hide
        hadoopqa Hadoop QA added a comment -
        -1 overall



        Vote Subsystem Runtime Comment
        0 reexec 0m 21s Docker mode activated.
        +1 @author 0m 0s The patch does not contain any @author tags.
        -1 test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
        +1 mvninstall 6m 19s trunk passed
        +1 compile 0m 45s trunk passed
        +1 checkstyle 0m 24s trunk passed
        +1 mvnsite 0m 51s trunk passed
        +1 mvneclipse 0m 11s trunk passed
        +1 findbugs 1m 44s trunk passed
        +1 javadoc 0m 55s trunk passed
        +1 mvninstall 0m 47s the patch passed
        +1 compile 0m 44s the patch passed
        +1 javac 0m 44s the patch passed
        +1 checkstyle 0m 22s the patch passed
        +1 mvnsite 0m 50s the patch passed
        +1 mvneclipse 0m 8s the patch passed
        +1 whitespace 0m 0s The patch has no whitespace issues.
        +1 findbugs 1m 47s the patch passed
        +1 javadoc 0m 53s the patch passed
        +1 unit 69m 9s hadoop-hdfs in the patch passed.
        +1 asflicense 0m 19s The patch does not generate ASF License warnings.
        87m 41s



        Subsystem Report/Notes
        Docker Image:yetus/hadoop:2c91fd8
        JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12810590/HDFS-10525.02.patch
        JIRA Issue HDFS-10525
        Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
        uname Linux dbc32f751c5a 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
        Build tool maven
        Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
        git revision trunk / 8e8cb4c
        Default Java 1.8.0_91
        findbugs v3.0.0
        Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/15768/testReport/
        modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
        Console output https://builds.apache.org/job/PreCommit-HDFS-Build/15768/console
        Powered by Apache Yetus 0.3.0 http://yetus.apache.org

        This message was automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 21s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. -1 test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 mvninstall 6m 19s trunk passed +1 compile 0m 45s trunk passed +1 checkstyle 0m 24s trunk passed +1 mvnsite 0m 51s trunk passed +1 mvneclipse 0m 11s trunk passed +1 findbugs 1m 44s trunk passed +1 javadoc 0m 55s trunk passed +1 mvninstall 0m 47s the patch passed +1 compile 0m 44s the patch passed +1 javac 0m 44s the patch passed +1 checkstyle 0m 22s the patch passed +1 mvnsite 0m 50s the patch passed +1 mvneclipse 0m 8s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 1m 47s the patch passed +1 javadoc 0m 53s the patch passed +1 unit 69m 9s hadoop-hdfs in the patch passed. +1 asflicense 0m 19s The patch does not generate ASF License warnings. 87m 41s Subsystem Report/Notes Docker Image:yetus/hadoop:2c91fd8 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12810590/HDFS-10525.02.patch JIRA Issue HDFS-10525 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux dbc32f751c5a 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 8e8cb4c Default Java 1.8.0_91 findbugs v3.0.0 Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/15768/testReport/ modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs Console output https://builds.apache.org/job/PreCommit-HDFS-Build/15768/console Powered by Apache Yetus 0.3.0 http://yetus.apache.org This message was automatically generated.
        Hide
        cmccabe Colin P. McCabe added a comment -

        +1. Thanks, Xiao Chen.

        Show
        cmccabe Colin P. McCabe added a comment - +1. Thanks, Xiao Chen .
        Hide
        hadoopqa Hadoop QA added a comment -
        -1 overall



        Vote Subsystem Runtime Comment
        0 reexec 0m 13s Docker mode activated.
        +1 @author 0m 0s The patch does not contain any @author tags.
        -1 test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
        +1 mvninstall 7m 5s trunk passed
        +1 compile 0m 51s trunk passed
        +1 checkstyle 0m 28s trunk passed
        +1 mvnsite 0m 59s trunk passed
        +1 mvneclipse 0m 11s trunk passed
        +1 findbugs 1m 48s trunk passed
        +1 javadoc 1m 1s trunk passed
        +1 mvninstall 0m 55s the patch passed
        +1 compile 0m 52s the patch passed
        +1 javac 0m 52s the patch passed
        +1 checkstyle 0m 25s the patch passed
        +1 mvnsite 0m 56s the patch passed
        +1 mvneclipse 0m 9s the patch passed
        +1 whitespace 0m 0s The patch has no whitespace issues.
        +1 findbugs 2m 0s the patch passed
        +1 javadoc 1m 1s the patch passed
        -1 unit 62m 48s hadoop-hdfs in the patch failed.
        +1 asflicense 0m 19s The patch does not generate ASF License warnings.
        83m 22s



        Reason Tests
        Failed junit tests hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes



        Subsystem Report/Notes
        Docker Image:yetus/hadoop:2c91fd8
        JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12810566/HDFS-10525.01.patch
        JIRA Issue HDFS-10525
        Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
        uname Linux b0469cdf6d12 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
        Build tool maven
        Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
        git revision trunk / 8e8cb4c
        Default Java 1.8.0_91
        findbugs v3.0.0
        unit https://builds.apache.org/job/PreCommit-HDFS-Build/15766/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
        unit test logs https://builds.apache.org/job/PreCommit-HDFS-Build/15766/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
        Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/15766/testReport/
        modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
        Console output https://builds.apache.org/job/PreCommit-HDFS-Build/15766/console
        Powered by Apache Yetus 0.3.0 http://yetus.apache.org

        This message was automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 13s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. -1 test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 mvninstall 7m 5s trunk passed +1 compile 0m 51s trunk passed +1 checkstyle 0m 28s trunk passed +1 mvnsite 0m 59s trunk passed +1 mvneclipse 0m 11s trunk passed +1 findbugs 1m 48s trunk passed +1 javadoc 1m 1s trunk passed +1 mvninstall 0m 55s the patch passed +1 compile 0m 52s the patch passed +1 javac 0m 52s the patch passed +1 checkstyle 0m 25s the patch passed +1 mvnsite 0m 56s the patch passed +1 mvneclipse 0m 9s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 2m 0s the patch passed +1 javadoc 1m 1s the patch passed -1 unit 62m 48s hadoop-hdfs in the patch failed. +1 asflicense 0m 19s The patch does not generate ASF License warnings. 83m 22s Reason Tests Failed junit tests hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes Subsystem Report/Notes Docker Image:yetus/hadoop:2c91fd8 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12810566/HDFS-10525.01.patch JIRA Issue HDFS-10525 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux b0469cdf6d12 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 8e8cb4c Default Java 1.8.0_91 findbugs v3.0.0 unit https://builds.apache.org/job/PreCommit-HDFS-Build/15766/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt unit test logs https://builds.apache.org/job/PreCommit-HDFS-Build/15766/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/15766/testReport/ modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs Console output https://builds.apache.org/job/PreCommit-HDFS-Build/15766/console Powered by Apache Yetus 0.3.0 http://yetus.apache.org This message was automatically generated.
        Hide
        xiaochen Xiao Chen added a comment - - edited

        Thanks Colin P. McCabe, good idea to have more logging to track this. Patch 2 added the log. Truly sorry for missing this check in HDFS-9549.

        Show
        xiaochen Xiao Chen added a comment - - edited Thanks Colin P. McCabe , good idea to have more logging to track this. Patch 2 added the log. Truly sorry for missing this check in HDFS-9549 .
        Hide
        cmccabe Colin P. McCabe added a comment -

        Thanks, Xiao Chen. Can you add a LOG.debug to the "if" statement that talks about the block ID that is getting skipped?

        +1 once that's done.

        Show
        cmccabe Colin P. McCabe added a comment - Thanks, Xiao Chen . Can you add a LOG.debug to the "if" statement that talks about the block ID that is getting skipped? +1 once that's done.
        Hide
        xiaochen Xiao Chen added a comment -

        Patch 1 to fix this in the capacity check.

        The check added in HDFS-9549 was to handle a rare case where a block stuck in pending if the capacity is reached.

        I think there're 2 options on the fix:

        • ignore null blockInfo. We already have logic to remove the blocks that cannot be found on NN
        • remove immediately. I feel this may add a source of confusion since we have 1 more place to remove.

        To limit the change scope, patch 1 goes with option 1.

        Show
        xiaochen Xiao Chen added a comment - Patch 1 to fix this in the capacity check. The check added in HDFS-9549 was to handle a rare case where a block stuck in pending if the capacity is reached. I think there're 2 options on the fix: ignore null blockInfo. We already have logic to remove the blocks that cannot be found on NN remove immediately. I feel this may add a source of confusion since we have 1 more place to remove. To limit the change scope, patch 1 goes with option 1.
        Hide
        xiaochen Xiao Chen added a comment -

        An example stack trace is

        2016-06-13 15:20:32,769 INFO org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: Rescanning because of pending operations
        2016-06-13 15:20:32,770 ERROR org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: Thread exiting
        java.lang.NullPointerException
                at org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor.rescanCachedBlockMap(CacheReplicationMonitor.java:507)
                at org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor.rescan(CacheReplicationMonitor.java:305)
                at org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor.run(CacheReplicationMonitor.java:191)
        

        We should handle the case where block is not available from NN.

        Show
        xiaochen Xiao Chen added a comment - An example stack trace is 2016-06-13 15:20:32,769 INFO org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: Rescanning because of pending operations 2016-06-13 15:20:32,770 ERROR org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: Thread exiting java.lang.NullPointerException at org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor.rescanCachedBlockMap(CacheReplicationMonitor.java:507) at org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor.rescan(CacheReplicationMonitor.java:305) at org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor.run(CacheReplicationMonitor.java:191) We should handle the case where block is not available from NN.

          People

          • Assignee:
            xiaochen Xiao Chen
            Reporter:
            xiaochen Xiao Chen
          • Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development