Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-8770

ReplicationMonitor thread received Runtime exception: NullPointerException when BlockManager.chooseExcessReplicates

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Duplicate
    • 2.6.0, 2.7.0
    • None
    • namenode
    • None

    Description

      Namenode shutdown when ReplicationMonitor thread received Runtime exception:

      2015-07-08 16:43:55,167 ERROR org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: ReplicationMonitor thread received Runtime exception.
      java.lang.NullPointerException
      at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy.adjustSetsWithChosenReplica(BlockPlacementPolicy.java:189)
      at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseExcessReplicates(BlockManager.java:2911)
      at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processOverReplicatedBlock(BlockManager.java:2849)
      at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processMisReplicatedBlock(BlockManager.java:2780)
      at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.rescanPostponedMisreplicatedBlocks(BlockManager.java:1931)
      at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$ReplicationMonitor.run(BlockManager.java:3628)
      at java.lang.Thread.run(Thread.java:744)

      We use hadoop-2.6.0 configured with heterogeneous storages and setStoragePolicy some path One_SSD. When a block has excess replicated like 2 SSD replica on different rack(exactlyOne set) and 2 Disk on same rack(moreThanOne set), BlockPlacementPolicyDefault.chooseReplicaToDelete return null because only moreThanOne set be chosen to find SSD replica

      Attachments

        1. HDFS-8770_v1.patch
          2 kB
          chunde

        Issue Links

          Activity

            xiaochen Xiao Chen added a comment -

            Hi aderen,
            Thanks for reporting the issue and providing a patch. The fix makes sense to me.
            Could you add a unit test to reproduce the scenario that you're trying to fix?

            xiaochen Xiao Chen added a comment - Hi aderen , Thanks for reporting the issue and providing a patch. The fix makes sense to me. Could you add a unit test to reproduce the scenario that you're trying to fix?
            walter.k.su Walter Su added a comment -

            HDFS-9313 probably fixed this as a workaround. And HDFS-9314 is filed to improve this.

            Closed as duplicated. Be free to reopen if you disagree.

            walter.k.su Walter Su added a comment - HDFS-9313 probably fixed this as a workaround. And HDFS-9314 is filed to improve this. Closed as duplicated. Be free to reopen if you disagree.
            xiaochen Xiao Chen added a comment -

            Thanks walter.k.su for the reference. This should be the same as HDFS-9314, will follow up from there.

            xiaochen Xiao Chen added a comment - Thanks walter.k.su for the reference. This should be the same as HDFS-9314 , will follow up from there.

            People

              aderen chunde
              aderen chunde
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: