Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-3787

BlockManager#close races with ReplicationMonitor#run

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Minor
    • Resolution: Duplicate
    • 2.0.0-alpha
    • None
    • namenode
    • None

    Description

      We saw TestDirectoryScanner fail during shutdown:

      2012-08-09 12:17:19,844 WARN  datanode.DataNode (BPServiceActor.java:run(683)) - Ending block pool service for: Block pool BP-610123021-172.29.121.238-1344539835759 (storage id DS-1581877160-172.29.121.238-43609-1344539837880) service to localhost/127.0.0.1:40012
      ...
      2012-08-09 12:17:19,876 FATAL blockmanagement.BlockManager (BlockManager.java:run(3039)) - ReplicationMonitor thread received Runtime exception. 
      java.lang.NullPointerException
      	at org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.getBlockCollection(BlocksMap.java:101)
      	at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeReplicationWorkForBlocks(BlockManager.java:1141)
      	at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeReplicationWork(BlockManager.java:1116)
      	at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeDatanodeWork(BlockManager.java:3070)
      	at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$ReplicationMonitor.run(BlockManager.java:3032)
      	at java.lang.Thread.run(Thread.java:662)
      

      Inspecting the code, it appears that BlockManager#close -> BlocksMap#close can set blocks to null while computeDatanodeWork is running.

      The fix seems simple – have close just set an exit flag, and have ReplicationMonitor#run call BlocksMap#close.

      Attachments

        1. hdfs-3787.txt
          1 kB
          Andy Isaacson
        2. hdfs-3787-2.txt
          1 kB
          Eli Collins
        3. hdfs-3787-2.txt
          1 kB
          Andy Isaacson

        Issue Links

          Activity

            People

              adi2 Andy Isaacson
              adi2 Andy Isaacson
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: