Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-774

Datanodes fails to heartbeat when a directory with a large number of blocks is deleted

VotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 0.10.0
    • None
    • None

    Description

      If a user removes a few files that are huge, it causes the namenode to send BlockInvalidate command to the relevant Datanodes. The Datanode process the blockInvalidate command as part of its heartbeat thread. If the number of blocks to be invalidated is huge, the datanode takes a long time to process it. This causes the datanode to not send new heartbeats to the namenode. The namenode declares the datanode as dead!

      1. One option is to process the blockInvalidate as a separate thread from the heartbeat thread in the Datanode.
      2. Another option would be to constrain the namenode to send a max (e.g. 500) blocks per blockInvalidate message.

      Attachments

        1. chunkinvalidateBlocks2.java
          3 kB
          Dhruba Borthakur

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            dhruba Dhruba Borthakur
            dhruba Dhruba Borthakur
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment