Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-1501

Block reports from all datanodes arrive at the namenode within a small band of time

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 0.14.0
    • None
    • None

    Description

      I have a 2000 node cluster and the block report interval is set to 1 hour. Most block report arrive within a few minutes of one another. For example, I have seen block reports from all 2000 nodes arrive within 5 minutes of one another. This causes CPU overload on the namenode, causing dropped calls in Call queue.

      My proposal is to make the datanode send a block report as soon as the datanode starts. Then, it waits for a random time between 0 to 1 hour (the configured value) before sending the nect block report. From then on, block reports from that datanode are sent once every 1 hour (the configured value).

      Attachments

        1. randomBlockReportInterval.patch
          0.9 kB
          Dhruba Borthakur

        Activity

          People

            dhruba Dhruba Borthakur
            dhruba Dhruba Borthakur
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: