Log workAgile BoardRank to TopRank to BottomAttach filesAttach ScreenshotVotersWatch issueWatchersConvert to IssueMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

    Details

    • Type: Sub-task
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.0.0-alpha, 0.23.7
    • Component/s: datanode, namenode
    • Labels:
      None
    • Hadoop Flags:
      Incompatible change

      Description

      I have a cluster that has 1800 datanodes. Each datanode has around 50000 blocks and sends a block report to the namenode once every hour. This means that the namenode processes a block report once every 2 seconds. Each block report contains all blocks that the datanode currently hosts. This makes the namenode compare a huge number of blocks that practically remains the same between two consecutive reports. This wastes CPU on the namenode.

      The problem becomes worse when the number of datanodes increases.

      One proposal is to make succeeding block reports (after a successful send of a full block report) be incremental. This will make the namenode process only those blocks that were added/deleted in the last period.

        Attachments

        1. blockReportPeriod.patch
          3 kB
          Dhruba Borthakur
        2. explicitAcks.patch-3
          31 kB
          Tomasz Nykiel
        3. explicitAcks.patch-4
          30 kB
          Tomasz Nykiel
        4. explicitAcks.patch-5
          30 kB
          Tomasz Nykiel
        5. explicitAcks.patch-6
          31 kB
          Tomasz Nykiel
        6. explicitDeleteAcks.patch
          20 kB
          Tomasz Nykiel

          Activity

          $i18n.getText('security.level.explanation', $currentSelection) Viewable by All Users
          Cancel

            People

            • Assignee:
              tnykiel Tomasz Nykiel Assign to me
              Reporter:
              dhruba Dhruba Borthakur

              Dates

              • Created:
                Updated:
                Resolved:

                Issue deployment