Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-9434

Recommission a datanode with 500k blocks may pause NN for 30 seconds

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.8.0, 2.7.2, 2.6.3, 3.0.0-alpha1
    • Component/s: namenode
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      In BlockManager, processOverReplicatedBlocksOnReCommission is called within the namespace lock. There is a (not very useful) log message printed in processOverReplicatedBlock. When there is a large number of blocks stored in a storage, printing the log message for each block can pause NN to process any other operations. We did see that it could pause NN for 30 seconds for a storage with 500k blocks.

      I suggest to change the log message to trace level as a quick fix.

        Attachments

        1. h9434_20151116_branch-2.6.patch
          1 kB
          Tsz Wo Nicholas Sze
        2. h9434_20151116.patch
          1 kB
          Tsz Wo Nicholas Sze

          Issue Links

            Activity

              People

              • Assignee:
                szetszwo Tsz Wo Nicholas Sze
                Reporter:
                szetszwo Tsz Wo Nicholas Sze
              • Votes:
                0 Vote for this issue
                Watchers:
                16 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: