Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-16439

Makes calculating maxNodesPerRack simpler

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: In Progress
    • Major
    • Resolution: Unresolved
    • 3.4.0
    • None
    • namenode

    Description

      When creating a new file, it is usually necessary to communicate with the namenode first to obtain the location of some DataNodes as the target location of Blockd. At this time, when BlockPlacementPolicyDefault#getMaxNodesPerRack() is executed, if the number of replicas is very large, once it exceeds the number of all nodes in the cluster. The following piece of code will be executed:
      int clusterSize = clusterMap.getNumOfLeaves();
      int totalNumOfReplicas = numOfChosen + numOfReplicas;
      if (totalNumOfReplicas > clusterSize)

      { numOfReplicas -= (totalNumOfReplicas-clusterSize); totalNumOfReplicas = clusterSize; }

      Here, the calculation for numOfReplicas gets a little more complicated. It can be simplified like:
      numOfReplicas = clusterSize - numOfChosen

      It would be more helpful to understand it this way, while also freeing up a little cpu (though not a lot).

      Attachments

        Issue Links

          Activity

            People

              jianghuazhu JiangHua Zhu
              jianghuazhu JiangHua Zhu
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 40m
                  40m