Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-325

DFS should not use round robin policy in determing on which volume (file system partition) to allocate for the next block



    • Type: Improvement
    • Status: Reopened
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:


      When multiple file system partitions are configured for the data storage of a data node,
      it uses a strict round robin policy to decide which partition to use for writing the next block.
      This may result in anormaly cases in which the blocks of a file are not evenly distributed across
      the partitions. For example, when we use distcp to copy files with each node have 4 mappers running concurrently,
      those 4 mappers are writing to DFS at about the same rate. Thus, it is possible that the 4 mappers write out
      blocks interleavingly. If there are 4 file system partitions configured for the local data node, it is possible that each mapper will
      continue to write its blocks on to the same file system partition.

      A simple random placement policy will avoid such anormaly cases, and does not have any obvious drawbacks.


        1. randomDatanodePartition.patch
          0.7 kB
          Dhruba Borthakur

          Issue Links



              • Assignee:
                dhruba Dhruba Borthakur
                runping Runping Qi
              • Votes:
                0 Vote for this issue
                4 Start watching this issue


                • Created: