Hadoop HDFS
  1. Hadoop HDFS
  2. HDFS-325

DFS should not use round robin policy in determing on which volume (file system partition) to allocate for the next block


    • Type: Improvement Improvement
    • Status: Reopened
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:


      When multiple file system partitions are configured for the data storage of a data node,
      it uses a strict round robin policy to decide which partition to use for writing the next block.
      This may result in anormaly cases in which the blocks of a file are not evenly distributed across
      the partitions. For example, when we use distcp to copy files with each node have 4 mappers running concurrently,
      those 4 mappers are writing to DFS at about the same rate. Thus, it is possible that the 4 mappers write out
      blocks interleavingly. If there are 4 file system partitions configured for the local data node, it is possible that each mapper will
      continue to write its blocks on to the same file system partition.

      A simple random placement policy will avoid such anormaly cases, and does not have any obvious drawbacks.

        Issue Links



            • Assignee:
              dhruba borthakur
              Runping Qi
            • Votes:
              0 Vote for this issue
              4 Start watching this issue


              • Created: