Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-325

DFS should not use round robin policy in determing on which volume (file system partition) to allocate for the next block

    Details

    • Type: Improvement
    • Status: Reopened
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      When multiple file system partitions are configured for the data storage of a data node,
      it uses a strict round robin policy to decide which partition to use for writing the next block.
      This may result in anormaly cases in which the blocks of a file are not evenly distributed across
      the partitions. For example, when we use distcp to copy files with each node have 4 mappers running concurrently,
      those 4 mappers are writing to DFS at about the same rate. Thus, it is possible that the 4 mappers write out
      blocks interleavingly. If there are 4 file system partitions configured for the local data node, it is possible that each mapper will
      continue to write its blocks on to the same file system partition.

      A simple random placement policy will avoid such anormaly cases, and does not have any obvious drawbacks.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                dhruba dhruba borthakur
                Reporter:
                runping Runping Qi
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated: