Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-822

Appends to already-finalized blocks can rename across volumes

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: 0.21.0, 0.22.0
    • Fix Version/s: 0.21.0
    • Component/s: datanode
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      This is a performance thing. As I understand the code in FSDataset.append, if the block is already finalized, it needs to move it into the RBW directory so it can go back into a "being written" state. This is done using volumes.getNextVolume without preference to the volume that the block currently exists on. It seems to me that this could cause a lot of slow cross-volume copies on applications that periodically append/close/append/close a file. Instead, getNextVolume could provide an alternate form that gives preference to a particular volume, so the rename stays on the same disk.

        Attachments

        1. HDFS-822.patch
          3 kB
          Hairong Kuang
        2. HDFS-822.patch
          0.7 kB
          Hairong Kuang
        3. HDFS-822.patch
          0.8 kB
          Hairong Kuang

          Activity

            People

            • Assignee:
              hairong Hairong Kuang
              Reporter:
              tlipcon Todd Lipcon
            • Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: