Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-8031 Follow-on work for erasure coding phase I (striping layout)
  3. HDFS-10971

Distcp should not copy replication factor if source file is erasure coded

    XMLWordPrintableJSON

Details

    Description

      The current erasure coding implementation uses replication factor field to store erasure coding policy.

      Distcp copies the source file's replication factor to the destination if -pr is specified. However, if the source file is EC, the replication factor (which is EC policy) should not be replicated to the destination file. When a HdfsFileStatus is converted to FileStatus, the replication factor is set to 0 if it's an EC file.

      In fact, I will attach a test case that shows trying to replicate the replication factor of an EC file results in an IOException: "Requested replication factor of 0 is less than the required minimum of 1 for /tmp/dst/dest2"

      Attachments

        1. HDFS-10971.testcase.patch
          3 kB
          Wei-Chiu Chuang
        2. HDFS-10971.01.patch
          8 kB
          Manoj Govindassamy
        3. HDFS-10971.02.patch
          11 kB
          Manoj Govindassamy

        Issue Links

          Activity

            People

              manojg Manoj Govindassamy
              weichiu Wei-Chiu Chuang
              Votes:
              0 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: