Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-14254

Add a Distcp option to preserve Erasure Coding attributes

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.0.0-alpha4
    • 3.4.0
    • tools/distcp
    • None
    • Reviewed

    Description

      Currently Distcp does not preserve the erasure coding attributes properly. I propose we add a "-pe" switch to ensure erasure coded files at source are copied as erasure coded files at destination.

      For example, if the src cluster has the following directories and files that are copied to dest cluster
      hdfs://src/ root directory is replicated
      hdfs://src/foo erasure code enabled directory
      hdfs://src/foo/bar erasure coded file

      after distcp, hdfs://dest/foo and hdfs://dest/foo/bar will not be erasure coded.

      It may be useful to add such capability. One potential use is for disaster recovery. The other use is for out-of-place cluster upgrade.

      Attachments

        1. HDFS-11472.001.patch
          10 kB
          Wei-Chiu Chuang
        2. HADOOP-14254-04.patch
          22 kB
          Ayush Saxena
        3. HADOOP-14254-03.patch
          29 kB
          Ayush Saxena
        4. HADOOP-14254-02.patch
          22 kB
          Ayush Saxena
        5. HADOOP-14254-01.patch
          22 kB
          Ayush Saxena
        6. HADOOP-14254.test.patch
          3 kB
          Wei-Chiu Chuang

        Issue Links

          Activity

            People

              ayushtkn Ayush Saxena
              weichiu Wei-Chiu Chuang
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: