Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-14254

Add a Distcp option to preserve Erasure Coding attributes

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 3.0.0-alpha4
    • Fix Version/s: None
    • Component/s: tools/distcp
    • Labels:
      None
    • Target Version/s:

      Description

      Currently Distcp does not preserve the erasure coding attributes properly. I propose we add a "-pe" switch to ensure erasure coded files at source are copied as erasure coded files at destination.

      For example, if the src cluster has the following directories and files that are copied to dest cluster
      hdfs://src/ root directory is replicated
      hdfs://src/foo erasure code enabled directory
      hdfs://src/foo/bar erasure coded file

      after distcp, hdfs://dest/foo and hdfs://dest/foo/bar will not be erasure coded.

      It may be useful to add such capability. One potential use is for disaster recovery. The other use is for out-of-place cluster upgrade.

        Attachments

        1. HDFS-11472.001.patch
          10 kB
          Wei-Chiu Chuang
        2. HADOOP-14254.test.patch
          3 kB
          Wei-Chiu Chuang

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                weichiu Wei-Chiu Chuang
              • Votes:
                0 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                • Created:
                  Updated: