Currently Distcp does not preserve the erasure coding attributes properly. I propose we add a "-pe" switch to ensure erasure coded files at source are copied as erasure coded files at destination.
For example, if the src cluster has the following directories and files that are copied to dest cluster
hdfs://src/ root directory is replicated
hdfs://src/foo erasure code enabled directory
hdfs://src/foo/bar erasure coded file
after distcp, hdfs://dest/foo and hdfs://dest/foo/bar will not be erasure coded.
It may be useful to add such capability. One potential use is for disaster recovery. The other use is for out-of-place cluster upgrade.