Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-18564

Use file-level checksum by default when copying between two different file systems



    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • None



      Reduce user friction


      When distcp'ing between two different file systems, distcp still uses block-level checksum by default, even though the two file systems can be very different in how they manage blocks, so that a block-level checksum no longer makes sense between these two.

      e.g. distcp between HDFS and Ozone without overriding dfs.checksum.combine.mode throws IOException because the blocks of the same file on two FSes are different (as expected):

      $ hadoop distcp -i -pp /test o3fs://buck-test1.vol1.ozone1/
      java.lang.Exception: java.io.IOException: File copy failed: hdfs://duong-1.duong.root.hwx.site:8020/test/test.bin --> o3fs://buck-test1.vol1.ozone1/test/test.bin
      	at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:492)
      	at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:552)
      Caused by: java.io.IOException: File copy failed: hdfs://duong-1.duong.root.hwx.site:8020/test/test.bin --> o3fs://buck-test1.vol1.ozone1/test/test.bin
      	at org.apache.hadoop.tools.mapred.CopyMapper.copyFileWithRetry(CopyMapper.java:262)
      	at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:219)
      	at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:48)
      	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
      	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:799)
      	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347)
      	at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:271)
      	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
      	at java.lang.Thread.run(Thread.java:748)
      Caused by: java.io.IOException: Couldn't run retriable-command: Copying hdfs://duong-1.duong.root.hwx.site:8020/test/test.bin to o3fs://buck-test1.vol1.ozone1/test/test.bin
      	at org.apache.hadoop.tools.util.RetriableCommand.execute(RetriableCommand.java:101)
      	at org.apache.hadoop.tools.mapred.CopyMapper.copyFileWithRetry(CopyMapper.java:258)
      	... 11 more
      Caused by: java.io.IOException: Checksum mismatch between hdfs://duong-1.duong.root.hwx.site:8020/test/test.bin and o3fs://buck-test1.vol1.ozone1/.distcp.tmp.attempt_local1346550241_0001_m_000000_0.Source and destination filesystems are of different types
      Their checksum algorithms may be incompatible You can choose file-level checksum validation via -Ddfs.checksum.combine.mode=COMPOSITE_CRC when block-sizes or filesystems are different. Or you can skip checksum-checks altogether  with -skipcrccheck.

      And it works when we use a file-level checksum like COMPOSITE_CRC:

      With -Ddfs.checksum.combine.mode=COMPOSITE_CRC
      $ hadoop distcp -i -pp /test o3fs://buck-test2.vol1.ozone1/ -Ddfs.checksum.combine.mode=COMPOSITE_CRC
      22/10/18 19:07:42 INFO mapreduce.Job: Job job_local386071499_0001 completed successfully
      22/10/18 19:07:42 INFO mapreduce.Job: Counters: 30
      	File System Counters
      		FILE: Number of bytes read=219900
      		FILE: Number of bytes written=794129
      		FILE: Number of read operations=0
      		FILE: Number of large read operations=0
      		FILE: Number of write operations=0
      		HDFS: Number of bytes read=0
      		HDFS: Number of bytes written=0
      		HDFS: Number of read operations=13
      		HDFS: Number of large read operations=0
      		HDFS: Number of write operations=2
      		HDFS: Number of bytes read erasure-coded=0
      		O3FS: Number of bytes read=0
      		O3FS: Number of bytes written=0
      		O3FS: Number of read operations=5
      		O3FS: Number of large read operations=0
      		O3FS: Number of write operations=0


      (if changing global defaults could potentially break distcp'ing between HDFS/S3/etc. Also weichiu mentioned COMPOSITE_CRC is only added in Hadoop 3.1.1. So this might be the only way.)

      Don't touch the global default, and make it a client-side config.

      e.g. add a config to allow automatically usage of COMPOSITE_CRC (dfs.checksum.combine.mode) when distcp'ing between HDFS and Ozone, which would be the equivalent of specifying -Ddfs.checksum.combine.mode=COMPOSITE_CRC on the distcp command but the end user won't have to specify it every single time.

      cc duongnguyen weichiu




            Unassigned Unassigned
            smeng Siyao Meng
            0 Vote for this issue
            4 Start watching this issue