Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-13660

DistCp job fails when new data is appended in the file while the distCp copy job is running

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • None
    • 3.3.0, 3.1.4, 3.2.2
    • distcp
    • None

    Description

      Steps to reproduce: 

      Suppose distcp MR job is copying the file /tmp/web_returns_merged/data-m-002 and 

      we append some more data to this file using command 

      hadoop fs -appendToFile xaa  /tmp/web_returns_merged/data-m-002

      the job fails with exception 

      Mismatch in length of source:hdfs://mycluster0/tmp/web_returns_merged/data-m-002 and target.

      Attached the logs.

      Attachments

        Issue Links

          Activity

            People

              mukund-thakur Mukund Thakur
              mukund-thakur Mukund Thakur
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: