Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-13660

DistCp job fails when new data is appended in the file while the distCp copy job is running

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 3.3.0, 3.1.4, 3.2.2
    • Component/s: distcp
    • Labels:
      None

      Description

      Steps to reproduce: 

      Suppose distcp MR job is copying the file /tmp/web_returns_merged/data-m-002 and 

      we append some more data to this file using command 

      hadoop fs -appendToFile xaa  /tmp/web_returns_merged/data-m-002

      the job fails with exception 

      Mismatch in length of source:hdfs://mycluster0/tmp/web_returns_merged/data-m-002 and target.

      Attached the logs.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                mukund-thakur Mukund Thakur
                Reporter:
                mukund-thakur Mukund Thakur
              • Votes:
                0 Vote for this issue
                Watchers:
                9 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: