Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-13660

DistCp job fails when new data is appended in the file while the distCp copy job is running

VotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 3.3.0, 3.1.4, 3.2.2
    • Component/s: distcp
    • Labels:
      None

      Description

      Steps to reproduce: 

      Suppose distcp MR job is copying the file /tmp/web_returns_merged/data-m-002 and 

      we append some more data to this file using command 

      hadoop fs -appendToFile xaa  /tmp/web_returns_merged/data-m-002

      the job fails with exception 

      Mismatch in length of source:hdfs://mycluster0/tmp/web_returns_merged/data-m-002 and target.

      Attached the logs.

        Attachments

          Activity

            People

            • Assignee:
              mukund-thakur Mukund Thakur
              Reporter:
              mukund-thakur Mukund Thakur

              Dates

              • Created:
                Updated:
                Resolved:

                Issue deployment