VotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • None
    • 0.21.0
    • distcp
    • None
    • Reviewed

    Description

      1. distcp -update launches job when there is at least one dir in source paths to be copied, even though there is nothing to copy.

      HADOOP-5675 added fileCount > 0 to be checked to decide whether to launch job. And HADOOP-5762 changed this to fileCount + dirCount > 0 to solve the issue of empty directories not getting copied to destination. With -update, dirCount is incremented without checking if that dir already exists at the destination. So distcp job is launched because of dirCount > 0 even though there is nothing to copy. Incrementing dirCount can be skipped if that dir already exists at the destination in case of -update.

      2. distcp doesn't skip copying file when we do -update on single file if the destfile already exists.

      When we do

      hadoop distcp -update srcfilename destfilename

      it seems to be comparing checksums of srcfilename and destfilename/srcfilename and so skip is not done. It should compare checksums of srcfilename and destfilename.

      See also MAPREDUCE-644.

      Attachments

        1. d_648_644.patch
          8 kB
          Ravi Gummadi
        2. d_dirCount_648.patch
          2 kB
          Ravi Gummadi
        3. d_dirCount648.patch
          2 kB
          Ravi Gummadi
        4. d_dirCount648.v1.patch
          2 kB
          Ravi Gummadi

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            ravidotg Ravi Gummadi
            ravidotg Ravi Gummadi
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment