Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-654

Add an option -count to distcp for displaying some info about the src files

VotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.21.0
    • 0.21.0
    • distcp
    • None
    • Reviewed

    Description

      Add an option -count to distcp for displaying metadata about src files like number of files to be copied and total size of src files to be copied.
      WIth -count, distcp doesn't do any copy. Just displays info and exits.
      This is useful specifically when used with -update.
      distcp -update -count <src>* <dst>
      would display the number of files to be updated and the total size of copy needs to be done(by comparing the file sizes and checksums at src and dst). Based on this info, users could allocate the number of nodes needed for the actual update job.

      Attachments

        1. d_count_v1.patch
          7 kB
          Ravi Gummadi
        2. d_count.patch
          6 kB
          Ravi Gummadi
        3. d_count654.patch
          8 kB
          Ravi Gummadi
        4. M654-2.patch
          7 kB
          Christopher Douglas

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            ravidotg Ravi Gummadi
            ravidotg Ravi Gummadi
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment