Hadoop Common
  1. Hadoop Common
  2. HADOOP-8940

Add a resume feature to the copyFromLocal and put commands

    Details

    • Type: New Feature New Feature
    • Status: In Progress
    • Priority: Minor Minor
    • Resolution: Unresolved
    • Affects Version/s: 2.0.1-alpha
    • Fix Version/s: None
    • Component/s: tools
    • Labels:
      None

      Description

      Add a resume feature to the copyFromLocal command. Failures in large transfers result in a great deal of wasted time. For large files, it would be good to be able to continue from the last good block onwards. The file would have to be unavailable to other clients for reads or regular writes until the "resume" process was completed.

        Activity

        Transition Time In Source Status Execution Times Last Executer Last Execution Date
        Open Open In Progress In Progress
        1012d 21h 42m 1 Mahesh Dharmasena Sunday 17:20
        Harsh J made changes -
        Fix Version/s 2.0.1-alpha [ 12322467 ]
        Hide
        Harsh J added a comment -

        Then it can look at time stamps on the files, and possibly checksums as well, to pick up where it left off on a failure.

        You could also do this with DistCp's -update flag, with -Dmapreduce.framework.name=local passed through for Local FS file:/// sources. I'm uncertain if the checksum checks would work though, unless the files were written by the Checksumming FS. Useful for a lot of files, but probably not if what's needed is independent file-level append-like resume.

        Show
        Harsh J added a comment - Then it can look at time stamps on the files, and possibly checksums as well, to pick up where it left off on a failure. You could also do this with DistCp's -update flag, with -Dmapreduce.framework.name=local passed through for Local FS file:/// sources. I'm uncertain if the checksum checks would work though, unless the files were written by the Checksumming FS. Useful for a lot of files, but probably not if what's needed is independent file-level append-like resume.
        Mahesh Dharmasena made changes -
        Status Open [ 1 ] In Progress [ 3 ]
        Mahesh Dharmasena made changes -
        Assignee Mahesh Dharmasena [ mahesh.ksl ]
        Eli Collins made changes -
        Field Original Value New Value
        Project Hadoop HDFS [ 12310942 ] Hadoop Common [ 12310240 ]
        Key HDFS-3971 HADOOP-8940
        Affects Version/s 2.0.1-alpha [ 12322467 ]
        Affects Version/s 2.0.1-alpha [ 12322465 ]
        Target Version/s 2.0.1-alpha [ 12322465 ]
        Fix Version/s 2.0.1-alpha [ 12322467 ]
        Fix Version/s 2.0.1-alpha [ 12322465 ]
        Component/s tools [ 12319643 ]
        Component/s tools [ 12312944 ]
        Hide
        Adam Muise added a comment -

        Yes, this should probably look like rsync.

        No, sqoop does not support this use case.

        Show
        Adam Muise added a comment - Yes, this should probably look like rsync. No, sqoop does not support this use case.
        Hide
        Eli Collins added a comment -

        Yea, something like rsync or Sqoop for file systems seems more appropriate.

        Show
        Eli Collins added a comment - Yea, something like rsync or Sqoop for file systems seems more appropriate.
        Hide
        Robert Joseph Evans added a comment -

        It almost sounds like you want to turn this into something like rsync. I think it would be much more useful to just add in an rsync command with a simmilar set of features and flags then trying to reinvent it piecemeal. Then it can look at time stamps on the files, and possibly checksums as well, to pick up where it left off on a failure.

        Show
        Robert Joseph Evans added a comment - It almost sounds like you want to turn this into something like rsync. I think it would be much more useful to just add in an rsync command with a simmilar set of features and flags then trying to reinvent it piecemeal. Then it can look at time stamps on the files, and possibly checksums as well, to pick up where it left off on a failure.
        Adam Muise created issue -

          People

          • Assignee:
            Mahesh Dharmasena
            Reporter:
            Adam Muise
          • Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

            • Created:
              Updated:

              Development