Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-15208

DistCp to offer -xtrack <path> option to save src/dest filesets as alternative to delete()

    Details

    • Type: New Feature
    • Status: Resolved
    • Priority: Major
    • Resolution: Duplicate
    • Affects Version/s: 2.9.0
    • Fix Version/s: None
    • Component/s: tools/distcp
    • Labels:
      None

      Description

      There are opportunities to improve distcp delete performance and scalability with object stores, but you need to test with production datasets to determine if the optimizations work, don't run out of memory, etc.

      By adding the option to save the sequence files of source, dest listings, people (myself included) can experiment with different strategies before trying to commit one which doesn't scale

        Attachments

        1. HADOOP-15208-001.patch
          69 kB
          Steve Loughran
        2. HADOOP-15208-002.patch
          64 kB
          Steve Loughran
        3. HADOOP-15208-002.patch
          64 kB
          Steve Loughran
        4. HADOOP-15208-003.patch
          67 kB
          Steve Loughran

          Issue Links

            Activity

              People

              • Assignee:
                stevel@apache.org Steve Loughran
                Reporter:
                stevel@apache.org Steve Loughran
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: