Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-15208

DistCp to offer -xtrack <path> option to save src/dest filesets as alternative to delete()

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Resolved
    • Major
    • Resolution: Duplicate
    • 2.9.0
    • None
    • tools/distcp
    • None

    Description

      There are opportunities to improve distcp delete performance and scalability with object stores, but you need to test with production datasets to determine if the optimizations work, don't run out of memory, etc.

      By adding the option to save the sequence files of source, dest listings, people (myself included) can experiment with different strategies before trying to commit one which doesn't scale

      Attachments

        1. HADOOP-15208-001.patch
          69 kB
          Steve Loughran
        2. HADOOP-15208-002.patch
          64 kB
          Steve Loughran
        3. HADOOP-15208-002.patch
          64 kB
          Steve Loughran
        4. HADOOP-15208-003.patch
          67 kB
          Steve Loughran

        Issue Links

          Activity

            People

              stevel@apache.org Steve Loughran
              stevel@apache.org Steve Loughran
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: