Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-28686

MapReduceBackupCopyJob should support custom DistCp options

Details

    Description

      Problem

      The MapReduceBackupCopyJob class provides no means for updating DistCp job options. This means that you're stuck with defaults, which isn't always desirable. For example, my workplace would like the freedom to deviate from at least two DistCp defaults:

      1. distcp.direct.write — we would like to set this to true, because writing and renaming tmp files is expensive in S3 (where we store our backups).
      2. we would also like control over the number of mappers that DistCp will run

      Proposed Solution

      It is not the prettiest solution, but I'm proposing that we support DistCp customizations via the given backup client configuration like this. It's necessary to do this conf -> arg conversion because we still want to use DistCp's run method, which expects args, so as to not change any error codes. Hadoop actually does something similar, but in the opposite direction — the DistCp job has logic to convert the args back to configurations (lol).

      Further, the DistCp API is really unfortunately designed for programmatic use, so it doesn't leave us great alternatives. For example, it doesn't matter what you pass in as DistCpOptions to the constructor if you use the run method, your options will be overwritten based on the args that you pass in. Alternatively, if you pass in the DistCpOptions in the constructor and use DistCp#execute or DistCp#createAndSubmitJob, then you get none of the error specificity!

      Attachments

        Activity

          hudson Hudson added a comment -

          Results for branch branch-2
          build #1112 on builds.a.o: -1 overall


          details (if available):

          +1 general checks
          – For more information see general report

          +1 jdk8 hadoop2 checks
          – For more information see jdk8 (hadoop2) report

          -1 jdk8 hadoop3 checks
          – For more information see jdk8 (hadoop3) report

          +1 jdk11 hadoop3 checks
          – For more information see jdk11 report

          +1 jdk17 hadoop3 checks
          – For more information see jdk17 report

          +1 source release artifact
          – See build output for details.

          +1 client integration test

          hudson Hudson added a comment - Results for branch branch-2 build #1112 on builds.a.o : -1 overall details (if available): +1 general checks – For more information see general report +1 jdk8 hadoop2 checks – For more information see jdk8 (hadoop2) report -1 jdk8 hadoop3 checks – For more information see jdk8 (hadoop3) report +1 jdk11 hadoop3 checks – For more information see jdk11 report +1 jdk17 hadoop3 checks – For more information see jdk17 report +1 source release artifact – See build output for details. +1 client integration test
          hudson Hudson added a comment -

          Results for branch branch-2.6
          build #172 on builds.a.o: -1 overall


          details (if available):

          +1 general checks
          – For more information see general report

          +1 jdk8 hadoop2 checks
          – For more information see jdk8 (hadoop2) report

          -1 jdk8 hadoop3 checks
          – For more information see jdk8 (hadoop3) report

          +1 jdk11 hadoop3 checks
          – For more information see jdk11 report

          -1 jdk17 hadoop3 checks
          – For more information see jdk17 report

          +1 source release artifact
          – See build output for details.

          +1 client integration test

          hudson Hudson added a comment - Results for branch branch-2.6 build #172 on builds.a.o : -1 overall details (if available): +1 general checks – For more information see general report +1 jdk8 hadoop2 checks – For more information see jdk8 (hadoop2) report -1 jdk8 hadoop3 checks – For more information see jdk8 (hadoop3) report +1 jdk11 hadoop3 checks – For more information see jdk11 report -1 jdk17 hadoop3 checks – For more information see jdk17 report +1 source release artifact – See build output for details. +1 client integration test
          ndimiduk Nick Dimiduk added a comment -

          Pushed to branch-2.6+. Thanks for the contribution rmdmattingly. Would you mind also adding a release note explaining the change?

          ndimiduk Nick Dimiduk added a comment - Pushed to branch-2.6+. Thanks for the contribution rmdmattingly . Would you mind also adding a release note explaining the change?
          hudson Hudson added a comment -

          Results for branch master
          build #1125 on builds.a.o: -1 overall


          details (if available):

          +1 general checks
          – For more information see general report

          +1 jdk17 hadoop3 checks
          – For more information see jdk17 report

          +1 source release artifact
          – See build output for details.

          -1 client integration test
          – Something went wrong with this stage, check relevant console output.

          hudson Hudson added a comment - Results for branch master build #1125 on builds.a.o : -1 overall details (if available): +1 general checks – For more information see general report +1 jdk17 hadoop3 checks – For more information see jdk17 report +1 source release artifact – See build output for details. -1 client integration test – Something went wrong with this stage, check relevant console output .
          hudson Hudson added a comment -

          Results for branch branch-3
          build #254 on builds.a.o: -1 overall


          details (if available):

          +1 general checks
          – For more information see general report

          +1 jdk17 hadoop3 checks
          – For more information see jdk17 report

          +1 source release artifact
          – See build output for details.

          -1 client integration test
          – Something went wrong with this stage, check relevant console output.

          hudson Hudson added a comment - Results for branch branch-3 build #254 on builds.a.o : -1 overall details (if available): +1 general checks – For more information see general report +1 jdk17 hadoop3 checks – For more information see jdk17 report +1 source release artifact – See build output for details. -1 client integration test – Something went wrong with this stage, check relevant console output .

          People

            rmdmattingly Ray Mattingly
            rmdmattingly Ray Mattingly
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: