Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-6638

Do not attempt to recover progress from previous job attempts if spill encryption is enabled

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.7.2
    • Fix Version/s: 2.9.0, 3.0.0-alpha2
    • Component/s: applicationmaster
    • Labels:
      None
    • Target Version/s:
    • Hadoop Flags:
      Reviewed

      Description

      Post the fix to CVE-2015-1776, jobs with ecrypted spills enabled cannot be recovered if the AM fails. We should store the key some place safe so they can actually be recovered. If there is no "safe" place, at least we should restart the job by re-running all mappers/reducers.

      1. mapreduce6638.001.patch
        6 kB
        Haibo Chen
      2. mapreduce6638.002.patch
        6 kB
        Haibo Chen
      3. mapreduce6638.003.patch
        9 kB
        Haibo Chen
      4. mapreduce6638.004.patch
        9 kB
        Haibo Chen
      5. mapreduce6683.005.patch
        9 kB
        Haibo Chen

        Issue Links

          Activity

          Hide
          kasha Karthik Kambatla added a comment -

          Can't we store the key on KMS or encrypted HDFS so the second AM can recover it using delegation tokens?

          Arun Suresh, Vinod Kumar Vavilapalli - thoughts?

          Show
          kasha Karthik Kambatla added a comment - Can't we store the key on KMS or encrypted HDFS so the second AM can recover it using delegation tokens? Arun Suresh , Vinod Kumar Vavilapalli - thoughts?
          Hide
          haibochen Haibo Chen added a comment -

          The patch skips recovery if spill encryption is enabled and rerun all mappers and reducers. Have not seen any safe place to store the spill encryption key in MRAppMaster.

          Show
          haibochen Haibo Chen added a comment - The patch skips recovery if spill encryption is enabled and rerun all mappers and reducers. Have not seen any safe place to store the spill encryption key in MRAppMaster.
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 18s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
          +1 mvninstall 7m 8s trunk passed
          +1 compile 0m 23s trunk passed
          +1 checkstyle 0m 17s trunk passed
          +1 mvnsite 0m 29s trunk passed
          +1 mvneclipse 0m 15s trunk passed
          +1 findbugs 0m 41s trunk passed
          +1 javadoc 0m 16s trunk passed
          +1 mvninstall 0m 24s the patch passed
          +1 compile 0m 20s the patch passed
          +1 javac 0m 20s the patch passed
          -1 checkstyle 0m 15s hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app: The patch generated 2 new + 191 unchanged - 0 fixed = 193 total (was 191)
          +1 mvnsite 0m 26s the patch passed
          +1 mvneclipse 0m 12s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 findbugs 0m 40s the patch passed
          +1 javadoc 0m 13s the patch passed
          -1 unit 9m 42s hadoop-mapreduce-client-app in the patch failed.
          +1 asflicense 0m 15s The patch does not generate ASF License warnings.
          22m 52s



          Reason Tests
          Failed junit tests hadoop.mapreduce.v2.app.TestRecovery



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:9560f25
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12825352/mapreduce6638.001.patch
          JIRA Issue MAPREDUCE-6638
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux 63f482cc2a93 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / a1f3293
          Default Java 1.8.0_101
          findbugs v3.0.0
          checkstyle https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6695/artifact/patchprocess/diff-checkstyle-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-app.txt
          unit https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6695/artifact/patchprocess/patch-unit-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-app.txt
          unit test logs https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6695/artifact/patchprocess/patch-unit-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-app.txt
          Test Results https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6695/testReport/
          modules C: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app U: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app
          Console output https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6695/console
          Powered by Apache Yetus 0.3.0 http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 18s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files. +1 mvninstall 7m 8s trunk passed +1 compile 0m 23s trunk passed +1 checkstyle 0m 17s trunk passed +1 mvnsite 0m 29s trunk passed +1 mvneclipse 0m 15s trunk passed +1 findbugs 0m 41s trunk passed +1 javadoc 0m 16s trunk passed +1 mvninstall 0m 24s the patch passed +1 compile 0m 20s the patch passed +1 javac 0m 20s the patch passed -1 checkstyle 0m 15s hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app: The patch generated 2 new + 191 unchanged - 0 fixed = 193 total (was 191) +1 mvnsite 0m 26s the patch passed +1 mvneclipse 0m 12s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 0m 40s the patch passed +1 javadoc 0m 13s the patch passed -1 unit 9m 42s hadoop-mapreduce-client-app in the patch failed. +1 asflicense 0m 15s The patch does not generate ASF License warnings. 22m 52s Reason Tests Failed junit tests hadoop.mapreduce.v2.app.TestRecovery Subsystem Report/Notes Docker Image:yetus/hadoop:9560f25 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12825352/mapreduce6638.001.patch JIRA Issue MAPREDUCE-6638 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 63f482cc2a93 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / a1f3293 Default Java 1.8.0_101 findbugs v3.0.0 checkstyle https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6695/artifact/patchprocess/diff-checkstyle-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-app.txt unit https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6695/artifact/patchprocess/patch-unit-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-app.txt unit test logs https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6695/artifact/patchprocess/patch-unit-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-app.txt Test Results https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6695/testReport/ modules C: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app U: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app Console output https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6695/console Powered by Apache Yetus 0.3.0 http://yetus.apache.org This message was automatically generated.
          Hide
          haibochen Haibo Chen added a comment -

          Uploading a new patch to address the unit test failure and check style warning

          Show
          haibochen Haibo Chen added a comment - Uploading a new patch to address the unit test failure and check style warning
          Hide
          hadoopqa Hadoop QA added a comment -
          +1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 18s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
          +1 mvninstall 6m 51s trunk passed
          +1 compile 0m 22s trunk passed
          +1 checkstyle 0m 18s trunk passed
          +1 mvnsite 0m 29s trunk passed
          +1 mvneclipse 0m 16s trunk passed
          +1 findbugs 0m 36s trunk passed
          +1 javadoc 0m 15s trunk passed
          +1 mvninstall 0m 22s the patch passed
          +1 compile 0m 20s the patch passed
          +1 javac 0m 20s the patch passed
          +1 checkstyle 0m 15s the patch passed
          +1 mvnsite 0m 27s the patch passed
          +1 mvneclipse 0m 12s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 findbugs 0m 40s the patch passed
          +1 javadoc 0m 13s the patch passed
          +1 unit 8m 48s hadoop-mapreduce-client-app in the patch passed.
          +1 asflicense 0m 15s The patch does not generate ASF License warnings.
          21m 35s



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:9560f25
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12825396/mapreduce6638.002.patch
          JIRA Issue MAPREDUCE-6638
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux 242a65b7551a 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / 5a6fc5f
          Default Java 1.8.0_101
          findbugs v3.0.0
          Test Results https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6697/testReport/
          modules C: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app U: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app
          Console output https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6697/console
          Powered by Apache Yetus 0.3.0 http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - +1 overall Vote Subsystem Runtime Comment 0 reexec 0m 18s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files. +1 mvninstall 6m 51s trunk passed +1 compile 0m 22s trunk passed +1 checkstyle 0m 18s trunk passed +1 mvnsite 0m 29s trunk passed +1 mvneclipse 0m 16s trunk passed +1 findbugs 0m 36s trunk passed +1 javadoc 0m 15s trunk passed +1 mvninstall 0m 22s the patch passed +1 compile 0m 20s the patch passed +1 javac 0m 20s the patch passed +1 checkstyle 0m 15s the patch passed +1 mvnsite 0m 27s the patch passed +1 mvneclipse 0m 12s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 0m 40s the patch passed +1 javadoc 0m 13s the patch passed +1 unit 8m 48s hadoop-mapreduce-client-app in the patch passed. +1 asflicense 0m 15s The patch does not generate ASF License warnings. 21m 35s Subsystem Report/Notes Docker Image:yetus/hadoop:9560f25 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12825396/mapreduce6638.002.patch JIRA Issue MAPREDUCE-6638 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 242a65b7551a 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 5a6fc5f Default Java 1.8.0_101 findbugs v3.0.0 Test Results https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6697/testReport/ modules C: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app U: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app Console output https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6697/console Powered by Apache Yetus 0.3.0 http://yetus.apache.org This message was automatically generated.
          Hide
          kasha Karthik Kambatla added a comment -

          We should have two JIRAs for this - (1) Avoid recovering an AM if encrypted spill is enabled, and (2) Support recovering an AM even when encrypted spill is enabled. I am fine with using this JIRA for either, but we should file the other one too.

          Comments on the patch:

          1. The comment for the variable can be simplified to be clear. For instance, "When intermediate data is encrypted, recovering the job requires access to the key. Until the encryption key is persisted, we should avoid recovery attempts."
          2. Can the boolean condition be simplified to numReduceTasks <= 0 || shuffleKeyValidForRecovery || !spillEncrypted?
          3. I assume the new method recovered() is for tests only. Should we annotate it @VisibleForTesting?
          Show
          kasha Karthik Kambatla added a comment - We should have two JIRAs for this - (1) Avoid recovering an AM if encrypted spill is enabled, and (2) Support recovering an AM even when encrypted spill is enabled. I am fine with using this JIRA for either, but we should file the other one too. Comments on the patch: The comment for the variable can be simplified to be clear. For instance, "When intermediate data is encrypted, recovering the job requires access to the key. Until the encryption key is persisted, we should avoid recovery attempts." Can the boolean condition be simplified to numReduceTasks <= 0 || shuffleKeyValidForRecovery || !spillEncrypted ? I assume the new method recovered() is for tests only. Should we annotate it @VisibleForTesting?
          Hide
          haibochen Haibo Chen added a comment -

          Karthik Kambatla Thanks for your reviews! I have left this jira to do 1), so I will create another jira that does 2) after this is done. But as I said in my previous comment, MR does not seem to have any safe store to persist the encryption key across job attempts (Not sure how much efforts it is to introduce one).

          Can the boolean condition be simplified to numReduceTasks <= 0 || shuffleKeyValidForRecovery || !spillEncrypted?

          Not quite sure about when to make something like this incline or a tiny method. I was thinking of using the method name as comment sort of. Any guideline on this?

          Show
          haibochen Haibo Chen added a comment - Karthik Kambatla Thanks for your reviews! I have left this jira to do 1), so I will create another jira that does 2) after this is done. But as I said in my previous comment, MR does not seem to have any safe store to persist the encryption key across job attempts (Not sure how much efforts it is to introduce one). Can the boolean condition be simplified to numReduceTasks <= 0 || shuffleKeyValidForRecovery || !spillEncrypted? Not quite sure about when to make something like this incline or a tiny method. I was thinking of using the method name as comment sort of. Any guideline on this?
          Hide
          kasha Karthik Kambatla added a comment -

          MR does not seem to have any safe store to persist the encryption key across job attempts

          One could store the encrypted key in KMS. Once stored, we could do one of the following:

          1. Tasks have a way to fetch this key directly
          2. Leave the tasks as is, but augment the AM to recover this key as part of recovery.

          Not quite sure about when to make something like this incline or a tiny method.

          It is okay to pull this into a separate method. That said, I feel this method could do with improved logging. For instance, when we don't recover, the method dumps a bunch of information that is hard for end-users to parse. Instead, I would like something like this:

              boolean attemptRecovery = true;
              boolean recoveryEnabled = getConfig().getBoolean(
                  MRJobConfig.MR_AM_JOB_RECOVERY_ENABLE,
                  MRJobConfig.MR_AM_JOB_RECOVERY_ENABLE_DEFAULT);
              if (!recoveryEnabled) {
                LOG.info("Not attempting to recover. Recovery disabled. To enable " +
                    "recovery, set " + MRJobConfig.MR_AM_JOB_RECOVERY_ENABLE);
                attemptRecovery = false;
              }
          
              boolean recoverySupportedByCommitter = isRecoverySupported();
              if (!recoverySupportedByCommitter) {
                LOG.info("Not attempting to recover. Recovery is not supported by " +
                    "committer. Use an OutputCommitter that allows recovery.");
                attemptRecovery = false;
              }
          
              int numReduceTasks = getConfig().getInt(MRJobConfig.NUM_REDUCES, 0);
              boolean shuffleKeyValidForRecovery =
                  TokenCache.getShuffleSecretKey(jobCredentials) != null;
              if (numReduceTasks > 0 && !shuffleKeyValidForRecovery) {
                LOG.info("Not attempting to recover. Shuffle key is not valid for " +
                    "recovery.");
                attemptRecovery = false;
              }
          
              boolean recoverySucceeded = true;
              if (attemptRecovery) {
                LOG.info("Attempting to recover.");
                try {
                  parsePreviousJobHistory();
                } catch (IOException e) {
                  LOG.warn("Unable to parse prior job history, aborting recovery", e);
                  recoverySucceeded = false;
                }
              }
              
              if (!attemptRecovery || !recoverySucceeded) {
                // Get the amInfos anyways whether recovery is enabled or not
                am
          

          Alternatively, processRecovery could return true if recovery succeeded or if it is first attempt, and false otherwise.

          Show
          kasha Karthik Kambatla added a comment - MR does not seem to have any safe store to persist the encryption key across job attempts One could store the encrypted key in KMS. Once stored, we could do one of the following: Tasks have a way to fetch this key directly Leave the tasks as is, but augment the AM to recover this key as part of recovery. Not quite sure about when to make something like this incline or a tiny method. It is okay to pull this into a separate method. That said, I feel this method could do with improved logging. For instance, when we don't recover, the method dumps a bunch of information that is hard for end-users to parse. Instead, I would like something like this: boolean attemptRecovery = true ; boolean recoveryEnabled = getConfig().getBoolean( MRJobConfig.MR_AM_JOB_RECOVERY_ENABLE, MRJobConfig.MR_AM_JOB_RECOVERY_ENABLE_DEFAULT); if (!recoveryEnabled) { LOG.info( "Not attempting to recover. Recovery disabled. To enable " + "recovery, set " + MRJobConfig.MR_AM_JOB_RECOVERY_ENABLE); attemptRecovery = false ; } boolean recoverySupportedByCommitter = isRecoverySupported(); if (!recoverySupportedByCommitter) { LOG.info( "Not attempting to recover. Recovery is not supported by " + "committer. Use an OutputCommitter that allows recovery." ); attemptRecovery = false ; } int numReduceTasks = getConfig().getInt(MRJobConfig.NUM_REDUCES, 0); boolean shuffleKeyValidForRecovery = TokenCache.getShuffleSecretKey(jobCredentials) != null ; if (numReduceTasks > 0 && !shuffleKeyValidForRecovery) { LOG.info( "Not attempting to recover. Shuffle key is not valid for " + "recovery." ); attemptRecovery = false ; } boolean recoverySucceeded = true ; if (attemptRecovery) { LOG.info( "Attempting to recover." ); try { parsePreviousJobHistory(); } catch (IOException e) { LOG.warn( "Unable to parse prior job history, aborting recovery" , e); recoverySucceeded = false ; } } if (!attemptRecovery || !recoverySucceeded) { // Get the amInfos anyways whether recovery is enabled or not am Alternatively, processRecovery could return true if recovery succeeded or if it is first attempt, and false otherwise.
          Hide
          haibochen Haibo Chen added a comment -

          Thanks for your comments, Karthik! Uploading a new patch to address your suggestions, which does make the logging much more user-friendly.

          One could store the encrypted key in KMS. Once stored, we could do one of the following: 1) Tasks have a way to fetch this key directly 2) Leave the tasks as is, but augment the AM to recover this key as part of recovery.

          I think option 2 is advantageous in that the change needed is minimal and only one query for each job attempt to the safe store is needed instead of one for each task.

          Will create a follow up jira once this is done.

          Show
          haibochen Haibo Chen added a comment - Thanks for your comments, Karthik! Uploading a new patch to address your suggestions, which does make the logging much more user-friendly. One could store the encrypted key in KMS. Once stored, we could do one of the following: 1) Tasks have a way to fetch this key directly 2) Leave the tasks as is, but augment the AM to recover this key as part of recovery. I think option 2 is advantageous in that the change needed is minimal and only one query for each job attempt to the safe store is needed instead of one for each task. Will create a follow up jira once this is done.
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 11s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
          +1 mvninstall 6m 59s trunk passed
          +1 compile 0m 24s trunk passed
          +1 checkstyle 0m 18s trunk passed
          +1 mvnsite 0m 28s trunk passed
          +1 mvneclipse 0m 15s trunk passed
          +1 findbugs 0m 34s trunk passed
          +1 javadoc 0m 15s trunk passed
          +1 mvninstall 0m 23s the patch passed
          +1 compile 0m 19s the patch passed
          +1 javac 0m 19s the patch passed
          -1 checkstyle 0m 15s hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app: The patch generated 1 new + 191 unchanged - 0 fixed = 192 total (was 191)
          +1 mvnsite 0m 26s the patch passed
          +1 mvneclipse 0m 12s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 findbugs 0m 39s the patch passed
          +1 javadoc 0m 12s the patch passed
          +1 unit 8m 52s hadoop-mapreduce-client-app in the patch passed.
          +1 asflicense 0m 15s The patch does not generate ASF License warnings.
          21m 33s



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:9560f25
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12828963/mapreduce6638.003.patch
          JIRA Issue MAPREDUCE-6638
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux ba895a9268a2 3.13.0-93-generic #140-Ubuntu SMP Mon Jul 18 21:21:05 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / 4174b97
          Default Java 1.8.0_101
          findbugs v3.0.0
          checkstyle https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6727/artifact/patchprocess/diff-checkstyle-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-app.txt
          Test Results https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6727/testReport/
          modules C: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app U: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app
          Console output https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6727/console
          Powered by Apache Yetus 0.3.0 http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 11s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files. +1 mvninstall 6m 59s trunk passed +1 compile 0m 24s trunk passed +1 checkstyle 0m 18s trunk passed +1 mvnsite 0m 28s trunk passed +1 mvneclipse 0m 15s trunk passed +1 findbugs 0m 34s trunk passed +1 javadoc 0m 15s trunk passed +1 mvninstall 0m 23s the patch passed +1 compile 0m 19s the patch passed +1 javac 0m 19s the patch passed -1 checkstyle 0m 15s hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app: The patch generated 1 new + 191 unchanged - 0 fixed = 192 total (was 191) +1 mvnsite 0m 26s the patch passed +1 mvneclipse 0m 12s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 0m 39s the patch passed +1 javadoc 0m 12s the patch passed +1 unit 8m 52s hadoop-mapreduce-client-app in the patch passed. +1 asflicense 0m 15s The patch does not generate ASF License warnings. 21m 33s Subsystem Report/Notes Docker Image:yetus/hadoop:9560f25 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12828963/mapreduce6638.003.patch JIRA Issue MAPREDUCE-6638 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux ba895a9268a2 3.13.0-93-generic #140-Ubuntu SMP Mon Jul 18 21:21:05 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 4174b97 Default Java 1.8.0_101 findbugs v3.0.0 checkstyle https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6727/artifact/patchprocess/diff-checkstyle-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-app.txt Test Results https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6727/testReport/ modules C: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app U: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app Console output https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6727/console Powered by Apache Yetus 0.3.0 http://yetus.apache.org This message was automatically generated.
          Hide
          hadoopqa Hadoop QA added a comment -
          +1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 11s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
          +1 mvninstall 6m 53s trunk passed
          +1 compile 0m 24s trunk passed
          +1 checkstyle 0m 18s trunk passed
          +1 mvnsite 0m 30s trunk passed
          +1 mvneclipse 0m 16s trunk passed
          +1 findbugs 0m 40s trunk passed
          +1 javadoc 0m 16s trunk passed
          +1 mvninstall 0m 23s the patch passed
          +1 compile 0m 20s the patch passed
          +1 javac 0m 20s the patch passed
          +1 checkstyle 0m 15s the patch passed
          +1 mvnsite 0m 25s the patch passed
          +1 mvneclipse 0m 12s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 findbugs 0m 43s the patch passed
          +1 javadoc 0m 13s the patch passed
          +1 unit 9m 4s hadoop-mapreduce-client-app in the patch passed.
          +1 asflicense 0m 15s The patch does not generate ASF License warnings.
          21m 56s



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:9560f25
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12828966/mapreduce6638.004.patch
          JIRA Issue MAPREDUCE-6638
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux b1be9b1eecdb 3.13.0-93-generic #140-Ubuntu SMP Mon Jul 18 21:21:05 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / 4174b97
          Default Java 1.8.0_101
          findbugs v3.0.0
          Test Results https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6728/testReport/
          modules C: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app U: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app
          Console output https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6728/console
          Powered by Apache Yetus 0.3.0 http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - +1 overall Vote Subsystem Runtime Comment 0 reexec 0m 11s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files. +1 mvninstall 6m 53s trunk passed +1 compile 0m 24s trunk passed +1 checkstyle 0m 18s trunk passed +1 mvnsite 0m 30s trunk passed +1 mvneclipse 0m 16s trunk passed +1 findbugs 0m 40s trunk passed +1 javadoc 0m 16s trunk passed +1 mvninstall 0m 23s the patch passed +1 compile 0m 20s the patch passed +1 javac 0m 20s the patch passed +1 checkstyle 0m 15s the patch passed +1 mvnsite 0m 25s the patch passed +1 mvneclipse 0m 12s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 0m 43s the patch passed +1 javadoc 0m 13s the patch passed +1 unit 9m 4s hadoop-mapreduce-client-app in the patch passed. +1 asflicense 0m 15s The patch does not generate ASF License warnings. 21m 56s Subsystem Report/Notes Docker Image:yetus/hadoop:9560f25 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12828966/mapreduce6638.004.patch JIRA Issue MAPREDUCE-6638 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux b1be9b1eecdb 3.13.0-93-generic #140-Ubuntu SMP Mon Jul 18 21:21:05 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 4174b97 Default Java 1.8.0_101 findbugs v3.0.0 Test Results https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6728/testReport/ modules C: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app U: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app Console output https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6728/console Powered by Apache Yetus 0.3.0 http://yetus.apache.org This message was automatically generated.
          Hide
          kasha Karthik Kambatla added a comment -

          Patch looks pretty good. Really like the actionable log messages.

          Couple of nits:

          1. In the following log message, s/recovery/recover:
                if (reducerCount > 0 && spillEncrypted) {
                  LOG.info("Not attempting to recovery. Intermediate spill encryption" +
                      " is enabled.");
            
          2. Consider renaming attemptRecoveryIfNotFirstAttempt to shouldAttemptRecoveryIfNotFirstAttempt. That does become quite long, which makes me think: how about shouldAttemptRecovery? This method could also check for the first attempt case and return false. And, the if condition towards the end of processRecovery could include a check for first attempt. What do you think?
          Show
          kasha Karthik Kambatla added a comment - Patch looks pretty good. Really like the actionable log messages. Couple of nits: In the following log message, s/recovery/recover: if (reducerCount > 0 && spillEncrypted) { LOG.info( "Not attempting to recovery. Intermediate spill encryption" + " is enabled." ); Consider renaming attemptRecoveryIfNotFirstAttempt to shouldAttemptRecoveryIfNotFirstAttempt . That does become quite long, which makes me think: how about shouldAttemptRecovery ? This method could also check for the first attempt case and return false. And, the if condition towards the end of processRecovery could include a check for first attempt. What do you think?
          Hide
          haibochen Haibo Chen added a comment -

          Thanks for the reviews, Karthik! Uploading a new patch to address your comments.

          Show
          haibochen Haibo Chen added a comment - Thanks for the reviews, Karthik! Uploading a new patch to address your comments.
          Hide
          kasha Karthik Kambatla added a comment -

          +1, pending Jenkins.

          Show
          kasha Karthik Kambatla added a comment - +1, pending Jenkins.
          Hide
          hadoopqa Hadoop QA added a comment -
          +1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 14s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
          +1 mvninstall 6m 45s trunk passed
          +1 compile 0m 22s trunk passed
          +1 checkstyle 0m 17s trunk passed
          +1 mvnsite 0m 27s trunk passed
          +1 mvneclipse 0m 14s trunk passed
          +1 findbugs 0m 36s trunk passed
          +1 javadoc 0m 15s trunk passed
          +1 mvninstall 0m 21s the patch passed
          +1 compile 0m 20s the patch passed
          +1 javac 0m 20s the patch passed
          +1 checkstyle 0m 14s hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app: The patch generated 0 new + 190 unchanged - 1 fixed = 190 total (was 191)
          +1 mvnsite 0m 25s the patch passed
          +1 mvneclipse 0m 12s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 findbugs 0m 39s the patch passed
          +1 javadoc 0m 12s the patch passed
          +1 unit 8m 49s hadoop-mapreduce-client-app in the patch passed.
          +1 asflicense 0m 16s The patch does not generate ASF License warnings.
          21m 13s



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:9560f25
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12829943/mapreduce6683.005.patch
          JIRA Issue MAPREDUCE-6638
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux 9f1d1d805c63 3.13.0-93-generic #140-Ubuntu SMP Mon Jul 18 21:21:05 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / 40acace
          Default Java 1.8.0_101
          findbugs v3.0.0
          Test Results https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6736/testReport/
          modules C: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app U: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app
          Console output https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6736/console
          Powered by Apache Yetus 0.3.0 http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - +1 overall Vote Subsystem Runtime Comment 0 reexec 0m 14s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files. +1 mvninstall 6m 45s trunk passed +1 compile 0m 22s trunk passed +1 checkstyle 0m 17s trunk passed +1 mvnsite 0m 27s trunk passed +1 mvneclipse 0m 14s trunk passed +1 findbugs 0m 36s trunk passed +1 javadoc 0m 15s trunk passed +1 mvninstall 0m 21s the patch passed +1 compile 0m 20s the patch passed +1 javac 0m 20s the patch passed +1 checkstyle 0m 14s hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app: The patch generated 0 new + 190 unchanged - 1 fixed = 190 total (was 191) +1 mvnsite 0m 25s the patch passed +1 mvneclipse 0m 12s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 0m 39s the patch passed +1 javadoc 0m 12s the patch passed +1 unit 8m 49s hadoop-mapreduce-client-app in the patch passed. +1 asflicense 0m 16s The patch does not generate ASF License warnings. 21m 13s Subsystem Report/Notes Docker Image:yetus/hadoop:9560f25 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12829943/mapreduce6683.005.patch JIRA Issue MAPREDUCE-6638 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 9f1d1d805c63 3.13.0-93-generic #140-Ubuntu SMP Mon Jul 18 21:21:05 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 40acace Default Java 1.8.0_101 findbugs v3.0.0 Test Results https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6736/testReport/ modules C: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app U: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app Console output https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6736/console Powered by Apache Yetus 0.3.0 http://yetus.apache.org This message was automatically generated.
          Hide
          hitesh Hitesh Shah added a comment - - edited

          (1) Avoid recovering an AM if encrypted spill is enabled

          Encrypted spill w.r.t recovery is not the same as a committer not supporting recovery. Any reason we cannot just re-run the job from scratch if all reducers have not completed ( or re-run all maps and incomplete reducers )?

          Ideally speaking, you could just re-run most of the job tasks again if needed to support proper fault tolerance even in scenarios where the key cannot be stored securely. In this scenario, the new AM can generate a new key. I would agree that this might not be a performant solution but it atleast solves the problem of not having the user to re-submit the job. If performance is an issue, users can turn off recovery when encryption is enabled for scenarios where the key cannot be stored securely.

          Show
          hitesh Hitesh Shah added a comment - - edited (1) Avoid recovering an AM if encrypted spill is enabled Encrypted spill w.r.t recovery is not the same as a committer not supporting recovery. Any reason we cannot just re-run the job from scratch if all reducers have not completed ( or re-run all maps and incomplete reducers )? Ideally speaking, you could just re-run most of the job tasks again if needed to support proper fault tolerance even in scenarios where the key cannot be stored securely. In this scenario, the new AM can generate a new key. I would agree that this might not be a performant solution but it atleast solves the problem of not having the user to re-submit the job. If performance is an issue, users can turn off recovery when encryption is enabled for scenarios where the key cannot be stored securely.
          Hide
          haibochen Haibo Chen added a comment -

          Hi Hitesh Shah Thanks very much for your comments! Sorry for the confusing phase in there. I guess a better way to put it is probably something like
          Avoid recovering progress made in previous attempts if encrypted spill is enabled.

          If recovery and spill encryption are both enabled, the job attempt does not fail. The job attempt will simply start from scratch because no history file is parsed. If the naming in the patch is too misleading, I can create a new follow up jira to rename the shouldAttemptRecovery. How about shouldAttemptToRecoveryProgress?

          Show
          haibochen Haibo Chen added a comment - Hi Hitesh Shah Thanks very much for your comments! Sorry for the confusing phase in there. I guess a better way to put it is probably something like Avoid recovering progress made in previous attempts if encrypted spill is enabled. If recovery and spill encryption are both enabled, the job attempt does not fail. The job attempt will simply start from scratch because no history file is parsed. If the naming in the patch is too misleading, I can create a new follow up jira to rename the shouldAttemptRecovery. How about shouldAttemptToRecoveryProgress?
          Hide
          kasha Karthik Kambatla added a comment -

          Hitesh Shah - are you comfortable with checking this in?

          Show
          kasha Karthik Kambatla added a comment - Hitesh Shah - are you comfortable with checking this in?
          Hide
          hitesh Hitesh Shah added a comment - - edited

          Havent looked at the patch in detail but Haibo Chen's clarifying comments make sense. Jira title could be modified accordingly. +0 from my side.

          Show
          hitesh Hitesh Shah added a comment - - edited Havent looked at the patch in detail but Haibo Chen 's clarifying comments make sense. Jira title could be modified accordingly. +0 from my side.
          Hide
          kasha Karthik Kambatla added a comment -

          Thanks Hitesh. Checking this in with the updated title.

          Show
          kasha Karthik Kambatla added a comment - Thanks Hitesh. Checking this in with the updated title.
          Hide
          kasha Karthik Kambatla added a comment -

          Just committed to trunk and branch-2. Thanks Haibo Chen for cleaning this up, and Hitesh Shah for chiming in on the appropriateness of the title/commit message.

          Show
          kasha Karthik Kambatla added a comment - Just committed to trunk and branch-2. Thanks Haibo Chen for cleaning this up, and Hitesh Shah for chiming in on the appropriateness of the title/commit message.
          Hide
          hudson Hudson added a comment -

          SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #10532 (See https://builds.apache.org/job/Hadoop-trunk-Commit/10532/)
          MAPREDUCE-6638. Do not attempt to recover progress from previous job (kasha: rev de7a0a92ca1983b35ca4beb7ab712fd700a9e6e0)

          • (edit) hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestRecovery.java
          • (edit) hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/MRAppMaster.java
          Show
          hudson Hudson added a comment - SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #10532 (See https://builds.apache.org/job/Hadoop-trunk-Commit/10532/ ) MAPREDUCE-6638 . Do not attempt to recover progress from previous job (kasha: rev de7a0a92ca1983b35ca4beb7ab712fd700a9e6e0) (edit) hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestRecovery.java (edit) hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/MRAppMaster.java

            People

            • Assignee:
              haibochen Haibo Chen
              Reporter:
              kasha Karthik Kambatla
            • Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development